Run talk example failed #782

jhezjkp · 2023-04-17T18:03:14Z

➜ whisper.cpp git:(master) ✗ ./talk -p santa
whisper_init_from_file_no_state: loading model from 'models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 2
whisper_model_load: mem required = 218.00 MB (+ 6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 140.60 MB
whisper_model_load: model size = 140.54 MB
whisper_init_state: kv self size = 5.25 MB
whisper_init_state: kv cross size = 17.58 MB
gpt2_model_load: loading model from 'models/ggml-gpt-2-117M.bin'
gpt2_model_load: n_vocab = 50257
gpt2_model_load: n_ctx = 1024
gpt2_model_load: n_embd = 768
gpt2_model_load: n_head = 12
gpt2_model_load: n_layer = 12
gpt2_model_load: f16 = 1
gpt2_model_load: ggml ctx size = 311.12 MB
gpt2_model_load: memory size = 72.00 MB, n_mem = 12288
gpt2_model_load: tensor 'model/h0/attn/c_attn/w' has wrong shape in model file: got [2304, 768], expected [768, 2304]
gpt2_init: failed to load model from 'models/ggml-gpt-2-117M.bin'

main: processing, 4 threads, lang = en, task = transcribe, timestamps = 0 ...

init: found 1 capture devices:
init: - Capture device #0: 'MacBook Pro麦克风'
init: attempt to open default capture device ...
init: obtained spec for input device (SDL Id = 2):
init: - sample rate: 16000
init: - format: 33056 (required: 33056)
init: - channels: 1 (required: 1)
init: - samples per frame: 1024
[1] 68264 segmentation fault ./talk -p santa

➜ whisper.cpp git:(master) ✗ shasum -a 256 ./models/ggml-gpt-2-117M.bin
b457d5fcc7f2f71e727bee74298d42d80610619e02af16beca53d44a71d5f607 ./models/ggml-gpt-2-117M.bin

jhezjkp · 2023-04-17T18:05:04Z

Apple M1 Pro
Mac OS 13.3

ggerganov · 2023-04-23T13:49:23Z

I'll fix these in a few days

gab-luz · 2023-04-29T11:04:21Z

The same is happening to me when using q2 whisper version. mr. @ggerganov I've also tried switching to 4-bit branch and it still didn't work. Besides usual "./main -m models/ggml-model-whisper-large-q4_0.bin -f file.mp4", is there anything else to be done to run quantized model?

gab-luz · 2023-04-29T11:05:00Z

The same error to me when running the quantized file version: [1] 37589 segmentation fault (core dumped) ./main -m models/ggml-model-whisper-large-q4_0.bin -f

gab-luz · 2023-04-30T00:11:26Z

I've tried a wav file instead of mp4 file and it didn't work. Running ggml-large.bin works but it's not the quantized one, unfortunately.

ggerganov · 2023-04-30T15:55:51Z

Should be fixed now

ZechenM · 2023-05-31T00:19:18Z

I am still getting the same error:

(py310-whisper) whisper.cpp % ./talk -p Santa
whisper_init_from_file_no_state: loading model from 'models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2
whisper_model_load: mem required = 310.00 MB (+ 6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx = 140.66 MB
whisper_model_load: model size = 140.54 MB
whisper_init_state: kv self size = 5.25 MB
whisper_init_state: kv cross size = 17.58 MB
whisper_init_state: loading Core ML model from 'models/ggml-base.en-encoder.mlmodelc'
whisper_init_state: first run on a device may take a while ...
whisper_init_state: Core ML model loaded
gpt2_model_load: loading model from 'models/ggml-gpt-2-117M.bin'
gpt2_model_load: failed to open 'models/ggml-gpt-2-117M.bin'
gpt2_init: failed to load model from 'models/ggml-gpt-2-117M.bin'

main: processing, 4 threads, lang = en, task = transcribe, timestamps = 0 ...

init: found 4 capture devices:
init: - Capture device #0: 'Zechen’s AirPods Pro #2'
init: - Capture device #1: 'Z’s iPhone Microphone'
init: - Capture device #2: 'MacBook Pro Microphone'
init: - Capture device #3: 'ZoomAudioDevice'
init: attempt to open default capture device ...
init: obtained spec for input device (SDL Id = 2):
init: - sample rate: 16000
init: - format: 33056 (required: 33056)
init: - channels: 1 (required: 1)
init: - samples per frame: 1024
zsh: segmentation fault ./talk -p Santa

fedorenko-dmitriy · 2024-06-04T12:02:27Z

Hi all. The same problem on Win10 when started python script from example. I test with all win builds from 1.5.4 to 1.6.2
`
PS C:\Users\123\CODE\test2> python -m test .\jfk.wav base
..\src\utils\whisper.cpp_win\main.exe -m ggml-base.bin -f .\jfk.wav
Error: Error processing audio: whisper_init_from_file_with_params_no_state: loading model from 'ggml-base.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: CPU total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 18.87 MB
whisper_init_state: kv pad size = 3.15 MB
whisper_init_state: compute buffer (conv) = 16.39 MB
whisper_init_state: compute buffer (encode) = 132.07 MB
whisper_init_state: compute buffer (cross) = 4.78 MB
whisper_init_state: compute buffer (decode) = 96.48 MB

main: processing '.\jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...

whisper_print_timings: load time = 324.05 ms
whisper_print_timings: fallbacks = 0 p / 0 h
whisper_print_timings: mel time = 16.54 ms
whisper_print_timings: sample time = 120.25 ms / 141 runs ( 0.85 ms per run)
whisper_print_timings: encode time = 1851.88 ms / 1 runs ( 1851.88 ms per run)
whisper_print_timings: decode time = 5.56 ms / 1 runs ( 5.56 ms per run)
whisper_print_timings: batchd time = 307.09 ms / 138 runs ( 2.23 ms per run)
whisper_print_timings: prompt time = 0.00 ms / 1 runs ( 0.00 ms per run)
whisper_print_timings: total time = 2641.22 ms`

But if run the command directly it works normal

`PS C:\Users\123\CODE\test2> ..\src\utils\whisper.cpp_win\main.exe -m ggml-base.bin -f .\jfk.wav
whisper_init_from_file_with_params_no_state: loading model from 'ggml-base.bin'
whisper_init_with_params_no_state: use gpu = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0
whisper_init_with_params_no_state: dtw = 0
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_model_load: CPU total size = 147.37 MB
whisper_model_load: model size = 147.37 MB
whisper_init_state: kv self size = 18.87 MB
whisper_init_state: kv cross size = 18.87 MB
whisper_init_state: kv pad size = 3.15 MB
whisper_init_state: compute buffer (conv) = 16.39 MB
whisper_init_state: compute buffer (encode) = 132.07 MB
whisper_init_state: compute buffer (cross) = 4.78 MB
whisper_init_state: compute buffer (decode) = 96.48 MB

main: processing '.\jfk.wav' (176000 samples, 11.0 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = en, task = transcribe, timestamps = 1 ...

[00:00:00.000 --> 00:00:08.000] And so, my fellow Americans, ask not what your country can do for you,
[00:00:08.000 --> 00:00:11.000] ask what you can do for your country.

whisper_print_timings: load time = 366.80 ms
whisper_print_timings: fallbacks = 0 p / 0 h
whisper_print_timings: mel time = 40.23 ms
whisper_print_timings: sample time = 165.82 ms / 141 runs ( 1.18 ms per run)
whisper_print_timings: encode time = 3230.55 ms / 1 runs ( 3230.55 ms per run)
whisper_print_timings: decode time = 7.19 ms / 1 runs ( 7.19 ms per run)
whisper_print_timings: batchd time = 559.73 ms / 138 runs ( 4.06 ms per run)
whisper_print_timings: prompt time = 0.00 ms / 1 runs ( 0.00 ms per run)
whisper_print_timings: total time = 4438.11 ms`

Any suggestions? Thx

ggerganov added the bug Something isn't working label Apr 23, 2023

ggerganov closed this as completed Apr 30, 2023

ZechenM mentioned this issue May 31, 2023

Segmentation fault while running talk on mac m1 max #974

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run talk example failed #782

Run talk example failed #782

jhezjkp commented Apr 17, 2023

jhezjkp commented Apr 17, 2023

ggerganov commented Apr 23, 2023

gab-luz commented Apr 29, 2023

gab-luz commented Apr 29, 2023

gab-luz commented Apr 30, 2023

ggerganov commented Apr 30, 2023

ZechenM commented May 31, 2023

fedorenko-dmitriy commented Jun 4, 2024

Run talk example failed #782

Run talk example failed #782

Comments

jhezjkp commented Apr 17, 2023

jhezjkp commented Apr 17, 2023

ggerganov commented Apr 23, 2023

gab-luz commented Apr 29, 2023

gab-luz commented Apr 29, 2023

gab-luz commented Apr 30, 2023

ggerganov commented Apr 30, 2023

ZechenM commented May 31, 2023

fedorenko-dmitriy commented Jun 4, 2024