Skip to content

VAD coredump (whisper-server) #3403

@bkervaski

Description

@bkervaski

I found an issue that's causing a core-dump with VAD, the audio file being processes is fine and I can re-process it and it works the second time, it's a stereo PCM wav file.

The whisper-server startup command is:

bin/whisper-server -m models/ggml-large-v3.bin --port 8080 -nc -sns --diarize -t 8 --vad --vad-model models/ggml-silero-v5.1.2.bin

Here's the log output prior to the core dump, I also have the actual core dump if needed:

Sep 02 14:24:58 ai0 whisper-server[389693]: operator(): processing '/tmp/01b3-b408-4c3b-c91c-f82c-cfe8-40ce-e110.wav' (356251 samples, 22.3 sec), 8 threads, 1 processors, lang = en, task = transcribe, timestamps = 1 ...
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_full: VAD is enabled, processing speech segments only
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_vad: VAD is enabled, processing speech segments only
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_vad_segments_from_samples: detecting speech timestamps in 356251 samples
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_vad_detect_speech: detecting speech in 356251 samples
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_vad_detect_speech: n_chunks: 696
Sep 02 14:24:58 ai0 whisper-server[389693]: whisper_vad_detect_speech: props size: 696
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_detect_speech: chunk_len: 411 < n_window: 512
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_detect_speech: vad time = 489470.28 ms processing 356251 samples
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: detecting speech timestamps using 696 probabilities
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: Merged 1 adjacent segments, now have 4 segments
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: Final speech segments after filtering: 4
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: VAD segment 0: start = 0.64, end = 1.05 (duration: 0.41)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: VAD segment 1: start = 1.44, end = 7.77 (duration: 6.33)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: VAD segment 2: start = 8.00, end = 21.60 (duration: 13.60)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad_segments_from_probs: VAD segment 3: start = 21.92, end = 22.27 (duration: 0.35)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: detected 4 speech segments
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Including segment 0: 0.64 - 1.15 (duration: 0.51)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Including segment 1: 1.44 - 7.87 (duration: 6.43)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Including segment 2: 8.00 - 21.70 (duration: 13.70)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Including segment 3: 21.92 - 22.27 (duration: 0.35)
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: total duration of speech segments: 20.99 seconds
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: vad_segment_info: orig_start: 0.64, orig_end: 1.05, vad_start: 0.00, vad_end: 0.51
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: vad_segment_info: orig_start: 1.44, orig_end: 7.77, vad_start: 0.61, vad_end: 7.04
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: vad_segment_info: orig_start: 8.00, orig_end: 21.60, vad_start: 7.14, vad_end: 20.84
Sep 02 14:24:59 ai0 audit[389693]: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:unconfined_service_t:s0 pid=389693 comm="whisper-server" exe="/opt/whisper/build/bin/whisper-server" sig=6 res=1
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: vad_segment_info: orig_start: 21.92, orig_end: 22.27, vad_start: 20.94, vad_end: 21.29
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Created time mapping table with 106 points
Sep 02 14:24:59 ai0 whisper-server[389693]: whisper_vad: Reduced audio from 356251 to 340571 samples (4.4% reduction)
Sep 02 14:24:59 ai0 whisper-server[389693]: malloc(): invalid size (unsorted)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions