Transcription stopped halfway #36

nhan000 · 2023-07-14T11:00:23Z

I downloaded this 27 min Youtube video (uploaded it here).

I run the transcription using this code
whisper-faster "C:\Users\ntnha\Videos\4K Video Downloader\Carl Sagan Astronomer of the People.mp4" --language en --model large-v2 --batch_recursive true

and it stopped at [13:15.860 --> 13:18.860] His greatest achievement was just around the corner.

I downloaded the mp3 file from that YouTube video (uploaded it here)
whisper-faster "C:\Users\ntnha\Videos\4K Video Downloader\Carl Sagan Astronomer of the People.mp3" --language en --model large-v2 --batch_recursive true

and it was able to run to [26:44.760 --> 26:46.180] might have been enough.

Interestingly, it didn't transcribe the advertisement at the beginning and at the end of the video.

The text was updated successfully, but these errors were encountered:

Purfview · 2023-07-14T11:05:04Z

Check if .srt subtitle file is created. [when you think that it's "stopped"]

nhan000 · 2023-07-14T11:49:28Z

The srt file was created and the later half was missing, same as the timestamp in the command prompt.

Purfview · 2023-07-14T11:54:37Z

Do you run it on cuda? If yes then try --compute_type=int8 parameter.

nhan000 · 2023-07-14T18:57:12Z

I added the parameter you gave me, so the code is

whisper-faster "C:\Users\ntnha\Videos\4K Video Downloader\Carl Sagan Astronomer of the People.mp4" --language en --model large-v2 --batch_recursive true --compute_type=int8

It ran on cuda

And it still stopped at the same location

Purfview · 2023-07-14T19:03:08Z

I reproduced this issue on my side. Later I'll check what can be done about it.
Interestingly, this hallucination starts on the advertisement.

nhan000 · 2023-07-14T19:13:40Z

Thanks for looking into this, and separately, thanks for making this program. Very noob-friendly for people who are not very techy like me.

The video has 3 advertisement segments:

One at the beginning that Whisper Standalone doesn't transcribe for both mp4 and mp3 files.
One at the middle (13:19) that it transcribes in the mp3 file but stopped for the mp4 file.
One at the end (26:46) that it also doesn't transcribe in the mp3 file.

Purfview · 2023-07-16T10:26:10Z

It doesn't stuck with -beam_size=5 option.

Ad at start/end is still ignored, probably models are trained to ignore that ad. Btw tiny and base models transcribe that ad.

nhan000 · 2023-07-16T21:19:38Z

Thanks a lot! I will keep the beam size parameter in mind and change it around when I ran into issues.

nhan000 closed this as completed Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Transcription stopped halfway #36

Transcription stopped halfway #36

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023 •

edited

Loading

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023 •

edited

Loading

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023

nhan000 commented Jul 14, 2023 •

edited

Loading

Purfview commented Jul 16, 2023

nhan000 commented Jul 16, 2023

Transcription stopped halfway #36

Transcription stopped halfway #36

Comments

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023 • edited Loading

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023 • edited Loading

nhan000 commented Jul 14, 2023

Purfview commented Jul 14, 2023

nhan000 commented Jul 14, 2023 • edited Loading

Purfview commented Jul 16, 2023

nhan000 commented Jul 16, 2023

Purfview commented Jul 14, 2023 •

edited

Loading

Purfview commented Jul 14, 2023 •

edited

Loading

nhan000 commented Jul 14, 2023 •

edited

Loading