How to get the progress bar while transcribing? #850
Replies: 4 comments 19 replies
-
This is how I do it in JavaScript: https://github.com/mayeaux/generate-subtitles/blob/master/helpers/formatStdErr.js#L9 |
Beta Was this translation helpful? Give feedback.
-
I think it would be great to have a function like I think it's not complicated, since |
Beta Was this translation helpful? Give feedback.
-
Another option is to override the import os
import sys
import tqdm
import urllib.request
class _CustomProgressBar(tqdm.tqdm):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self._current = self.n # Set the initial value
def update(self, n):
super().update(n)
self._current += n
# Handle progress here
print("Progress: " + str(self._current) + "/" + str(self.total))
# Inject into tqdm.tqdm of Whisper, so we can see progress
import whisper.transcribe
transcribe_module = sys.modules['whisper.transcribe']
transcribe_module.tqdm.tqdm = _CustomProgressBar
import whisper
model = whisper.load_model("medium")
if not os.path.exists("sample1.wav"):
urllib.request.urlretrieve("https://github.com/itsupera/audiobook_alignment/raw/main/samples/sample1.wav", "sample1.wav")
result = model.transcribe("sample1.wav", language="Japanese", fp16=False, verbose=None)
print(result['text']) That way, you don't have launch Whisper in another process, or parse the output. I have a complete example of how this can be done here: I use it to pass progress from Whisper into Gradio. But yeah, hopefully there will be a proper API soon so we don't have to rely on method hooks or parsing the console output. |
Beta Was this translation helpful? Give feedback.
-
#!/usr/bin/env python3
import whisper
model = whisper.load_model("large-v3")
result = model.transcribe("scale.m4a")
with open("transcription.txt", "w") as f:
f.write(result["text"]) I don't see the progress bar with this simple code. I've searched and transcribe.py has tqdm already. Thank you |
Beta Was this translation helpful? Give feedback.
-
Here is my code.
import whisper
Load Whisper model Large
MODEL = whisper.load_model("model/medium.pt", in_memory=True)
Set options
translate_options = dict(task="translate", **dict(language=translate_language, beam_size=5, best_of=5))
result_transcribe = MODEL.transcribe(audio, **transcribe_options, fp16=False, verbose=False)
When i enable verbose=False, this is the terminal output while transcribing,
Detected language: Turkish
60%|███████████████████████████████████████████████████████████████████████████ | 4058/6058 [00:21<00:00, 277.58frames/s]
100%|███████████████████████████████████████████████████████████████████████████ | 6058/6058 [00:21<00:00, 277.58frames/s]
I want to get the progress bar while its transcribing is it possible?
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions