-
Notifications
You must be signed in to change notification settings - Fork 15k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new parameter forced_decoder_ids to OpenAIWhisperParserLocal + small bug fix #8793
Add new parameter forced_decoder_ids to OpenAIWhisperParserLocal + small bug fix #8793
Conversation
…ng tasks (translate/transcribe)
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@@ -136,10 +156,19 @@ def __init__(self, device: str = "0", lang_model: Optional[str] = None): | |||
# load model for inference | |||
self.pipe = pipeline( | |||
"automatic-speech-recognition", | |||
model="openai/whisper-medium", | |||
model=self.lang_model, # fix to use model name that was evaluated earlier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would remove this comment, it loses meaning outside of this pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Fixed.
chunk_length_s=30, | ||
device=self.device, | ||
) | ||
try: | ||
if forced_decoder_ids is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i feel like its slightly nicer to do:
if ...:
try:
...
except:
...
dont care super strongly tho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks
Minor fix - shortened 1 line due to lint failed check. Will look into why local lint didn't displayed it for me. |
processor = WhisperProcessor.from_pretrained("openai/whisper-medium")
forced_decoder_ids = processor.get_decoder_prompt_ids(language="french", task="transcribe")
#forced_decoder_ids = processor.get_decoder_prompt_ids(language="french", task="translate")
loader = GenericLoader(YoutubeAudioLoader(urls, save_dir), OpenAIWhisperParserLocal(lang_model="openai/whisper-medium",forced_decoder_ids=forced_decoder_ids))
Please make sure you're PR is passing linting and testing before submitting. Run
make format
,make lint
andmake test
to check this locally.See contribution guidelines for more information on how to write/run tests, lint, etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->