-
2 issues raised by Zico Kolter in this Twitter thread:
Here's the result given the first few minutes of lecture 6 of Zico's course. yt-dlp --extract-audio --audio-format mp3 https://www.youtube.com/watch?v=CukpVt-1PA4
whisper --model large --language en --initial_prompt 'Terminology: i‘th activation (or layer) in the network = z_i, the bias term = b_i, associated nonlinear function = sigma_i, weight term = W_i.' Lecture\ 6\ -\ Fully\ connected\ networks\,\ optimization\,\ initialization\ \[CukpVt-1PA4\].mp3 Click to show transcribed output
We get I also tried appending At a guess, perhaps the tokenisation is preventing the correct output here with the underscore, and perhaps the capitalisation is not amenable to prompting for whatever reason. Is there anything that can be done currently or could be developed to improve this? When I tried to increase the temperature to 0.2, suspecting this could increase the prompt efficacy, I got an error from line 497 in
I just wanted to record it and share it in passing while seeing what Whisper can do :-) Any thoughts or advice appreciated, and might I add thank you for publishing your work open source! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I think the biggest reason here is the token suppression, where by default most special characters are explicitly forbidden during sampling: Lines 239 to 249 in 62fe7f1 You might see a better result if you relax this by supplying a different The The |
Beta Was this translation helpful? Give feedback.
I think the biggest reason here is the token suppression, where by default most special characters are explicitly forbidden during sampling:
whisper/whisper/tokenizer.py
Lines 239 to 249 in 62fe7f1