Does temperature make sense in transcription? #406
-
I'm likely wrong but as I understand, the temperature parameter is very interesting in generative models (such as GPT) to "enhance creativity". However, I'm not sure how it helps in the case of speech-to-text, where one wants to get the exact words being pronounced. Isn't that why we get so many hallucinations in whisper? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
Beta Was this translation helpful? Give feedback.
-
What about translation. Do higher temperatures or other parameter changes lead to less literal translations? Skimming the paper I don't see similar stats for the translation task. |
Beta Was this translation helpful? Give feedback.
By default, the temperature is not used for the first decoding attempt. It is only when the decoded results do not meet the
compression_ratio_threshold
orlogprob_threshold
that it resorts to temperature fallback.According to Table 7 in the paper, using temperature fallback on average does improve the performance of long-form transcription.