Whisper producing differing results and different inference time at every inference #913

nome2050 · 2023-01-31T18:20:01Z

nome2050
Jan 31, 2023

I am using t4 gpu to run whisper medium en

the code sometimes gives different results at different inference

also, the time is also variable

that is making the benchmark process difficult

also, any way to decrease inference time

Answered by jongwook

Jan 31, 2023

Please see #81 for the nondeterminism. Some avenues for optimization includes quantization and CUDA kernel fusion, which has been discussed in #454 and #115, but are not yet included in this repo.

View full answer

jongwook · 2023-01-31T18:23:59Z

jongwook
Jan 31, 2023
Maintainer

Please see #81 for the nondeterminism. Some avenues for optimization includes quantization and CUDA kernel fusion, which has been discussed in #454 and #115, but are not yet included in this repo.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper producing differing results and different inference time at every inference #913

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Whisper producing differing results and different inference time at every inference #913

Uh oh!

nome2050 Jan 31, 2023

Replies: 1 comment

Uh oh!

jongwook Jan 31, 2023 Maintainer

nome2050
Jan 31, 2023

jongwook
Jan 31, 2023
Maintainer