Utilizing CTRanslate2 for multiple times speed up #836
alexkrasov
started this conversation in
Ideas
Replies: 1 comment 4 replies
-
|
I didn't realize it had good gains for CPU only, I will take a closer look if for only that reason I've played with it before and it's quite good. Problem, potentially more to bundle |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hey folks, I wanted to throw out an idea: adding a CTranslate2 backend to Handy.
In plain terms, this could make local transcription feel way snappier, especially on CPU-only machines.
On my own older i7 (no GPU), I tested Whisper Large V3 Turbo, 8-bit quantized, and got almost real-time speed: about
10 seconds of audio processed in ~10 seconds.
That was many (like 10) times faster than a similar “regular” setup I've tried before (all large whisper models, including turbo, are basically unusable on my machine).
If Handy could support this engine, it could be a big quality-of-life win for a lot of people, especially anyone not
running a strong GPU.
If anyone wants to play with this stack, check out faster-whisper - it’s a great way to see these speed gains in
practice.
In the end, this could make Whisper models genuinely usable on CPU, which is basically not practical right now for
many users.
Beta Was this translation helpful? Give feedback.
All reactions