Instructions for running the cli version? #140

jrp2014 · 2024-05-18T21:42:07Z

Is the word whisperkit-cli missing from the README?

swift run whisperkit-cli transcribe --model-path "Models/whisperkit-coreml/openai_whisper-large-v3" --audio-path ~/.cache/whisper/alice.mp3 
Building for debugging...
[1/1] Write swift-version--58304C5D6DBC2206.txt
Build complete! (0.09s)

If I don't include it, I get error: no executable product named 'transcribe'.

Transcription seems to be pretty slow, with no use of the GPU.

The output is a wall of text, with some capitalisation anomalies.

Using the mlx whisper, you can add timestamps to the output, so that if two people are speaking, the transcript starts each change of speaker on a new line. Is the same capability available here?

I'm not sure what MP3 formats are supported? I got a Error when transcribing /Users/xxx.mp3: loadAudioFailed("Unable to resample audio") from a stereo 44.1 kHz .mp3 file.

I'm not sure whether I'm using the large-v3 for 30s clips, or the one for full length transcripts.

The text was updated successfully, but these errors were encountered:

ZachNagengast · 2024-05-19T05:29:33Z

It's mentioned in the readme here: https://github.com/argmaxinc/WhisperKit?tab=readme-ov-file#swift-cli

Did you see somewhere else that had swift run transcribe? We will update if you can point us to it.

Regarding the timestamps, we do have a parameter clipTimestamps in the swift library, but it's not currently in the CLI, making a note to get that brought over.

The mp3 resample bug you posted is interesting, I've yet to see this error, are you able to provide the audio file you used so we can debug?

jrp2014 · 2024-05-19T12:26:18Z

Thanks. The README. now seems to be corrected.

I'm sorry that I can't share the mp3. Perhaps the app ran out of memory as the clip is quite long.

ZachNagengast · 2024-05-20T20:47:09Z

Ok, if you can replicate it with a file you can share let us know. Memory seems like a good candidate, will see if there is a better error message we can give there.

ZachNagengast added documentation Improvements or additions to documentation enhancement Improves existing code good first issue Good for newcomers bug Something isn't working labels May 19, 2024

ZachNagengast closed this as completed May 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Instructions for running the cli version? #140

Instructions for running the cli version? #140

jrp2014 commented May 18, 2024 •

edited

Loading

ZachNagengast commented May 19, 2024

jrp2014 commented May 19, 2024

ZachNagengast commented May 20, 2024

Instructions for running the cli version? #140

Instructions for running the cli version? #140

Comments

jrp2014 commented May 18, 2024 • edited Loading

ZachNagengast commented May 19, 2024

jrp2014 commented May 19, 2024

ZachNagengast commented May 20, 2024

jrp2014 commented May 18, 2024 •

edited

Loading