This is an experimental fork of WhisperKit with AI-driven optimizations. It is not intended for production use.
brew install vibekernels/tap/whisperkit-vk-cliswift buildBuild the CLI in release mode:
swift build -c release --product whisperkit-cli.build/release/whisperkit-cli transcribe \
--audio-path Tests/WhisperKitTests/Resources/jfk.wav \
--model large-v2 \
--verbose.build/release/whisperkit-cli transcribe \
--audio-path Tests/WhisperKitTests/Resources/ted_60.m4a \
--model large-v2 \
--verboseRun the same file across different models to compare speed and accuracy:
for model in tiny base small large-v2 large-v3; do
echo "=== $model ==="
.build/release/whisperkit-cli transcribe \
--audio-path Tests/WhisperKitTests/Resources/jfk.wav \
--model $model \
--verbose
doneRun all models against short and long audio:
./scripts/benchmark.sh --allSee ./scripts/benchmark.sh --help for options like --audio, --models, and --long.
The --verbose flag prints tokens per second, real-time factor, and speed factor after each transcription.
WhisperKit is released under the MIT License.
Based on argmaxinc/WhisperKit.