Minimal Obj-C application for automatic offline speech recognition. The inference runs locally, on-device.
whisper-iphone-13-mini-2.mp4
Real-time transcription demo:
whisper-iphone-13-mini-3.mp4
git clone https://github.com/ggerganov/whisper.cpp
open whisper.cpp/examples/whisper.objc/whisper.objc.xcodeproj/
// If you don't want to convert a Core ML model, you can skip this step by create dummy model
mkdir models/ggml-base.en-encoder.mlmodelc
Make sure to build the project in Release
:
Also, don't forget to add the -DGGML_USE_ACCELERATE
compiler flag for ggml.c
in Build Phases.
This can significantly improve the performance of the transcription:
If you want to enable Core ML support, you can add the -DWHISPER_USE_COREML -DWHISPER_COREML_ALLOW_FALLBACK
compiler flag for whisper.cpp
in Build Phases:
![image](https://private-user-images.githubusercontent.com/3001525/237518678-103e8f57-6eb6-490d-a60c-f6cf6c319324.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkxNjY3ODMsIm5iZiI6MTcxOTE2NjQ4MywicGF0aCI6Ii8zMDAxNTI1LzIzNzUxODY3OC0xMDNlOGY1Ny02ZWI2LTQ5MGQtYTYwYy1mNmNmNmMzMTkzMjQucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYyMyUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MjNUMTgxNDQzWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ODRlZTc5NTM1ODNlMTRiOTk1ZDkzZGM5OWIzNWMzYWUxNzBkYjFkY2E4NjM4MDg1ZWNkZGM1Mzg4NzEyYTkyNiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.6Ogm9-jYzlSiiwNpVwwex3EqQIKfwt7k5rH5H2_xhHU)
Then follow the Core ML support
section of readme for convert the model.
In this project, it also added -O3 -DNDEBUG
to Other C Flags
, but adding flags to app proj is not ideal in real world (applies to all C/C++ files), consider splitting xcodeproj in workspace in your own project.