This repository is a test integration of Whisper.cpp as a new WASI-NN backend for the LFX Mentorship project#3170
OS: MacOs Sonoma 14.3 23D56 arm64
Kernel: 23.3.0
Homebrew: 4.2.8
cmake version: 3.28.3
Create a clone using git.
git clone https://github.com/WasmEdge/WasmEdge.git && cd WasmEdge
git checkout hydai/0.13.5_ggml_lts
Installing dependencies
Cmake:brew install cmake
Ninja:brew install ninja
LLVM:brew install llvm
Follow the guide to run llama.cpp.
Please run in the WasmEdge directory.
cmake -GNinja -Bbuild -DCMAKE_BUILD_TYPE=Release \
-DWASMEDGE_PLUGIN_WASI_NN_BACKEND="GGML" \
-DWASMEDGE_PLUGIN_WASI_NN_GGML_LLAMA_METAL=ON \
-DWASMEDGE_PLUGIN_WASI_NN_GGML_LLAMA_BLAS=OFF \
.
cmake --build build
cmake -install build
Download a cross-platform compatible portable Wasm file for the chat app. This application allows you to chat with the model via the command line.
curl -LO https://github.com/second-state/llama-utils/raw/main/chat/llama-chat.wasm
Choose and download a model in ggml format. This time, llama-2-7b-chat.Q5_K_M.gguf was selected.Please download your preferred model from here.
curl -LO https://huggingface.co/TheBloke/Llama-2-7b-Chat-GGUF/resolve/main/llama-2-7b-chat.Q5_K_M.gguf
wasmedge --dir .:. \
--nn-preload default:GGML:AUTO:llama-2-7b-chat.Q5_K_M.gguf \
wasmedge-ggml-llama.wasm default
If it doesn't work as expected, please check here.
![llama-chat-sample](https://private-user-images.githubusercontent.com/26399136/305917326-d19f0bf9-f9cc-43cb-8b49-ca855b71e96c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTkxNDg0ODIsIm5iZiI6MTcxOTE0ODE4MiwicGF0aCI6Ii8yNjM5OTEzNi8zMDU5MTczMjYtZDE5ZjBiZjktZjljYy00M2NiLThiNDktY2E4NTViNzFlOTZjLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MjMlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjIzVDEzMDk0MlomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTYzMWYyN2E4NmZjOGQwZjdiOWY3NjMzMTc2MDI1MjFkMWFiZWIyZDg0Y2UxMjk5MjgxYzhkNzgxZmIxMzNmOTImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.Kz9kJprJ3Y-OFQJTgVO6GqMEGKvMZCwlMvikh6ymKEc)
Let's run whisper.cpp for transcription and real-time transcription!
- Clone the repository.
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp
- Select a model Download the whisper model converted to ggml format. This time, base.en was used.
bash ./models/download-ggml-model.sh base.en
You can choose your favorite model from here. Models are tiny, base, small, medium, large, with accuracy increasing from left to right, but also file size and memory usage.
Models are multilingual unless the model name includes .en.
Transcribe using a sample audio.
./main -f samples/jfk.wav
Success is achieved if transcription is completed as shown below.
And so my fellow Americans, ask not what your country can do for you, ask what you can do for your country.
- Download dependencies. We will download the dependencies and proceed with the execution.
brew install sdl2
- Transcription in real time When executed, it can transcribe from microphone input.
make stream
./stream -m ./models/ggml-base.en.bin -t 8 --step 500 --length 5000
The demo audio was created with ElevenLabs.
whisper-sample.mp4
To use languages other than English, please specify the language using the -l option. Without this option, the system may default to English.
The example provided uses the 'base' model.The following command allows for real-time transcription in Japanese.
bash ./models/download-ggml-model.sh base
make stream
./stream -m ./models/ggml-base.bin -l ja -t 8 --step 500 --length 5000