Real-time in-browser speech recognition using Whisper and WebGPU, written in TypeScript.
This is a TypeScript version of the original webgpu-whisper example, with full type safety and better IDE support.
- 🎙️ Real-time speech recognition
- 🌐 Runs entirely in the browser (no server required)
- ⚡ Powered by WebGPU for fast inference
- 🔒 Private - all processing happens locally
- 🌍 Multi-language support (99+ languages)
- 📝 Fully typed with TypeScript
npm installnpm run devnpm run build- Transformers.js for running Whisper models
- WebGPU for accelerated inference
- Vite for fast development and building
- Tailwind CSS for styling
- The app loads the Whisper-base model (~200MB) from Hugging Face
- It requests microphone access and starts recording audio
- Audio is continuously processed and transcribed in real-time
- All processing happens in a Web Worker to keep the UI responsive
- Results are streamed back and displayed as they're generated
Requires a browser with WebGPU support:
- Chrome/Edge 113+
- Safari/iOS 16.4+ (experimental)
Same as the parent Transformers.js project.