An ultra-fast, lightweight, and 100% offline local AI subtitle assistant powered by Tauri, Vue 3, and SenseVoice-Small.
简体中文 | English
Shiyu Subtitle is an ultra-fast, lightweight, and 100% offline local AI subtitle assistant designed for creators, developers, and power users. By combining the high-performance speech recognition of SenseVoice-Small with the modern, efficient Tauri + Vue 3 desktop framework, Shiyu delivers a premium, distraction-free subtitle generation and editing experience directly on your local machine.
- 100% Local & Private: No cloud API keys, no network required. Your audio, video, and transcripts never leave your machine.
- SenseVoice-Small Powered: Ultra-fast voice-to-text inference utilizing optimized ONNX Runtime. Compresses 1 hour of audio transcription into less than 1 minute.
- Smart Speech Segmentation: Advanced bilingual segmenting algorithms for natural, human-readable line breaks, bypassing model-native limitations.
- Temporal Offset Compensation: Integrated -150ms latency compensation, perfectly aligning subtitles to the audio waveform and correcting SenseVoice CTC peak delays.
- Modern & Sleek UI: A premium dark-mode glassmorphism interface featuring smooth micro-animations, synchronized wave previews, and seamless timing navigation.
- Silent Daemon Management: Seamless, headless backend process lifecycle control. The Python API starts and terminates invisibly alongside the Tauri GUI—no annoying command prompt windows.
- CI/CD Ready: Ready-to-go GitHub Actions configuration for automated multi-platform compilation (Windows, macOS, and Linux) with platform-native Python packaging.
graph TD
A[Tauri GUI - Vue 3 / Vite] <-->|Local Localhost API| B[Python Daemon - FastAPI]
B -->|ONNX Runtime Inference| C[SenseVoice-Small Model]
A -->|Tauri Command| D[Silent Process Controller]
D -->|Spawns / Terminates| B
- Frontend: Vue 3, Vite, Naive UI, Lucide Icons, Web Audio API
- Tauri Core: Rust (System windowing, silent process daemonization, registry isolation)
- Backend: FastAPI, Uvicorn, Python 3.10, ONNX Runtime (SenseVoice inference)
- Node.js (v18+) & npm
- Rust (stable toolchain)
- Python (3.10.x recommended)
- Clone the repository and navigate to the backend folder:
cd backend - Create a virtual environment and activate it:
python -m venv venv # On Windows: .\venv\Scripts\activate # On macOS/Linux: source venv/bin/activate
- Install dependencies:
pip install -r requirements.txt
- Download the SenseVoice-Small model (~230 MB) and place it in
models/sensevoice-small/:# From HuggingFace huggingface-cli download SenseVoiceSmall/model-onnx --local-dir models/sensevoice-small/ # Or download the ONNX model directly: # Download URL: https://huggingface.co/SenseVoiceSmall/model-onnx/resolve/main/model.onnx # Save to: models/sensevoice-small/model.onnx
⚠️ The model file is ~230 MB and is not included in the Git repository — download it separately.
- Navigate to the frontend directory:
cd ../frontend - Install Node dependencies:
npm install
- Run the application in development mode:
npm run tauri dev
This repository includes a professional GitHub Actions workflow located at .github/workflows/release.yml.
To compile and package installers for Windows (.msi/.exe), macOS (.dmg), and Linux (.deb) simultaneously:
- Push your code to GitHub.
- Create and push a version tag:
git tag v1.0.0 git push origin v1.0.0
- The runner will download the model from HuggingFace, compile Rust code, bundle native Python environments, and publish draft installers to your GitHub Releases.
This project is licensed under the MIT License.

