Real-time streaming translator powered by local LLM via Ollama.
- Streaming Translation — Get instant translation feedback as text is being translated
- Delta Translation — Translate text incrementally with context awareness for natural flow
- Compression & Summarization — Translate and compress text in a single pass
- Web Interface — Simple, fast frontend for real-time translation
- 11 Languages Supported — English, Dutch, German, French, Spanish, Polish, Italian, Portuguese, Chinese, Russian
- Local & Private — Runs entirely on your machine, no external API calls
- Ollama installed and running
- Python 3.10+
- pip
# Clone the repo
git clone https://github.com/YOUR_USERNAME/translAI.git
cd translAI
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txtCopy .env.example to .env and adjust if needed:
cp .env.example .envDefault values:
OLLAMA_URL=http://localhost:11434TRANSLAI_MODEL=qwen2.5:1.5b
# Start Ollama (in another terminal)
ollama serve
# Pull the model (first time only)
ollama pull qwen2.5:1.5b
# Start the server
./start.sh
# or: uvicorn app:app --reloadVisit http://localhost:8000 in your browser.
Translate full text with streaming.
{
"text": "Привет, мир!",
"target_lang": "en",
"source_lang": "Russian"
}Translate incremental text changes for real-time translation UX.
{
"delta": "мир",
"context_ru": "Привет,",
"context_tr": "Hello,",
"target_lang": "en"
}Translate and compress text simultaneously.
{
"text": "Long text...",
"target_lang": "en"
}Check Ollama connection and model availability.
Performance logging endpoints for monitoring translation speeds.
- FastAPI — Modern, fast web framework with async streaming
- Ollama — Local LLM inference engine
- HTTPX — Async HTTP client for Ollama communication
- Vanilla JS Frontend — Lightweight, no framework overhead
MIT