Translates content from a CSV, Excel, plain-text, or HTML file — or directly from a URL — into one or more target languages using a local LLM via Ollama or the OpenRouter API. Each translation is back-translated into the source language and scored for round-trip quality. Results are written back to the output file.
Three model roles are configured independently and may be the same or different models:
| Role | Purpose |
|---|---|
| Translator | Forward translation: source → target |
| Back-translator | Reverse translation: target → source |
| Evaluator | Scores semantic similarity of original vs back-translation (0.0–1.0) |
Install uv — the package manager used to install and run the tool:
# macOS / Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"NOTE The Ollama mode works for demonstration purposes but is not a good translation solution due to throughput limitations, OpenRouter based translation is reccommended for productive use.
Install Ollama and pull at least one model:
ollama pull phi4-miniThe tool will attempt to start Ollama automatically if it is not already running. If you prefer to start it yourself, run ollama serve before launching the tool.
If you want to use OpenRouter instead of Ollama, create an account at openrouter.ai and obtain an API key. Then set it as an environment variable:
# macOS / Linux — add to ~/.bashrc or ~/.zshrc to make permanent
export OPENROUTER_API_KEY=your-key-here
# Windows (Command Prompt) — sets permanently for your user account
setx OPENROUTER_API_KEY your-key-hereIMPORTANT Do not unzip the file into a folder managed with One Drive, copy to a logical location on your C drive and unzip there, otherwise UV and the libraries will get very confused and fail to work!
Clone or unzip the project files into a new folder. Navigate to the folder in a command window and then install all dependencies from the lockfile:
uv sync --frozenThis creates a .venv in the project directory and installs exact locked versions with SHA256 hash verification.
# navigate to your environment first
cd C:\Location\of\translator
# activate your UV environment
.venv\Scripts\activate
# Cloud models via OpenRouter (default config - recommended)
uv run translator --input phrases.csv --output results.csv
# Local models via Ollama (alternate config)
uv run translator --input phrases.csv --output results.csv --config config_ollama.toml| Flag | Default | Description |
|---|---|---|
--input |
(optional) | Input file path (.csv, .xlsx, .txt, .html) or a URL. Omit to process all files in input/. |
--output |
(optional) | Output file path. Omit to overwrite --input, or write to output/ in folder mode. Not used for HTML/URL input. |
--source-column |
phrase |
CSV/Excel column containing source text. |
--config |
./config.toml |
Path to config file. |
--verbose |
Enable debug-level console output. | |
--quiet |
Suppress all console output except errors. | |
--log-file |
log_file.txt |
Path to log file. |
All model, language, backend, and threshold settings are configured in config.toml — see Configuration below.
On first run, the tool downloads evaluation model weights (~2–3 GB total) from HuggingFace for the sentence-transformer and BERTScore models. This is a one-time download; subsequent runs use the cached weights. Any configured Ollama models not already present are also pulled automatically.
Single file: pass --input and optionally --output directly:
uv run translator --input phrases.csv --output results.csvBatch mode: place all your files in an input/ folder next to the project, then run with no flags. Translated files are written to output/ automatically:
translator/
├── input/
│ ├── phrases.csv
│ ├── marketing.xlsx
│ └── legal.txt
└── output/ ← created automatically
├── phrases.csv
├── marketing.xlsx
└── legal.txt
uv run translatorPass a URL or a local .html/.htm file as --input and the tool will translate every visible text node on the page:
# Translate a web page directly from its URL
uv run translator --input https://example.com/about
# Translate a local HTML file
uv run translator --input page.htmlFor each target language the tool writes two files into output/:
| File | Description |
|---|---|
{stem}_{lang}.html |
Full translated HTML with all markup preserved |
{stem}_summary.xlsx |
One row per text node: original, translation, back-translation, score |
The --output flag is not used for HTML/URL input — output is always written to output/.
Short phrases (≤ cache_max_words words, default 5) are cached in a local SQLite database at cache/translations.db. On subsequent runs any cached phrase is served instantly with no LLM call, which makes re-translating pages with repeated navigation text (menus, labels, headings) much faster. Short phrases that score below the quality threshold are accepted and cached without retry — a two-word label is unlikely to improve on a second attempt.
To start fresh, delete the cache/ folder:
rm -rf cache/To disable caching entirely, set max_words = 0 in the [cache] section of your config file.
The input CSV must have a column containing the source phrases (default column name: phrase).
id,phrase
1,The early bird catches the worm
2,Better late than neverExcel (.xlsx, .xls) and plain-text (.txt) files are also supported.
The original columns are preserved and three columns are appended per target language:
id,phrase,fr_translation,fr_back,fr_score,de_translation,de_back,de_score
1,The early bird catches the worm,Le lève-tôt attrape le ver,The early bird catches the worm,0.923,...| Column | Description |
|---|---|
{lang}_translation |
Forward translation produced by the Translator model |
{lang}_back |
Back-translation produced by the Back-translator model |
{lang}_score |
Round-trip quality score 0.0–1.0 |
Scores near 1.0 indicate the meaning was very well preserved. Scores below 0.6 suggest the translation may be unreliable.
For plain-text input, the output file begins with an average score line followed by each language's translation.
All model, language, backend, and threshold settings live in config.toml. Pass --config path/to/file.toml to use a different config. Every key listed below is required — the tool will report all missing keys if any are absent rather than silently using defaults.
[models]
backend = "ollama" # "ollama" or "openrouter"
translator = "phi4-mini"
backtranslator = "phi4-mini"
evaluator = "phi4-mini"
request_timeout = 60 # seconds per LLM call; raise for slow models
[ollama]
base_url = "http://localhost:11434"
[openrouter]
# Set OPENROUTER_API_KEY as an environment variable (see Prerequisites above)
[scoring]
quality_threshold = 0.7 # score at which the retry loop exits early
retries = 3 # max additional attempts per phrase/chunk
pass_threshold = 0.8 # informational: score considered passing
warn_threshold = 0.6 # informational: score considered acceptable
max_chunk_chars = 4000 # split inputs longer than this; 0 to disable
[cache]
max_words = 5 # cache phrases with this many words or fewer; 0 to disable
[languages]
source = "en"
targets = ["fr", "de", "nl", "fr-BE", "nl-BE"]Four ready-made configs are included:
| File | Backend | Models |
|---|---|---|
config.toml |
Ollama (local) | phi4-mini for all roles |
config_ollama.toml |
Ollama (local) | phi4-mini for all roles |
config_or_cheap.toml |
OpenRouter | Claude Sonnet / GPT / DeepSeek |
config_or_pricey.toml |
OpenRouter | Claude Opus / GPT / Gemini |
The project has a full test suite (unit and integration). Tests are excluded from the distribution zip to keep it lightweight, but are available in the full repository.
uv run pytest # Unit tests only (default)
uv run pytest -m integration # Integration tests (requires live Ollama / OpenRouter)
uv run pytest -m "" # Everything