A generic, project-agnostic localization pipeline for Godot games. Translates per-locale JSON files via Google Translate or a local LLM (LM Studio / Ollama), verifies parity, and gives you an at-a-glance dashboard of which locales are done.
Zero token cost for the local-LLM path. Runs entirely on your machine.
Works with any OpenAI-compatible local model — Qwen2.5, Llama 3.1/3.2, Mistral, Gemma 2, Aya and more. On first run it detects your CPU/RAM/GPU and recommends one that fits, with one-click setup.
Cross-platform: tested on Windows, macOS (incl. Apple Silicon), and Linux in CI.
Not tied to any one game. Adapt to your project by editing one config.yaml.
On first run it probes your hardware and lets you pick from the models your local LM Studio / Ollama already have loaded — with a one-click apply:
The whole workflow lives on one page — from a single-string smoke test, through scanning and translating UI JSON modules, to the six quality-assurance parity gates, brand sweeps, checkpoint maintenance, and registering a new locale:
It is: a tool that translates <localized_dir>/<source_locale>/*.json into the same shape under every other locale's dir, preserving placeholders / BBCode / printf tokens / brand literals you declare. Six inline verification gates catch the usual translator mistakes before you ship.
It isn't: tied to any game's content schema. There's no ticket pipeline, no masterlist generator, no GDScript-checklist printer. If your project needs an opinionated content-schema integration, fork this repo and add it.
git clone https://github.com/reprodev/Godot-AI-Localizer.git godotlocalizer
cd godotlocalizer
# Install (Python 3.11+). Local-LLM users (LM Studio / Ollama) need nothing extra.
pip install -e .
# Optional: only if you want the paid Google Cloud Translation backend.
# This pulls the large google-cloud-translate (grpc/protobuf) tree.
pip install -e ".[google]"
# Windows (PowerShell)
$env:GOOGLE_APPLICATION_CREDENTIALS = "C:\path\to\service-account.json"
# macOS / Linux
# export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Run (the installed console script, or `python -m godotlocalizer`)
godotlocalizer
# → http://localhost:8997Important
Directory Scope: Always execute the run command python -m godotlocalizer from the repository root folder (the Godot-AI-Localizer checkout), not from inside the godotlocalizer/ package subdirectory itself. If run from inside the subdirectory, Python's module resolver will return No module named godotlocalizer.
Use --port N to change the bind port.
Not ready to point it at your own game? Four self-contained, openable Godot 4 example
projects live in examples/ — a basic menu, a visual novel, a fan
translation (source-extracted from GDScript), and a multi-locale RPG. Run one with:
godotlocalizer --config examples/visual-novel/config.yamlIndexing, the parity gates, source scan/generate, and the brand-fix sweep all work without a model running; only the actual translate step needs LM Studio / Ollama. See examples/README.md for the full menu.
These aren't mockups — they're the four bundled projects under examples/, each
opened with godotlocalizer --config examples/<name>/config.yaml. Every example sets
repo_root: ".", so the tool points at that example's own folder (you can see the path in the
Configuration Manager line of each shot) and the locale grid reflects the real JSON shipped in
that folder. Point the same --config flag at your game's config and the dashboard looks exactly
like this, with your locales and your completion state. The screenshots below were taken with no
model running — only the final translate step needs LM Studio / Ollama.
basic-menu — the 60-second hello-world. One target locale (fr), nothing translated yet, so the
grid shows 0 / 15 UI keys and the assistant's next step is "Translate Français". This is the
empty-starting-point every new project begins from.
visual-novel — a partially-translated locale. Japanese (ja, a non-Latin/CJK locale) is half
done — 11 / 23 UI keys — which is exactly when the English-fallback overlay matters: untranslated
keys still render in English instead of going blank. Note the per-locale "last touched" date.
fan-translation — extracting source strings from GDScript. This project has no English JSON to
begin with; the canonical strings live in const _EN dicts in scripts/GameData*.gd. The Source
Strings panel audits the code: GameDataItems.gd is uncaptured (it would ship untranslated),
while GameDataDialogue.gd and GameDataMenu.gd are mapped. Generate then writes the English
JSON from that code so your template can't drift behind the game.
rpg-quests — multi-locale at a glance. Two targets: German (de) partway done at 9 / 18, Spanish
(es) not started at 0 / 18. This is also the brand-protection example (AetherForge must read
identically everywhere), so it's where the parity gates and the brand-fix sweep earn their keep.
The Repo root in these shots is shown as a placeholder (
‹your game project folder›). When you run an example it's that example's own folder; for your own game it's wherever your project lives.
Godot-AI-Localizer only produces the per-locale JSON — <localized_dir>/<locale>/<category>.json.
Loading those strings at runtime is a few lines of GDScript. The pattern: add one autoload that
loads English first as a fallback, overlays the active locale on top (so a half-translated locale
still shows English instead of a blank), and looks strings up by "category/dotted.key".
- Add an autoload. Save this as
res://scripts/localization.gd, then register it in Project → Project Settings → Autoload with the nameL(so you can callL.t(...)anywhere).
extends Node
## Runtime loader for the per-locale JSON Godot-AI-Localizer writes.
const SOURCE_LOCALE := "en"
var locale := SOURCE_LOCALE
var _data: Dictionary = {} # category -> parsed JSON tree
func _ready() -> void:
set_locale(SOURCE_LOCALE)
func set_locale(code: String) -> void:
locale = code
_data.clear()
_load_locale(SOURCE_LOCALE) # base / fallback
if code != SOURCE_LOCALE:
_load_locale(code) # overlay the active locale
func _load_locale(code: String) -> void:
var dir_path := "res://data/localized/%s" % code
var d := DirAccess.open(dir_path)
if d == null:
return
for f in d.get_files():
if not f.ends_with(".json"):
continue
var category := f.get_basename()
var parsed: Variant = JSON.parse_string(FileAccess.get_file_as_string("%s/%s" % [dir_path, f]))
if not (parsed is Dictionary):
continue
_data[category] = _data.get(category, {})
_merge(_data[category], parsed)
func _merge(base: Dictionary, overlay: Dictionary) -> void:
for k in overlay:
if base.has(k) and base[k] is Dictionary and overlay[k] is Dictionary:
_merge(base[k], overlay[k])
else:
base[k] = overlay[k]
## Look up by "category/key.path" (dots or slashes). Returns the path itself if
## missing, so a typo is visible rather than crashing.
func t(path: String) -> Variant:
var node: Variant = _data
for part in path.replace(".", "/").split("/", false):
if node is Dictionary and node.has(part):
node = node[part]
else:
return path
return node- Use it. Where you'd hard-code English, call the loader instead:
$Label.text = L.t("menu/title") # data/localized/<locale>/menu.json → { "title": "..." }
$Button.text = L.t("hud.start_game") # dots and slashes both work- Switch language at runtime from your settings screen:
L.set_locale("fr") # re-render your UI afterwards (re-call the L.t(...) assignments)That's the whole integration. A complete, runnable version ships at
examples/basic-menu/scripts/localization.gd (and every
other project under examples/). Keep your category filenames and key paths stable —
they're the contract between the JSON the tool writes and the L.t(...) calls in your game.
Prefer a container? A slim image is included. The dashboard runs in the container; your LLM
stays on the host and is reached via host.docker.internal.
# 1. Put a config.yaml in the project you want to translate
# (copy docker/config.example.yaml — note the host.docker.internal base URLs).
# 2. Point the ./:/project mount in docker-compose.yml at that project.
# 3. With LM Studio / Ollama running on the host:
docker compose up --build # → http://localhost:8997- The image is built slim (no Google backend); add it in a derived image with
pip install .[google]. - On Linux,
host.docker.internalneeds theextra_hosts: ["host.docker.internal:host-gateway"]already set indocker-compose.yml(harmless on Docker Desktop). - Checkpoints + differential hashes persist in the
gl_statevolume (GODOTLOCALIZER_STATE_DIR=/state).
Godot-AI-Localizer provides a fully interactive Configuration Manager directly in the browser, eliminating the need to manually write or format raw YAML configuration files.
- Start the Web App: Run
python -m godotlocalizerand open your browser tohttp://localhost:8997. - Open the Config Editor: Click the Edit Settings button on the Configuration Manager card.
- Set Project Layout:
- Repository Root Path: Specify where your main Godot project lives relative to this tool (e.g.,
../..if Godot-AI-Localizer is in atools/folder) or use an absolute path. Click Browse… to open an in-app folder picker and navigate your machine to any folder on disk — the chosen absolute path is filled in for you, and a ✓/✗ badge confirms the folder exists as you type. - Localized Directory: The path inside your Godot project holding your locales (typically
data/localized). - Source Locale: The language you write your initial UI assets in (typically
en).
- Repository Root Path: Specify where your main Godot project lives relative to this tool (e.g.,
- Configure Target Locales:
- Use the Add New Locale box to register target languages by specifying their language code (e.g.,
pt_brorja) and a friendly Display Name. - Check special scripting badges if applicable: RTL for right-to-left scripts (e.g., Arabic), Non-Latin for character systems (e.g., CJK), or VBuffer for special text render behaviors (e.g., Hindi).
- Delete any unwanted languages instantly by clicking the Delete button next to each locale card.
- Use the Add New Locale box to register target languages by specifying their language code (e.g.,
- Set Glossaries & Placeholders:
- Placeholders: Input variable replacement tags (like
{PLAYER}or{SCORE}), one per line. They will be protected during LLM translations and restored verbatim. - Brand Literals: Add names or IPs that shouldn't be translated.
- Brand Replacements: Declare pairs in the format
incorrect: correct(e.g.,Synergy AI: Synergy-AI) to automate sweeps for brand variants.
- Placeholders: Input variable replacement tags (like
- Configure Translation Backends: Fill out base URLs, API keys, default models, temperature, and tokens for LM Studio, Ollama, or Google Translate.
- Save Safely: Click Save Configuration. Settings are written to
config.yamland persist across restarts — you only set them once. To re-run the first-launch hardware check and model picker at any time, click System Check in the Configuration Manager header.
Tip
Safety Validation Check: When you save settings, the server runs a verification pass on a temporary file (config.temp.yaml) first. If there are any schema syntax errors or missing directories, the active configuration is not overwritten, protecting your server from crashing. If it passes, settings hot-reload in-memory instantly, updating target translation options on the fly.
| Backend | Cost | Speed | Notes |
|---|---|---|---|
| Google Translate v3 | ~$20/M chars | Very fast | Needs a Google Cloud project + service account |
| LM Studio (local) | Free | Slower | OpenAI-compatible at :1234/v1 — load any chat model |
| Ollama (local) | Free | Slower | OpenAI-compatible at :11434/v1 |
Godot-AI-Localizer works with any OpenAI-compatible chat model. On first launch the System Check card detects your hardware (shown as guidance) and polls your running LM Studio / Ollama servers for the models they actually have loaded — pick one from that live list and click Apply, no hand-typing. A Re-scan button re-polls after you load or download a model. If no server is reachable yet, it falls back to a curated catalog (each entry showing a rough download size) so you can choose what fits your memory, then start your server and Re-scan. Re-run any time via the System Check button in the Configuration Manager.
Popular, translation-capable open models by GPU size (Qwen2.5 is the hardware default — strong CJK and the documented sweet spot; pick whatever you prefer):
| GPU / unified memory | Recommended | Good alternatives |
|---|---|---|
| ~4 GB | Qwen2.5 3B | Llama 3.2 3B, Gemma 2 2B |
| ~8 GB | Qwen2.5 7B | Llama 3.1 8B, Mistral 7B, Gemma 2 9B, Aya 23 8B |
| ~12–16 GB | Qwen2.5 14B | Mistral Nemo 12B, Phi-3 Medium |
| 24 GB+ | Qwen2.5 32B | Gemma 2 27B, Mixtral 8x7B, Qwen2.5 72B |
Balanced pick (maintainer's choice): google/gemma-4-26b-a4b in LM Studio (≈ 16 GB) —
a ~4B-active MoE that runs on an 8 GB GPU with RAM offload yet translates with big-model
accuracy. The System Check auto-selects it whenever it's loaded.
No dedicated GPU? CPU inference works but is slow — a 3B model stays usable, or use the
Google Translate backend. Change models any time in config.yaml →
backends.lmstudio.default_model (or via the settings form's Detect button, which
lists the models your local server currently has loaded).
The Performance & Hardware settings section (and the System Check) let you pick a profile that tunes batch size + token budget to your memory:
| Profile | Batch chunk | Max tokens | For |
|---|---|---|---|
| Conservative | 8 | 8192 | Low VRAM / CPU, smallest blast radius |
| Balanced (default) | 15 | 16384 | The documented sweet spot |
| Aggressive | 25 | 24576 | Lots of VRAM; faster passes |
Click Run gates for any locale to run all six:
| Gate | What it checks |
|---|---|
json_syntax |
Every target JSON parses cleanly |
key_parity |
Every leaf path in your source locale exists in the target (no missing, no extras) |
mojibake |
Regex scan for cp1252-of-UTF-8 mojibake (é where é should be, etc.) |
brand_parity |
Forbidden brand spellings from your config don't appear in any target file |
placeholder_integrity |
Every {PLACEHOLDER} / %s / [b]...[/b] from the source survives in the translation, in the same multiplicity |
english_leak |
No translatable source leaf is byte-identical to (or missing from) the target — i.e. nothing silently ships in the source language |
All six must be green to ship. The dashboard shows them as colored badges.
If your project's canonical strings live in GDScript const dictionaries/arrays (rather than directly in the localized JSON this tool reads), enable source_extraction in config.yaml. A new Source Strings panel then lets you:
- Scan — audit every file matched by
source_glob, classifying each as captured / covered / uncaptured. The uncaptured list is the actionable one: those strings have no source JSON and would ship untranslated. - Generate / Refresh — (re)build
<source_locale>/<category>.jsonfrom the matching consts, so your source template can never drift behind the code.POST /api/ui/translatewithrefresh_source_first: truedoes this automatically before a pass.
It's disabled by default. Leave it off if your source strings already live in the JSON files. Config keys: enabled, source_glob (e.g. scripts/*.gd), const_pattern (e.g. _EN), and an optional category_map.
After your first pass, edit any source-locale file. The next run with mode=differential only re-translates strings whose SHA-256 hash changed since the last pass. The hash sidecar lives at .godotlocalizer_state/ui_hashes_<locale>.json — never inside your project's data dirs.
If you make major environment modifications or need to clear active threads/sockets, you can perform a complete process reload directly from the web interface:
- Under Configuration Manager, click the Restart Server button.
- Confirm the action. The server spawns a daemon thread that relaunches a fresh process deterministically as
python -m godotlocalizerwith your original CLI args (host/port/log-level/config) — so it works the same whether you started via the module or the installedgodotlocalizerconsole script. - The old instance immediately exits (
os._exit(0)), releasing port8997instantly. - Your browser will display a custom "Server Restarting..." reconnecting overlay and reload your dashboard tab within 3 seconds.
| Endpoint | Purpose |
|---|---|
GET / |
Home page (status grid + all panels) |
GET /api/locales |
List configured locales + metadata |
GET /api/backends/health |
Ping local LM Studio / Ollama |
GET /api/backends/<name>/models |
List models the local server currently has loaded |
POST /api/translate/smoke |
One-string round-trip test |
GET /api/ui/scan |
Enumerate categories under the source locale |
POST /api/ui/translate |
Background job: translate every category |
GET /api/source/scan |
Audit source files: captured / covered / uncaptured (opt-in) |
POST /api/source/generate |
Regenerate source JSON from GDScript consts (opt-in) |
GET /api/jobs/<id> |
Poll a job's progress / result |
GET /api/jobs |
List recent jobs |
GET /api/parity/<target> |
Run all six gates for a locale |
GET /api/status/all |
Per-locale completion grid (UI + Parity) |
GET /api/system/probe |
Detect CPU/RAM/GPU + recommend a model & profile |
POST /api/system/apply |
Apply a chosen model + performance profile to config |
Every output panel has a Copy button in its header. Click it to put the panel's text on the clipboard — paste straight into Claude / ChatGPT / Gemini for a second-opinion review. The status grid copies as a clean markdown table.
Godot-AI-Localizer/
├── pyproject.toml
├── config.yaml # All project-specific knobs live here
├── README.md
├── godotlocalizer/
│ ├── __main__.py # python -m godotlocalizer → FastAPI on :8997
│ ├── app.py # Routes
│ ├── state.py # Checkpoint + hash sidecar
│ ├── jobs.py # Thread-safe job manager
│ ├── system.py # Cross-platform HW probe + model recommendation
│ ├── backends/
│ │ ├── base.py
│ │ ├── google.py
│ │ └── openai_compat.py # LM Studio + Ollama (same wire format)
│ ├── pipeline/
│ │ ├── glossary.py # Token protection
│ │ ├── batch.py # Chunking + retry + checkpoint
│ │ ├── ui_json.py # JSON walk + translate + write-out
│ │ ├── source_extract.py # Optional: extract source JSON from GDScript consts
│ │ ├── parity.py # Six inline gates
│ │ └── status.py # Per-locale completion scanner
│ ├── static/ # CSS + JS (vanilla, no framework)
│ └── templates/ # Jinja2 HTML
└── tests/ # pytest — 241 passing + 1 skipped
These are real failure modes you can hit when translating into Japanese, Korean, Hindi, and Urdu with a local LLM. If you hit one, the fix is here.
Symptom: Batches translating into Japanese / Korean / Chinese fail all 3 retries with could not parse JSON array. The captured output ends mid-string (e.g. "새로운 커피 with no closing quote).
Cause: CJK characters tokenize at ~2–3 tokens each on most LLMs. A 30-string batch of long narrative needs 6–10k output tokens; the default max_tokens: 4096 truncates.
Fix: Already applied in the default config.yaml (max_tokens: 16384, ui_chunk_size: 15). If you still see it, drop ui_chunk_size further to 10 or 8.
Symptom: HTTP 400 from LM Studio mid-pass, error body includes a � Unicode replacement character.
Cause: The model glitched on an earlier request and the KV cache is poisoned. Or a prior translation pass wrote a �-containing string back to your target locale JSON.
Fix: Reload the model in LM Studio (Eject → load again — clears the KV cache). Re-run with mode=missing — the pipeline detects � in existing translations and treats those entries as needs-retry.
Symptom: Batches containing source strings with inner 'PHRASE' constructs (like "NEW COFFEE FLAVOR: 'HYPER-CAFFEINE'") fail with Expecting ',' delimiter. The model confused the inner ' with a JSON " delimiter.
Fix: Already covered by the default glossary patterns inner_single_quoted and inner_double_quoted — word-boundary lookarounds protect them without eating French l'IA or English don't. If you've removed those patterns, add them back.
Symptom: The brand_parity gate flags forbidden spellings (e.g. translators rendered your MyBrand as My-Brand or Mybrand across multiple locales).
Fix: Populate glossary.brand_replacements in config.yaml, then click the Auto-fix brand variants panel. Dry-run shows per-file impact; Apply rewrites in place with BOM preservation and JSON validation.
pytest tests/Critical safety nets that must stay green:
test_glossary.py— round-trip every protected token form. If this is red, real translations will be mangled.test_parity.py— every inline gate detects its target failure mode.test_ui_json.py— full / missing / differential modes do the right thing; corrupted target strings (U+FFFD) trigger re-translate.test_batch.py— mid-pass crash leaves a resumable checkpoint, no string translated twice.test_brand_fix.py— BOM-preserving auto-fix never breaks JSON syntax.
See CONTRIBUTING.md for the dev workflow and code conventions.
MIT — see LICENSE.
See CONTRIBUTING.md for the dev workflow and how to extend the tool (new backends, new parity gates).






