Godot-AI-Localizer

A generic, project-agnostic localization pipeline for Godot games. Translates per-locale JSON files via Google Translate or a local LLM (LM Studio / Ollama), verifies parity, and gives you an at-a-glance dashboard of which locales are done.

Zero token cost for the local-LLM path. Runs entirely on your machine.

Works with any OpenAI-compatible local model — Qwen2.5, Llama 3.1/3.2, Mistral, Gemma 2, Aya and more. On first run it detects your CPU/RAM/GPU and recommends one that fits, with one-click setup.

Cross-platform: tested on Windows, macOS (incl. Apple Silicon), and Linux in CI.

Not tied to any one game. Adapt to your project by editing one config.yaml.

On first run it probes your hardware and lets you pick from the models your local LM Studio / Ollama already have loaded — with a one-click apply:

The whole workflow lives on one page — from a single-string smoke test, through scanning and translating UI JSON modules, to the six quality-assurance parity gates, brand sweeps, checkpoint maintenance, and registering a new locale:

What this is (and isn't)

It is: a tool that translates <localized_dir>/<source_locale>/*.json into the same shape under every other locale's dir, preserving placeholders / BBCode / printf tokens / brand literals you declare. Six inline verification gates catch the usual translator mistakes before you ship.

It isn't: tied to any game's content schema. There's no ticket pipeline, no masterlist generator, no GDScript-checklist printer. If your project needs an opinionated content-schema integration, fork this repo and add it.

Quick start

git clone https://github.com/reprodev/Godot-AI-Localizer.git godotlocalizer
cd godotlocalizer

# Install (Python 3.11+). Local-LLM users (LM Studio / Ollama) need nothing extra.
pip install -e .

# Optional: only if you want the paid Google Cloud Translation backend.
# This pulls the large google-cloud-translate (grpc/protobuf) tree.
pip install -e ".[google]"
# Windows (PowerShell)
$env:GOOGLE_APPLICATION_CREDENTIALS = "C:\path\to\service-account.json"
# macOS / Linux
# export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json

# Run (the installed console script, or `python -m godotlocalizer`)
godotlocalizer
# → http://localhost:8997

Important

Directory Scope: Always execute the run command python -m godotlocalizer from the repository root folder (the Godot-AI-Localizer checkout), not from inside the godotlocalizer/ package subdirectory itself. If run from inside the subdirectory, Python's module resolver will return No module named godotlocalizer.

Use --port N to change the bind port.

Try an example first

Not ready to point it at your own game? Four self-contained, openable Godot 4 example projects live in examples/ — a basic menu, a visual novel, a fan translation (source-extracted from GDScript), and a multi-locale RPG. Run one with:

godotlocalizer --config examples/visual-novel/config.yaml

Indexing, the parity gates, source scan/generate, and the brand-fix sweep all work without a model running; only the actual translate step needs LM Studio / Ollama. See examples/README.md for the full menu.

The examples, in the dashboard

These aren't mockups — they're the four bundled projects under examples/, each opened with godotlocalizer --config examples/<name>/config.yaml. Every example sets repo_root: ".", so the tool points at that example's own folder (you can see the path in the Configuration Manager line of each shot) and the locale grid reflects the real JSON shipped in that folder. Point the same --config flag at your game's config and the dashboard looks exactly like this, with your locales and your completion state. The screenshots below were taken with no model running — only the final translate step needs LM Studio / Ollama.

basic-menu — the 60-second hello-world. One target locale (fr), nothing translated yet, so the grid shows 0 / 15 UI keys and the assistant's next step is "Translate Français". This is the empty-starting-point every new project begins from.

visual-novel — a partially-translated locale. Japanese (ja, a non-Latin/CJK locale) is half done — 11 / 23 UI keys — which is exactly when the English-fallback overlay matters: untranslated keys still render in English instead of going blank. Note the per-locale "last touched" date.

fan-translation — extracting source strings from GDScript. This project has no English JSON to begin with; the canonical strings live in const _EN dicts in scripts/GameData*.gd. The Source Strings panel audits the code: GameDataItems.gd is uncaptured (it would ship untranslated), while GameDataDialogue.gd and GameDataMenu.gd are mapped. Generate then writes the English JSON from that code so your template can't drift behind the game.

rpg-quests — multi-locale at a glance. Two targets: German (de) partway done at 9 / 18, Spanish (es) not started at 0 / 18. This is also the brand-protection example (AetherForge must read identically everywhere), so it's where the parity gates and the brand-fix sweep earn their keep.

The Repo root in these shots is shown as a placeholder (‹your game project folder›). When you run an example it's that example's own folder; for your own game it's wherever your project lives.

Wiring it into your game

Godot-AI-Localizer only produces the per-locale JSON — <localized_dir>/<locale>/<category>.json. Loading those strings at runtime is a few lines of GDScript. The pattern: add one autoload that loads English first as a fallback, overlays the active locale on top (so a half-translated locale still shows English instead of a blank), and looks strings up by "category/dotted.key".

Add an autoload. Save this as res://scripts/localization.gd, then register it in Project → Project Settings → Autoload with the name L (so you can call L.t(...) anywhere).

extends Node
## Runtime loader for the per-locale JSON Godot-AI-Localizer writes.
const SOURCE_LOCALE := "en"

var locale := SOURCE_LOCALE
var _data: Dictionary = {}   # category -> parsed JSON tree

func _ready() -> void:
    set_locale(SOURCE_LOCALE)

func set_locale(code: String) -> void:
    locale = code
    _data.clear()
    _load_locale(SOURCE_LOCALE)            # base / fallback
    if code != SOURCE_LOCALE:
        _load_locale(code)                 # overlay the active locale

func _load_locale(code: String) -> void:
    var dir_path := "res://data/localized/%s" % code
    var d := DirAccess.open(dir_path)
    if d == null:
        return
    for f in d.get_files():
        if not f.ends_with(".json"):
            continue
        var category := f.get_basename()
        var parsed: Variant = JSON.parse_string(FileAccess.get_file_as_string("%s/%s" % [dir_path, f]))
        if not (parsed is Dictionary):
            continue
        _data[category] = _data.get(category, {})
        _merge(_data[category], parsed)

func _merge(base: Dictionary, overlay: Dictionary) -> void:
    for k in overlay:
        if base.has(k) and base[k] is Dictionary and overlay[k] is Dictionary:
            _merge(base[k], overlay[k])
        else:
            base[k] = overlay[k]

## Look up by "category/key.path" (dots or slashes). Returns the path itself if
## missing, so a typo is visible rather than crashing.
func t(path: String) -> Variant:
    var node: Variant = _data
    for part in path.replace(".", "/").split("/", false):
        if node is Dictionary and node.has(part):
            node = node[part]
        else:
            return path
    return node

Use it. Where you'd hard-code English, call the loader instead:

$Label.text = L.t("menu/title")        # data/localized/<locale>/menu.json → { "title": "..." }
$Button.text = L.t("hud.start_game")   # dots and slashes both work

Switch language at runtime from your settings screen:

L.set_locale("fr")   # re-render your UI afterwards (re-call the L.t(...) assignments)

That's the whole integration. A complete, runnable version ships at examples/basic-menu/scripts/localization.gd (and every other project under examples/). Keep your category filenames and key paths stable — they're the contract between the JSON the tool writes and the L.t(...) calls in your game.

Docker (optional)

Prefer a container? A slim image is included. The dashboard runs in the container; your LLM stays on the host and is reached via host.docker.internal.

# 1. Put a config.yaml in the project you want to translate
#    (copy docker/config.example.yaml — note the host.docker.internal base URLs).
# 2. Point the ./:/project mount in docker-compose.yml at that project.
# 3. With LM Studio / Ollama running on the host:
docker compose up --build      # → http://localhost:8997

The image is built slim (no Google backend); add it in a derived image with pip install .[google].
On Linux, host.docker.internal needs the extra_hosts: ["host.docker.internal:host-gateway"] already set in docker-compose.yml (harmless on Docker Desktop).
Checkpoints + differential hashes persist in the gl_state volume (GODOTLOCALIZER_STATE_DIR=/state).

Setting up & Configuration (Web UI)

Godot-AI-Localizer provides a fully interactive Configuration Manager directly in the browser, eliminating the need to manually write or format raw YAML configuration files.

Step-by-Step Setup Guide:

Start the Web App: Run python -m godotlocalizer and open your browser to http://localhost:8997.
Open the Config Editor: Click the Edit Settings button on the Configuration Manager card.
Set Project Layout:
- Repository Root Path: Specify where your main Godot project lives relative to this tool (e.g., ../.. if Godot-AI-Localizer is in a tools/ folder) or use an absolute path. Click Browse… to open an in-app folder picker and navigate your machine to any folder on disk — the chosen absolute path is filled in for you, and a ✓/✗ badge confirms the folder exists as you type.
- Localized Directory: The path inside your Godot project holding your locales (typically data/localized).
- Source Locale: The language you write your initial UI assets in (typically en).
Configure Target Locales:
- Use the Add New Locale box to register target languages by specifying their language code (e.g., pt_br or ja) and a friendly Display Name.
- Check special scripting badges if applicable: RTL for right-to-left scripts (e.g., Arabic), Non-Latin for character systems (e.g., CJK), or VBuffer for special text render behaviors (e.g., Hindi).
- Delete any unwanted languages instantly by clicking the Delete button next to each locale card.
Set Glossaries & Placeholders:
- Placeholders: Input variable replacement tags (like {PLAYER} or {SCORE}), one per line. They will be protected during LLM translations and restored verbatim.
- Brand Literals: Add names or IPs that shouldn't be translated.
- Brand Replacements: Declare pairs in the format incorrect: correct (e.g., Synergy AI: Synergy-AI) to automate sweeps for brand variants.
Configure Translation Backends: Fill out base URLs, API keys, default models, temperature, and tokens for LM Studio, Ollama, or Google Translate.
Save Safely: Click Save Configuration. Settings are written to config.yaml and persist across restarts — you only set them once. To re-run the first-launch hardware check and model picker at any time, click System Check in the Configuration Manager header.

Tip

Safety Validation Check: When you save settings, the server runs a verification pass on a temporary file (config.temp.yaml) first. If there are any schema syntax errors or missing directories, the active configuration is not overwritten, protecting your server from crashing. If it passes, settings hot-reload in-memory instantly, updating target translation options on the fly.

Backends

Backend	Cost	Speed	Notes
Google Translate v3	~$20/M chars	Very fast	Needs a Google Cloud project + service account
LM Studio (local)	Free	Slower	OpenAI-compatible at `:1234/v1` — load any chat model
Ollama (local)	Free	Slower	OpenAI-compatible at `:11434/v1`

Choosing a local model

Godot-AI-Localizer works with any OpenAI-compatible chat model. On first launch the System Check card detects your hardware (shown as guidance) and polls your running LM Studio / Ollama servers for the models they actually have loaded — pick one from that live list and click Apply, no hand-typing. A Re-scan button re-polls after you load or download a model. If no server is reachable yet, it falls back to a curated catalog (each entry showing a rough download size) so you can choose what fits your memory, then start your server and Re-scan. Re-run any time via the System Check button in the Configuration Manager.

Popular, translation-capable open models by GPU size (Qwen2.5 is the hardware default — strong CJK and the documented sweet spot; pick whatever you prefer):

GPU / unified memory	Recommended	Good alternatives
~4 GB	Qwen2.5 3B	Llama 3.2 3B, Gemma 2 2B
~8 GB	Qwen2.5 7B	Llama 3.1 8B, Mistral 7B, Gemma 2 9B, Aya 23 8B
~12–16 GB	Qwen2.5 14B	Mistral Nemo 12B, Phi-3 Medium
24 GB+	Qwen2.5 32B	Gemma 2 27B, Mixtral 8x7B, Qwen2.5 72B

Balanced pick (maintainer's choice): google/gemma-4-26b-a4b in LM Studio (≈ 16 GB) — a ~4B-active MoE that runs on an 8 GB GPU with RAM offload yet translates with big-model accuracy. The System Check auto-selects it whenever it's loaded.

No dedicated GPU? CPU inference works but is slow — a 3B model stays usable, or use the Google Translate backend. Change models any time in config.yaml → backends.lmstudio.default_model (or via the settings form's Detect button, which lists the models your local server currently has loaded).

Performance profiles

The Performance & Hardware settings section (and the System Check) let you pick a profile that tunes batch size + token budget to your memory:

Profile	Batch chunk	Max tokens	For
Conservative	8	8192	Low VRAM / CPU, smallest blast radius
Balanced (default)	15	16384	The documented sweet spot
Aggressive	25	24576	Lots of VRAM; faster passes

The six inline parity gates

Click Run gates for any locale to run all six:

Gate	What it checks
`json_syntax`	Every target JSON parses cleanly
`key_parity`	Every leaf path in your source locale exists in the target (no missing, no extras)
`mojibake`	Regex scan for cp1252-of-UTF-8 mojibake (`Ã©` where `é` should be, etc.)
`brand_parity`	Forbidden brand spellings from your config don't appear in any target file
`placeholder_integrity`	Every `{PLACEHOLDER}` / `%s` / `[b]...[/b]` from the source survives in the translation, in the same multiplicity
`english_leak`	No translatable source leaf is byte-identical to (or missing from) the target — i.e. nothing silently ships in the source language

All six must be green to ship. The dashboard shows them as colored badges.

Source-string extraction (optional)

If your project's canonical strings live in GDScript const dictionaries/arrays (rather than directly in the localized JSON this tool reads), enable source_extraction in config.yaml. A new Source Strings panel then lets you:

Scan — audit every file matched by source_glob, classifying each as captured / covered / uncaptured. The uncaptured list is the actionable one: those strings have no source JSON and would ship untranslated.
Generate / Refresh — (re)build <source_locale>/<category>.json from the matching consts, so your source template can never drift behind the code. POST /api/ui/translate with refresh_source_first: true does this automatically before a pass.

It's disabled by default. Leave it off if your source strings already live in the JSON files. Config keys: enabled, source_glob (e.g. scripts/*.gd), const_pattern (e.g. _EN), and an optional category_map.

Differential mode

After your first pass, edit any source-locale file. The next run with mode=differential only re-translates strings whose SHA-256 hash changed since the last pass. The hash sidecar lives at .godotlocalizer_state/ui_hashes_<locale>.json — never inside your project's data dirs.

Server Restart Utility

If you make major environment modifications or need to clear active threads/sockets, you can perform a complete process reload directly from the web interface:

Under Configuration Manager, click the Restart Server button.
Confirm the action. The server spawns a daemon thread that relaunches a fresh process deterministically as python -m godotlocalizer with your original CLI args (host/port/log-level/config) — so it works the same whether you started via the module or the installed godotlocalizer console script.
The old instance immediately exits (os._exit(0)), releasing port 8997 instantly.
Your browser will display a custom "Server Restarting..." reconnecting overlay and reload your dashboard tab within 3 seconds.

API

Endpoint	Purpose
`GET /`	Home page (status grid + all panels)
`GET /api/locales`	List configured locales + metadata
`GET /api/backends/health`	Ping local LM Studio / Ollama
`GET /api/backends/<name>/models`	List models the local server currently has loaded
`POST /api/translate/smoke`	One-string round-trip test
`GET /api/ui/scan`	Enumerate categories under the source locale
`POST /api/ui/translate`	Background job: translate every category
`GET /api/source/scan`	Audit source files: captured / covered / uncaptured (opt-in)
`POST /api/source/generate`	Regenerate source JSON from GDScript consts (opt-in)
`GET /api/jobs/<id>`	Poll a job's progress / result
`GET /api/jobs`	List recent jobs
`GET /api/parity/<target>`	Run all six gates for a locale
`GET /api/status/all`	Per-locale completion grid (UI + Parity)
`GET /api/system/probe`	Detect CPU/RAM/GPU + recommend a model & profile
`POST /api/system/apply`	Apply a chosen model + performance profile to config

Copy-to-clipboard

Every output panel has a Copy button in its header. Click it to put the panel's text on the clipboard — paste straight into Claude / ChatGPT / Gemini for a second-opinion review. The status grid copies as a clean markdown table.

Layout

Godot-AI-Localizer/
├── pyproject.toml
├── config.yaml                 # All project-specific knobs live here
├── README.md
├── godotlocalizer/
│   ├── __main__.py             # python -m godotlocalizer → FastAPI on :8997
│   ├── app.py                  # Routes
│   ├── state.py                # Checkpoint + hash sidecar
│   ├── jobs.py                 # Thread-safe job manager
│   ├── system.py               # Cross-platform HW probe + model recommendation
│   ├── backends/
│   │   ├── base.py
│   │   ├── google.py
│   │   └── openai_compat.py    # LM Studio + Ollama (same wire format)
│   ├── pipeline/
│   │   ├── glossary.py         # Token protection
│   │   ├── batch.py            # Chunking + retry + checkpoint
│   │   ├── ui_json.py          # JSON walk + translate + write-out
│   │   ├── source_extract.py   # Optional: extract source JSON from GDScript consts
│   │   ├── parity.py           # Six inline gates
│   │   └── status.py           # Per-locale completion scanner
│   ├── static/                 # CSS + JS (vanilla, no framework)
│   └── templates/              # Jinja2 HTML
└── tests/                      # pytest — 241 passing + 1 skipped

Troubleshooting

These are real failure modes you can hit when translating into Japanese, Korean, Hindi, and Urdu with a local LLM. If you hit one, the fix is here.

Mid-string truncation / JSON parse failures on CJK locales

Symptom: Batches translating into Japanese / Korean / Chinese fail all 3 retries with could not parse JSON array. The captured output ends mid-string (e.g. "새로운 커피 with no closing quote).

Cause: CJK characters tokenize at ~2–3 tokens each on most LLMs. A 30-string batch of long narrative needs 6–10k output tokens; the default max_tokens: 4096 truncates.

Fix: Already applied in the default config.yaml (max_tokens: 16384, ui_chunk_size: 15). If you still see it, drop ui_chunk_size further to 10 or 8.

"Failed to parse input at pos 0" 400 from LM Studio

Symptom: HTTP 400 from LM Studio mid-pass, error body includes a � Unicode replacement character.

Cause: The model glitched on an earlier request and the KV cache is poisoned. Or a prior translation pass wrote a �-containing string back to your target locale JSON.

Fix: Reload the model in LM Studio (Eject → load again — clears the KV cache). Re-run with mode=missing — the pipeline detects � in existing translations and treats those entries as needs-retry.

Quote confusion on inner-quoted phrases

Symptom: Batches containing source strings with inner 'PHRASE' constructs (like "NEW COFFEE FLAVOR: 'HYPER-CAFFEINE'") fail with Expecting ',' delimiter. The model confused the inner ' with a JSON " delimiter.

Fix: Already covered by the default glossary patterns inner_single_quoted and inner_double_quoted — word-boundary lookarounds protect them without eating French l'IA or English don't. If you've removed those patterns, add them back.

Brand-variant drift across locales

Symptom: The brand_parity gate flags forbidden spellings (e.g. translators rendered your MyBrand as My-Brand or Mybrand across multiple locales).

Fix: Populate glossary.brand_replacements in config.yaml, then click the Auto-fix brand variants panel. Dry-run shows per-file impact; Apply rewrites in place with BOM preservation and JSON validation.

Tests

pytest tests/

Critical safety nets that must stay green:

test_glossary.py — round-trip every protected token form. If this is red, real translations will be mangled.
test_parity.py — every inline gate detects its target failure mode.
test_ui_json.py — full / missing / differential modes do the right thing; corrupted target strings (U+FFFD) trigger re-translate.
test_batch.py — mid-pass crash leaves a resumable checkpoint, no string translated twice.
test_brand_fix.py — BOM-preserving auto-fix never breaks JSON syntax.

See CONTRIBUTING.md for the dev workflow and code conventions.

License

MIT — see LICENSE.

See CONTRIBUTING.md for the dev workflow and how to extend the tool (new backends, new parity gates).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Godot-AI-Localizer

What this is (and isn't)

Quick start

Try an example first

The examples, in the dashboard

Wiring it into your game

Docker (optional)

Setting up & Configuration (Web UI)

Step-by-Step Setup Guide:

Backends

Choosing a local model

Performance profiles

The six inline parity gates

Source-string extraction (optional)

Differential mode

Server Restart Utility

API

Copy-to-clipboard

Layout

Troubleshooting

Mid-string truncation / JSON parse failures on CJK locales

"Failed to parse input at pos 0" 400 from LM Studio

Quote confusion on inner-quoted phrases

Brand-variant drift across locales

Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
docker		docker
docs/screenshots		docs/screenshots
examples		examples
godotlocalizer		godotlocalizer
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config.yaml		config.yaml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Godot-AI-Localizer

What this is (and isn't)

Quick start

Try an example first

The examples, in the dashboard

Wiring it into your game

Docker (optional)

Setting up & Configuration (Web UI)

Step-by-Step Setup Guide:

Backends

Choosing a local model

Performance profiles

The six inline parity gates

Source-string extraction (optional)

Differential mode

Server Restart Utility

API

Copy-to-clipboard

Layout

Troubleshooting

Mid-string truncation / JSON parse failures on CJK locales

"Failed to parse input at pos 0" 400 from LM Studio

Quote confusion on inner-quoted phrases

Brand-variant drift across locales

Tests

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages