The first Model Context Protocol server built specifically for Spanish dialects.
DialectOS is an open-source Spanish dialect translation server that runs as an MCP (Model Context Protocol) tool and CLI. It translates English and other languages into 25 regional Spanish variants β Mexican, Argentinian, Colombian, Puerto Rican, and more β while preserving markdown structure, enforcing glossary terms, and applying adversarial quality gates that catch semantic drift before it reaches users.
Translate, detect, and adapt content across 25 regional Spanish variants while preserving markdown structure, code comments, and locale file formatting.
π Documentation Β· π Quick Start Β· π οΈ MCP Tools Β· π¦ Packages Β· π€ Contributing Β· π Roadmap
DialectOS is a Spanish localization and dialect QA system for AI agents, documentation teams, app developers, and support organizations. It provides MCP tools, CLI workflows, glossary enforcement, locale-file validation, and adversarial quality gates for regional Spanish variants.
AI discovery: llms.txt provides a compact project summary for AI assistants and search crawlers.
Best-fit searches: Spanish dialect translation MCP server, Spanish localization QA, Model Context Protocol translation tool, i18n validation CLI, regional Spanish translator, glossary enforcement, AI localization audit, Spanish launch certification.
DialectOS is available as a paid Spanish localization launch audit. We certify your Spanish docs, app strings, support macros, or locale files across target dialects and deliver an MQM-aligned launch-readiness report.
- Beta pilot: $500
- Scope: up to 10,000 source words and 5 target dialects
- Deliverables: certification report, issue list, severity table, recommended fixes, launch decision
- Sample report:
audits/sample-customer-report.md - Offer details:
docs/spanish-launch-certification.md
| Feature | Google Translate | DeepL API | DialectOS |
|---|---|---|---|
| Spanish dialect awareness | β Generic "Spanish" | β 25 regional variants | |
| MCP native integration | β | β | β 17 MCP tools |
| Markdown structure preservation | β | β | β Tables, code blocks, links intact |
| i18n locale file support | β | β | β JSON locale diff & merge |
| Gender-neutral language | β | β | β elles / latine / -x |
| Formality checking (tΓΊ vs usted) | β | β | β Cross-dialect consistency |
| Adversarial quality gates | β | β | β Semantic drift + structure validation |
| LLM-first dialect adaptation | β Generic MT | β Any OpenAI/Anthropic/LM Studio local LLM + dialect contracts | |
| Translation validation (any provider) | β | β | β
dialectos validate β standalone correctness check |
| GitHub CI integration | β | β | β Composite action for PR validation |
| Auto-glossary from corrections | β | β | β Learns from user feedback |
| Public benchmark suite | β | β | β 205 adversarial samples across 25 dialects |
"We shipped a product to Mexico using our Spain Spanish translations. Users thought we were being intentionally rude."
Spanish is not one language β it's 25 regional variants with different vocabulary, formality levels, slang, and grammatical preferences. Existing translation tools treat Spanish as a monolith.
DialectOS solves this by:
- Understanding regional differences (es-MX vs es-ES vs es-AR vs es-CO...)
- Preserving technical document structure during translation
- Providing glossary enforcement for consistent terminology
- Adding semantic context, dialect grammar profiles, quality contracts, and quality gates that catch drift before it reaches users
- Running as an MCP server so AI assistants can translate natively
The browser demo is no longer a fake/static rule replacer. It calls a local DialectOS backend, and that backend calls the configured provider stack.
LLM_API_URL="http://127.0.0.1:1234/v1/chat/completions" \
LLM_API_FORMAT="openai" \
LLM_MODEL="your-local-model-name" \
LLM_ALLOW_LOCAL=1 \
pnpm demoOpen http://127.0.0.1:8080.
For the beginner container walkthrough, see
docs/full-app-demo.md.
Add to your Claude Desktop, Cursor, or any MCP client:
{
"mcpServers": {
"dialectos": {
"command": "npx",
"args": ["-y", "@dialectos/mcp"],
"env": {
"LLM_API_URL": "https://your-llm-gateway/v1/chat/completions",
"LLM_MODEL": "your-dialect-capable-model",
"LLM_API_KEY": "your-key-if-required",
"LLM_API_FORMAT": "openai",
"ALLOWED_LOCALE_DIRS": "/path/to/locales"
}
}
}
}For v0.3.0, the recommended default cloud model is glm-4.5-air through the Z.ai international Anthropic-compatible endpoint. It passed basic, expanded adversarial, and long-document certification. Use glm-5.1 when you want the higher-confidence/premium option, and qwen3.5-9b via LM Studio for local/offline certification.
export LLM_API_URL="https://api.z.ai/api/anthropic/v1/messages"
export LLM_MODEL="glm-4.5-air"
export LLM_API_FORMAT="anthropic"
export LLM_API_KEY="..."Start LM Studio's local server, then point DialectOS at any downloaded local model. LLM_API_FORMAT=lmstudio uses LM Studio's native REST API and loads the model just-in-time when needed.
LM_STUDIO_URL="http://127.0.0.1:1234" \
LLM_MODEL="publisher/model-key-or-api-identifier" \
LLM_API_FORMAT="lmstudio" \
pnpm dialect:eval -- --live --provider=llm --out=/tmp/dialectos-lmstudio-evalUse dialect:certify for long local-model or cloud-provider runs. It writes events.jsonl, progress.json, and an incrementally updated results.json after every sample, with per-sample timeout protection.
LM_STUDIO_URL="http://127.0.0.1:1234" \
LLM_MODEL="qwen3.5-9b" \
LLM_API_FORMAT="lmstudio" \
pnpm dialect:certify -- --live --provider=llm --sample-timeout-ms=300000 --out=/tmp/dialectos-certifyUse dialect:certify:adversarial to run paraphrase, dialect-collision, taboo-copy, placeholder, register, and repeatability traps. It wraps dialect:certify and writes a failure-matrix.md plus aggregate repeatability results.
pnpm dialect:certify:adversarial -- --live --provider=llm --repeat=2 --sample-timeout-ms=300000 --out=/tmp/dialectos-adversarialUse dialect:certify:documents to certify README/API-doc/locale JSON flows, not just sentence fixtures. It checks markdown structure, placeholders, URLs, code fences, API tables, and locale JSON outputs.
pnpm dialect:certify:documents -- --live --provider=llm --dialects=es-MX,es-PA,es-PR --out=/tmp/dialectos-doc-certUse dialect:report to turn certification artifacts into a customer-facing Markdown deliverable for paid launch audits.
pnpm dialect:report -- --input=audits/release-candidate-2026-04-22/model-matrix.json --out=customer-report.md --customer="Acme SaaS" --product="Spanish launch"# Install globally
npm install -g @dialectos/cli
# Translate to Mexican Spanish
dialectos translate "Hello world" --dialect es-MX
# Translate a README preserving structure
dialectos translate-readme README.md --dialect es-AR --output README.ar.md
# Validate an existing translation
dialectos validate --source "Click the button" --translated "Haz clic en el botΓ³n" --dialect es-MX
# Validate translation files
dialectos validate --source-file en.json --translated-file es-MX.json --dialect es-MX --format json
# View translation corpus statistics
dialectos corpus stats
# Run dialect quality benchmark
dialectos benchmark run --dialects es-MX,es-AR,es-ES
# Generate glossary suggestions from corrections
dialectos glossary suggest --min-occurrences 3
# Compare two glossary versions
dialectos glossary diff glossary-v1.json glossary-v2.json
# Detect missing i18n keys
dialectos i18n detect-missing ./locales/en.json ./locales/es.json
# List all supported dialects
dialectos dialects listgit clone https://github.com/KyaniteLabs/DialectOS.git
cd DialectOS
pnpm install
pnpm build
pnpm test # ~850+ tests passing| Tool | Description |
|---|---|
translate_markdown |
Translate while preserving tables, code blocks, links |
extract_translatable |
Extract only translatable text from markdown |
translate_api_docs |
Translate API docs with table cell-level translation |
create_bilingual_doc |
Side-by-side bilingual documents |
| Tool | Description |
|---|---|
detect_missing_keys |
Compare locale files for missing keys |
translate_missing_keys |
Auto-translate missing keys |
batch_translate_locales |
Batch translate to multiple dialects |
manage_dialect_variants |
Create dialect-specific variants |
check_formality |
Check tΓΊ vs usted consistency |
apply_gender_neutral |
Apply gender-neutral language |
| Tool | Description |
|---|---|
translate_text |
Translate with semantic context, grammar profiles, and quality contracts |
detect_dialect |
Detect dialect from sample text |
translate_code_comment |
Translate comments, preserve code |
translate_readme |
Full README translation pipeline |
search_glossary |
Search 300+ source-attributed glossary terms |
list_dialects |
List all 25 supported dialects |
research_regional_term |
Research source-backed regional lexeme proposals without mutating runtime data |
| Package | Version | Description | Tests |
|---|---|---|---|
@dialectos/mcp |
0.3.0 |
17 MCP tools (stdio server) | 86 |
@dialectos/cli |
0.3.0 |
CLI: translate, validate, corpus, benchmark, glossary | 545 |
@dialectos/providers |
0.3.0 |
LLM, DeepL, LibreTranslate, MyMemory with circuit breaker + corpus | 152 |
@dialectos/security |
0.3.0 |
Rate limiting, SSRF protection, sanitization | 68 |
@dialectos/types |
0.3.0 |
Shared TypeScript types + glossary, profile, certification, and quality data | 54 |
@dialectos/locale-utils |
0.3.0 |
Locale file diff/merge utilities | 55 |
@dialectos/markdown-parser |
0.3.0 |
Structure-preserving markdown parser | 74 |
Total: 1034 tests across 7 packages plus the full-app docs, demo-server, and static-hardening contracts
DialectOS has undergone adversarial security hardening:
- 18 CVEs resolved via dependency overrides
- SSRF protection on all provider endpoints
- Circuit breaker with half-open probe locks
- Atomic checkpoint writes with schema versioning
- HTML injection detection in translated output
- Semantic drift scoring β catches "looks valid but meaning changed"
- Provider capability negotiation β validates language support before API calls
- Chaos harness for deterministic resilience testing
See SECURITY.md for details.
| Code | Region | Example Difference |
|---|---|---|
es-ES |
Spain | "Coche" (car), "Ordenador" (computer) |
es-MX |
Mexico | "Carro", "Computadora" |
es-AR |
Argentina | "Auto", "Computadora", "Che" |
es-CO |
Colombia | "Carro", "Computador", "ChΓ©vere" |
es-CL |
Chile | "Auto", "Computadora", "Caleta" |
es-PE |
Peru | "Carro", "Computadora", "Pe" |
es-VE |
Venezuela | "Carro", "Computadora", "Chamo" |
es-UY |
Uruguay | "Auto", "Computadora", "Bo" |
es-GQ |
Equatorial Guinea | "Carro", "Camisola", "Bacalao" |
es-US |
United States | "Carro", "Computadora", "Pocha" |
es-PH |
Philippines (Chavacano) | "Carro", "Jendeh", "Kame" |
es-BZ |
Belize | "Carro", "Breki", "Kriol" |
es-AD |
Andorra | "Carro", "Madriu", "Caldea" |
...and 12 more. Full list via dialectos dialects list.
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client β
β (Claude Desktop / Cursor / etc.) β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β stdio
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β @dialectos/mcp β
β 17 tools β’ JSON-RPC over stdio β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β @dialectos/cli β
β translate β’ validate β’ corpus β’ benchmark β’ glossary β
β ββ Policy profiles (strict/balanced/permissive) β
β ββ Quality gates (token/glossary/structure/semantic) β
β ββ Translation corpus + auto-glossary β
β ββ Checkpoint resumption + telemetry β
ββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββββ
β
ββββββββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββ
β @dialectos/providers β
β βββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β LLM β β DeepL β β Libre/MyMemory β β
β β Primary β β Paid fallback β β Generic fallbackβ β
β βββββββββββ βββββββββββββββββββ βββββββββββββββββββ β
β β β β β
β ββββββββββββββββββ΄βββββββββββββββββββββ β
β Circuit Breaker + Rate Limiter β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Every translation passes through 4 quality dimensions:
Quality Score = tokenIntegrityΓ25% + glossaryFidelityΓ30% + structureIntegrityΓ20% + semanticSimilarityΓ25%
| Gate | What it checks | Example failure |
|---|---|---|
| Token Integrity | Protected terms preserved | "Kyanite Labs" β "Cianita Labs" |
| Glossary Fidelity | Enforced terminology used | "API" β "Interfaz" (when glossary says "API") |
| Structure Integrity | Markdown structure intact | Missing code fence, broken table |
| Semantic Similarity | Meaning not drifted | "API is down" β "Hello world" |
| Feature | Google Translate | DeepL | DialectOS |
|---|---|---|---|
| Spanish dialect awareness | β Generic "Spanish" | β 25 regional variants | |
| MCP native integration | β | β | β 17 MCP tools |
| Markdown structure preservation | β | β | β Tables, code blocks, links intact |
| i18n locale file support | β | β | β JSON locale diff & merge |
| Gender-neutral language | β | β | β elles / latine / -x |
| Formality checking (tΓΊ vs usted) | β | β | β Cross-dialect consistency |
| Adversarial quality gates | β | β | β Semantic drift + structure validation |
| LLM-first dialect adaptation | β Generic MT | β Any OpenAI/Anthropic/LM Studio local LLM + dialect contracts | |
| Translation validation (any provider) | β | β | β
dialectos validate β standalone correctness check |
| GitHub CI integration | β | β | β Composite action for PR validation |
| Auto-glossary from corrections | β | β | β Learns from user feedback |
| Public benchmark suite | β | β | β 205 adversarial samples across 25 dialects |
| Open source | β | β | β BSL 1.1 β Apache-2.0 in 2030 |
| Free | β | Partial | β |
What is DialectOS? DialectOS is an open-source translation engine for Spanish regional dialects. It runs as an MCP server (for AI assistants like Claude) and a CLI tool for developers.
How is DialectOS different from Google Translate? Google Translate treats Spanish as one language. DialectOS understands 25 regional variants, preserves markdown structure, enforces glossaries, and applies quality gates that catch errors before they reach users.
What are Spanish dialects? Spanish varies significantly by country. Mexican Spanish uses "carro" for car; Spain uses "coche"; Argentina uses "auto". DialectOS handles these differences automatically.
Does DialectOS work with ChatGPT / Claude? Yes. DialectOS is an MCP server, so Claude Desktop, Cursor, Windsurf, and other MCP clients can use its 17 translation tools natively.
Is DialectOS free? Yes. It's licensed under BSL 1.1 (free for most use) and becomes Apache-2.0 on 2030-04-20.
What is MCP? Model Context Protocol is an open standard that lets AI assistants use external tools. DialectOS exposes 17 translation tools through MCP so AI agents can translate natively.
Can I use DialectOS for commercial projects? Yes. BSL 1.1 allows production use. See LICENSE for details.
How accurate is the translation? DialectOS applies 4 quality gates (token integrity, glossary fidelity, structure integrity, semantic similarity) and adversarial tests. 1034 tests verify correctness across dialects.
Add this badge to your project if you use DialectOS for translation:
[](https://github.com/KyaniteLabs/DialectOS)Validate Spanish translations in CI on every pull request:
- uses: KyaniteLabs/DialectOS/action@v0.3.0
with:
dialect: es-MX
source-dir: locales/en
target-patterns: 'locales/es-MX/*.json'
fail-on-blocking: trueMulti-dialect matrix:
strategy:
matrix:
dialect: [es-ES, es-MX, es-AR, es-CO]
steps:
- uses: KyaniteLabs/DialectOS/action@v0.3.0
with:
dialect: ${{ matrix.dialect }}
fail-on-blocking: trueSee docs/github-action.md for full configuration options.
We welcome contributors! See CONTRIBUTING.md for:
- Setting up your development environment
- Running the test suite
- Submitting pull requests
- Code style guidelines
Good first issues are tagged with good first issue β perfect for newcomers.
See ROADMAP.md for upcoming features including:
- Portuguese dialect support (pt-BR, pt-PT)
- Real-time collaborative translation
- Custom provider plugins
- OpenAI-compatible, Anthropic-compatible, and LM Studio local gateways via
LLM_API_URL/LM_STUDIO_URL+LLM_MODEL+LLM_API_FORMAT - VS Code extension
BSL 1.1 β see LICENSE for details. The Licensed Work will become available under Apache-2.0 on 2030-04-20.
Made with β€οΈ by KyaniteLabs and contributors.
Star β this repo if it helps your project!