Skip to content

feat(ai-translation): add writer → reviewer → editor pipeline#2739

Merged
Innei merged 4 commits into
masterfrom
feat/translation-review-pipeline
May 27, 2026
Merged

feat(ai-translation): add writer → reviewer → editor pipeline#2739
Innei merged 4 commits into
masterfrom
feat/translation-review-pipeline

Conversation

@Innei

@Innei Innei commented May 27, 2026

Copy link
Copy Markdown
Member

Summary

Three-step translation pipeline (writer → reviewer → editor) that critiques the initial translation and revises flagged segments to read natively. Reviewer is critique-only with an independent runtime; editor returns segment patches merged into the result before a single DB write.

Changes

  • Pipeline shape — sequential write → review → edit (rejected agent-loop / reviewer-rewrite / iterate-to-threshold)
  • Lexical strategy — drop `MAX_BATCH_TOKENS` batching, single `callWriter` per pass; keep incremental block reuse
  • Markdown strategy — fence-aware paragraph splitter; editor patches target `text:p` paragraph IDs not the whole `text` field
  • Reviewer service — structured-output + text-mode fallback; sanitizes issues against `ALLOWED_IDS`
  • Prompts — `translation-reviewer.system.md` / `translation-editor.system.md` extracted to .md, imported via Vite `?raw` + `{{var}}` template renderer
  • Config — `enableTranslationReview`, `translationReviewModel`, `translationReviewScoreThreshold`, `AIFeatureKey.TranslationReview` (falls back to translator model)
  • Metrics out-param — `PipelineMetrics` exposes per-stage timing, score, issue IDs, patch keys, before/after samples for tests + observability
  • Incremental + review — reviewer sees full translated text (reused + new), but constrained to flag only changed segment IDs; full-reuse path skips review entirely (T1)
  • Live e2e — `pnpm test:live` (OpenRouter) / `pnpm test:live:local` (LM Studio) / `pnpm test:live:mix` / `pnpm test:live:deepseek-mix`; writes `tmp/translation-review-smoke-.md` quality report
  • bench-translation-prompt.ts — rewritten to drive real strategies + runtimes for A/B prompt comparison

Spec

`docs/superpowers/specs/2026-05-27-translation-review-pipeline-design.md`

Test plan

  • `pnpm test` ai module 178 unit tests pass
  • `tsc --noEmit` clean
  • `eslint` clean (single `prefer-regex-literals` disable on `{{var}}` template regex)
  • markdown paragraph splitter 8/8 spec passes (fence, nested lists, round-trip, patch)
  • base-translation-strategy.spec passes after `callChunkTranslation` → `callWriter` rename
  • Live e2e against deepseek + GPT-5.5 — Case 1 zh→en quality probe 90 (Δ +8, issues -3), Case 3 markdown structure 5/5 preserved
  • Manual rollout: `enableTranslationReview` off by default in config; enable per-deployment to validate

Adds a three-step translation pipeline (writer → reviewer → editor) that
critiques the writer's initial translation and revises flagged segments
to read natively. Reviewer is critique-only with an independent runtime;
editor returns segment patches that are merged into the result before a
single DB write.

Highlights:
- Drop lexical writer batching (single-call now); keep incremental
  block reuse and meta hash diffing
- New TranslationReviewerService with structured-output + text-mode
  fallback; sanitizes issues against ALLOWED_IDS
- Markdown patches go through a fence-aware paragraph splitter so
  patches target paragraph IDs, not the whole text field
- Reviewer / editor system prompts moved to .md files imported via
  Vite raw + {{var}} template renderer
- PipelineMetrics out-param exposes per-stage timing, score, issue
  IDs, patch keys, and before/after samples for tests and observability
- Independent reviewer model config (AIFeatureKey.TranslationReview,
  enableTranslationReview, translationReviewModel,
  translationReviewScoreThreshold) — falls back to translator model
- Live OpenRouter / LM Studio e2e (RUN_LIVE_TESTS=1) writes a markdown
  quality report covering writer output, revise step, post-edit
  quality probe, and structural fidelity. Default 'pnpm test' does
  not invoke it.
- bench-translation-prompt.ts rewritten to drive real strategies +
  runtimes for A/B prompt comparison

Spec: docs/superpowers/specs/2026-05-27-translation-review-pipeline-design.md
@safedep

safedep Bot commented May 27, 2026

Copy link
Copy Markdown

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

Innei added 3 commits May 27, 2026 21:47
- Switch tsx → vite-node so Vite's ?raw imports (translation-reviewer.system.md, translation-editor.system.md) load correctly inside ai.prompts.ts
- Inject TranslationReviewerService into LexicalTranslationStrategy constructor to satisfy the new pipeline signature; review/edit stays inert when no reviewer runtime is passed to translate(...)

Verified vite-node starts the bench against openrouter and exercises LexicalTranslationStrategy end-to-end (streamed translations include real .md system prompt content).
- Move every AI system prompt out of ai.prompts.ts into prompts/*.md (loaded via Vite ?raw)
- Render dynamic interpolations through renderPromptTemplate placeholders
  ({{MAX_WORDS}}, {{GENRE_LIST}}, {{TARGET_LANGUAGE}})
- Split optional sections (Japanese ruby, stream reminders, input/output format) into .partial.md fragments concatenated by buildXxxSystem helpers
- ai.prompts.ts shrinks from ~440 lines of string literals to declarative composition
…ategies

Hoist DEFAULT_REVIEW_SCORE_THRESHOLD, emptyReviewerMetrics,
emptyEditorMetrics, and buildReviewerMetrics into the base strategy so
lexical and markdown translation strategies share one source of truth.
Simplifies metrics bookkeeping in runReviewAndEdit and editor patch
application.
@Innei Innei merged commit 4518edb into master May 27, 2026
11 checks passed
@Innei Innei deleted the feat/translation-review-pipeline branch May 27, 2026 15:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant