error-corrector

A FastAPI service that takes HTML in and returns the same HTML with Hebrew word-level corrections wrapped in <mark class="error" data-corrections="...">…</mark>. Inference is remote via LiteLLM; no model weights are bundled, so the image is small and CPU-only.

This is the first stage of the broader editors-agent project. Future stages will reuse the same /correct HTTP contract with different correction backends.

Quick start (local)

cp .env.example .env       # fill in at least one provider key
docker compose up --build

Service listens on http://localhost:8000.

Interactive API docs: http://localhost:8000/docs.

API

`GET /health`

Liveness and readiness probe.

curl -s http://localhost:8000/health

{
  "status": "ok",
  "default_model": "gpt-5.5",
  "has_anthropic_key": true,
  "has_openai_key": true,
  "has_google_key": false
}

`POST /correct`

Raw HTML in, raw HTML out.

Field	Where	Required	Default	Notes
body	request body	yes	—	HTML document or fragment, UTF-8
`X-Model`	request header	no	`DEFAULT_MODEL`	LiteLLM model id override
`X-Service-Tier`	request header	no	`DEFAULT_SERVICE_TIER`	`priority` / `flex` / `default` (OpenAI only)

Response:

Status 200, Content-Type: text/html; charset=utf-8, body = corrected HTML
Headers: X-Model, X-Chunks, X-Total-Ms, X-Cost-Usd, optional X-Warning

Error envelope (status 400 / 413 / 500): {"detail": "..."}.

curl -X POST http://localhost:8000/correct \
  -H "Content-Type: text/html; charset=utf-8" \
  --data-binary @article.html

`GET /metrics`

Prometheus exposition (FastAPI default instrumentation). Excluded from OpenAPI.

Configuration

All settings come from environment variables. For local dev they can be loaded from a .env file in the working directory.

Name	Default	Required	Notes
`ANTHROPIC_API_KEY`	—	for `claude-*` models	secret
`OPENAI_API_KEY`	—	for `gpt-*` models	secret
`GOOGLE_API_KEY`	—	for `gemini/*` models	secret
`DEFAULT_MODEL`	`gpt-5.5`	no	LiteLLM model id used when `X-Model` is absent
`DEFAULT_SERVICE_TIER`	`priority`	no	Default OpenAI service tier: `priority` / `flex` / `default` / `off`. Used when `X-Service-Tier` is absent
`VERIFIER_MODEL`	`claude-haiku-4-5`	no	Validator model. Set to `off` to skip verification
`MAX_HTML_BYTES`	`5242880` (5 MiB)	no	Request body cap
`LOG_LEVEL`	`INFO`	no	Standard Python logging level
`LOG_FORMAT`	`json`	no	`json` for prod, `text` for local dev
`PORT`	`8000`	no	Host port mapping (docker compose only)

Deployment (Kubernetes / ArgoCD)

Image build & push

docker build -t <registry>/error-corrector:<tag> .
docker push <registry>/error-corrector:<tag>

The image:

runs uvicorn app.main:app on port 8000
runs as non-root appuser (uid 1001)
includes an in-image HEALTHCHECK hitting /health
has no persistence — the service is fully stateless

Secret for provider API keys

apiVersion: v1
kind: Secret
metadata:
  name: error-corrector-keys
type: Opaque
stringData:
  ANTHROPIC_API_KEY: <redacted>
  # OPENAI_API_KEY: <redacted>
  # GOOGLE_API_KEY: <redacted>

Suggested Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: error-corrector
spec:
  replicas: 2
  selector:
    matchLabels:
      app: error-corrector
  template:
    metadata:
      labels:
        app: error-corrector
    spec:
      containers:
        - name: app
          image: <registry>/error-corrector:<tag>
          ports:
            - containerPort: 8000
              name: http
          envFrom:
            - secretRef:
                name: error-corrector-keys
          env:
            - name: DEFAULT_MODEL
              value: gpt-5.5
            - name: DEFAULT_SERVICE_TIER
              value: priority
            - name: VERIFIER_MODEL
              value: claude-haiku-4-5
            - name: LOG_FORMAT
              value: json
          readinessProbe:
            httpGet:
              path: /health
              port: http
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health
              port: http
            periodSeconds: 30
            failureThreshold: 3
          resources:
            requests:
              cpu: "200m"
              memory: "256Mi"
            limits:
              cpu: "1000m"
              memory: "768Mi"

Resource numbers are a starting point. The service is CPU-light (HTTP glue plus a Node subprocess for parse5) but each POST /correct triggers N synchronous LLM calls — one per Hebrew text node, plus batched verifier calls. Latency scales with article length, so scale horizontally via HPA, not vertically.

HorizontalPodAutoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: error-corrector
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: error-corrector
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Service / Ingress

Standard ClusterIP Service on port 80 → containerPort 8000, exposed through the org's Ingress / Gateway. Nothing service-specific is required.

ArgoCD

Stateless and immutable — a standard Application manifest pointing at the deployment-manifest path is enough. This repo ships the container image only.

Observability

Logs: structured JSON to stdout (LOG_FORMAT=json). Notable events:
- extract_nodes — per request, with n_nodes and n_relevant
- pipeline_complete — per request, with model, n_chunks, total_ms, cost_usd, invariant_ok, warnings_count
- pipeline_failed — uncaught pipeline exception with traceback
- detector_parse_failed — model returned invalid JSON for a text node
Metrics: Prometheus exposition at GET /metrics. Includes the default FastAPI instrumentation (http_requests_total, http_request_duration_seconds, etc.). Scrape with the standard ServiceMonitor / PodMonitor.

Operational notes

Stateless: every request is independent. No DB, no cache, no on-disk state. Pods are interchangeable.
Latency scales with the number of Hebrew text nodes in the input. A multi-paragraph article on Sonnet typically completes in a few seconds.
Cost is reported per request via the X-Cost-Usd response header. Token accounting uses LiteLLM's pricing tables with a heuristic fallback.
Manual reconstruction: the model only returns JSON edit candidates. The service inserts <mark> tags into the original HTML deterministically in code. If stripping inserted marks would not exactly recover the input HTML, the service returns the original HTML and surfaces a warning via X-Warning.
Provider compatibility: requests with optional parameters that a provider rejects (e.g. service_tier, response_format) are retried once without the unsupported parameter, and the omission is surfaced via X-Warning.
No model weights: all inference is remote via LiteLLM.

Repo layout

error-corrector/
├── pyproject.toml         # Python deps (PEP 621)
├── package.json           # parse5 (Node)
├── Dockerfile             # Python 3.12 + Node 20, non-root, in-image HEALTHCHECK
├── docker-compose.yml     # local dev only
├── .env.example
├── README.md
└── app/
    ├── main.py            # FastAPI app, logging + metrics setup
    ├── routes.py          # GET /health, POST /correct
    ├── config.py          # pydantic-settings
    ├── logging_setup.py   # structlog JSON config
    ├── pipeline.py        # extract → detect → verify → reconstruct
    ├── llm/
    │   ├── client.py      # litellm wrapper with retries
    │   ├── prompts.py     # detector + verifier prompts and schemas
    │   └── costs.py       # token / cost accounting
    ├── html/
    │   ├── nodes.py       # parse5 subprocess driver
    │   └── markup.py      # tokenization, mark insertion, mark stripping
    └── tools/
        └── html-text-nodes.mjs   # parse5-based extractor (Node)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

error-corrector

Quick start (local)

API

`GET /health`

`POST /correct`

`GET /metrics`

Configuration

Deployment (Kubernetes / ArgoCD)

Image build & push

Secret for provider API keys

Suggested Deployment

HorizontalPodAutoscaler

Service / Ingress

ArgoCD

Observability

Operational notes

Repo layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
app		app
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
package.json		package.json
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

error-corrector

Quick start (local)

API

GET /health

POST /correct

GET /metrics

Configuration

Deployment (Kubernetes / ArgoCD)

Image build & push

Secret for provider API keys

Suggested Deployment

HorizontalPodAutoscaler

Service / Ingress

ArgoCD

Observability

Operational notes

Repo layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

`GET /health`

`POST /correct`

`GET /metrics`

Packages