Summary
A comprehensive audit of docs/ compared against the current state of the codebase (primarily smartem-decisions, smartem-frontend, smartem-devtools). The docs are structurally sound with good organisation and useful ADRs, but they largely reflect the state of the codebase from mid-to-late 2025. Significant additions since then are undocumented, and several claims are now factually wrong.
Findings are grouped by severity.
Incorrect (will mislead readers)
These will cause confusion or errors if followed.
| Doc |
Claim |
Reality |
backend/api-server.md |
python -m smartem_backend.simulate_msg and ./tools/simulate-messages.sh exist |
Neither exists. The tool is tools/external_message_simulator.py |
backend/api-server.md |
./scripts/k8s/dev-k8s.sh up (implies it's in smartem-decisions) |
Script lives in smartem-devtools scripts/k8s/, not in smartem-decisions |
backend/database.md |
Lists 3 migrations (baseline, indexes, prediction models) |
There are 7 migrations — missing: quality metrics, schema drift fixes, suggested acquisition index, agent log table |
backend/database.md |
Baseline migration ID 6e6302f1ccb6 |
Actual baseline file is 2025_09_18_1042-001_create_core_smartem_schema_baseline.py — ID likely differs |
operations/kubernetes.md |
./scripts/k8s/dev-k8s.sh (implies in smartem-decisions) |
Lives in smartem-devtools scripts/k8s/ |
development/generate-docs.md |
tox -e docs to generate documentation |
No tox.ini exists in smartem-decisions. Docs generation has moved elsewhere |
development/e2e-simulation.md |
./tests/e2e/run-e2e-test.sh (implies in smartem-decisions) |
Lives in smartem-devtools tests/e2e/ |
glossary.md |
ARIA = "Automated Real-time Image Analysis" |
ARIA is a central metadata repository for structural biology data from multiple facilities (INSTRUCT-ERIC) — not real-time image analysis |
Severely Incomplete (missing major functionality)
These documents exist but cover only a fraction of current functionality.
| Doc |
What's documented |
What actually exists |
backend/api-documentation.md |
~8 API endpoints |
Reality has 60+ endpoints including atlas tiles, quality predictions, agent sessions/logs, debug endpoints, frontend SSE stream, ordered foilholes, latent representations |
backend/database.md |
Implies ~5 core tables |
Reality has 22 tables including quality prediction models/weights/parameters, quality metrics/statistics, agent log/connection/session/instruction/acknowledgement, atlas tiles, overall quality predictions |
agent/cli-reference.md |
Documents parse/validate/watch |
Missing the installed smartem-agent CLI entry point (via pyproject.toml [project.scripts]), and 5 other CLI entry points (smartem.agent-cleanup, smartem.register-prediction-model, smartem.init-model-weight, smartem.random-model-predictions, smartem.random-prior-updates) |
operations/environment-variables.md |
Lists DB and RabbitMQ vars |
Missing CORS_ALLOWED_ORIGINS, SMARTEM_BACKEND_CONFIG, and appconfig.yml YAML-based configuration (database pool settings, particle_select_batch_size, etc.) |
development/tools.md |
Lists 4 tools |
Missing db_table_totals.py, check_schema_drift.py, generate_api_docs.py, makeiso.sh |
Stale / Outdated (not wrong per se, but no longer reflects current state)
| Doc |
Issue |
smartem-decisions README.md |
Still calls itself "proof of concept" — the system is production-grade with 60+ endpoints, 22 DB tables, CI/CD, K8s deployment, ML pipeline |
backend/api-server.md |
Only covers basic API/consumer startup. Doesn't mention appconfig.yml, frontend_stream.py (frontend SSE), agent log submission, or the ML prediction pipeline |
operations/containerization.md |
Documents multi-stage developer/build/runtime stages — this matches Dockerfile.dev, but the production Dockerfile is simpler (installs from PyPI). Docs don't distinguish between the two |
operations/containerization.md |
Image name ghcr.io/diamondlightsource/smartem-decisions — CI actually uses ghcr.io/${{ github.repository }} (case-sensitive) |
Entirely Missing Documentation
No docs exist for these areas:
- ML prediction pipeline — quality prediction models, model weights, training tables, metrics aggregation
- Frontend SSE stream —
GET /frontend/events/stream for real-time UI updates
- Agent logging —
POST /agent/{agent_id}/session/{session_id}/logs and the agentlog table (added March 2026)
- Debug endpoints — agent session management, test instruction creation
- Image serving — atlas and gridsquare image endpoints (
GET /grids/{grid_uuid}/atlas_image, GET /gridsquares/{gridsquare_uuid}/gridsquare_image)
- appconfig.yml — YAML-based application configuration (DB pool tuning, batch sizes, log file path)
- smartem-frontend — no documentation at all. Frontend is now a monorepo with npm workspaces (
apps/legacy, apps/smartem, packages/api, packages/ui), React 19, MUI 7, TanStack Router, Tailwind CSS 4, Orval API client generation
- smartem-devtools webui — the developer dashboard (React 19, Vite, MDX) has no documentation
Suggested Priority
- Fix incorrect claims (wrong script paths, phantom modules, wrong ARIA definition) — these actively mislead
- Update database.md — migration list and table inventory are significantly behind
- Update API documentation — endpoint coverage is ~13% of reality
- Add frontend docs — entire subsystem undocumented
- Fill in missing topics (ML pipeline, appconfig, image serving, agent logging)
- Refresh stale content (README "proof of concept", containerization docs)
Summary
A comprehensive audit of
docs/compared against the current state of the codebase (primarily smartem-decisions, smartem-frontend, smartem-devtools). The docs are structurally sound with good organisation and useful ADRs, but they largely reflect the state of the codebase from mid-to-late 2025. Significant additions since then are undocumented, and several claims are now factually wrong.Findings are grouped by severity.
Incorrect (will mislead readers)
These will cause confusion or errors if followed.
backend/api-server.mdpython -m smartem_backend.simulate_msgand./tools/simulate-messages.shexisttools/external_message_simulator.pybackend/api-server.md./scripts/k8s/dev-k8s.sh up(implies it's in smartem-decisions)scripts/k8s/, not in smartem-decisionsbackend/database.mdbackend/database.md6e6302f1ccb62025_09_18_1042-001_create_core_smartem_schema_baseline.py— ID likely differsoperations/kubernetes.md./scripts/k8s/dev-k8s.sh(implies in smartem-decisions)scripts/k8s/development/generate-docs.mdtox -e docsto generate documentationtox.iniexists in smartem-decisions. Docs generation has moved elsewheredevelopment/e2e-simulation.md./tests/e2e/run-e2e-test.sh(implies in smartem-decisions)tests/e2e/glossary.mdSeverely Incomplete (missing major functionality)
These documents exist but cover only a fraction of current functionality.
backend/api-documentation.mdbackend/database.mdagent/cli-reference.mdsmartem-agentCLI entry point (via pyproject.toml[project.scripts]), and 5 other CLI entry points (smartem.agent-cleanup,smartem.register-prediction-model,smartem.init-model-weight,smartem.random-model-predictions,smartem.random-prior-updates)operations/environment-variables.mdCORS_ALLOWED_ORIGINS,SMARTEM_BACKEND_CONFIG, andappconfig.ymlYAML-based configuration (database pool settings,particle_select_batch_size, etc.)development/tools.mddb_table_totals.py,check_schema_drift.py,generate_api_docs.py,makeiso.shStale / Outdated (not wrong per se, but no longer reflects current state)
README.mdbackend/api-server.mdappconfig.yml,frontend_stream.py(frontend SSE), agent log submission, or the ML prediction pipelineoperations/containerization.mddeveloper/build/runtimestages — this matchesDockerfile.dev, but the productionDockerfileis simpler (installs from PyPI). Docs don't distinguish between the twooperations/containerization.mdghcr.io/diamondlightsource/smartem-decisions— CI actually usesghcr.io/${{ github.repository }}(case-sensitive)Entirely Missing Documentation
No docs exist for these areas:
GET /frontend/events/streamfor real-time UI updatesPOST /agent/{agent_id}/session/{session_id}/logsand theagentlogtable (added March 2026)GET /grids/{grid_uuid}/atlas_image,GET /gridsquares/{gridsquare_uuid}/gridsquare_image)apps/legacy,apps/smartem,packages/api,packages/ui), React 19, MUI 7, TanStack Router, Tailwind CSS 4, Orval API client generationSuggested Priority