Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is loosely based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

Released versions are drafted automatically by [release-drafter](https://github.com/release-drafter/release-drafter); see `.github/release-drafter.yml` and `.github/workflows/release-drafter.yml`. Each entry on the GitHub Releases page corresponds to a tag of the form `vX.Y.Z`.

## Unreleased

### Added

- Initial harness scaffold (Python 3.14 + FastAPI + Pydantic v2 + OpenTelemetry; React 19.2 + Vite + TypeScript strict).
- 15 required CI status checks (lint, typecheck, tests, coverage ≥ 75 %, import-linter, pre-commit, frontend build/quality, security suite, two meta-gates, PR-title lint).
- Release pipeline: tag-triggered build, push to GHCR, CycloneDX SBOM, GitHub Release publish.
- Eval harness scaffold (provider-agnostic runner + LLM-judge Protocol + 1 example golden case + workflow_dispatch nightly).
- `.claude/` agent integration (3 hooks, 6 auto-activating skills, settings example).

### Notes

- This template was extracted from a financial-agent take-home (Teller) and generalised. The harness is the product; the scaffold exists so every gate has something to operate on.
63 changes: 63 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# CLAUDE.md — agent project instructions

You are working in `harness-python-react`, a template repo whose harness IS the product. Code quality here is enforced mechanically — every gate fails CI, not just tests. Keep that bar as you work.

## What this repo is

A production-quality LLM-driven coding harness over a minimal FastAPI + React scaffold. The point isn't the features (one `/health`, one `/echo`, one hello page); the point is that every layer of the pipeline — lint, types, architecture, security, eval, agent hooks — catches a different failure class without anyone remembering to run it.

## Read first

- [`docs/HARNESS.md`](docs/HARNESS.md) — umbrella; the controls and where they live.
- [`docs/INVARIANTS.md`](docs/INVARIANTS.md) — the load-bearing rules. Every PR is checked against them.
- [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md) — layered import-linter contract; reverse imports fail CI.
- [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) — branching, commit format, justfile, CI overview.

## Workflow

- One issue per change. Branch name: `feat|fix|chore|docs|test|refactor/<issue-number>-<kebab-title>`.
- One PR per branch, base `develop`. PR title = the conventional-commit subject.
- `develop → main` happens via a `release:` PR.
- The pre-push gate is `just check` (lint + typecheck + architecture + tests). Run it before pushing.
- For frontend changes, also run `just frontend-check`.

## Code conventions

- **Python:** 3.14, `uv run --frozen` everywhere, mypy `--strict`, ruff with the wide select set (`E W F I N UP B SIM TCH S RUF`).
- **Type hints:** every public function. `from __future__ import annotations` at module top.
- **Models:** anything crossing a module / process seam inherits from `StrictModel` (`src/models/_base.py`). `extra="forbid"`. Add `strict=True` to the class when you want strict type coercion (rejecting `"3.14"` → float).
- **API:** every route under `/api/v1/`. Typed Pydantic responses, not raw dicts.
- **Layer flow:** one-way. Reverse imports are a CI failure. See `docs/BOUNDARIES.md`.
- **Observability:** OTel `agent_span(...)` for any operation in the request path; semconv-defined attribute keys only (constants at the top of `src/observability/spans.py`).
- **Frontend:** React 19 + TS strict; functional components + hooks; never `dangerouslySetInnerHTML` on backend output; SSE consumers use the typed primitive at `frontend/src/lib/api/client.ts`.

## What NOT to do

- Don't bypass gates. `--no-verify` / `--no-hooks` / `--no-gpg-sign` are blocked by `pretooluse_bash.py` for a reason. If a hook is wrong, fix the hook.
- Don't introduce a new commit-type prefix without updating both `pyproject.toml`'s commitizen schema AND `pr-title.yml` (the `Commit-type sync` meta-gate will fail otherwise).
- Don't add a CI job without listing it in `.github/branch-protection/{develop,main}.json` (the `Branch-protection contexts sync` meta-gate will fail).
- Don't skip the architecture contract by accident — `lint-imports` runs in CI and locally via `just architecture`.
- Don't write code without tests. Coverage gate is 75% on `src/`.
- Don't hand-roll secrets into config. Use env / `.env` (gitignored) → `Settings` from `src/models/config.py`.
- Don't create files unless they're necessary. The scaffold has no dead modules.

## Use the skills

The agent-side skills in `.claude/skills/` auto-activate based on context:

- `architect` — when designing module boundaries, API contracts, layer-flow decisions.
- `code-reviewer` — after writing/editing code; runs the 10-point review checklist.
- `devops` — when touching Docker, CI, pyproject.toml, observability config.
- `frontend` — when working in `frontend/` (React 19 + TS + Vite).
- `qa-engineer` — when writing tests or extending the eval harness.
- `technical-writer` — when updating docs / READMEs.

Trust their guidance — they encode this project's conventions.

## When in doubt

- If the change touches a gate, update the meta-gate inputs (`branch-protection/*.json`, `pr-title.yml`, `check_required_contexts.py`'s exemption list).
- If the change touches an invariant, decide whether the invariant is wrong (update `docs/INVARIANTS.md` in the same PR) or the change is wrong (rework).
- If a CI job is failing for a reason that doesn't match the change, dig — don't reroll. Recent fix patterns: tag-vs-commit SHA in pinned action references, `if: hashFiles(...)` startup failures (see project-memory), pytest exit-5 on empty test suites.

The harness exists to make sloppy work hard. Lean into it — when a gate trips, it's protecting the next person reading this codebase.
79 changes: 79 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Contributing

Thanks for taking a look. This template's harness is the product, so the contribution flow is opinionated — every change goes through the same gates as a feature.

## Branching

```
main ◄── release PR ◄── develop ◄── feat/123-short-name
◄── fix/124-bug-name
◄── chore/125-config-change
```

- `main` is the release line. Protected: 15 required status checks, code-owner approval, no force pushes.
- `develop` is the integration branch. Same gates, less strict (PRs don't need rebases).
- Feature branches are short-lived and named `<type>/<issue-number>-<kebab-title>`. Open one issue per branch so the project board stays usable.

## Commit messages

Seven prefixes (enforced in three places — `[tool.commitizen]` in `pyproject.toml`, `pr-title.yml`, `check_commit_types.py`):

| Prefix | When |
|---|---|
| `feat:` | New capability |
| `fix:` | Bug fix |
| `docs:` | Documentation only |
| `test:` | Tests / eval harness |
| `refactor:` | Internal change with no behaviour delta |
| `chore:` | Tooling, deps, infra |
| `release:` | `develop → main` release PRs only |

The subject is **lowercase** after the colon. Title Case prose (`Add the thing`) is rejected; all-caps initialisms (`CI failure`, `SDK upgrade`) are fine.

## Pull requests

1. Open the issue first. Use a feature/bug template; fill every section.
2. Branch off `develop` with the matching name.
3. Land one logical change per PR. Stack PRs if the work is naturally split.
4. The PR template asks five things — answer each (`None` is valid where applicable):
- **What & why** (1–3 lines)
- **Test plan** (checkbox list; CI covers most of it)
- **Invariants affected** — cite numbered rules from `docs/INVARIANTS.md`
- **New deps / actions / external surface** (anchor for supply-chain review)
- **Screenshots** (UI changes only)
5. Wait for green CI + a code-owner review before merging.

## Local pre-push gate

```sh
just check # ruff + mypy + import-linter + pytest
cd frontend && npm run lint && npm run format:check && npm run check && npm run test && npm run build
uv run pre-commit run --all-files
```

A green pre-push run is a high-confidence predictor of a green CI run. The `just check` gate is intentionally a subset of CI — fast feedback over coverage.

## Adding a check

When the harness grows a new gate:

1. Add the workflow job in `.github/workflows/`.
2. If it's a required gate, add the job's display name to the `contexts` arrays in `.github/branch-protection/{develop,main}.json`.
3. If it's NOT required (scheduled / dispatch-only / push-to-main-only), add the workflow filename to `EXEMPT_WORKFLOWS` in `.github/scripts/check_required_contexts.py`.
4. Update `docs/HARNESS.md` and `docs/SECURITY.md` (if security-relevant).
5. Land in one PR — the meta-gate `Branch-protection contexts sync` will fail if you skip step 2 or 3.

## Code of conduct

Be kind. Disagree on substance, not on people. If review feedback gets sharp, take it offline and come back when both sides are ready.

## Reporting security issues

If you find a vulnerability that affects users of the template, **do not open a public issue**. Email the maintainer (see commit history for contact). Include:

- Repro steps
- Affected version / commit SHA
- Severity assessment (informational / low / medium / high / critical)
- Suggested fix if you have one

We'll acknowledge within 72 hours.
71 changes: 69 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,72 @@
# harness-python-react

Production-quality coding harness for Python (FastAPI) backends and Vite + React + TypeScript frontends. Designed for LLM-driven development: every gate — lint, types, architecture, security, eval — is enforced mechanically so code quality stays consistent across many human and AI contributors.
> A production-quality coding harness for Python (FastAPI) + Vite/React/TypeScript projects. Designed for LLM-driven development: every gate — lint, types, architecture, security, eval — is enforced mechanically so code quality stays consistent across many human and AI contributors.

> **Status:** bootstrap. Full documentation, scaffolding, and the harness itself land across [issues #1–#28](https://github.com/constk/harness-python-react/issues). Track progress on the [project board](https://github.com/users/constk/projects/3).
## What ships

- **Backend:** Python 3.14, FastAPI, Pydantic v2 (`StrictModel` base), `uv` deps, OpenTelemetry SDK + OTLP exporter, structured JSON logs, generic tool-registry pattern.
- **Frontend:** Node 24 LTS, React 19.2, Vite 8, TypeScript strict, ESLint 10 flat config, Prettier, Vitest + jsdom + Testing Library.
- **Eval harness:** provider-agnostic runner + LLM-judge `Protocol`, three tolerance modes (exact / numeric / semantic), one example golden case, nightly workflow (disabled by default).
- **CI:** 15 required status checks across `ci.yml` (lint/format, mypy strict, unit tests, coverage ≥75%, import-linter architecture, pre-commit, frontend build, frontend quality, branch-protection sync, commit-type sync) + `security.yml` (gitleaks, pip-audit, npm audit, trivy) + PR-title lint.
- **Release:** tag-triggered workflow that builds the image, pushes to `ghcr.io`, generates a CycloneDX SBOM, and publishes the GitHub Release.
- **Agent integration:** `.claude/hooks/` (forbidden-flag blocker, secret scan, formatter dispatch, SessionStart context) + six auto-activating skills (architect / code-reviewer / devops / frontend / qa-engineer / technical-writer).
- **Docker:** multi-stage Dockerfile (non-root, healthcheck), `docker compose up` boots app + frontend + Jaeger.

## Quickstart

```sh
git clone https://github.com/constk/harness-python-react.git
cd harness-python-react

uv sync --extra dev
uv run pre-commit install --hook-type pre-commit --hook-type commit-msg
(cd frontend && npm ci)

docker compose up # backend :8000, frontend :5173, Jaeger :16686
```

The pre-push gate is `just check` (= ruff + mypy + import-linter + pytest). For frontend changes add `just frontend-check`.

## Why a harness

The differentiator isn't the scaffold — it's that every layer of the pipeline catches a different failure class **without relying on the human or LLM coder remembering to run anything**. The same posture protects code regardless of who wrote it.

See [`docs/HARNESS.md`](docs/HARNESS.md) for the full umbrella. Highlights:

- **Pydantic `StrictModel` everywhere a contract crosses a seam** (rejects unknown keys at construction).
- **`import-linter` enforces one-way layer flow** (`api | eval → agent → tools → data → observability → models`).
- **Three independent secret scans** (PreToolUse hook → pre-commit gitleaks → CI gitleaks).
- **Two meta-gates** that catch *drift in the gates themselves*: `Branch-protection contexts sync` (workflow jobs vs branch-protection JSON) and `Commit-type sync` (commitizen schema vs PR-title allowlist).
- **CycloneDX SBOM attached to every release** for supply-chain attestation.

## Documentation

| File | Purpose |
|---|---|
| [`docs/HARNESS.md`](docs/HARNESS.md) | Umbrella: every control + where it lives |
| [`docs/INVARIANTS.md`](docs/INVARIANTS.md) | The numbered load-bearing rules |
| [`docs/BOUNDARIES.md`](docs/BOUNDARIES.md) | Module layering + the import-linter contracts |
| [`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) | Local setup, branching, justfile, CI |
| [`docs/EVAL_HARNESS.md`](docs/EVAL_HARNESS.md) | Eval flywheel + opt-in for the nightly workflow |
| [`docs/SECURITY.md`](docs/SECURITY.md) | Threat model + defence-in-depth map |
| [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) | Scaffold-level component view |
| [`CONTRIBUTING.md`](CONTRIBUTING.md) | Branching, commit format, PR flow |
| [`CLAUDE.md`](CLAUDE.md) | Agent-facing project instructions |

## Versions

Verified April 2026 (`endoflife.date`):

| Layer | Version | Sunset |
|---|---|---|
| Python | 3.14.4 | active feature release |
| Node LTS | 24.15.0 | through 2028-04-30 |
| React | 19.2.5 | current stable |
| Vite | 8.x | current stable |
| TypeScript | 6.x | current stable |

Bump together (Python in `pyproject.toml`, Node in `frontend/package.json`, both in `Dockerfile` + the CI matrix). Document the bump in `docs/DEVELOPMENT.md`.

## License

[MIT](LICENSE).
Loading
Loading