Skip to content

ModelMirrorAI/fedcourtsai

FedCourtsAI

CI lint-actions codeql Python ≥3.12 License: MIT

Agentic AI system to predict events in US federal courts — for example, whether a motion before a court of appeals or the Supreme Court will be granted or denied, the likely vote of each judge or justice, and a detailed prediction of the court's reasoning.

Status: early scaffold. The pipeline shape, data contract, and automation are in place; most feature work is done by AI coding agents via the label-driven workflows below.

No predictions have been published yet — the first target is the OT2026 long-conference cert release (see milestones).

Not legal advice. Outputs are experimental model predictions — they may be wrong, carry no affiliation with or endorsement by any court, and are not legal advice or a forecast you should rely on for any decision.

Predictions about how individual judges or justices may vote describe likely outcomes — they are not assertions of fact and not statements about how anyone should decide.

How it works

The project runs as a label-driven pipeline of GitHub Actions. Work is represented as GitHub issues; applying a run:* label triggers the matching workflow. When a stage needs to hand off, it opens (or labels) an issue to trigger the next stage. Several stages delegate to agentic coding tools (Claude Code and Codex), which branch, do the work, and open a pull request.

Label Workflow Does Engine
run:dev run-dev Normal development on the pipeline codebase Claude Code
run:seed run-seed Ingest initial dockets from CourtListener into the corpus Script
run:pull run-pull Refresh tracked dockets (also runs on a daily schedule) Script (agent only if ambiguous)
run:reconcile run-reconcile Confirm a decided event's outcome.json from the docket when pull can't Claude Code
run:predict run-predict Predict open events with multiple competing predictors (fan-out) Claude Code + Codex
run:evaluate run-evaluate Score past predictions against realized outcomes (evaluator × predictor) Claude Code + Codex

Plus run-ops, a read-only daily health & cost dashboard that has no run:* label — it runs on a schedule (or manual dispatch). See docs/pipeline.md.

flowchart TD
    seed["run:seed — seed dockets"] --> corpus[("corpus")]
    pull["run:pull — refresh dockets<br/>(daily schedule)"] --> corpus
    pull -->|"changed?"| predict["run:predict — predict open events<br/>(matrix over predictors)"]
    predict --> ppr[/"pull requests"/]
    predict -.->|"outcome lands via pull,<br/>or run:reconcile"| evaluate["run:evaluate — score every predictor"]
    evaluate --> epr[/"pull requests"/]
Loading

Longer term, an automated-research harness (in the spirit of Anthropic's automated alignment researchers) proposes new predictor designs, registers them as new entries in the predictor registry, and lets them compete — so run-predict tracks a growing field of agents and run-evaluate is the tournament that ranks them.

Data model

State lives in two stores, split by kind. Raw facts — dockets, snapshots, judges, case and event metadata — go into a packed corpus (SQLite under DVC/S3), written identically by seed and pull. Derived artifacts are versioned as files in git, organized case-centrically so everything we conclude about a single predictable event lives together:

data/cases/<court_id>/<docket_id>/events/<event_id>/
  outcome.json                   # ground truth, once the event resolves
  predictions/<predictor_id>/<run_id>/
    prediction.json              # quantitative: granted 1/0, P(granted), votes
    reasoning.md                 # qualitative: predicted reasoning
  evaluations/<evaluator_id>/<predictor_id>/<run_id>/
    evaluation.json
    evaluation.md

Every git artifact validates against a pydantic model in fedcourtsai.schemas (exported to schemas/*.schema.json). See docs/data-model.md for the rationale and docs/data-pipeline.md for the corpus.

Develop

Requires uv. A devcontainer is included (.devcontainer/) and is the recommended way to work in Codespaces.

uv sync                       # install deps into .venv
uv run fedcourts --help       # CLI (full reference: docs/cli.md)
uv run fedcourts export-schemas
uv run fedcourts validate data

# the local gate CI also runs:
uv run ruff format --check .
uv run ruff check .
uv run mypy
uv run pytest

seed and pull are single-docket REST helpers that fetch one case from the CourtListener REST API into the corpus through the shared ingestion core, so they need a free API token. seed onboards a docket; pull refreshes one and reports whether it changed:

export FEDCOURTS_COURTLISTENER_API_TOKEN=...   # https://www.courtlistener.com/help/api/rest/
uv run fedcourts seed --court ca9 --docket <docket_id>   # onboard one docket into the corpus
uv run fedcourts pull --court ca9 --docket <docket_id>   # refresh one docket; report changes

The historical mass is loaded by seed-backfill, what the run-seed workflow runs: deterministic, no-agent ingestion of CourtListener bulk data (no API token, no API budget) into the same corpus through the same core. It loads one chunk of the tracked courts per run against a resumable cursor (config/seed-progress.yaml), chunked until complete:

uv run fedcourts seed-backfill --report seed-report.json   # load the next bulk chunk

See docs/seed-backfill.md and docs/data-pipeline.md.

For AI agents

Start with AGENTS.md — it is the canonical instruction file and defines the branch-and-PR workflow every agent follows. CLAUDE.md points to it.

Repository layout

src/fedcourtsai/    library: CourtListener client, schemas, paths, registry, CLI
config/             predictor & evaluator registries, tracking settings
data/               tracked cases (versioned)
schemas/            JSON Schema exported from the pydantic models
docs/               architecture, data model, pipeline, security
.github/workflows/  the label-driven pipeline + CI + workflow linting
.github/prompts/    engine-agnostic prompts used by both Claude Code and Codex

Documentation

Data & attribution

Court data comes from CourtListener, a project of the Free Law Project — via the CourtListener REST API and the quarterly bulk-data exports. A great deal of this project rests on their work; please review and support it. Use of their data is governed by CourtListener's terms (CC BY-ND 4.0 for CourtListener's own content; the underlying federal records are public domain), with attribution also recorded in the top-level NOTICE.

The derived corpus is not publicly republished — it stays in an access-gated store; only our model-generated judgments over those public records reach public git. We ingest only public-record dockets and never sealed or privileged material. See docs/data-sources.md for the full position on terms, redistribution, the API budget, and PII.

FedCourtsAI is independent and is not affiliated with or endorsed by the Free Law Project or any court. Court records are public records of the U.S. federal courts; the predictions and evaluations in this repository are model-generated and are not official court records.

License

MIT — see LICENSE.

About

Agentic AI system for predicting events in the U.S. federal courts — motion outcomes, judge votes, and predicted reasoning.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors

Languages