billwatch

A cached, token-efficient CLI for querying US state legislative data. Designed for LLM consumption following AXI principles — the CLI is the data layer, the LLM is the intelligence layer.

What it does

Given a state and a keyword, billwatch returns legislative bills in a compact token-efficient format (TOON) with a help[] block pointing at the next useful command. Everything runs against a local SQLite cache with an FTS5 full-text index, so every query is offline and instant.

Data source

Bills come from OpenStates monthly Postgres dumps. OpenStates publishes a free dump covering every state + DC at data.openstates.org/postgres/. You restore it to a local Postgres, run billwatch import openstates once, and every state's bills, sponsors, abstracts, and full text land in the local SQLite cache. After that, all queries run offline — no API keys, no rate limits, no HTTP calls.

Refreshing is a monthly job: grab the next dump, re-run billwatch import openstates, and the cache is current. A live-refresh layer on top of LegiScan for sub-monthly freshness is planned but not yet implemented — tracked in the project's GitHub issues.

Install

git clone <this repo> billwatch
cd billwatch
uv sync --extra openstates --extra dev

No API keys to configure. The only prerequisite is a local Postgres to restore the dump into (see the import openstates workflow below).

Quick start

# what's cached, last sync
uv run billwatch

# one-time bulk import — every state, offline
uv run billwatch import openstates \
    --pg-url postgresql://openstates:test@localhost:54329/openstates \
    --with-text

# search (FTS5 against the local cache — 0 network calls)
uv run billwatch search NY "artificial intelligence"
uv run billwatch search IL "ai"

# single bill details
uv run billwatch bill NY S6953
uv run billwatch bill IL HB1806

# full bill text
uv run billwatch text NY S6953
uv run billwatch text IL HB1806

# browse cached bills
uv run billwatch ls            # all cached states
uv run billwatch ls NY         # all cached NY bills
uv run billwatch ls NY --status "signed by governor" --limit 20

Commands

`billwatch` (no args) — Ambient context

Prints version, cached state summary, and help suggestions. Run this first in any new conversation to see what's already available locally.

billwatch v0.1.0

cached[2]{state,bills,synced}:
  IL,108,2h ago
  NY,198,2h ago

help[]:
  Run 'billwatch search IL "tax"' to search cached bills
  Run 'billwatch ls' to list cached states
  Run 'billwatch --help' for all commands

`billwatch search STATE "KEYWORD"`

Runs an FTS5 search over the cached bills for the given state. Exits with cache_miss if the state has no cached bills yet — populate with billwatch import openstates --state XX first.

Options:

--since N — only show bills with last_action_date in the last N days
--status S — filter by status substring (case-insensitive), e.g. --status "signed"
--fields f1,f2 — add fields to the output (sponsors, url, description, session, bill_type)
--full — show all fields
--json — emit valid JSON instead of TOON (useful for piping into jq)

Query syntax notes:

Multi-word queries are automatically phrase-quoted for FTS5, so billwatch search NY "artificial intelligence" matches the exact phrase.
Single-word queries match any bill whose title, description, or bill number contains the word.
Malformed FTS5 queries (e.g. bare operators, unmatched quotes) return zero results rather than crashing.

`billwatch bill STATE BILL_NUMBER`

Shows full metadata for a single bill: sponsor, status, last action date, description, URL, etc. Description is truncated to 500 chars unless --full is passed. Returns not_found if the bill isn't in the cache.

uv run billwatch bill NY S6953
uv run billwatch bill IL HB1806 --full --json

`billwatch text STATE BILL_NUMBER`

Returns the complete bill text as plaintext from the cache. Default truncates to 4000 chars with a size hint; --full returns the complete text. Requires the import to have been run with --with-text.

uv run billwatch text NY S6953
uv run billwatch text IL HB1806 --full

Pipe through grep / head / tail to narrow in on specific sections:

uv run billwatch text NY S6953 --full | grep -A 3 "DEFINITIONS"

`billwatch ls [STATE]`

Without a state: lists all cached states with bill counts and freshness (relative time). With a state: lists the cached bills for that state, sorted by last_action_date desc.

Options (when a state is given):

--limit N — show N bills (default 20, 0 = all)
--status S — filter by status substring
--since N — bills with action in last N days
--fields, --full, --json — same as search

`billwatch import openstates`

Bulk-imports bills from an OpenStates monthly Postgres dump into the local SQLite cache. Run this once per month and you have every state's legislature indexed locally with zero ongoing API calls.

Workflow:

# 1. Download the monthly dump (~10 GB)
curl -O https://data.openstates.org/postgres/monthly/2026-04-public.pgdump

# 2. Start a local Postgres and restore
docker run -d --name openstates -e POSTGRES_PASSWORD=test \
    -e POSTGRES_USER=openstates -e POSTGRES_DB=openstates \
    -p 54329:5432 postgres:17
docker cp 2026-04-public.pgdump openstates:/tmp/dump.pgdump
docker exec openstates pg_restore --no-owner --no-privileges \
    -d openstates -U openstates /tmp/dump.pgdump

# 3. Install the optional psycopg dependency (already done if you ran `uv sync --extra openstates`)
uv sync --extra openstates --extra dev

# 4. Import one state (fast, ~30s for most states)
uv run billwatch import openstates \
    --pg-url postgresql://openstates:test@localhost:54329/openstates \
    --state NY \
    --with-text

# or: import every state at once (much slower, larger cache)
uv run billwatch import openstates \
    --pg-url postgresql://openstates:test@localhost:54329/openstates \
    --with-text

Options:

--pg-url URL — Postgres connection string. Alternatively set OPENSTATES_PG_URL env var.
--state XX — filter to one state (2-letter code). Default imports all states.
--limit N — max bills to import (useful for dry runs).
--with-text — also import full bill text from opencivicdata_searchablebill.raw_text. Slower but worth it.

What gets imported: bill identifier, title, status (derived from latest_passage_date + latest_action_description), sponsors, abstracts (as description), source URLs, session, and optionally raw text. Classification and related metadata are preserved in the cached data blob.

Output format — TOON

billwatch defaults to TOON (Token-Optimized Object Notation), a CSV-flavored format that's roughly 40% fewer tokens than JSON. A list looks like:

bill[3]{id,title,status,last_action}:
  S6953,Relates to the training and use of artificial intelligence frontier models,Signed by Governor,2025-12-19
  S7599,Relates to automated decision-making by government agencies,Signed by Governor,2025-10-07
  S822,Relates to the disclosure of automated employment decision-making tools and maintaining an artificial intelligence inventory,Signed by Governor,2025-12-19

Values containing commas are double-quoted (CSV-style with inner quotes doubled). Newlines are escaped as \n.

A single object:

bill{bill_number,title,status,last_action_date}:
  S6953
  Relates to the training and use of artificial intelligence frontier models
  Signed by Governor
  2025-12-19

If you need real JSON, pass --json.

Errors

Errors go to stdout (not stderr) as structured TOON with an exit code of 1:

error{code,message}:
  cache_miss
  No cached bills for CA. Run 'billwatch import openstates --state CA' to populate the cache.

Error codes you'll see:

cache_miss — the requested state has no cached bills; run import openstates first
not_found — bill lookup didn't match anything in the cache
invalid_state — unknown 2-letter state code
invalid_source — import <source> used an unknown source (only openstates is supported)
missing_pg_url — import openstates without --pg-url or OPENSTATES_PG_URL
missing_dependency — psycopg isn't installed; run uv sync --extra openstates

No prompts, no stderr noise, clean exit codes — safe to pipe and script.

Cache

Stored at ~/.billwatch/cache.db. SQLite with two tables:

bills — normalized bill records keyed on (state, bill_number, session)
bill_texts — plaintext bill bodies, joined by internal rowid

Plus a bills_fts FTS5 virtual table for local full-text search over cached bills (title, description, bill_number).

The cache is append/replace-only: import openstates upserts whatever's in the current dump. Delete the file to reset.

What billwatch is NOT

Not a summarizer. It doesn't categorize, rank, or generate text. If you ask "which IL bills have teeth?" billwatch returns the raw bill texts — an LLM (or a human) does the reasoning.
Not a live data source. The cache is only as fresh as the last OpenStates dump you imported (monthly cadence). For sub-monthly freshness, a LegiScan live-refresh layer is tracked in the GitHub issues.
Not magic about false positives. Keyword search hits anything containing the word. Many bills tagged "artificial intelligence" are budget line items or unrelated statutes that mention AI in passing. Expect to filter.

Running tests

uv run pytest tests/ -x -q
uvx ruff check src/ tests/
uvx ruff format --check src/ tests/ --line-length 144

All tests are offline — no network access required. The OpenStates importer tests mock psycopg cursors directly, so the suite runs even without psycopg installed.

Architecture sketch

              ┌────────────────────────┐
              │ OpenStates monthly dump│  user restores to local Postgres
              └───────────┬────────────┘
                          │ billwatch import openstates
                          ▼
              ┌────────────────────────┐
              │    SQLite cache        │  bills, bill_texts, bills_fts (FTS5)
              │    ~/.billwatch        │
              └───────────┬────────────┘
                          │ every query hits the cache
                          ▼
              ┌────────────────────────┐
              │          CLI           │  search, bill, text, ls, import
              └────────────────────────┘

Only two Python modules matter at runtime: openstates.py (one-shot bulk importer) and cache.py (SQLite + FTS5 reads). The CLI is a thin Click wrapper on top.

Design notes

Things that are deliberate and worth understanding before changing:

No ORM anywhere. SQLite uses raw sqlite3 with parameterized queries; OpenStates Postgres access uses raw psycopg with parameterized queries. Keeps the schema visible, debugging tractable, and dependencies minimal.
Unified bill dict shape. Bills normalize to {bill_id, state, bill_number, title, description, status, last_action_date, url, sponsors, data, fetched_at, session, source}. The cache, formatter, and CLI treat this as the canonical shape — any new source (e.g., a future LegiScan adapter) should produce dicts that match.
bill_id = None is a feature. The cache's upsert_bill dispatches on whether the caller supplied an integer bill_id or passed None. OpenStates passes None so SQLite auto-assigns a rowid, and the semantic identity lives in (state, bill_number, session).
FTS5 update ordering matters. bills_fts is an external-content FTS5 table pointing at bills. Updating a row means: DELETE FROM bills_fts WHERE rowid = ? first (while the old content is still in bills), then UPDATE bills, then INSERT INTO bills_fts with the new values. Deleting after the update would leave stale terms indexed. See upsert_bill in cache.py.
Batched enrichment in the importer. The OpenStates importer fetches bills in one query, then batches sponsors/abstracts/sources/texts in 500-row chunks using WHERE bill_id = ANY(%s). Limits round-trips and memory for a 10 GB dump.
Optional dependencies load lazily. psycopg is only installed via the [openstates] extra and is imported under try/except ImportError. Normal users don't pay for a dep they won't use, and the test suite runs without it because the tests mock psycopg cursors.
Errors on stdout, not stderr. All command failures emit structured TOON error{code,message} on stdout with exit code 1. This makes billwatch safe to pipe and script — consumers can distinguish success from failure by exit code without losing the error payload.
Cache-only at query time. search, bill, text, and ls never hit the network. If the cache is stale or missing data, the answer is to re-run import openstates, not to fall back to a live scrape.

Extending billwatch

The current architecture supports exactly one data source (OpenStates bulk import). The planned extension is a LegiScan live-refresh layer that sits alongside the importer and freshens individual bills between monthly dumps — see the LegiScan issue in the GitHub tracker for scope.

When that's built, the seams are:

A new legiscan.py module that returns the unified bill dict shape for a (state, bill_number) lookup.
A refresh hook in bill and text commands that calls LegiScan when a cached entry is older than some TTL.
A quota counter for the LegiScan API (the prior api_usage table was removed when the old scrapers were ripped out — reintroduce it if needed).

Until that lands, any query that needs fresh data should be answered by re-running the monthly import.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
src/billwatch		src/billwatch
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
PROMPT.md		PROMPT.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

billwatch

What it does

Data source

Install

Quick start

Commands

`billwatch` (no args) — Ambient context

`billwatch search STATE "KEYWORD"`

`billwatch bill STATE BILL_NUMBER`

`billwatch text STATE BILL_NUMBER`

`billwatch ls [STATE]`

`billwatch import openstates`

Output format — TOON

Errors

Cache

What billwatch is NOT

Running tests

Architecture sketch

Design notes

Extending billwatch

See also

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

billwatch

What it does

Data source

Install

Quick start

Commands

billwatch (no args) — Ambient context

billwatch search STATE "KEYWORD"

billwatch bill STATE BILL_NUMBER

billwatch text STATE BILL_NUMBER

billwatch ls [STATE]

billwatch import openstates

Output format — TOON

Errors

Cache

What billwatch is NOT

Running tests

Architecture sketch

Design notes

Extending billwatch

See also

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`billwatch` (no args) — Ambient context

`billwatch search STATE "KEYWORD"`

`billwatch bill STATE BILL_NUMBER`

`billwatch text STATE BILL_NUMBER`

`billwatch ls [STATE]`

`billwatch import openstates`

Packages