contract watch contractwatch

ContractWatch

Federal contract anomaly flagging tool. Published as a static dashboard at contractwatch.org.

ContractWatch screens every federal prime contract award above $1M against three patterns and surfaces the survivors of a structural filter. Source data is USASpending.gov. Output is a static site served from Cloudflare Pages. No accounts, no API keys, no closed-source services.

The three flags

Code	Pattern
F01	No prior federal contracts for this UEI plus sole-source award above $10M
F02	No prior federal contracts plus competitive solicitation that received only one offer, above $25M
F03	No prior federal contracts plus first contract above $25M (regardless of competition)

A fired flag is descriptive, not a finding. Many flagged awards have routine explanations and are best read as worth a closer look. Every flagged award on the dashboard links to its USASpending record so readers can review the underlying contract directly.

The structural filter

A flag-firing award is stripped if it matches a structural rule. The rules cover patterns that fire the flags by design: major federal-prime subsidiaries, joint ventures formed for specific procurements, ANC and tribal subsidiaries, municipal authorities, healthcare providers, US utilities, foreign government recipients, FFRDC and national-lab operators, bridge-contract extensions, and a curated list of approximately 1,400 recipient names verified safe by individual review. See engine/structural_filter.py.

The curated list is maintained by batching new flag candidates and verifying each against SAM.gov entity records (formation date, address, business types, registration history), USASpending contract history (prior federal work under the same UEI), and public web sources (company website, news coverage, public-company filings). Investigation is performed by Claude Opus models with each batch reviewed and approved by a human before names are committed. When verification is ambiguous, the entity stays on the dashboard rather than getting safelisted. The filter is updated periodically rather than on every monthly refresh.

Dashboard features

The dashboard at contractwatch.org reads the static JSON files generated by export_json.py and renders them with the following features:

Sort and filter. Awards can be sorted by dollars, action date, or recipient name, and filtered by agency, state, NAICS, PSC, fiscal year, or flag code.
Per-award detail. Click any award to expand the full record: PIID, period of performance, contract type, competition mechanism, awarding office, the full description, and the list of flags that fired on it with their rationale text.
USASpending link. Every award includes a direct link to its public USASpending record so readers can verify the underlying contract data.
Export CSV. The "Export CSV" button in the dashboard header generates a CSV of the currently visible awards (respects active filters and sort order) and triggers a browser download. The file is named contractwatch-awards-MMDDYY.csv and contains 18 columns: piid, recipient, uei, state, agency, office, obligation, action_date, start_date, end_date, contract_type, competition, naics, psc, description, flag_codes, flag_details, usaspending_url. The CSV is generated client-side in the browser from the loaded JSON; no server round-trip, no separate export script to run.

Architecture

USASpending bulk archives (annual FY zips)
        |
        v
  tools/bulk_loader.py                 (one-time / monthly rebuild)
        |
        v
  contractwatch.db (SQLite, single file)
        |
        v
  tools/reflag_all.py + engine/flags.py + engine/structural_filter.py
        |
        v
  flags table
        |
        v
  export_json.py  ->  web/data/latest.json, stats.json, history/*.json
        |
        v
  web/index.html  ->  Cloudflare Pages  ->  contractwatch.org

A scheduled monthly refresh (monthly_scan.sh, fired by launchd on the 8th of each month) re-runs this whole pipeline once a month against the latest USASpending bulk archive snapshot. scan.py remains available for manual catch-up scans against the live USASpending API when needed, but is no longer scheduled.

After deploy, monthly_scan.sh runs tools/build_review_queue.py to write a local-only logs/review_queue_<date>.json of new-and-uninvestigated awards, and sends an iMessage success summary with headline stats if CONTRACTWATCH_NOTIFY_PHONE is set.

Prerequisites

Before starting, the following must be installed on the local machine.

Python 3.11 or newer. Check the installed version with python3 --version. macOS ships with Python 3.9 in /usr/bin/python3 which is too old; install a newer version via python.org, Homebrew (brew install python@3.12), or uv python install 3.12.
uv, the Python package manager from Astral. Used in place of pip + virtualenv. Install on macOS or Linux with curl -LsSf https://astral.sh/uv/install.sh | sh, or on Windows with powershell -c "irm https://astral.sh/uv/install.ps1 | iex". ContractWatch uses uv for environment isolation and dependency resolution because it is roughly 10x faster than pip and handles Python version installation in one tool.
Approximately 25 GB of free disk space. Roughly 20 GB for the downloaded USASpending bulk archives (held in memory during processing) plus a few hundred MB for the resulting SQLite database. Add a few GB of headroom for working files. The 12 fiscal-year archives total about 19.76 GB downloaded; the final database after deduplication is around 356 MB.
Optional: wrangler. Cloudflare's CLI for deploying static sites to Cloudflare Pages. Only required if pushing the dashboard to a Cloudflare-hosted site. Install with npm install -g wrangler. The dashboard can also be served locally or from any other static host without wrangler.

From clone to live dashboard

Six steps. The full sequence takes roughly 50 minutes the first time, almost all of which is the bulk database build. Subsequent runs (reflag, export, view) take seconds.

Step 1. Install dependencies

From the repository root:

uv sync

This creates a local virtual environment in .venv/ and installs the project's Python dependencies (just requests, since ContractWatch deliberately keeps the dependency tree minimal). It also reads pyproject.toml for any Python-version constraints and downloads a matching interpreter if needed.

Step 2. Configure overrides (optional)

ContractWatch ships with reasonable defaults and runs without any configuration. To customize behavior, copy the example environment file to a real one:

cp .env.example .env

Then edit .env to set any of the optional variables documented in the Configuration section below (excluded agencies, scan window, Cloudflare project name, etc.). The .env file is gitignored so local settings stay out of the repository.

Step 3. Build the database

This step downloads USASpending's annual bulk archives, parses them, and loads the resulting awards into a SQLite database at contractwatch.db in the repository root.

uv run python tools/bulk_loader.py --mode initial

--mode initial loads FY2015 through the current federal fiscal year. The loader computes the current FY from today's date (FY starts Oct 1) and discovers the latest USASpending bulk archive snapshot date via HEAD probes against a known-stable FY URL. No URL editing is required when fiscal years roll over or when USASpending publishes a new monthly snapshot. The monthly scheduled refresh uses --mode monthly, which loads only the previous closed FY plus the current FY (the active years that change month over month).

FY15, FY16, and FY17 are loaded as history-only: they populate the prior-history lookback that the flag pipeline uses to decide "is this entity new to federal contracting" but they are never themselves evaluated as flag candidates. The dashboard begins at FY18 (start of October 2017).

Expect roughly 26 minutes total wall time on a fast connection for the initial load. The loader pipelines downloads and parsing: while one archive is being parsed (~2 min), the next archive is already being fetched (~45 sec at typical USASpending S3 speed). Total downloaded across the 12 archives is about 19.76 GB; the resulting SQLite database after deduplication is around 356 MB. Subsequent monthly refreshes (--mode monthly) download just 2 archives (~3-4 GB, ~5 min). Watch live progress in a browser by running this in a separate terminal window:

python -m http.server 8000 -d web

Then open http://localhost:8000/loader.html in a browser. The page polls web/data/loader_status.json every two seconds and shows download speed, parse progress, and per-FY status.

If you are driving the build through an AI coding assistant (Claude Code, Cursor, etc.) that uses ephemeral shells, do NOT use python -m http.server & from the assistant's bash. Background processes started with & inside an assistant's shell frequently die when that shell exits, leading to a confusing "can't connect" pattern. Use ./serve_loader.sh instead, which uses nohup and disown to truly detach the server so it survives shell-context death.

Step 4. Apply the flags

With the database built, run the flag pipeline:

uv run python tools/reflag_all.py

This evaluates F01/F02/F03 against every award in the database, applies the structural filter, and writes the surviving flags to the flags table. The full pass completes in about 1.5 seconds because the flag-eligible subset is pulled via three bulk SQL queries rather than per-row Python iteration. Output prints the candidate count, survivor count, and how many candidates were stripped by the structural filter.

The previous flags table is backed up to a timestamped table (e.g., flags_backup_20260528_071500) before being replaced, so prior flag state can be inspected or restored if needed.

Step 5. Generate the dashboard JSON

Convert the populated flags table into the static JSON files the dashboard reads:

uv run python export_json.py

This writes web/data/latest.json (the flagged awards), web/data/stats.json (running totals), and a dated archive at web/data/history/YYYY-MM-DD.json. The publish filter (drop pre-FY18 action dates) is applied at this step, so the published count is slightly lower than the in-database flag count.

Step 6. View the dashboard

Serve the web/ folder locally:

python -m http.server 8000 -d web

Then open http://localhost:8000/ in a browser. The dashboard reads the JSON files generated in step 5 and renders the flagged awards with sorting, filtering, and per-award detail expansion.

Monthly refresh

The scheduled workflow is the monthly_scan.sh shell script. It downloads the latest USASpending bulk archives (FY15-FY26), rebuilds the SQLite database, runs the bulk reflag, regenerates the dashboard JSON, and (if wrangler is configured) deploys to Cloudflare Pages.

./monthly_scan.sh

Expect roughly 26 minutes wall time and ~25 GB of free disk.

For automated monthly runs on macOS, copy launchd/com.contractwatch.plist.example to ~/Library/LaunchAgents/com.contractwatch.plist, edit the two hardcoded absolute paths (launchd does not expand ~), then load the job:

launchctl load ~/Library/LaunchAgents/com.contractwatch.plist

The example template fires once a month on the 8th at 07:00 local time. USASpending typically publishes the monthly archive snapshot on the 5th or 6th, so the 8th gives a small buffer for the upstream data to settle. Adjust the Day, Hour, and Minute values in the plist to change the schedule. To disable the job, run launchctl unload with the same path.

Manual catch-up scans

scan.py remains available for ad-hoc catch-up against the live USASpending API when you need data fresher than the monthly bulk archive:

uv run python scan.py --start 2026-05-06 --end 2026-05-28 --min-amount 1000000

This is not scheduled by default. Use it sparingly; the live API is rate-limited and the monthly bulk archive is the authoritative source.

Project layout

contractwatch/
├── README.md              this file
├── LICENSE                MIT
├── pyproject.toml         uv-managed deps (only `requests`)
├── uv.lock
├── .env.example           optional env-var overrides; all keys optional
├── .gitignore             blocks DB, archives, .env, .venv, cache, web/data
├── wrangler.toml          Cloudflare Pages config
│
├── scan.py                CLI: live scan via USASpending API
├── export_json.py         build web/data/latest.json + stats.json
├── monthly_scan.sh        launchd-friendly bulk-load + reflag + export + deploy
├── serve_loader.sh        detached HTTP server for web/loader.html (use with AI coding assistants)
│
├── engine/                core flag pipeline
│   ├── config.py          thresholds, paths, env-var loader
│   ├── db.py              SQLite schema + helpers
│   ├── usaspending.py     USASpending HTTP client (transport only)
│   ├── normalize.py       award normalization (business logic)
│   ├── scanner.py         live scan engine
│   ├── flags.py           F01/F02/F03 definitions + detail formatters
│   └── structural_filter.py  rules + curated safe-recipient list
│
├── tools/
│   ├── bulk_loader.py            load USASpending archive ZIPs into the DB; --mode initial (FY15-current) or --mode monthly (prev FY + current FY); URLs and snapshot date generated internally
│   ├── reflag_all.py             bulk SQL re-flag of the full DB (1.5s)
│   └── build_review_queue.py     diff post-deploy latest.json against prior snapshot; write logs/review_queue_<date>.json of new-and-uninvestigated awards
│
├── launchd/
│   └── com.contractwatch.plist.example   launchd job template (macOS)
│
├── logs/                  local-only working state (gitignored)
│   ├── agent_verdicts.json     running history of per-recipient Opus verdicts; consulted by build_review_queue.py to skip already-investigated entities
│   ├── snapshots/              per-cycle baselines of latest.json, used for month-to-month diffs
│   └── review_queue_<date>.json   per-cycle queue of new-and-uninvestigated awards (produced by build_review_queue.py)
│
└── web/                   static dashboard (served by Cloudflare Pages)
    ├── index.html         main flagged-awards view
    ├── loader.html        live bulk-loader status (polls loader_status.json)
    ├── llms.txt           machine-readable site description
    ├── robots.txt, sitemap.xml, favicons, social-preview.png
    └── data/              generated JSON (gitignored; regenerate via export_json.py)

Data

Source: USASpending.gov, the federal government's authoritative public spending dataset. The annual bulk archives are used for both the initial database build and the monthly refresh. The search API is available for manual catch-up between monthly snapshots via scan.py but is not scheduled.
Coverage: Federal fiscal year 2018 (October 2017) through the latest USASpending publication. Most $1M+ prime contract actions report to USASpending on a 30-45 day lag; very recent activity will fill in retrospectively.
DB shape: One row per unique contract_award_unique_key, tagged with the LATEST action_date seen across all archives. A multi-year IDV with transactions in FY18-FY26 has a single row tagged with the FY26 date. The flag pipeline keys off "is this entity new" (has the UEI any prior awards), which is the signal that matters for F01/F02/F03.

Deployment

The dashboard is a static web/ folder. ContractWatch deploys to Cloudflare Pages via wrangler, but any static host works. To run your own copy on Cloudflare:

Create a Cloudflare account and a Pages project named whatever you like.
Install wrangler and wrangler login.
Set CONTRACTWATCH_CF_PROJECT in your .env to your project's name.
wrangler pages deploy web --project-name="$CONTRACTWATCH_CF_PROJECT", or just run monthly_scan.sh.

You can also skip Cloudflare entirely and serve web/ from anywhere: python -m http.server, S3, GitHub Pages, nginx.

Configuration

All env vars are optional. See .env.example for the full list.

CONTRACTWATCH_EXCLUDED_AGENCIES: pipe-delimited agency names to skip at ingestion and strip at reflag time
CONTRACTWATCH_BACKFILL_DAYS: default lookback window in days for the manual scan.py catch-up tool, default 2 (used only when scan.py is invoked without explicit --start/--end / --days flags)
CONTRACTWATCH_CF_PROJECT: Cloudflare Pages project name for monthly_scan.sh
CONTRACTWATCH_NOTIFY_PHONE: optional E.164 phone (e.g. +15551234567) for iMessage notifications from monthly_scan.sh. Sends one success message with headline stats on a clean run and one failure message with phase plus exit code on any phase failure. macOS only (uses Messages.app via osascript). Leave empty to disable.

Adjusting the flags

Threshold knobs live in engine/config.py:

MIN_OBLIGATION — ingestion floor (default $1M)
CRITICAL_SOLE_SOURCE_MIN — F01 dollar floor (default $10M)
FIRST_LARGE_AWARD_MIN — F03 dollar floor (default $25M)

F02's ONE_OFFER_MIN_OBLIGATION is in engine/flags.py (default $25M). Edit a value, run uv run python tools/reflag_all.py, then uv run python export_json.py. New thresholds take effect immediately. No DB rebuild needed.

Extending the structural filter

Two ways to mark a recipient safe:

By name: add to CURATED_SAFE_RECIPIENT_NAMES in engine/structural_filter.py. Best for known-safe entities the existing rules don't catch (verified municipal authorities, specific hospitals, etc.).
By pattern: add a new StructuralRule to the STRUCTURAL_RULES list. Best for an entire category of false positives (a new kind of JV structure, a regional utility cooperative pattern, etc.).

Both are pure Python; no schema changes, no DB rebuild. Run tools/reflag_all.py to see the new exclusion take effect.

Tips for builders

Flags and the structural filter are paired by design. A flag is the positive signal: it catches a pattern. The structural filter is the negative signal: it strips patterns that fire mechanically rather than meaningfully (major-prime subsidiaries, named JVs, ANC sole-source, M&O contracts, etc.). Neither is useful without the other.

A flag without a paired structural filter is not a usable signal. Empirically, raw flag candidates outnumber surviving flags by 3:1 to 5:1 after structural filtering. Adding a flag without doing the matching filter work means dumping unreviewed noise onto the dashboard.

Plan flags in groups, not singletons. F01/F02/F03 form one group targeting the no-prior-history large-award pattern and share structural-filter rules across all three. A new flag for a different anomaly category (pass-through, repeat sole-source clustering, bid concentration) should be planned as its own group with its own filter work for that category's noise.

Test empirically before committing. Write the candidate SQL, apply the existing structural filter, inspect what survives. If most survivors are clearly legitimate, the signal-to-noise ratio is too low and the flag is not worth adding.

Not every detectable pattern is worth detecting. Leaving a coverage gap acknowledged is better than filling it with noise.

License

MIT. See LICENSE. Source data is public-domain US federal government data.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly