TXLookup

Texas civic data, accessible to anyone who can search Google. A multi-agent system that turns plain-English questions into sourced answers across 6,061 Texas open-data datasets. Free. Open source. MIT-licensed.

What it does

You type a question in plain English. A team of OpenAI-powered agents picks the right dataset, writes the SoQL query, runs it on the source-of-truth portal, and hands you a sourced answer — every claim citable back to the originating portal, every step replayable.

Live: txlookup.vercel.app · Try it: Restaurants near 78704 with failing inspections this year · Pitch: /pitch

$ "Where do construction permits cluster in Austin in the last 30 days?"

→ Planner picks 3syk-w9eu (Austin Construction Permits)
→ Analyst runs $select=original_zip,count(*) $where=issue_date>='2026-04-10' $group=original_zip
→ 412 rows, 870ms
→ Critic verifies grounding
→ Reporter composes answer
→ ~7 seconds end-to-end · cited to data.austintexas.gov · replayable SODA URL

Why

Civic data is public. Reaching it isn't.

6 portals, two API styles (Socrata + CKAN). Different IDs, different conventions.
Schema drift — 180+ columns just for permits, with overlapping semantics (permittype vs work_class vs permit_class_mapped).
Brutal SoQL — $where, $group, date_extract_y, double-quoting, escape rules. One typo and the query 400s.
Download + sift — current path is "open the 200k-row CSV in a spreadsheet". Most people give up.

TXLookup is the layer between you and 6,061 datasets. If you can search Google, you can ask Texas civic data anything.

Architecture

Seven specialist agents coordinate behind a single search box:

Agent	Role
Planner	Picks the dataset, drafts a structured plan with bounded tool calls.
Data analyst	Writes SoQL, computes stats with quality flags (null rate, top concentration, sample factor).
Reporter	Composes plain-English answer, grounded in the analyst's findings.
Critic	Reviews plan + answer for groundedness and citation. Forces revision on reject.
Support	Handles meta-questions and disambiguation. No SoQL fired.
Dataset scout (cron)	Indexes new portal datasets every 6h.
Ingestor (cron)	Refreshes the local-mirror cache so pages stay fast and survive throttling.

The patentable bit: a pattern-based doom-loop guard (identical-3x and [A,B,A,B] cycle predicates) plus an intent-preserving replan path that survives plan rewrites. See docs/deepinvent-submission.md.

Full architecture: txlookup.vercel.app/architecture · short doc: docs/architecture.md

Quick start

Try it without installing anything

Open txlookup.vercel.app, click any of the What people ask chips, watch the agent fire.

Install the MCP server (Claude Code)

claude mcp add txlookup -- python -m mcp.server

Codex

codex mcp add txlookup --command python --args -m --args mcp.server

Cursor — paste into MCP settings

{
  "txlookup": {
    "command": "python",
    "args": ["-m", "mcp.server"]
  }
}

The server exposes 8 MCP tools: ask_data, discover_datasets, get_dataset_schema, fetch_data, get_task_status, create_miro_board, add_to_miro, list_known_tools. Full reference at /use-as-agent.

Run locally

git clone https://github.com/ATX-TXLookup/TXLookup
cd TXLookup
npm install
pip install -r requirements.txt

# .env.local — keys only, never committed
cat > .env.local <<'EOF'
OPENAI_API_KEY=sk-...
SOCRATA_KEY_ID=...        # optional, higher rate limit
SOCRATA_KEY_SECRET=...
MIRO_API_TOKEN=eyJ...     # optional, for /q "render to Miro" path
EOF

npm run dev          # web app on :3000
python mcp/server.py # MCP server (stdio)

Datasets

11 deeply curated (full schema knowledge, locally mirrored, hand-picked SoQL):

3syk-w9eu — Austin construction permits
ecmv-9xxi — Austin food-establishment inspections
xwdj-i9he — Austin 311 service requests
6wtj-zbtb — Austin code-complaint cases
9cir-efmm — TX state franchise tax holders
gc4d-8a49 — Dallas 311
9fxf-t2tr — Dallas police active calls
fdj4-gpfu — Austin crime
y2wy-tgr5 — Austin traffic fatalities
2zpi-yjjs — TX state expenditures
naix-2893 — Austin mixed-beverage licenses

6,061 indexed across 6 portals — every other dataset is answered live: agent reads catalog metadata, plans a query, runs it on the source portal. Full provenance ledger at /sources.

Tech stack

Frontend: Next.js 14 App Router · TypeScript · Tailwind · inline-SVG charts
Agent runtime: OpenAI Codex / GPT-4o · 4 distinct LLM roles · Featherless fallback
MCP server: FastMCP · stdio transport · 8 tools
Data: Socrata SODA (Austin / Dallas / TX state) · CKAN (San Antonio / Houston) · Miro REST
Cache: Local JSON mirror (data/cache/*.json) · refreshed every 6h via GitHub Actions cron
Hosting: Vercel (Next.js serverless) · 60s function timeout · ephemeral filesystem

Project structure

.
├── app/              # Next.js App Router pages (TypeScript)
│   ├── api/agent/    # SSE streaming agent endpoint
│   ├── q/            # the agent observatory + DAG visualization
│   ├── chat/         # conversational support agent
│   ├── reports/      # 5 reports + 1 cross-dataset Heat Index
│   ├── datasets/     # universe browse + per-dataset detail
│   ├── sources/      # citations + glossary
│   ├── architecture/ # how the system fits together
│   ├── about/        # team
│   └── lib/          # cache, catalog, agent loop, specialists
├── agent/            # Python agent runtime + tools
│   ├── specialists/  # dataset_scout.py, ingestor.py
│   └── tools/        # data.py (SoQL), miro.py (REST)
├── mcp/              # FastMCP server (Python)
│   ├── server.py
│   └── manifest.json
├── prompts/          # System prompts per agent role
├── skills/txlookup/  # Cross-runtime skill doc
├── config/           # datasets.yaml, reports.ts, models.yaml
├── data/cache/       # local JSON mirror — committed, refreshed every 6h
├── docs/             # how-it-works, agents-strategy, demo-script, ...
├── tests/            # Python + TS tests (catalog integrity, doom-loop, e2e)
└── .github/workflows # deploy / scout / ingestor / watchdog crons

Contributing

Issues and PRs welcome. Read CONTRIBUTING.md before opening a PR.

Areas where new contributors land easily:

Add a portal — pick a Socrata or CKAN open-data portal. Add it to scripts/fetch-discovered-catalog.mjs. Open a PR.
Add a dataset — pick one from the 6,000+ indexed. Add a CatalogDataset entry to app/lib/catalog.ts and an INGEST_SPEC row to agent/specialists/ingestor.py. The deep curation kicks in automatically.
Add a report — write a ReportDef in config/reports.ts. The default [slug]/page.tsx renders bar/line/stat charts. For a flagship layout, see app/reports/[slug]/AustinConstructionReport.tsx.

License

MIT. See LICENSE.

All data is the property of its issuing agency, used under public-records terms. TXLookup does not claim ownership of any source data — every claim links back to its source portal.

Acknowledgements

Thanks to the City of Austin, the City of Dallas, the City of San Antonio, the City of Houston, and the State of Texas for publishing every dataset behind this site openly. Thanks to AITX and Codex for hosting the hackathon. Built on top of Anthropic's Model Context Protocol, Smithery, Miro, OpenAI, Featherless, and the Socrata + CKAN open-data standards.

Built by Ravinder Jilkapally, Kunal Vasavada, Godwyn James, and Raj Akula at the AITX × Codex Hackathon, May 8–10, 2026.

Name		Name	Last commit message	Last commit date
Latest commit History 239 Commits
.claude/skills		.claude/skills
.github		.github
agent		agent
app		app
brand-guide		brand-guide
config		config
data-dump		data-dump
data		data
docs-site		docs-site
docs		docs
mcp		mcp
prompts		prompts
public		public
scripts		scripts
skills/txlookup		skills/txlookup
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.mcp.json		.mcp.json
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
DESIGN.md		DESIGN.md
HACKATHON.md		HACKATHON.md
README.md		README.md
middleware.ts		middleware.ts
next-env.d.ts		next-env.d.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
requirements.txt		requirements.txt
smithery.yaml		smithery.yaml
tailwind.config.ts		tailwind.config.ts
test_live.py		test_live.py
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TXLookup

What it does

Why

Architecture

Quick start

Try it without installing anything

Install the MCP server (Claude Code)

Codex

Cursor — paste into MCP settings

Run locally

Datasets

Tech stack

Project structure

Contributing

License

Acknowledgements

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TXLookup

What it does

Why

Architecture

Quick start

Try it without installing anything

Install the MCP server (Claude Code)

Codex

Cursor — paste into MCP settings

Run locally

Datasets

Tech stack

Project structure

Contributing

License

Acknowledgements

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages