Congressional Hearing Database

A public-facing tool for exploring congressional committee hearings, witness testimony, and legislative activity. Track what's happening in Congress through an accessible web interface and powerful data management tools.

Live Demo: Deployed on Vercel (if applicable)

What Can You Do With This?

📅 Track Committee Activity - Monitor specific committees and subject areas
👥 Follow Members - See what congressional members are working on
🎤 Analyze Witness Testimony - Identify who testifies and organizational representation patterns
🔍 Search & Filter - Find hearings by chamber, committee, date, or keyword
📄 Access Documents - Direct links to official transcripts and supporting materials

Current Data

Congress: 119th (2025-2027)
Hearings: 1,168 (613 House, 555 Senate)
Witnesses: 1,545 unique individuals with 1,620 appearances
Committees: 53 parent committees + 161 subcommittees
Members: 538 congressional members tracked
Updates: Daily automated sync at 6am UTC

Quick Start

For Casual Users

Just want to browse hearings? Visit the web interface:

# Clone and run locally
git clone <repository-url>
cd Hearing-Database
pip install -r requirements.txt
python cli.py web serve

Then open http://localhost:5000 in your browser.

👉 See the User Guide for web interface tutorials

For Technical Users

Want to import your own data or run custom queries?

Get a Congress.gov API Key (free): https://api.congress.gov/sign-up/

Configure environment:

cp .env.example .env
# Edit .env and add your API key

Initialize database:
```
python cli.py database init
```

Import data:

python cli.py import full --congress 119

👉 See the CLI Commands Reference for detailed command documentation

For Developers

Want to contribute or integrate with the API?

📚 Complete Documentation Hub - Explore all guides organized by audience

Essential Guides:

Development: Developer Guide - Architecture, patterns, and how to contribute
Database: Database Schema Reference - Complete schema documentation
Testing: Testing Guide - Writing and running tests
API: API Reference - Programmatic access documentation
Deployment: Deployment Guide - Vercel and production setup
Monitoring: Operations & Monitoring - Health checks and performance tracking

Features

Web Interface

Browse hearings with advanced filtering (chamber, committee, date range, search)
View detailed hearing information with witness lists and documents
Explore committee structures and membership
Track witness testimony across multiple hearings
Member detail pages with committee assignments

Data Management CLI

Import: Full data import from Congress.gov API
Update: Incremental daily updates (fetches only changed data)
Enhance: Enrich existing data with additional details
Database: Schema management and maintenance
Analysis: Audit tools and data quality checks

Automated Updates

Daily cron job (Vercel deployment) syncs new hearings and updates
Incremental update strategy minimizes API usage
Error tracking and logging for monitoring

Project Structure

Hearing-Database/
├── api/                    # Congress.gov API client and rate limiting
├── config/                 # Configuration and logging
├── database/              # Database schema and operations (SQLite)
├── fetchers/              # API data fetchers (hearings, committees, witnesses, etc.)
├── parsers/               # Data validation and parsing
├── importers/             # Import orchestration
├── updaters/              # Daily update automation
├── web/                   # Flask web application
│   ├── blueprints/        # Modular route handlers
│   ├── templates/         # HTML templates
│   └── static/            # CSS, JavaScript, images
├── scripts/               # Legacy standalone scripts
├── docs/                  # Documentation
├── tests/                 # Test suite
└── cli.py                 # Unified command-line interface

Database Schema

The SQLite database tracks comprehensive congressional hearing data:

Core Tables:

hearings - Hearing metadata (title, date, chamber, status, type)
committees - Committee and subcommittee information
members - Congressional members with party and state
witnesses - Witness information (name, title, organization)

Relationship Tables:

hearing_committees - Links hearings to committees
hearing_transcripts - Transcript URLs and metadata
witness_appearances - Witness testimony records
witness_documents - Witness written statements
supporting_documents - Additional hearing materials
committee_memberships - Member committee assignments

Tracking Tables:

sync_tracking - Import/update history
update_logs - Daily update metrics
import_errors - Error tracking

Technology Stack

Backend: Python 3.8+ with Flask web framework
Database: SQLite (portable, serverless-friendly)
API: Congress.gov API v3 (5,000 requests/hour limit)
Frontend: Bootstrap 5 with vanilla JavaScript
Deployment: Vercel with automated daily cron jobs
CLI: Click framework for command-line interface

Data Scope & Philosophy

Current Focus

119th Congress (2025-2027) - Currently active congress
Historical Archive - Data accumulates over time (will retain 119th when 120th starts)
Metadata Focus - Stores links to documents rather than full text (lightweight approach)
Public Data - All data sourced from official Congress.gov API

Out of Scope (Current)

Bill tracking (schema exists but low priority)
Full-text document storage/search
Historical backfill to prior congresses (possible future enhancement)

Use Cases

Civic Engagement

Monitor hearings on topics you care about
See which organizations testify before Congress
Track your representatives' committee activities
Access official hearing documents and transcripts

Research & Analysis

Study witness testimony patterns and organizational representation
Analyze committee hearing frequency and topics
Track legislative oversight activities
Export data for custom analysis

Journalism & Transparency

Quick lookup of hearing information
Verify witness testimony claims
Monitor congressional activity timelines
Access primary source documents

API Endpoints

Public JSON API for programmatic access:

GET /api/stats - Database statistics
GET /api/update-status - Daily update status and history
GET /api/debug - System diagnostic information

See API Reference for complete documentation.

Configuration

Key environment variables (.env file):

# Required
CONGRESS_API_KEY=your_api_key_here

# Optional
DATABASE_PATH=database.db          # Database file location
TARGET_CONGRESS=119                # Congress to import
BATCH_SIZE=50                      # Import batch size
UPDATE_WINDOW_DAYS=30              # Daily update lookback window
LOG_LEVEL=INFO                     # Logging verbosity

Development

Setup

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run tests
pytest

# Format code
black .

Running Locally

# Start web server
python cli.py web serve --host 127.0.0.1 --port 5000

# Or run Flask app directly
python web/app.py  # Runs on port 8000

Deployment

The system is designed for Vercel serverless deployment with automated daily updates:

Vercel handles web hosting and serverless functions
Cron job triggers daily data sync at 6am UTC
SQLite database deployed with the application
Read-mostly workload optimized for Vercel's serverless environment

See Deployment Guide for detailed instructions.

Performance & Limitations

Current Scale

Database Size: ~4.5 MB (1,168 hearings with full metadata)
API Rate Limit: 5,000 requests/hour (Congress.gov)
Update Time: ~5-10 minutes for daily incremental updates
Full Import: ~30-60 minutes for complete 119th Congress import

SQLite Considerations

Excellent for read-heavy workloads (web browsing)
Single-writer limitation (appropriate for daily batch updates)
Portable and serverless-friendly
May need migration to PostgreSQL if scale significantly increases

Contributing

Contributions welcome! Areas of interest:

Additional data visualizations
Enhanced search capabilities
Alternative transcript data source integration
Historical congress backfill
Performance optimizations
Documentation improvements

Please open an issue to discuss major changes before submitting PRs.

Roadmap

Potential Future Enhancements

Full-text search of transcript content (if alternative data source identified)
Historical backfill to prior congresses (118th, 117th, etc.)
Advanced analytics dashboard with charts and trends
Export capabilities (CSV, JSON) for custom analysis
Email notifications for committee/member activity
Mobile-responsive improvements

License

MIT License - See LICENSE file for details

Acknowledgments

Data provided by the Congress.gov API
Built for civic engagement and government transparency
Inspired by the need for accessible congressional oversight data

Support

Issues: Report bugs or request features via GitHub Issues
Documentation: See docs/ directory for detailed guides
Questions: Open a discussion for usage questions

Disclaimer: This is an independent project and is not affiliated with or endorsed by Congress.gov, the Library of Congress, or any government entity. All data is sourced from publicly available official sources.

CRS/Policy Library migrated to PostgreSQL with products/product_versions views

Name		Name	Last commit message	Last commit date
Latest commit History 1,740 Commits
.claude		.claude
Congressional-Hackathon-2025/capitol-voices		Congressional-Hackathon-2025/capitol-voices
URL_Migration_Project		URL_Migration_Project
api		api
archive		archive
audit_tools		audit_tools
backups		backups
brookings_ingester		brookings_ingester
config		config
congressional-hearing-tools		congressional-hearing-tools
cursor-fix-implementation-plan		cursor-fix-implementation-plan
data		data
database		database
docs		docs
domain-discovery		domain-discovery
fetchers		fetchers
implementation-plan		implementation-plan
importers		importers
ingesters		ingesters
migrations		migrations
near-term-implementation		near-term-implementation
notifications		notifications
parsers		parsers
planning		planning
public		public
queries		queries
scripts		scripts
tests		tests
updaters		updaters
utils		utils
web		web
.env.example		.env.example
.env.vercel		.env.vercel
.gitattributes		.gitattributes
.gitignore		.gitignore
.railwayignore		.railwayignore
.schema		.schema
.vercel-rebuild		.vercel-rebuild
.vercelignore		.vercelignore
BROOKINGS_STRUCTURE_PATTERNS.md		BROOKINGS_STRUCTURE_PATTERNS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODEBASE_OVERVIEW.md		CODEBASE_OVERVIEW.md
CRITICAL_FIX_NEEDED.md		CRITICAL_FIX_NEEDED.md
CRS_CRON_SETUP.md		CRS_CRON_SETUP.md
CRS_METADATA_UPDATE_README.md		CRS_METADATA_UPDATE_README.md
CURSOR_JUMPING_DEBUG_SUMMARY.md		CURSOR_JUMPING_DEBUG_SUMMARY.md
CURSOR_JUMPING_INVESTIGATION.md		CURSOR_JUMPING_INVESTIGATION.md
DAILY_UPDATES_IMPLEMENTATION.md		DAILY_UPDATES_IMPLEMENTATION.md
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
DEPLOYMENT_NOTES.md		DEPLOYMENT_NOTES.md
DUAL_CHAMBER_VIDEO_SUPPORT_COMPLETE.md		DUAL_CHAMBER_VIDEO_SUPPORT_COMPLETE.md
Dockerfile		Dockerfile
IMPLEMENTATION_COMPLETE.md		IMPLEMENTATION_COMPLETE.md
INVESTIGATION_FINDINGS.md		INVESTIGATION_FINDINGS.md
PARSER_IMPROVEMENTS_SUMMARY.md		PARSER_IMPROVEMENTS_SUMMARY.md
PHASE_1_5_DEPLOYMENT_SUCCESS.md		PHASE_1_5_DEPLOYMENT_SUCCESS.md
PHASE_1_5_FINAL_VALIDATION.md		PHASE_1_5_FINAL_VALIDATION.md
POLICY_LIBRARY_POSTGRES_MIGRATION.md		POLICY_LIBRARY_POSTGRES_MIGRATION.md
POSTGRES_MIGRATION.md		POSTGRES_MIGRATION.md
PREVIEW_INSTRUCTIONS.md		PREVIEW_INSTRUCTIONS.md
PYANNOTE_API_IMPLEMENTATION_PLAN.md		PYANNOTE_API_IMPLEMENTATION_PLAN.md
QUICK_START.md		QUICK_START.md
R2_AUDIO_INVESTIGATION_RESULTS.md		R2_AUDIO_INVESTIGATION_RESULTS.md
R2_AUDIO_QUICK_REFERENCE.md		R2_AUDIO_QUICK_REFERENCE.md
README.md		README.md
RESEARCH_FINDINGS_SUMMARY.md		RESEARCH_FINDINGS_SUMMARY.md
ROOT_CAUSE_ANALYSIS.md		ROOT_CAUSE_ANALYSIS.md
SCHEMA_DESIGN.md		SCHEMA_DESIGN.md
SELECT name FROM sqlite_master WHERE type='table' ORDER BY name		SELECT name FROM sqlite_master WHERE type='table' ORDER BY name
STUCK_TASK_QUICK_FIX.md		STUCK_TASK_QUICK_FIX.md
SYSTEM_ARCHITECTURE.md		SYSTEM_ARCHITECTURE.md
THIRDWAY_STATUS_REPORT.md		THIRDWAY_STATUS_REPORT.md
TRANSCRIPT_EDITING_IMPROVEMENT_PLAN.md		TRANSCRIPT_EDITING_IMPROVEMENT_PLAN.md
VERCEL_CRON_VERIFICATION_REPORT.md		VERCEL_CRON_VERIFICATION_REPORT.md
VIDEO_EMBEDDING_FIX_COMPLETE.md		VIDEO_EMBEDDING_FIX_COMPLETE.md
VIDEO_FIX_COMPLETE.md		VIDEO_FIX_COMPLETE.md
VOICEPRINT_ENROLLMENT_GUIDE.md		VOICEPRINT_ENROLLMENT_GUIDE.md
add_missing_columns.py		add_missing_columns.py
add_thirdway_source.py		add_thirdway_source.py
analyze_all_docs.py		analyze_all_docs.py
analyze_successful_docs.py		analyze_successful_docs.py
apply_crs_migration.py		apply_crs_migration.py
apply_crs_tables.py		apply_crs_tables.py
apply_event_id_index.py		apply_event_id_index.py
backfill_dates.py		backfill_dates.py
backfill_videos.py		backfill_videos.py
batch_enroll_voiceprints.py		batch_enroll_voiceprints.py
check_api_status.py		check_api_status.py
check_authors_schema.py		check_authors_schema.py
check_crs_database.py		check_crs_database.py
check_crs_schema.py		check_crs_schema.py
check_crs_source.py		check_crs_source.py
check_document_325.py		check_document_325.py
check_documents.py		check_documents.py
check_policy_library.py		check_policy_library.py
check_queue_status.py		check_queue_status.py
check_real_crs_db.py		check_real_crs_db.py
check_toc_structure.py		check_toc_structure.py
check_view_def.py		check_view_def.py

Folders and files

Latest commit

History

Repository files navigation

Congressional Hearing Database

What Can You Do With This?

Current Data

Quick Start

For Casual Users

For Technical Users

For Developers

Features

Web Interface

Data Management CLI

Automated Updates

Project Structure

Database Schema

Technology Stack

Data Scope & Philosophy

Current Focus

Out of Scope (Current)

Use Cases

Civic Engagement

Research & Analysis

Journalism & Transparency

API Endpoints

Configuration

Development

Setup

Running Locally

Deployment

Performance & Limitations

Current Scale

SQLite Considerations

Contributing

Roadmap

Potential Future Enhancements

License

Acknowledgments

Support

CRS/Policy Library migrated to PostgreSQL with products/product_versions views

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages