Skip to content

rkhan60/IDOT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IDOT — Project Qualification & Recommendation System

A Python-based system for managing engineering firm pre-qualifications and recommending eligible firms for Illinois Department of Transportation (IDOT) infrastructure projects.

Overview

This system automates the process of:

  • Firm Data Management — Processing, validating, and storing data for 415+ engineering firms with 44 unique prequalification categories
  • Project Bulletin Extraction — Parsing IDOT Project Technical Bulletins (PTBs) to extract project requirements, districts, and scope
  • Firm-Project Matching — Using TF-IDF similarity, historical award analysis, and district rotation rules to recommend the top eligible firms for each project
  • Continuous Data Pipeline — Automated processing of new bulletins with incremental database updates, backups, and monitoring

Project Structure

├── FIRM/                        # Firm master data (CSV)
├── ex/                          # Extended modules
│   ├── FIRM/                    # Qualification engine, recommendation system, award validation
│   ├── pipeline/                # Continuous data pipeline with SQLite database
│   └── *.py                     # Extractors and pipeline orchestration
├── files/
│   └── word/                    # PTB bulletin documents (DOCX) and analysis results
├── *.py                         # Root-level processors, analyzers, and utilities
├── *.json                       # Data files (awards, firms, prequalifications)
└── *.md                         # Documentation and guides

Key Components

Data Processing

  • firm_data_processor.py — Main firm data processor with cleaning, validation, and database storage
  • firm_excel_processor.py — Excel-to-JSON transformation with 100% accuracy column mapping
  • build_corrected_json.py — Corrected firm JSON builder with full validation

Analysis & Matching

  • ptb217_fixed_extraction_system.py — PTB project extraction with TF-IDF prequalification matching
  • ptb217_rotation_test_system.py — District rotation rule testing (firms winning PTB N are ineligible for PTB N+1)
  • ex/FIRM/automated_recommendation_system.py — Automated top-5 firm recommendations per project
  • ex/FIRM/enhanced_qualification_engine.py — Multi-layer firm-project qualification matching

Validation & Quality

  • verify_prequals.py — Prequalification duplicate and formatting checker
  • check_duplicates.py / verify_duplicates.py — Firm code uniqueness verification
  • analyze_prequals.py — Comprehensive prequalification distribution analysis

Pipeline

  • ex/continuous_data_pipeline.py — Continuous bulletin processing with scheduling, backups, and health monitoring

Data Summary

Dataset Count
Eligible Firms 415
Unique Prequalifications 44
Historical Award Records 2,095
Projects in Database 46
Data Quality Score 98.5%+

Requirements

pandas
openpyxl
python-docx
scikit-learn
numpy

Getting Started

# Clone the repository
git clone https://github.com/rkhan60/IDOT.git
cd IDOT

# Install dependencies
pip install pandas openpyxl python-docx scikit-learn numpy

# Run firm analysis
python analyze_prequals.py

# Run data validation
python verify_prequals.py

# Run tests
python test_firm_processor.py
python test_excel_processor.py

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

License

MIT License

About

IDOT Project Qualification & Recommendation System - Matching engineering firms to Illinois DOT infrastructure projects

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages