Skip to content

Marc0Guo/Datathon-2026

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

14 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Datathon 2026: Access to a Livable Planet

Using EPA AQI data, we apply machine learning and data visualization to uncover hidden air quality risk patterns across U.S. counties.

Repository Contents

  • data_cleaning.ipynb
    Loads, merges, and cleans raw EPA AQI by county data. Handles missing values, invalid ratios, feature normalization, and constructs derived indicators used across all downstream analyses.

  • high_risk_cluster.ipynb
    Uses K-Means clustering to group counties based on long-term air quality profiles.
    Focuses on exposure frequency, pollutant composition, and historical intensity to identify high-risk air quality regimes.

  • extreme_event_early_warning.ipynb
    Develops an XGBoost-based early warning model for future extreme AQI events using historical trends and event recency features.
    Emphasizes interpretability and practical risk signals rather than short-term forecasting accuracy.

πŸ”— Project Links

Runa He, Marco Guo, Anna Huang, Amelia Li β€’ DubsTech Datathon β€’ 2026

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors