Skip to content

Repo for the winning submission of 2020 Duke Datathon. Performed inference and predictive modeling and forecasting on the economic impact of COVID-19 for different countries.

Notifications You must be signed in to change notification settings

leonz12345/2020_Duke_Datathon_Winner_Submission

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 

Repository files navigation

2020 Duke Datathon Winner Submission

About the Competition

Competition Details

What We Did

With the continuous outbreak of COVID-19 around the world, many health experts believe that the pandemic will not end in 2021, and the cease of the pandemic will likely involve a mix of efforts. The main objective of this report is to provide a comprehensive analysis and concrete data-driven recommendations on the devastating long-term economic implications of COVID-19. To define economic impact, we found three important economic metrics: GDP, unemployment rate, and stock market index. We used principal component analysis to aggregate these three metrics into a single score which we call Economic Develop Index, EDI. We then build a regression model to see which attributes are significant. Finally, we used time-series forecasting to predict how a country would perform in the foreseeable future. Our model shows increases in Population Density, Death Rate, New Deaths Per Million, Stringency Index, and Hospital Beds Per Thousand will decrease the overall EDI. The findings match our expectation because of the increase of death rate, stringency index, new death counts, and hospital beds indicate the increase of severity for COVID-19. Moreover, we used the United States to demonstrate how our findings could be used to predict the EDI economic indicator. The same approach could be applied to other countries/cities with appropriately processed data. Finally, we made recommendations to countries facing the devastating long-term economic implications of this pandemic.

Our Approach

Methodology

Please find the problem statement of this competition here.

For more details of our findings, please refer to our report here.

-- Project Status: [Completed]
-- Competition Status: [Winner]

Methods Used

  • Principle Component Analysis
  • Multiple Linear Regression Analysis
  • Time-Series Forecasting

Tools Used

  • Dataiku for collaboration
  • Pandas and NumPy for data inspection, cleaning, verification, imputation, and merging
  • Scikit-learn for feature engineering and data transformation
  • R for Multiple Linear Regression Analysis
  • Facebook Prophet for time-Series forecasting

Team Members

Dean Huang: @DeanHuang-Git
Leon Zhang: @leonz12345
Michael Tang: @MTang0728
Tommy Tseng: @TommyTseng1129

About

Repo for the winning submission of 2020 Duke Datathon. Performed inference and predictive modeling and forecasting on the economic impact of COVID-19 for different countries.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 95.5%
  • R 4.5%