by Michelle L. Gill
This is my third project for the Summer 2016 Metis Data Science Bootcamp, which incorporated supervised machine learning, PostgreSQL, and D3 for visualization. This project predicted whether or not a region would would have an outbreak of Zika virus.
A blog post on themodernscientist.com provides further details about this project.
- environment.yml: list of conda python libraries that were used during analysis.
- figures: images used on the presentation.
- map: D3 animated timeline used during presentation. A movie of the animation is also available.
- notebooks: Jupyter notebooks used for analysis.
- presentation: PDF version of the final presentation.
This project made extensive use of external data sources, including data from GitHub repos and that was scraped from various websites.
- Zika outbreak data was pulled from the CDC Epidemic Prediction Initiative GitHub repo. My project used data that was pulled on 07/30/2016, which corresponds to commit
d44c5d1ca3af633224c8b8b490b1a3aafa9bcc8e
. A clone of this commit is available here. - Latitude and longitude data for Zika outbreaks was pulled from the following: Google Maps API, Scraped from Google Search via four proxies, and scraped from LatLong.
- Airport location information was scraped from Falling Rain.
- Worldwide historical weather data was scraped from Wunderground using closest airport code as the key.
- Aedes aegypti and Aedes albopictus occurrences were from Dryad. See references below for manuscripts related to this data.
- Worldwide population density was from the NASA Socioeconomic Data and Applications Center (SEDAC) Gridded Population and Population Density of the World.
- World GDP and purchase parity adjusted GDP from 2015 were scraped from Knoema.
- Flight patterns were scraped from FlightRadar24, however this data was not incorporated into the model due to time limitations.
References for Aedes aegypti and Aedes albopictus occurrences
Kraemer MUG, Sinka ME, Duda KA, Mylne A, Shearer FM, Brady OJ, Messina JP, Barker CM, Moore CG, Carvalho RG, Coelho GE, Van Bortel W, Hendrickx G, Schaffner F, Wint GRW, Elyazar IRF, Teng H, Hay SI (2015) The global compendium of Aedes aegypti and Ae. albopictus occurrence. Scientific Data 2(7): 150035. http://dx.doi.org/10.1038/sdata.2015.35
Kraemer MUG, Sinka ME, Duda KA, Mylne A, Shearer FM, Brady OJ, Messina JP, Barker CM, Moore CG, Carvalho RG, Coelho GE, Van Bortel W, Hendrickx G, Schaffner F, Wint GRW, Elyazar IRF, Teng H, Hay SI (2015) Data from: The global compendium of Aedes aegypti and Ae. albopictus occurrence. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.47v3c