This repo is for analysis on the corona virus / covid-19 that will extract the latest data and generate reports. This repo will be updated daily
- Creates a time series dataset
- Creates a daily stats dataset
- Generates a number of visualizations
- You can also filter reports for a given country
- Generates an excel report including all of the above
- All results are saved to the output
reports
folder
- checkout the kanban boards to see work in progress
- You may have noticed that here are some discrepancies in the JHU data.
- These discrepencies include rows for countries missign for some sheets, Mispelling of country names and countries being names different (South Korea, Republic of Korea for example)
- I am doing my best to update the preprocessing code to fix these problems. Please be patient and I will release the newest version of covidify ASAP
pip install covidify
How to run:
$ covidify
Usage: covidify [OPTIONS] COMMAND [ARGS]...
☣ COVIDIFY ☣
- use the most up-to-date data to generate reports of confirmed cases,
fatalities and recoveries.
Options:
--help Show this message and exit.
Commands:
run
$ covidify run --help
Usage: covidify run [OPTIONS]
Options:
--output TEXT Folder to output data and reports [Default:
/Users/award40/Desktop/covidify-output/]
--source TEXT There are two datasources to choose from, John Hopkins
github repo or wikipedia -- options are git or wiki
respectively [Default: git]
--country TEXT Filter reports by a country [Default: Global cases]
--help Show this message and exit.
Example Commands:
# Will default to desktop folder
# for output and github for datasource
covidify run
# Will default to desktop folder for output
covidify run --source=wiki
covidify run --output=/Users/award40/Documents/projects-folder --source=git
# Filter reports by country
covidify run --country="South Korea"
This plots will be updated daily to visualize stats 3 attributes:
confirmed cases
deaths
recoveries
This is an accumalitive sum trendline for all the confirmed cases, deaths and recoveries.
This is an daily sum trendline for all the confirmed cases, deaths and recoveries.
This stacked bar chart shows a daily sum of people who are currently confirmed (red) and the number of people who have been been confirmed on that day (blue)
A count for new cases recorded on that given date, does not take past confirmations into account.
A count for deaths due to the virus recorded on that given date, does not take past deaths into account.
A count for new recoveries recorded on that given date, does not take past recoveries into account.
A count for all the people who are currently infected for a given date (confirmed cases - (recoveries + deaths))
- The data comes from the Novel Coronavirus (COVID-19) Cases, which is a live dataset provided by JHU CSSE.
- Data available here.
- All code written by me (Aaron Ward - https://www.linkedin.com/in/aaronjward/)
- A special thank you to the JHU CSSE team for maintaining the data