seecovid

This is a data science portfolio project in which plotly|Dash was used to visualize COVID-19 data.

More information about the finished app can be found on my portfolio website: https://buckeye17.github.io/COVID-Dashboard/

In August 2020 the app was revised significantly. Changes are listed below. More information about the update can be found here: https://buckeye17.github.io/Docker-and-AWS/

Removing animation feature from the heat map plot due to slow performance
Added a data picker for the heat map
Fixed bugs pertaining to logarithmic scale bar plots
Added Docker support
Added ability to download new data files from Amazon S3 bucket
Modified extract_transform Jupyter notebook as follows:
- Modified pipeline to conform new data structure to original structure
- Added Amazon S3 upload to enable automatic pushing of data to web app
- Added ability to restart Amazon Elastic Beanstalk app instances

The deployed app can be found here: http://seecovid.eba-ishvxpm4.us-east-1.elasticbeanstalk.com/

Replication

The local Python environment can be re-created either using requirements_aws_ec2.txt with pip or using environment.yml with conda.

The requirements.txt file is used by Heroku to build the required containerized environment for the heroku app. The heroku_gitignore.txt file is meant to define the files to be ignored for heroku deployment.

The github_gitignore.txt is meant to define the files to be ignored when archiving on Github.

After initial deployment as a Heroku app, it has been extended to also work within a Docker container and to be deployed on AWS Elastic Beanstalk. Hence the .dockerignore and Dockerfile files have been added to the root directory. The requirements_aws_ec2.txt file also acts to define the docker image Python environment.

Files & Folders

The Procfile is needed for heroku deployment. It tells the linux containerized environment how to start the app.

app.py fully defines the Python Dash app.

data_clean folder contains:

Clean Pandas dataframes as pickle files. These files provide COVID data & population data.
Geo JSON files used for heat map region boundaries.
An initial plotly figure data structure for the heatmap seen when landing on the site. This seems to reduce initial loading time from 25 seconds to 15 seconds!

data_raw folder contains:

csv files as well as xlsx files used to generate csv files. This is largely population data copied and pasted from Wikipedia tables. The sub-folder canada_provinces contains raw Geo JSON files for Canada.

images folder contains static images used by the app.

jupyter_notebooks folder contains:

PopulationData notebook to transform raw population data into clean Pandas dataframe pickle files.
Covid19_Extract_Transform notebook uses the clean population data and downloads Johns Hopkins COVID data to produce a clean Pandas dataframe pickle file for COVID data.
Covid19_Visualization3 was used to initially explore plotly visualizations which were eventually combined into the Dash app.

secret_credentials is a folder omitted from this Github repository because it contains authentication keys:

.mapbox_token provides a key for accessing mapbox maps used for all heat map figures.
client_secrets.json provides Google Drive access
github_token.txt provides access to Github, enabling all the desired files in the Johns Hopkins COVID respository to be downloaded
config & credentials are two files used by the boto3 package for AWS in the extract_transform notebook to push data files to an s3 bucket and to restart the beanstalk app

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data_clean		data_clean
data_raw		data_raw
images		images
jupyter_notebooks		jupyter_notebooks
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Procfile		Procfile
README.md		README.md
all_requirements.txt		all_requirements.txt
app.py		app.py
docker_commands.txt		docker_commands.txt
environment.yml		environment.yml
github_gitignore.txt		github_gitignore.txt
heroku-gitignore.txt		heroku-gitignore.txt
requirements.txt		requirements.txt
requirements_aws_ec2.txt		requirements_aws_ec2.txt

License

buckeye17/seecovid

Folders and files

Latest commit

History

Repository files navigation

seecovid

Replication

Files & Folders

About

Resources

License

Stars

Watchers

Forks

Languages