# COVID-19 Data Visualizations W/ Plotly

Creator: Teran Hembry

This notebook works with a dataset from https://ourworldindata.org/covid-deaths that touches on our global pandemic of COVID-19.
Throughout this notebook, there will be mulitple visualizations that display infection hotspots across the globe and the impact of the virus against our populations.

The data will be visualized through Python 3 libraries including matplotlib and plotly.



 -

To begin, first import all of the following libraries:


In [7]:

# To read data
import pandas as pd

# For data visualzation
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px


The file used in this project is labeled as "covid_deaths.csv"

Select the following columns from the dataset to create a dataframe:
iso_code, location, date, total_cases, new_cases, total_deaths, and population.

Convert columns 'total_cases', 'new_cases', 'total_deaths', and 'population' from string to float.

In [50]:
covidds = pd.read_csv('covid_deaths.csv')
df = covidds[['iso_code','location', 'continent', 'date', 'total_cases', 'new_cases', 'total_deaths', 'population']]

# Converting columns from strings to float
dict_column_types = {'total_cases': float,'new_cases': float, 'total_deaths': float, 'population': float}
df = df.astype(dict_column_types)

#Converting the date column into datetime
df['date'] = df['date'].astype('datetime64[ns]')

# Creating a new columns which represent percent of the population infected and the population death percentage
df['perc_pop_infected'] = (df['total_cases'] / df['population'])*100
df['pop_death_perc'] = (df['total_deaths'] / df['population'])*100


df.head()

Unnamed: 0,iso_code,location,continent,date,total_cases,new_cases,total_deaths,population,perc_pop_infected,pop_death_perc
0,AFG,Afghanistan,Asia,2020-02-24,5.0,5.0,,39835428.0,1.3e-05,
1,AFG,Afghanistan,Asia,2020-02-25,5.0,0.0,,39835428.0,1.3e-05,
2,AFG,Afghanistan,Asia,2020-02-26,5.0,0.0,,39835428.0,1.3e-05,
3,AFG,Afghanistan,Asia,2020-02-27,5.0,0.0,,39835428.0,1.3e-05,
4,AFG,Afghanistan,Asia,2020-02-28,5.0,0.0,,39835428.0,1.3e-05,


# Global Data

In [51]:
# Reframe data used to show records from 3/16/2022
gdata = df[df['date'] == '2022-03-16'][['iso_code','location','continent','total_cases', 'perc_pop_infected','total_deaths','pop_death_perc']]
condata = df[df['location'].isin(['Europe', 'Africa', 'North America', 'South America', 'Asia', 'Australia'])]

# Figure 1 displays the growth of COVID cases over time
fig1 = px.line(condata, x = 'date',y = 'total_cases', color = 'location',
              labels={'date':'Date', 'total_cases':'COVID-19 Cases', 'location':'Continents'},
              title = '1.  Global COVID-19 Cases Over Time')
fig1.show()

In [39]:


# Figure 2A displays the highest record of COVID-19 cases by country
fig2A = px.choropleth(gdata, locations="iso_code",
                    color="total_cases", # total count of COVID cases
                    hover_name="location", # column to add to hover information
                    range_color=(0, 50000000),
                    color_continuous_scale="deep",
                   title = "2A.  2022 COVID-19 Cases by Country",
                   labels = {'total_cases' : 'COVID Cases'})
fig2A.show()


#Figure 2B displays the highest record of COVID-19 deaths by country
fig2B = px.choropleth(gdata, locations="iso_code",
                    color="total_deaths", # total count of COVID cases
                    hover_name="location", # column to add to hover information
                    range_color=(0, 1000000),
                    color_continuous_scale="reds",
                   title = "2B.  2022 COVID-19 Death Count by Country",
                   labels = {'total_deaths' : 'COVID Related Deaths'})
fig2B.show()

From the figures above, we see that countries displayed with darker shades are common amongst figures 2A and 2B. By looking at the similar locations, this can partially prove a correlation between COVID cases and COVID deaths.

Although, when we compare figures 2A/2B to figure 1, Europe had 40 million cases above other contients. Due to the vast difference in population amongst countries, some locations look like powerhouses when displaying quantitative data.

This time, figures 3A/3B will use the same data above and instead compare it to each locations population size to collect a percent which represents the impact of the virus.


In [45]:
# Figure 3A displays the percentage of each population with infected cases
fig3A = px.choropleth(gdata, locations="iso_code",
                    color="perc_pop_infected", # total count of COVID cases
                    hover_name="location", # column to add to hover information
                    range_color=(0,50),
                    color_continuous_scale="deep",
                   title = "3A.  2022 COVID-19 Country Population Case Percentage",
                   labels = {'perc_pop_infected' : '%Population Infected'})
fig3A.show()

# Figure displays the percentage of each population with infected deaths
fig3B = px.choropleth(gdata, locations="iso_code",
                    color="pop_death_perc", # total count of COVID cases
                    hover_name="location", # column to add to hover information
                    range_color=(0,0.5),
                    color_continuous_scale="reds",
                   title = "3B.  2022 COVID-19 Country Population Death Percentage",
                   labels = {'pop_death_perc' : 'Population Death%'})
fig3B.show()