<a href="https://colab.research.google.com/github/erisaf2/BDI-Final-Project/blob/main/Final_Case_Study_Covid_Farimani.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

I chose the COVID-19 Dataset in order to compare COVID cases at the height of the Pandemic in the US on Jan 8, 2021 versus April 30, 2021. I'm curious to see how the US has improved after the vaccine has been rolled out and fully administered to over 30% of the population, as of May 2, 2021. This data includes the 50 states, US Territories and two cruise ships that had the initial large COVID outbreaks, Diamond Princess and Grand Princess.

Some questions I have when looking at these datasets include:
1. States/Territories with the most vs. least cases 
2. States/Territories with the most vs. least deaths 
3. How cases have ranged as a whole over the last ~4 months
4. Cases in the US shown geographically 
5. Fatality rate by State/Territory
5. Top 10 highest death rate
6. Bottom 10 lowest confirmed case rate

In [1]:
if 'google.colab' in str(get_ipython()):
    !pip install plotly==4.14.3



In [2]:
import pandas as pd
import numpy as np
import plotly
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio

In [3]:
#height of COVID-19 Pandemic in US
df1 = pd.read_csv('https://github.com/erisaf2/BDI-Final-Project/raw/main/01-08-2021.csv')

In [4]:
#Pandemic today after vaccine rollout
df2 = pd.read_csv('https://github.com/erisaf2/BDI-Final-Project/raw/main/04-30-2021.csv')

In [5]:
df1.shape


(58, 18)

In [6]:
df2.shape

(58, 18)

In [7]:
df1.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 58 entries, 0 to 57
Data columns (total 18 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Province_State        58 non-null     object 
 1   Country_Region        58 non-null     object 
 2   Last_Update           58 non-null     object 
 3   Lat                   56 non-null     float64
 4   Long_                 56 non-null     float64
 5   Confirmed             58 non-null     int64  
 6   Deaths                58 non-null     int64  
 7   Recovered             48 non-null     float64
 8   Active                58 non-null     float64
 9   FIPS                  58 non-null     float64
 10  Incident_Rate         56 non-null     float64
 11  Total_Test_Results    56 non-null     float64
 12  People_Hospitalized   0 non-null      float64
 13  Case_Fatality_Ratio   57 non-null     float64
 14  UID                   58 non-null     float64
 15  ISO3                  58 

In [8]:
#Cleaning the df1 data to remove columns with missing values
df1.drop(columns=['People_Hospitalized','Hospitalization_Rate'], inplace=True)

In [9]:
df1.head(3)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,Recovered,Active,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
0,Alabama,US,2021-01-09 05:30:45,32.3182,-86.9023,394287,5191,211684.0,177412.0,1.0,8041.446529,1944233.0,1.316554,84000001.0,USA,39652.450397
1,Alaska,US,2021-01-09 05:30:45,61.3707,-152.4044,49660,223,7165.0,42251.0,2.0,6785.501917,1337749.0,0.449244,84000002.0,USA,182866.262499
2,American Samoa,US,2021-01-09 05:30:45,-14.271,-170.132,0,0,,0.0,60.0,0.0,2140.0,,16.0,ASM,3846.084722


In [10]:
df1.sample(5)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,Recovered,Active,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
30,Montana,US,2021-01-09 05:30:45,46.9219,-110.4544,85568,1049,79114.0,5405.0,30.0,8006.152821,831887.0,1.225926,84000030.0,USA,77835.340922
46,South Carolina,US,2021-01-09 05:30:45,33.8569,-80.945,344176,5695,164575.0,173906.0,45.0,6684.698354,3398114.0,1.654677,84000045.0,USA,65999.276713
20,Kansas,US,2021-01-09 05:30:45,38.5266,-96.7265,244583,3141,4838.0,235669.0,20.0,8363.259161,1051207.0,1.289155,84000020.0,USA,36082.859589
53,Virginia,US,2021-01-09 05:30:45,37.7693,-78.17,387917,5312,32595.0,350010.0,51.0,4544.738287,4500002.0,1.369365,84000051.0,USA,52720.894886
51,Vermont,US,2021-01-09 05:30:45,44.0459,-72.7107,8619,156,5778.0,2685.0,50.0,1381.274349,743851.0,1.809955,84000050.0,USA,119208.992466


In [11]:
df2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 58 entries, 0 to 57
Data columns (total 18 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Province_State        58 non-null     object 
 1   Country_Region        58 non-null     object 
 2   Last_Update           58 non-null     object 
 3   Lat                   56 non-null     float64
 4   Long_                 56 non-null     float64
 5   Confirmed             58 non-null     int64  
 6   Deaths                58 non-null     int64  
 7   Recovered             0 non-null      float64
 8   Active                0 non-null      float64
 9   FIPS                  58 non-null     float64
 10  Incident_Rate         56 non-null     float64
 11  Total_Test_Results    56 non-null     float64
 12  People_Hospitalized   0 non-null      float64
 13  Case_Fatality_Ratio   57 non-null     float64
 14  UID                   58 non-null     float64
 15  ISO3                  58 

In [12]:
#Cleaning the df2 data to remove columns with missing values
df2.drop(columns=['Recovered','Active','People_Hospitalized','Hospitalization_Rate'], inplace=True)

In [13]:
df2.head(3)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
0,Alabama,US,2021-05-01 04:30:31,32.3182,-86.9023,527922,10896,1.0,10766.919869,2498822.0,2.063941,84000001.0,USA,50963.24124
1,Alaska,US,2021-05-01 04:30:31,61.3707,-152.4044,68148,347,2.0,9315.626516,2091352.0,0.509186,84000002.0,USA,285881.524718
2,American Samoa,US,2021-05-01 04:30:31,-14.271,-170.132,0,0,60.0,0.0,2140.0,,16.0,ASM,3846.084722


In [14]:
df2.sample(5)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
48,Tennessee,US,2021-05-01 04:30:31,35.7478,-86.6923,847430,12197,47.0,12408.967761,7643133.0,1.439293,84000047.0,USA,111918.849922
39,Northern Mariana Islands,US,2021-05-01 04:30:31,15.0979,145.6739,168,2,69.0,304.656898,17542.0,1.190476,580.0,MNP,31811.257798
12,Georgia,US,2021-05-01 04:30:31,33.0406,-83.6431,1100187,20190,13.0,10362.090688,8298740.0,1.835143,84000013.0,USA,78161.527519
17,Illinois,US,2021-05-01 04:30:31,40.3495,-88.9861,1334955,24291,17.0,10534.831576,22558270.0,1.819612,84000017.0,USA,178019.165517
13,Grand Princess,US,2021-05-01 04:30:31,,,103,3,99999.0,,,2.912621,84099999.0,USA,


In [15]:
#Top 5 States/Territories with most confirmed cases on Jan 8 2021
df1_sorted_by_cases = df1.sort_values('Confirmed', ascending=False)
df1_sorted_by_cases.head(5)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,Recovered,Active,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
5,California,US,2021-01-09 05:30:45,36.1162,-119.6816,2653925,29306,,2590723.0,6.0,6630.426742,35027330.0,1.110989,84000006.0,USA,88649.352885
49,Texas,US,2021-01-09 05:30:45,31.0545,-97.5635,1932554,30128,1536690.0,357064.0,48.0,6633.35251,14532743.0,1.541282,84000048.0,USA,50120.025668
11,Florida,US,2021-01-09 05:30:45,27.7663,-81.6868,1449252,22666,,1426586.0,12.0,6747.694136,16592877.0,1.563979,84000012.0,USA,77256.169959
36,New York,US,2021-01-09 05:30:45,42.1657,-74.9481,1101445,39282,108144.0,954019.0,36.0,5661.919687,26816135.0,3.566406,84000036.0,USA,137846.921702
17,Illinois,US,2021-01-09 05:30:45,40.3495,-88.9861,1017322,19108,,998214.0,17.0,8028.222621,13922611.0,1.878265,84000017.0,USA,109870.641323


In [16]:
#Top 5 States/Territories with most confirmed cases on May 2 2021
df2_sorted_by_cases = df2.sort_values('Confirmed', ascending=False)
df2_sorted_by_cases.head(5)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
5,California,US,2021-05-01 07:30:34,36.1162,-119.6816,3742115,62078,6.0,9470.778194,59795197.0,1.658901,84000006.0,USA,151333.416497
49,Texas,US,2021-05-01 04:30:31,31.0545,-97.5635,2893928,50219,48.0,9980.479641,23001820.0,1.735323,84000048.0,USA,79327.887985
11,Florida,US,2021-05-01 04:30:31,27.7663,-81.6868,2233518,35161,12.0,10399.22409,22617595.0,1.574243,84000012.0,USA,105307.160619
36,New York,US,2021-05-01 04:30:31,42.1657,-74.9481,2048150,52258,36.0,10528.406599,50991319.0,2.551473,84000036.0,USA,262118.174662
17,Illinois,US,2021-05-01 04:30:31,40.3495,-88.9861,1334955,24291,17.0,10534.831576,22558270.0,1.819612,84000017.0,USA,178019.165517


In [17]:
#Top 3 States/Territories with most overall deaths on Jan 8 2021
df1_sorted_by_deaths = df1.sort_values('Deaths', ascending=False)
df1_sorted_by_deaths.head(3)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,Recovered,Active,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
36,New York,US,2021-01-09 05:30:45,42.1657,-74.9481,1101445,39282,108144.0,954019.0,36.0,5661.919687,26816135.0,3.566406,84000036.0,USA,137846.921702
49,Texas,US,2021-01-09 05:30:45,31.0545,-97.5635,1932554,30128,1536690.0,357064.0,48.0,6633.35251,14532743.0,1.541282,84000048.0,USA,50120.025668
5,California,US,2021-01-09 05:30:45,36.1162,-119.6816,2653925,29306,,2590723.0,6.0,6630.426742,35027330.0,1.110989,84000006.0,USA,88649.352885


In [18]:
#Top 3 States/Territories with most overall deaths on May 2 2021
df2_sorted_by_deaths = df2.sort_values('Deaths', ascending=False)
df2_sorted_by_deaths.head(3)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
5,California,US,2021-05-01 07:30:34,36.1162,-119.6816,3742115,62078,6.0,9470.778194,59795197.0,1.658901,84000006.0,USA,151333.416497
36,New York,US,2021-05-01 04:30:31,42.1657,-74.9481,2048150,52258,36.0,10528.406599,50991319.0,2.551473,84000036.0,USA,262118.174662
49,Texas,US,2021-05-01 04:30:31,31.0545,-97.5635,2893928,50219,48.0,9980.479641,23001820.0,1.735323,84000048.0,USA,79327.887985


In [19]:
#Grouping by Case Fatality Ratio and Deaths - this shows the top 5 fatality ratios and 
#how many deaths occurred in these states/territories
df1_fatality_by_cases = df1.groupby('Case_Fatality_Ratio', as_index=False).agg({'Deaths': ['sum']})
df1_fatality_by_cases.tail(5)

Unnamed: 0_level_0,Case_Fatality_Ratio,Deaths
Unnamed: 0_level_1,Unnamed: 1_level_1,sum
52,2.912621,3
53,3.069992,6324
54,3.093722,12985
55,3.566406,39282
56,3.824176,19756


In [20]:
fig1 = px.box(df1[df1['Confirmed']<1000000],
             x='Confirmed',
             orientation='h',
             title='Confirmed Cases (for areas with less than 1mil cases) as of Jan 8 2021')
fig1.show()

In [21]:
fig2 = px.line(df2_sorted_by_deaths, y='Deaths', x='Province_State', title='Total Deaths by State/Territory as of May 2 2021')
fig2.show()

In [22]:
df2_sorted_by_case_fatality = df2.sort_values('Case_Fatality_Ratio', ascending=False)
df2_sorted_by_case_fatality.head(5)

Unnamed: 0,Province_State,Country_Region,Last_Update,Lat,Long_,Confirmed,Deaths,FIPS,Incident_Rate,Total_Test_Results,Case_Fatality_Ratio,UID,ISO3,Testing_Rate
13,Grand Princess,US,2021-05-01 04:30:31,,,103,3,99999.0,,,2.912621,84099999.0,USA,
34,New Jersey,US,2021-05-01 04:30:31,40.2989,-74.521,997223,25554,34.0,11227.219864,13555157.0,2.562516,84000034.0,USA,152610.527359
25,Massachusetts,US,2021-05-01 04:30:31,42.2302,-71.5301,688973,17610,25.0,9995.976788,21264961.0,2.555978,84000025.0,USA,308523.057589
36,New York,US,2021-05-01 04:30:31,42.1657,-74.9481,2048150,52258,36.0,10528.406599,50991319.0,2.551473,84000036.0,USA,262118.174662
7,Connecticut,US,2021-05-01 04:30:31,41.5978,-72.7554,339233,8097,9.0,9514.886179,8021441.0,2.386855,84000009.0,USA,224987.245066


In [23]:
fig3 = px.scatter(df2_sorted_by_case_fatality, y='Deaths', x='Case_Fatality_Ratio', title='Total Deaths vs. Fatality Ratio by each State/Territory as of May 2 2021')
fig3.show()

In [24]:
US_deaths_over_time = pd.read_csv('https://github.com/erisaf2/BDI-Final-Project/blob/main/time_series_covid19_deaths_US.csv?raw=true')
US_deaths_over_time.head(3)

Unnamed: 0,UID,iso2,iso3,code3,FIPS,Admin2,Province_State,Country_Region,Lat,Long_,Combined_Key,Population,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,...,3/24/21,3/25/21,3/26/21,3/27/21,3/28/21,3/29/21,3/30/21,3/31/21,4/1/21,4/2/21,4/3/21,4/4/21,4/5/21,4/6/21,4/7/21,4/8/21,4/9/21,4/10/21,4/11/21,4/12/21,4/13/21,4/14/21,4/15/21,4/16/21,4/17/21,4/18/21,4/19/21,4/20/21,4/21/21,4/22/21,4/23/21,4/24/21,4/25/21,4/26/21,4/27/21,4/28/21,4/29/21,4/30/21,5/1/21,5/2/21
0,84001001,US,USA,840,1001.0,Autauga,Alabama,US,32.539527,-86.644082,"Autauga, Alabama, US",55869,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,99,99,99,99,99,99,99,99,99,99,102,102,102,102,102,103,103,103,103,103,103,103,103,103,106,106,106,106,107,107,107,107,107,107,107,107,107,107,107,107
1,84001003,US,USA,840,1003.0,Baldwin,Alabama,US,30.72775,-87.722071,"Baldwin, Alabama, US",223234,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,300,300,301,301,301,301,301,301,301,301,301,301,301,301,301,300,300,300,300,300,300,300,302,302,302,302,302,302,302,303,303,305,305,305,305,305,305,305,306,306
2,84001005,US,USA,840,1005.0,Barbour,Alabama,US,31.868263,-85.387129,"Barbour, Alabama, US",24686,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,54,54,54,54,54,54,54,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,55,56,56,56,56,56,56,56,56,56,56,56


In [25]:
fig4 = px.pie(
    US_deaths_over_time,
    names='Province_State',
    title='Death Breakdown by State as of May 2 2021',
    width=800,
    height=700
)

fig4.show()

In [26]:
us_cases_over_time = pd.read_csv('https://github.com/erisaf2/BDI-Final-Project/blob/main/time_series_covid19_confirmed_US.csv?raw=true')
us_cases_over_time.head(3)

Unnamed: 0,UID,iso2,iso3,code3,FIPS,Admin2,Province_State,Country_Region,Lat,Long_,Combined_Key,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,...,3/24/21,3/25/21,3/26/21,3/27/21,3/28/21,3/29/21,3/30/21,3/31/21,4/1/21,4/2/21,4/3/21,4/4/21,4/5/21,4/6/21,4/7/21,4/8/21,4/9/21,4/10/21,4/11/21,4/12/21,4/13/21,4/14/21,4/15/21,4/16/21,4/17/21,4/18/21,4/19/21,4/20/21,4/21/21,4/22/21,4/23/21,4/24/21,4/25/21,4/26/21,4/27/21,4/28/21,4/29/21,4/30/21,5/1/21,5/2/21
0,84001001,US,USA,840,1001.0,Autauga,Alabama,US,32.539527,-86.644082,"Autauga, Alabama, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,6533,6540,6543,6562,6570,6577,6580,6589,6595,6606,6617,6619,6620,6644,6675,6702,6710,6715,6723,6727,6734,6740,6748,6750,6760,6763,6763,6773,6793,6819,6835,6876,6879,6882,6889,6890,6897,6904,6907,6909
1,84001003,US,USA,840,1003.0,Baldwin,Alabama,US,30.72775,-87.722071,"Baldwin, Alabama, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,20395,20417,20423,20453,20473,20487,20492,20505,20523,20519,20526,20541,20542,20551,20573,20588,20600,20617,20631,20638,20652,20670,20674,20701,20714,20723,20730,20764,20787,20815,20833,20838,20847,20863,20875,20897,20921,20941,20966,20983
2,84001005,US,USA,840,1005.0,Barbour,Alabama,US,31.868263,-85.387129,"Barbour, Alabama, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,2216,2218,2221,2224,2226,2226,2227,2227,2227,2228,2231,2232,2232,2238,2239,2244,2245,2247,2247,2249,2252,2257,2262,2264,2271,2271,2271,2275,2284,2289,2292,2296,2296,2296,2297,2298,2299,2300,2302,2302


In [27]:
us_cases_over_time.drop(columns=['code3','FIPS'], inplace=True)
us_cases_over_time.rename(columns={'Admin2': 'County'}, inplace=True)

In [28]:
us_cases_over_time.sample(5)

Unnamed: 0,UID,iso2,iso3,County,Province_State,Country_Region,Lat,Long_,Combined_Key,1/22/20,1/23/20,1/24/20,1/25/20,1/26/20,1/27/20,1/28/20,1/29/20,1/30/20,1/31/20,2/1/20,2/2/20,2/3/20,2/4/20,2/5/20,2/6/20,2/7/20,2/8/20,2/9/20,2/10/20,2/11/20,2/12/20,2/13/20,2/14/20,2/15/20,2/16/20,2/17/20,2/18/20,2/19/20,2/20/20,2/21/20,...,3/24/21,3/25/21,3/26/21,3/27/21,3/28/21,3/29/21,3/30/21,3/31/21,4/1/21,4/2/21,4/3/21,4/4/21,4/5/21,4/6/21,4/7/21,4/8/21,4/9/21,4/10/21,4/11/21,4/12/21,4/13/21,4/14/21,4/15/21,4/16/21,4/17/21,4/18/21,4/19/21,4/20/21,4/21/21,4/22/21,4/23/21,4/24/21,4/25/21,4/26/21,4/27/21,4/28/21,4/29/21,4/30/21,5/1/21,5/2/21
2944,84048485,US,USA,Wichita,Texas,US,33.988429,-98.704103,"Wichita, Texas, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,14773,14793,14801,14808,14808,14808,14822,14822,14835,14835,14840,14840,14840,14845,14845,14857,14857,14868,14868,14868,14875,14875,14889,14889,14899,14899,14899,14909,14909,14916,14916,14930,14930,14930,14944,14944,14953,14953,14971,14971
997,84020143,US,USA,Ottawa,Kansas,US,39.132374,-97.650203,"Ottawa, Kansas, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,547,547,548,548,548,549,549,549,549,549,549,549,549,549,549,549,549,549,549,549,549,549,549,550,550,550,549,549,549,549,549,549,549,550,550,551,551,551,551,551
42,84001085,US,USA,Lowndes,Alabama,US,32.159728,-86.651584,"Lowndes, Alabama, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,1357,1358,1358,1358,1359,1359,1361,1362,1363,1363,1363,1363,1363,1362,1362,1363,1364,1364,1364,1364,1365,1365,1365,1365,1373,1373,1373,1376,1378,1380,1379,1378,1382,1383,1383,1383,1385,1383,1383,1383
1058,84021051,US,USA,Clay,Kentucky,US,37.164511,-83.712575,"Clay, Kentucky, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,2558,2563,2570,2568,2572,2574,2577,2581,2581,2587,2591,2591,2589,2592,2595,2597,2599,2600,2600,2603,2607,2613,2617,2619,2621,2621,2624,2631,2635,2639,2641,2643,2643,2645,2648,2650,2650,2653,2654,2654
962,84020073,US,USA,Greenwood,Kansas,US,37.877439,-96.232505,"Greenwood, Kansas, US",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,...,551,551,551,551,551,551,551,551,551,551,551,551,552,552,552,552,552,552,552,553,553,553,553,554,554,554,555,555,555,555,555,555,555,554,554,556,556,556,556,556


In [29]:
df_0521 = us_cases_over_time.sort_values(['Province_State', '5/2/21'], ascending=[True, False]) \
  .groupby('Province_State').head(3)

fig5 = px.treemap(
    df_0521,
    path=['Province_State', 'County'],
    title='Top 3 Counties with Highest COVID Instances in Each State',
    values='5/2/21',
    height=700
)

fig5.show()


In [30]:


fig6 = px.density_heatmap(
    df2,
    x='Province_State',
    y='Confirmed',
    title='Neighbourhood vs Price Heatmap',
    height=600
)
fig6.show()

In [31]:
fig6 = px.bar(
    df2,
    x='Confirmed',
    y='Province_State',
    template='plotly_dark',
    title='Confirmed Cases by State in Ascending Order',
    height=600
)

fig6.update_yaxes(categoryorder='total ascending')

fig6.show()