In [None]:
import pandas as pd

# Covid 19 Analysis
The COVID-19 pandemic has had a significant impact on the world, with millions of cases and deaths reported globally. In this analysis, we will use pandas to explore a dataset of COVID-19 cases and deaths in the United States. The dataset includes information such as the date of submission, state, total cases, confirmed cases, probable cases, new cases, total deaths, confirmed deaths, probable deaths, new deaths, and consent for cases and deaths. By analyzing this dataset, we hope to gain insights into the trends and patterns of COVID-19 cases and deaths in the US, as well as identify any factors that may be contributing to the spread of the virus.

The Data was provided by Data.gov, which is a government agency devoted to releasing public information to the public.
The link to the data set can be found here: [Data Website](https://catalog.data.gov/dataset/united-states-covid-19-cases-and-deaths-by-state-over-time)



In [4]:
covid_df = pd.read_csv('USACovid.csv')
covid_df

Unnamed: 0,submission_date,state,tot_cases,conf_cases,prob_cases,new_case,pnew_case,tot_death,conf_death,prob_death,new_death,pnew_death,created_at,consent_cases,consent_deaths
0,03/11/2021,KS,297229,241035.0,56194.0,0,0.0,4851,0.0,0.0,0,0.0,03/12/2021 03:20:13 PM,Agree,0
1,12/01/2021,ND,163565,135705.0,27860.0,589,220.0,1907,0.0,0.0,9,0.0,12/02/2021 02:35:20 PM,Agree,Not agree
2,01/02/2022,AS,11,0.0,0.0,0,0.0,0,0.0,0.0,0,0.0,01/03/2022 03:18:16 PM,0,0
3,11/22/2021,AL,841461,620483.0,220978.0,703,357.0,16377,12727.0,3650.0,7,3.0,11/22/2021 12:00:00 AM,Agree,Agree
4,05/30/2022,AK,251425,0.0,0.0,0,0.0,1252,0.0,0.0,0,0.0,05/31/2022 01:20:20 PM,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
60055,02/09/2021,TX,2504556,0.0,0.0,13329,2676.0,43306,0.0,0.0,207,0.0,02/11/2021 12:00:00 AM,Not agree,Not agree
60056,11/20/2020,FL,913561,0.0,0.0,8217,1677.0,19014,0.0,0.0,79,5.0,11/20/2020 12:00:00 AM,Not agree,Not agree
60057,08/17/2020,NM,23500,0.0,0.0,92,0.0,682,0.0,0.0,4,0.0,08/19/2020 12:00:00 AM,0,Not agree
60058,06/17/2020,MS,24223,24038.0,185.0,521,6.0,1191,1172.0,19.0,9,0.0,06/19/2020 12:00:00 AM,Agree,Agree


# Data Cleaning
In order to ensure the data provided is accurate, all NaN occurences as been replaced with 0.
The replacement of NaN with zero will hurt the data representation, so for the further calculations, if the column is zero, it will be omitted from the evaluation.

# Visuals
In order to represent the data in a more friendly manner, the tables will remain presorted in ascending order

In [15]:
covid_df = covid_df.fillna(0)
covid_df = covid_df.sort_values('submission_date')
covid_df

Unnamed: 0,submission_date,state,tot_cases,conf_cases,prob_cases,new_case,pnew_case,tot_death,conf_death,prob_death,new_death,pnew_death,created_at,consent_cases,consent_deaths
12774,01/01/2021,FL,1327296,0.0,0.0,9697,1580.0,23444,0.0,0.0,155,12.0,01/01/2021 12:00:00 AM,Not agree,Not agree
19276,01/01/2021,IL,963389,963389.0,0.0,0,0.0,18173,16647.0,1526.0,195,38.0,01/02/2021 02:50:51 PM,Agree,Agree
58583,01/01/2021,UT,279722,279722.0,0.0,3110,0.0,1278,1252.0,26.0,9,1.0,01/02/2021 02:50:51 PM,Agree,Agree
16101,01/01/2021,WI,522523,483007.0,39516.0,2085,180.0,5254,4869.0,385.0,12,2.0,01/02/2021 02:50:51 PM,Agree,Agree
17857,01/01/2021,ID,141077,116717.0,24360.0,0,0.0,1436,1269.0,167.0,0,0.0,01/02/2021 02:50:51 PM,Agree,Agree
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
28664,12/31/2021,ME,154173,112408.0,41765.0,1060,144.0,1531,1406.0,125.0,0,0.0,01/02/2022 12:00:00 AM,Agree,Agree
29706,12/31/2021,WI,1120669,994535.0,126134.0,8010,1018.0,11173,10063.0,1110.0,22,3.0,01/01/2022 02:12:51 PM,Agree,Agree
23625,12/31/2021,DC,94286,0.0,0.0,0,0.0,1211,0.0,0.0,0,0.0,01/01/2022 02:12:51 PM,0,0
58567,12/31/2021,OH,2016095,1589233.0,426862.0,20598,5021.0,29447,29447.0,0.0,667,0.0,01/01/2022 02:12:51 PM,Agree,Agree
