# Task 2 Team Report on COVID-19 DataSet

##### We will utilize the data from usfacts.org. The dataset contains daily county-level tracker of COVID-19 cases. This makes it easy to follow COVID-19 cases on a granular level, as does the ability to break down infections per 100,000 people (with the population data). To gain a comprehensive understanding of the COVID-19 pandemic's impact, we will undertake a thorough analysis of the relevant dataset. Our objective is to explore into the complex details and nuances of the data, enabling us to figure out patterns, trends, and correlations within the information provided. Through careful examination and interpretation of the dataset, we aim to uncover a wealth of insights that will contribute to a deeper and more nuanced appreciation of the pandemic's multifaceted effects. This rigorous analytical process is crucial for informing future public health decisions and strategies to combat such global health crises.

In [11]:
import pandas as pd

#### import the Cases, Deaths, and County Population Data Using Pandas

In [12]:
confirmed = pd.read_csv("covid_confirmed_usafacts.csv")
deaths = pd.read_csv("covid_deaths_usafacts.csv")
county_population = pd.read_csv("covid_county_population_usafacts.csv")

# COVID-19 Confirmed Cases Data 

In [13]:
confirmed.head()

Unnamed: 0,countyFIPS,County Name,State,StateFIPS,2020-01-22,2020-01-23,2020-01-24,2020-01-25,2020-01-26,2020-01-27,...,2023-07-14,2023-07-15,2023-07-16,2023-07-17,2023-07-18,2023-07-19,2023-07-20,2023-07-21,2023-07-22,2023-07-23
0,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1001,Autauga County,AL,1,0,0,0,0,0,0,...,19913,19913,19913,19913,19913,19913,19913,19913,19913,19913
2,1003,Baldwin County,AL,1,0,0,0,0,0,0,...,70521,70521,70521,70521,70521,70521,70521,70521,70521,70521
3,1005,Barbour County,AL,1,0,0,0,0,0,0,...,7582,7582,7582,7582,7582,7582,7582,7582,7582,7582
4,1007,Bibb County,AL,1,0,0,0,0,0,0,...,8149,8149,8149,8149,8149,8149,8149,8149,8149,8149


The dataset contains a time series of confirmed COVID-19 cases in the United States at the county level. 

* Number of Rows: There are 3,193 rows, each representing a different county or region within states.
* Number of Columns: The dataset has 1,269 columns.
* Column Details:
   
   * countyFIPS: The FIPS code for the county.
    * County Name: The name of the county.
    * State: The U.S. state abbreviation.
    * StateFIPS: The FIPS code for the state.
    * Date columns: Starting from 2020-01-22 and continuing daily, representing the 
    * cumulative number of confirmed COVID-19 cases for each date.

The first five rows provide a glimpse into the structure of the data, with the columns from 2020-01-22 to 2023-07-23 indicating the progression of case counts over time. Each row's values under the date columns show the cumulative confirmed cases of COVID-19 for the respective county up to that date.


### Variable Dictionary for the Confirmed Cases Data

|    Name        | Definition                     | Data type | Possible values                           | Required? |
|:--------------:|:------------------------------:|:---------:|:-----------------------------------------:|:---------:|
| County FIPS    | Unique identifier for a County | Integer   | 0, 200, 4375                              |   Yes     |
| County Name    | Name of the County             | String    | "Randolph County", "Guilford County"      |   Yes     |
| State          | Name of the State              | String    | AL, NC, NJ, VA                            |   Yes     |   
| State FIPS     | Unique identifier for a State  | Integer   | 1, 3, 18                                  |   Yes     |

# COVID-19 Deaths Data

In [17]:
deaths.head()

Unnamed: 0,countyFIPS,County Name,State,StateFIPS,2020-01-22,2020-01-23,2020-01-24,2020-01-25,2020-01-26,2020-01-27,...,2023-07-14,2023-07-15,2023-07-16,2023-07-17,2023-07-18,2023-07-19,2023-07-20,2023-07-21,2023-07-22,2023-07-23
0,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1001,Autauga County,AL,1,0,0,0,0,0,0,...,235,235,235,235,235,235,235,235,235,235
2,1003,Baldwin County,AL,1,0,0,0,0,0,0,...,731,731,731,731,731,731,731,731,731,731
3,1005,Barbour County,AL,1,0,0,0,0,0,0,...,104,104,104,104,104,104,104,104,104,104
4,1007,Bibb County,AL,1,0,0,0,0,0,0,...,111,111,111,111,111,111,111,111,111,111


The dataset contains the COVID-19 related deaths in the United States, categorized by county. 

* Number of Rows: The dataset consists of 3,193 rows, representing different counties or regions within the United States.
* Number of Columns: There are 1,269 columns in the dataset.
* Column Details:
   
   * countyFIPS: The FIPS code for the county, which is a unique identifier for counties and county equivalents in the United States.
    * County Name: The name of the county.
    * State: The abbreviation for the state to which the county belongs.
    * StateFIPS: The FIPS code for the state.
    * Date columns: These columns start from 2020-01-22 and ends at 2023-07-23. Each column represents the cumulative number of confirmed COVID-19 deaths on that date for each county.

The first five rows of the dataset show the cumulative death counts beginning from zero on January 22, 2020, and provide a daily update of these counts.

#### Variable Dictionary for the Deaths Case Data

|    Name        | Definition                     | Data Type | Possible Values                           | Required? |
|:--------------:|:------------------------------:|:---------:|:-----------------------------------------:|:---------:|
| County FIPS    | Unique identifier for a County | Integer   | 0, 200, 3075                              |   Yes     |
| County Name    | Name of the County             | String    | "Randolph County", "Guilford County"      |   Yes     |
| State          | Name of the State              | String    | "AL", "NC", "NJ", "VA"                    |   Yes     |
| State FIPS     | Unique identifier for a State  | Integer   | 1, 3, 18                                  |   Yes     |

# COVID-19 County Population Data

In [18]:
county_population.head()

Unnamed: 0,countyFIPS,County Name,State,population
0,0,Statewide Unallocated,AL,0
1,1001,Autauga County,AL,55869
2,1003,Baldwin County,AL,223234
3,1005,Barbour County,AL,24686
4,1007,Bibb County,AL,22394


The dataset contains COVID-19 County Population which is structured as :

* Number of Rows: There are 3,195 rows in the dataset.
* Number of Columns: The dataset comprises 4 columns.
* Column Details:

    * countyFIPS: The Federal Information Processing Standards code, which serves as a unique identifier for each county in the United States.
    * County Name: The name of the county.
    * State: The abbreviation of the state to which the county belongs.
    * population: The population of the county.

* First Few Rows: The dataset starts with a row labeled 'Statewide Unallocated' for the state of Alabama (AL) with a population of 0, suggesting that it might be a placeholder or summary row. The following rows provide population data for specific counties in Alabama, such as Autauga County with a population of 55,869, Baldwin County with 223,234, Barbour County with 24,686, and Bibb County with 22,394.

This dataset is essential for understanding the population distribution across different counties in the U.S.



#### Variable Dictionary for the County Population Data

|    Name        | Definition                     | Data type | Possible values                           | Required?  |
|:--------------:|:------------------------------:|:---------:|:-----------------------------------------:|:----------:|
| County FIPS    | Unique identifier for a County | Integer   | 0, 200, 3075                              |    Yes     |
| County Name    | Name of the County             | String    | Randolph County, Guilford County          |    Yes     |
| State          | Name of the State              | String    | AL, NC, NJ, VA                            |    Yes     |
| Population     | The population of a County     | Integer   | 0, 9347, 139542                           |    Yes     |   