# Member Task 3 COVID-19 Employement Dataset 

### Analyzing Recent Trends: Examining the Latest Week's COVID-19 Data Specifically for North Carolina

### Let's begin our analysis by loading the COVID-19 Cases dataset from the .csv file.







In [2]:
import pandas as pd

cases = pd.read_csv('covid_confirmed_usafacts.csv')
cases.head()

Unnamed: 0,countyFIPS,County Name,State,StateFIPS,2020-01-22,2020-01-23,2020-01-24,2020-01-25,2020-01-26,2020-01-27,...,2023-07-14,2023-07-15,2023-07-16,2023-07-17,2023-07-18,2023-07-19,2023-07-20,2023-07-21,2023-07-22,2023-07-23
0,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,1001,Autauga County,AL,1,0,0,0,0,0,0,...,19913,19913,19913,19913,19913,19913,19913,19913,19913,19913
2,1003,Baldwin County,AL,1,0,0,0,0,0,0,...,70521,70521,70521,70521,70521,70521,70521,70521,70521,70521
3,1005,Barbour County,AL,1,0,0,0,0,0,0,...,7582,7582,7582,7582,7582,7582,7582,7582,7582,7582
4,1007,Bibb County,AL,1,0,0,0,0,0,0,...,8149,8149,8149,8149,8149,8149,8149,8149,8149,8149


### To focus specifically on the data trends for North Carolina, we will create a dedicated dataframe that exclusively contains North Carolina's data. This targeted approach will allow us to accurately identify and analyze the trend patterns within this region.

In [4]:
northcarolina_df = cases[cases['State'] == 'NC']
date_columns = northcarolina_df.columns[-7:]
trends_df = northcarolina_df[date_columns].diff(axis=1)

overall_trend = trends_df.sum().sum()

if overall_trend > 0:
    print("Cases are Increasing.")
elif overall_trend < 0:
    print("Cases are Decreasing.")
else:
    print("The number of cases has remained stable, showing no notable fluctuations.")

The number of cases has remained stable, showing no notable fluctuations.


### Loading and Displaying the Employment Data Set

In [5]:
employment_df = pd.read_excel('allhlcn231.xlsx')
employment_df.head()

Unnamed: 0,Area\nCode,St,Cnty,Own,NAICS,Year,Qtr,Area Type,St Name,Area,...,Industry,Status Code,Establishment Count,January Employment,February Employment,March Employment,Total Quarterly Wages,Average Weekly Wage,Employment Location Quotient Relative to U.S.,Total Wage Location Quotient Relative to U.S.
0,US000,US,0.0,0,10,2023,1,Nation,,U.S. TOTAL,...,"10 Total, all industries",,11841874,150223138,151012227,151528335,2874415473594,1465,1.0,1.0
1,US000,US,0.0,1,10,2023,1,Nation,,U.S. TOTAL,...,"10 Total, all industries",,60721,2865577,2879371,2883686,69924124474,1870,1.0,1.0
2,US000,US,0.0,2,10,2023,1,Nation,,U.S. TOTAL,...,"10 Total, all industries",,71769,4523955,4614397,4634781,87122334855,1460,1.0,1.0
3,US000,US,0.0,3,10,2023,1,Nation,,U.S. TOTAL,...,"10 Total, all industries",,171738,14256931,14378197,14442084,222917538523,1194,1.0,1.0
4,US000,US,0.0,5,10,2023,1,Nation,,U.S. TOTAL,...,"10 Total, all industries",,11537646,128576675,129140262,129567784,2494451475742,1486,1.0,1.0


### Executing a Data Integration: Merging COVID-19 Statistics with Employment Information

### Creating a Enhanced COVID Dataset: Integrating Confirmed Case Counts, Deaths, and County Population Data

In [7]:
cases_df = pd.read_csv("covid_confirmed_usafacts.csv")
deaths_df = pd.read_csv("covid_deaths_usafacts.csv")
population_df = pd.read_csv("covid_county_population_usafacts.csv")

cases_deaths = pd.merge(cases_df, deaths_df, on='countyFIPS', suffixes=('_cases', '_deaths'))
covid_df = pd.merge(cases_deaths, population_df, on='countyFIPS')
covid_df.head()

Unnamed: 0,countyFIPS,County Name_cases,State_cases,StateFIPS_cases,2020-01-22_cases,2020-01-23_cases,2020-01-24_cases,2020-01-25_cases,2020-01-26_cases,2020-01-27_cases,...,2023-07-17_deaths,2023-07-18_deaths,2023-07-19_deaths,2023-07-20_deaths,2023-07-21_deaths,2023-07-22_deaths,2023-07-23_deaths,County Name,State,population
0,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,Statewide Unallocated,AL,0
1,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,Statewide Unallocated,AK,0
2,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,Statewide Unallocated,AZ,0
3,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,Statewide Unallocated,AR,0
4,0,Statewide Unallocated,AL,1,0,0,0,0,0,0,...,0,0,0,0,0,0,0,Statewide Unallocated,CA,0


### Combining the Enhanced COVID Dataset with Employment Data for a Multi-Dimensional Analysis

In [8]:
# We'll isolate the county names from the 'Area' column, which presents data in the 'County, State' format

employment_df['County Name'] = employment_df['Area'].str.extract(r'^(.*?),')

# We will integrate the Super COVID-19 dataset with the Employment dataset by matching entries using the 'County Name' column

covid_employment = employment_df.merge(covid_df, on='County Name', how='inner')

# We will remove the newly created 'County Name' column to revert to the dataset's original structure
covid_employment = covid_employment.drop(columns='County Name')
covid_employment.head()

Unnamed: 0,Area\nCode,St,Cnty,Own,NAICS,Year,Qtr,Area Type,St Name,Area,...,2023-07-16_deaths,2023-07-17_deaths,2023-07-18_deaths,2023-07-19_deaths,2023-07-20_deaths,2023-07-21_deaths,2023-07-22_deaths,2023-07-23_deaths,State,population
0,1001,1,1.0,0,10,2023,1,County,Alabama,"Autauga County, Alabama",...,235,235,235,235,235,235,235,235,AL,55869
1,1001,1,1.0,1,10,2023,1,County,Alabama,"Autauga County, Alabama",...,235,235,235,235,235,235,235,235,AL,55869
2,1001,1,1.0,2,10,2023,1,County,Alabama,"Autauga County, Alabama",...,235,235,235,235,235,235,235,235,AL,55869
3,1001,1,1.0,3,10,2023,1,County,Alabama,"Autauga County, Alabama",...,235,235,235,235,235,235,235,235,AL,55869
4,1001,1,1.0,5,10,2023,1,County,Alabama,"Autauga County, Alabama",...,235,235,235,235,235,235,235,235,AL,55869
