# Descriptive Analysis of Jail Population

The goal of this noteboook is to visualize the past trends of the jail population data, especially in regard to the COVID-19 pandemic data points. This will help inform us on which method (if any) to take to mitigate the impact of these data points on our forecast.

In [5]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [2]:
#import data
daily_pop = pd.read_csv("../Data/daily_pop.csv", index_col = 0)
monthly_pop = pd.read_csv("../Data/_30_day_adp.csv", index_col = 0)

In [3]:
daily_pop.head()

Unnamed: 0,snapshot_date,Total Population
0,2016-06-02,9836
1,2016-06-03,9780
2,2016-06-04,9765
3,2016-06-05,9894
4,2016-06-06,9904


In [4]:
monthly_pop.head()

Unnamed: 0,Start Date,End Date,ADP
0,2016-05-25,2016-06-23,9809.0
1,2016-06-24,2016-07-23,9764.0
2,2016-07-24,2016-08-22,9762.0
3,2016-08-23,2016-09-21,9861.0
4,2016-09-22,2016-10-21,9816.0


According to [sources](https://www.governor.ny.gov/news/governor-cuomo-announces-new-york-ending-covid-19-state-disaster-emergency-june-24), New York Governor Andrew M. Cuomo ended the state disaster emergency declared on March 7, 2020 to fight COVID-19 June 25, 2021 given New York's dramatic progress against COVID-19, with the success in vaccination rates, and declining hospitalization.

In [6]:
#identify the COVID-19 period
covid_start = pd.to_datetime('2020-03-07')
covid_end = pd.to_datetime('2021-06-25')

# Filter pre-COVID and COVID data
pre_covid = monthly_pop[monthly_pop['date'] < covid_start]
covid_data = monthly_pop[(monthly_pop['date'] >= covid_start) & (monthly_pop['date'] <= covid_end)]

# Plotting the KDE plot with overlay
plt.figure(figsize=(12, 6))
sns.kdeplot(pre_covid['population'], shade=True, label='Pre-COVID Distribution')
sns.kdeplot(covid_data['population'], shade=True, color='r', label='COVID-19 Data')

# Overlay the mean/median
plt.axvline(pre_covid['population'].mean(), color='blue', linestyle='--', label='Pre-COVID Mean')
plt.axvline(pre_covid['population'].median(), color='green', linestyle='--', label='Pre-COVID Median')

plt.title('Distribution of Monthly Jail Population with COVID-19 Overlay')
plt.xlabel('Jail Population')
plt.ylabel('Density')
plt.legend()
plt.show()


KeyError: 'date'