# COVID-19 Trends in Prison Facilities Analysis Project

# Instructions:
Please use the COVID-19 Trends by County (Prison) Dataset and the plot type in the comments to answer the questions. Examples of how to read in this data and create all these types of plots were in the lecture notes. If youre still having trouble let me know, please! :)

### Dataset Overview
This data set contains information collected in March 2021 on the number of COVID-19 cases in
various prison facilities in the US. It also has the number of cases and deaths for inmates and
officers as well as the total number of inmates in each facility.




Step 1: Calculate a proxy for the number of cases per member in the facility
 * to do that, look at (Number of Cases in Inmates + Number of Cases in
 Guards) / (Number of Inmates) to get the latest inmate population.
 * Drop all the rows that were missing this value.


In [None]:
# Load required libraries
library(readr)
library(dplyr)

# Load data
facilities <- read_csv("/content/facilities.csv", show_col_types=FALSE)

# Data cleaning - drop rows missing inmate population data
facilities <- facilities %>%
  filter(!is.na(latest_inmate_population))

# Create normalized cases metric
facilities <- facilities %>%
  mutate(normalized_cases = (total_inmate_cases + total_officer_cases) / latest_inmate_population)

# View the updated dataset
head(facilities)

nyt_id,facility_name,facility_type,facility_city,facility_county,facility_county_fips,facility_state,facility_lng,facility_lat,latest_inmate_population,max_inmate_population_2020,total_inmate_cases,total_inmate_deaths,total_officer_cases,total_officer_deaths,note,normalized_cases
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<dbl>
F3EFE858,Alex City Work Release prison,Low-security work release,Alex City,Coosa,1037,Alabama,-86.00901,32.90451,188,,77,0,17,0,,0.5
5B910220,Alabama Therapeutic Education Facility prison,State rehabilitation center,Columbiana,Shelby,1117,Alabama,-86.62407,33.18075,272,,11,1,2,0,,0.04779412
02FB1675,Bibb Correctional Facility,State prison,Brent,Bibb,1007,Alabama,-87.16278,32.92075,1725,1825.0,164,3,61,0,,0.13043478
6378F6C4,Birmingham Women's Community Based Facility and Community Work Center,State prison,Birmingham,Jefferson,1073,Alabama,-86.80834,33.5311,192,,17,0,28,0,,0.234375
EAABF900,Bullock Correctional Facility,State prison,Bessemer,Bullock,1011,Alabama,-85.67393,32.14714,1477,1577.0,162,5,80,1,,0.16384563
D19A2461,Camden prison,State prison,Camden,Wilcox,1131,Alabama,-87.28786,31.98781,49,,5,0,3,0,,0.16326531


## **Question 1: Are there differences in the normalized number of cases based on the facility type (jail, prison, juvenile detention, etc).**
Use a bar chart to show the average normalized number of cases in each facility type.

In [None]:
options(repr.plot.width = 10, repr.plot.height = 6)
library(ggplot2)
library(dplyr)

# Ensure facility_type is a factor for consistent ordering and coloring


# Plot



### **Question 2: Are there differences in the normalized number of cases across different states?**

In [None]:
library(ggplot2)

# Set plot size for Colab or Jupyter
options(repr.plot.width = 14, repr.plot.height = 6)

# Boxplot


### **Question 3: Is there a relationship between the location of the facility and the normalized number of Covid-19 deaths?**

In [None]:
library(ggplot2)

# Adjust plot size for Google Colab
options(repr.plot.width = 10, repr.plot.height = 6)

# Scatter plot



### **Question 4: Is there a relationship between the inmate population and the number of COVID-19 cases in the officers?**

In [None]:
library(ggplot2)

# Set figure size (for Google Colab or Jupyter)
options(repr.plot.width = 10, repr.plot.height = 6)

# Scatter plot with color by facility_type
