# The effects of COVID-19 on crime rates in Vancouver

group project proposal:

Sean Lee, Neil Li, Tracy Wang, Wendi Zhong

## Introduction

Before the pandemic, our teammate Neil has experienced no crimes more major than perhaps public drunkeness, but once the pandemic started, he has been subjected to two different attempts of grand theft auto and one shooting. This can't help but make us wonder: is this simply a streak of bad luck or is there a genuine correlation between these crimes and the pandemic?

Now this is not a completely unfounded idea, as although research has shown 

### Research Question:

<b>Has Covid 19 affected the frequency and severity of Crimes?<b>

In [None]:
library(tidyverse)
library(datateachr)
library(repr)
library(digest)
library(infer)
library(grid)

## Dataset Info:

The dataset is downloaded from \"[Vancouver Crime Data](https://geodash.vpd.ca/opendata/)\", an open data dataset provided by the Vancouver Police Department. Which we selected to list all the the crimes commited in every neighbourhood in Vancouver since 2003.

In [None]:
crime_data <- read.csv("crimedata_csv_AllNeighbourhoods_AllYears.csv")
head(crime_data)

Because we want to have the crime data be more representative of the difference between the years leading up to the pandemic to the years during and after the pandemic, we will filter the data to only include years from 2017 onwards, and before November since 2022 has not had a November yet. We will also only need the columns containing the type of the crime, year the crime was committed.

In [None]:
crime_data_processed <- crime_data %>%
    filter(YEAR >= 2017, MONTH <= 10) %>%
    select(TYPE, YEAR)

head(crime_data_processed)

In [None]:
set.seed(2190)

crime_sample <- crime_data_processed %>%
    rep_sample_n(size = 5000, replace = FALSE) %>%
    mutate(Pandemic = ifelse(YEAR < 2020, "Before", "After"))
head(crime_sample)

We first decided to visualize the overall spread of crime over the six years by taking a sample of size 3000, and bootstrapping 1000 samples from it to see the overall

In [None]:
# create bootstrap samples of the difference in proportion of crimes commited before the pandemic 
# (YEAR < 2020), and obtain a 95% confidence interval from this
set.seed(1234)
sample <- crime_data_processed %>%
    rep_sample_n(size = 3000)
sample

crime_sample %>%
group_by(Pandemic) %>%
summarize(n = n())


Because this is a large dataset, we have the luxury of creating many large samples, and with those large samples we could apply the central limit theorem to get more crucial data.

In [None]:
# calculate mean and standard deviation on the difference between the total amount of crime before and after the pandemic using the central limit theorem and obtain a 95% confidence interval from this

## Methods


mention our plans for hypothesis testing, and future plans to test how different kinds of crimes have been affected.

# References:

Ferguson, E. (2015). Crime and punishment vocabulary with pronunciation. IELTS Liz. Retrieved October 31, 2022, from https://ieltsliz.com/crime-and-punishment-vocabulary/ 

n.a. (n.d.). Crime Data Download. VPD open data. Retrieved October 31, 2022, from https://geodash.vpd.ca/opendata/ 

Nivette, A.E., Zahnow, R., Aguilar, R. et al. A global analysis of the impact of COVID-19 stay-at-home restrictions on crime. Nat Hum Behav 5, 868–877 (2021). https://doi.org/10.1038/s41562-021-01139-z