# Vancouver Crime in Relation to Wealth

### Group 13 Project Proposal
#### Introduction

In recent years, the picturesque city of Vancouver has been grappling with a mounting concern as crime, with a notable surge in theft cases, has become an increasingly prevalent issue, posing challenges to the safety and security of its residents. CTV news reported on October 27th of this year that 258 arrests were made in a 2-week period as the Vancouver police department continue the shoplifting crackdown[1]. On October 30th, the B.C. Coalition called for immediate government action on theft, vandalism and violent crime, and its members say they have reached "epidemic proportions'' across the province[2].

Given the prevalent issue, in our project, we will be answering the question: “how do rates of theft differ in different neighbourhoods in Vancouver?”. We will be using the crime data set [3] from 2003 to 2023 released by the VPD, to investigate the number of incidents of robbery, theft of motor vehicles, theft from automotives, and theft under $5,000. The two neighbourhoods we will be comparing the theft rates of are: Strathcona(more commonly known as downtown east side), the poorest neighbourhood in Vancouver, according to Wikipedia[4] and CBC[5],  and Kerrisdale, which is one of the richest neighbourhoods in Vancouver[6][7]. Our location parameter of interest will be the mean value, and our scale parameter of interest will be the standard deviation. We believe that these two parameters will help us answer our question at hand, in comparing the rates of theft of the two neighbourhoods. 


In [1]:
# Setup
set.seed(3)

library(tidyverse)
library(tidymodels)
library(repr)
library(cowplot)
library(GGally)
library(ISLR)
options(repr.matrix.max.rows = 6)

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.3     [32m✔[39m [34mreadr    [39m 2.1.4
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.0
[32m✔[39m [34mggplot2  [39m 3.4.3     [32m✔[39m [34mtibble   [39m 3.2.1
[32m✔[39m [34mlubridate[39m 1.9.2     [32m✔[39m [34mtidyr    [39m 1.3.0
[32m✔[39m [34mpurrr    [39m 1.0.1     
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mℹ[39m Use the conflicted package ([3m[34m<http://conflicted.r-lib.org/>[39m[23m) to force all conflicts to become errors
── [1mAttaching packages[22m ────────────────────────────────────── tidymodels 1.1.1 ──

[32m✔[39m [34mbroom       [39m 1.0.5     [32m✔[39m [34mrsample     [39

#### Reading Data

The first step is to read in our data set from the web using the 'read_csv' function. The datasets were originally downloaded from "https://geodash.vpd.ca/opendata", where the data for Kerrisdale and Strathcona were in seperate csv's. So, we read the two csv's seperately and then merged them together into a single data frame using rbind. Our raw dataset also contained a lot of unnecessary columns like the exact time (MONTH, DAY, HOUR, MINUTE) and the precise location (HUNDRED_BLOCK, X, Y) of the crime committed, so we selected only the relevant columns to our study such as "TYPE", "YEAR", and "NEIGHBOURHOOD".


In [19]:
# Load the data from the web
kerris_url <- "https://drive.google.com/uc?export=download&id=1XOj_2FTc-0lW5-8RX9lRBOgzX5A2IMet"
strath_url <- "https://drive.google.com/uc?export=download&id=1wXQ8W3kBSo7ija-mhUzCVPbShviQHLNt"
kerrisdale_data <- read_csv(kerris_url) 
strathcona_data <- read_csv(strath_url)

# Merge the two data frames into a single data frame:
crime <- rbind(kerrisdale_data, strathcona_data)

# Select only relevant columns to our study
crime <- select(crime, c(TYPE, YEAR, NEIGHBOURHOOD))

print("Table 1: 2003-2023 Crime Data in Kerrisdale and Stratcona")
head(crime)
tail(crime)

[1mRows: [22m[34m11506[39m [1mColumns: [22m[34m10[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (3): TYPE, HUNDRED_BLOCK, NEIGHBOURHOOD
[32mdbl[39m (7): YEAR, MONTH, DAY, HOUR, MINUTE, X, Y

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.
[1mRows: [22m[34m56640[39m [1mColumns: [22m[34m10[39m
[36m──[39m [1mColumn specification[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m (3): TYPE, HUNDRED_BLOCK, NEIGHBOURHOOD
[32mdbl[39m (7): YEAR, MONTH, DAY, HOUR, MINUTE, X, Y

[36mℹ[39m Use `spec()` to retrieve the full column specification for this data.
[36mℹ[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


[1] "Table 1: 2003-2023 Crime Data in Kerrisdale and Stratcona"


TYPE,YEAR,NEIGHBOURHOOD
<chr>,<dbl>,<chr>
Break and Enter Commercial,2017,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale
Break and Enter Commercial,2007,Kerrisdale
Break and Enter Commercial,2006,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale


TYPE,YEAR,NEIGHBOURHOOD
<chr>,<dbl>,<chr>
Vehicle Collision or Pedestrian Struck (with Injury),2022,Strathcona
Vehicle Collision or Pedestrian Struck (with Injury),2013,Strathcona
Vehicle Collision or Pedestrian Struck (with Injury),2018,Strathcona
Vehicle Collision or Pedestrian Struck (with Injury),2022,Strathcona
Vehicle Collision or Pedestrian Struck (with Injury),2023,Strathcona
Vehicle Collision or Pedestrian Struck (with Injury),2013,Strathcona


#### Cleaning and Wrangling Data
We tidy our data so it follows a consistent format that functions in tidyverse will recognize. We can mutate 'NEIGHBOURHOOD' to be a factor to make visualizations and exploratory analysis easier.

In [22]:
#Here we will mutate NEIGHBOURHOOD to a factor so we can visulaize it easier 
crime_mutate <- crime |>
mutate(NEIGHBOURHOOD = as_factor(NEIGHBOURHOOD))



print("Table 2: Tidy Tennis Data")
head(crime_mutate)

[1] "Table 2: Tidy Tennis Data"


TYPE,YEAR,NEIGHBOURHOOD
<chr>,<dbl>,<fct>
Break and Enter Commercial,2017,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale
Break and Enter Commercial,2007,Kerrisdale
Break and Enter Commercial,2006,Kerrisdale
Break and Enter Commercial,2020,Kerrisdale


#### Preliminary exploratory data analysis
#### Methods
#### Expected outcomes and significance:
- What do you expect to find?
- What impact could such findings have?
- What future questions could this lead to?
#### References
[1] : https://bc.ctvnews.ca/258-arrests-made-in-2-week-period-as-vancouver-police-continue-shoplifting-crackdown-1.6620249

[2] : https://www.cbc.ca/news/canada/british-columbia/save-our-streets-seeks-crackdown-on-violent-retail-crimes-1.7013229

[3]: https://geodash.vpd.ca/opendata/#   *Source of Our Data

[4]: https://en.wikipedia.org/wiki/Downtown_Eastside#:~:text=The%20Downtown%20Eastside%20(DTES)%20is,mental%20illness%20and%20sex%20work.

[5]: https://www.cbc.ca/news/canada/british-columbia/weighing-in-on-future-of-vancouver-s-downtown-eastside-1.1356400

[6]: https://www.thebestvancouver.com/where-do-rich-in-vancouver-live/

[7]: https://www.cantechletter.com/2023/07/the-five-wealthiest-neighbourhoods-in-vancouver-listed/

