# Group Proposal


**Authors:** Bryan Chang, Linda Huang, Jade Jordan, Inan Latif

**Group:** 5

## Introduction

Bike theft is a common concern among Vancouverites, with cycling being a primary mode of transportation for many throughout the city. As such, it would be beneficial for cycling commuters to know how to reduce the chances of their bikes being stolen. To approach this, we are going to analyze the frequency of bike thefts in each Vancouver neighbourhood alongside the time of day (hour) in which they occur, so that people can better prepare to keep their bikes safe in these locations and time-periods.

The data set which we are using is published directly by the Vancouver Police Department (VPD) with options for you to filter out crime data from 2003 to 2022, inclusive. Along with that, you are also able to filter by any of the neighbourhoods of Vancouver proper. We will filter X columns…

**Response variable + test of interest:** 

**Research Question:** what location and time of day is bike theft most common?

**filter out:** bike theft, hour, neighbourhood



****


## Preliminary Results


In [4]:
library(tidyverse)
library(datateachr)
library(repr)
library(digest)
library(infer)
library(grid)
library(gridExtra)

In [5]:
#Reading data into R, and renaming all columns to a standard format
crime_data <- 
    read.csv("CrimeData.csv") %>%  
    setNames(c("type",
               "year",
               "month",
               "day",
               "hour",
               "minute",
               "hundred_block",
               "neighbourhood",
               "x",
               "y"))

head(crime_data)

Unnamed: 0_level_0,type,year,month,day,hour,minute,hundred_block,neighbourhood,x,y
Unnamed: 0_level_1,<chr>,<int>,<int>,<int>,<int>,<int>,<chr>,<chr>,<dbl>,<dbl>
1,Theft from Vehicle,2019,10,10,20,0,12XX W 76TH AVE,Marpole,490316.5,5449844
2,Theft from Vehicle,2019,3,19,21,45,12XX W 7TH AVE,Fairview,490387.2,5456955
3,Theft from Vehicle,2019,5,3,17,0,12XX W 8TH AVE,Fairview,490298.3,5456855
4,Theft from Vehicle,2019,3,4,0,0,12XX W 8TH AVE,Fairview,490368.2,5456863
5,Theft from Vehicle,2019,3,29,23,0,12XX W 8TH AVE,Fairview,490381.3,5456853
6,Theft from Vehicle,2019,12,17,5,16,12XX W 8TH AVE,Fairview,490441.0,5456852


## Cleaning Data

In [10]:
#filtering out type and minute column
crime_select <- crime_data %>% select(-type, -minute)

#renaming hundred_block to a more human readable format
crime_rename <- crime_select %>% 
                mutate(address = hundred_block) %>% 
                select(-hundred_block)
days_by_m <- c(0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334)
crime_newcol <- crime_rename %>%
                mutate(day = day + days_by_m[month]) %>%
                select(-month, -x, -y, -address)
head(crime_newcol)

Unnamed: 0_level_0,year,month,day,hour,neighbourhood,x,y,address
Unnamed: 0_level_1,<int>,<int>,<int>,<int>,<chr>,<dbl>,<dbl>,<chr>
1,2019,10,10,20,Marpole,490316.5,5449844,12XX W 76TH AVE
2,2019,3,19,21,Fairview,490387.2,5456955,12XX W 7TH AVE
3,2019,5,3,17,Fairview,490298.3,5456855,12XX W 8TH AVE
4,2019,3,4,0,Fairview,490368.2,5456863,12XX W 8TH AVE
5,2019,3,29,23,Fairview,490381.3,5456853,12XX W 8TH AVE
6,2019,12,17,5,Fairview,490441.0,5456852,12XX W 8TH AVE


Unnamed: 0_level_0,year,day,hour,neighbourhood
Unnamed: 0_level_1,<int>,<dbl>,<int>,<chr>
1,2019,283,20,Marpole
2,2019,78,21,Fairview
3,2019,123,17,Fairview
4,2019,63,0,Fairview
5,2019,88,23,Fairview
6,2019,351,5,Fairview


## Methods & Plan

**Add Methods:**



After inferential analysis of our data, we expect to find the Vancouver neighbourhood with the highest frequency of bike thefts. From that, we will visualize frequency of thefts by time. These findings could help cyclists prepare themselves to keep their property safe, or perhaps avoid cycling in specific areas/times.



**add future progression/research**

What do you expect to find?

What impact could such findings have?

What future questions could this lead to?


