# Baltimore shootings analysis

By [Christine Zhang](mailto:czhang@baltsun.com) and [Caroline Pate](mailto:cpate@baltsun.com)

The Baltimore Sun conducted the following analysis of Baltimore Police Department Part 1 Victim Based Crime Data posted on [Open Baltimore](https://data.baltimorecity.gov/Public-Safety/BPD-Part-1-Victim-Based-Crime-Data/wsfq-mvij) on October 17, 2018. The report covers incidents between January 1, 2012, and October 13, 2018.

The analysis provided information for the October 18, 2018 Baltimore Sun story titled ["Baltimore's last weekend without a shooting was the weekend Freddie Gray was arrested"](http://www.baltimoresun.com/news/maryland/crime/bs-md-ci-violence-stats-20181018-story.html).

Here were the findings presented in the story:

- There were zero shootings in Baltimore between April 10-12, 2015 — the last weekend in Baltimore which there were no shootings.
- Twelve shootings occurred on Sept. 24, 2016 — the last day that there were as much shootings as the 11 shootings on Oct. 16, 2018.
- Since 2012, the longest stretch of time Baltimore went without a shooting was an eight-day period from February 12-19, 2014.

## Import R data analysis libraries

In [1]:
suppressMessages(library('tidyverse'))
suppressMessages(library('lubridate'))

## Data processing:

### Read in BPD Part 1 Victim Based Crime Data for analysis

In [2]:
crime <- read.csv("BPD_Part_1_Victim_Based_Crime_Data.csv", stringsAsFactors = F)

Create a column, `date`, which formats the `CrimeDate` field as a date. Create a new dataframe, `shootings`, which filter the datas to include shootings and homicides for which a firearm was the weapon used.

In [3]:
crime$date <- mdy(crime$CrimeDate)
shootings <- crime %>% filter((Description == 'SHOOTING' | Description == 'HOMICIDE') & Weapon == 'FIREARM') %>%
                       arrange(date) 

### Group the `shootings` dataframe by date to get the number of shootings that occurred on each day

This is saved into another dataframe, `shootings.bydate`. The number of shootings on this date is provided a column labeled `n`.

In [4]:
shootings.bydate <- shootings %>% group_by(date) %>% summarise(n = sum(Total.Incidents, na.rm = T))

### Add extra columns for days when no shootings occurred and fill the `n` column in with zeros for those days

First, get the time period (minimum date and maximum date) of the `shootings.bydate` dataframe:

In [5]:
time.min <- shootings.bydate$date[1]
time.max <- shootings.bydate$date[length(shootings.bydate$date)]

print(time.min)
print(time.max)

[1] "2012-01-01"
[1] "2018-10-13"


Then, create a dataframe, `all.dates.frame`, containing all the days spanning that time period:

In [6]:
all.dates.frame <- data.frame(list(date = seq(time.min, time.max, by="day")))

Merge this dataframe with the `shootings.bydate` dataframe to get the dates with no shootings. This is saved into a dataframe called `merged.data`:

In [7]:
merged.data <- merge(all.dates.frame, shootings.bydate , all = T)

Extract the day of week using the `wday()` function in the `lubridate` R package. This is saved as the column `dow`:

In [8]:
merged.data$dow <- wday(merged.data$date, label=TRUE)

Finally, fill in the NAs in the `n` column for those days with no shootings with zeros:

In [9]:
merged.data$n <- ifelse(is.na(merged.data$n) == T, 0, merged.data$n)

## Data analysis:

### Finding: There were zero shootings in Baltimore between April 10-12, 2015 — the last weekend in Baltimore which there were no shootings.

For each date in the `merged.data` dataframe, calculate the sum of the shootings in the three days prior to that date. This is saved into the variable `lag3_n`.

In [10]:
merged.data <- merged.data %>% mutate(lag3_n = lag(n, 1)+lag(n, 2)+lag(n, 3))

To determine which weekends (defined as the three-day period from Friday to Sunday) saw no shootings, filter `merged.data` to include 0 for `n` and Monday for `dow`.

In [11]:
mondays <- merged.data %>% filter(lag3_n == 0 & dow == 'Mon') 

In [12]:
mondays

date,n,dow,lag3_n
2012-01-16,3,Mon,0
2013-02-18,1,Mon,0
2013-03-25,1,Mon,0
2013-07-22,9,Mon,0
2014-02-17,0,Mon,0
2014-03-24,1,Mon,0
2015-02-09,1,Mon,0
2015-04-13,2,Mon,0


There were no shootings over the three day period encompassing Friday 2015-04-10, Saturday 2015-04-11, Sunday 2015-04-12. (There were 2 shootings on Monday 2015-04-13.)

### Finding: Twelve shootings occurred on Sept. 24, 2016 — the last day that there were as much shootings as the 11 shootings on Oct. 16, 2018.

Use `table()` to view the distribution of shooting incidents in the `shootings.bydate` dataframe.

In [13]:
table(shootings.bydate$n)


  1   2   3   4   5   6   7   8   9  10  11  12 
641 524 331 212 113  76  48  21  13   4   3   2 

Filter to see the days with the maximum `n`, which in this case is 12.

In [14]:
shootings.bydate %>% filter(n == 12)

date,n
2015-05-20,12
2016-09-24,12


There were 12 shootings on Sept. 24, 2016.

### Finding: Since 2012, the longest stretch of time Baltimore went without a shooting was an eight-day period from February 12-19, 2014.

Create a variable, `days_between_shootings`, that represents the number of days between the dates in the `shootings.bydate` dataframe. This is calculated using the `difftime()` function in base R.

In [15]:
shootings.bydate$days_between_shootings <- difftime(shootings.bydate$date, lag(shootings.bydate$date)) - 1

In [16]:
shootings.bydate %>% filter(!is.na(days_between_shootings)) %>% arrange(desc(days_between_shootings)) %>% head()

date,n,days_between_shootings
2014-02-20,1,8 days
2018-08-26,5,7 days
2014-03-24,1,6 days
2015-02-09,1,6 days
2013-03-25,1,5 days
2013-07-22,9,5 days


Check this with the `merged.data` dataframe:

In [17]:
merged.data %>% filter(year(date) == 2014 & month(date) == 2) %>% select(date, n)

date,n
2014-02-01,0
2014-02-02,0
2014-02-03,1
2014-02-04,0
2014-02-05,2
2014-02-06,0
2014-02-07,2
2014-02-08,1
2014-02-09,0
2014-02-10,1


The eight-day period from 2014-02-12 to 2014-02-19 had zero recorded shootings.