<center><h1>Introduction to dplyr Package</h1></center>

# 1. The _dplyr_ Package

  - "dplyr" is short for "data plyer"
  - R package for aggregating, summarizing, reshaping, and generally wrangling data
  - Extremely popular in the R community
  - Authored by Hadley Wickham
  - Part of the "tidyverse" set of packages

## 1.1 The _dplyr_ "Verbs"

  - The _dplyr_ package is organized around a set of "verbs", which are functions that operator on data
    + `filter()`
    + `summarise()`
    + `select()`
    + `mutate()`
    + `arrange()`

## 1.2 The "Pipe" Operator

  - Can be used to pipe some object into a function call
  - `%>%`
    + `x %>% f(y)` is the same as `f(x, y)`
    

# 2. `filter()` Examples with _dplyr_

In [3]:
library(dplyr)           # load the package

In [4]:
arrests_df <- read.csv("data/pvd_arrests_2020-10-03.csv")

In [7]:
arrests_df %>% 
    filter(gender == "Male") 

arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
2019-08-24T02:23:00.0,2019,8,Male,White,NonHispanic,1981,37,No Permanent Address,providence,Rhode Island,,,,,2019-00084142,"YGonzalez, LTaveras",pvd2218242150382148273
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2.1,Chemical Test Refusal,1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2,Driving Under the Influence of Liqour or Drugs (=>.08<.1),1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083963,JHanley,pvd1675234703933765967
2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,11-32-1,OBSTRUCTING OFFICER IN EXECUTION OF DUTY,1,2019-00083963,JHanley,pvd1675234703933765967
2019-08-23T14:42:00.0,2019,8,Male,White,Hispanic,1998,20,LAURA ST,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00083892,"JCotugno, ALevesque, JButen, JJohnson",pvd17953747948212880432


### 2.1.1 Comparing `filter()` with Logical Indexing

In [8]:
# dplyr approach
arrests_df %>% 
    filter(gender == "Male")


# "base" R approach
is_male <- arrests_df$gender == "Male"      # create vector of bools

arrests_df[is_male, ]                       # get male

arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
2019-08-24T02:23:00.0,2019,8,Male,White,NonHispanic,1981,37,No Permanent Address,providence,Rhode Island,,,,,2019-00084142,"YGonzalez, LTaveras",pvd2218242150382148273
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2.1,Chemical Test Refusal,1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2,Driving Under the Influence of Liqour or Drugs (=>.08<.1),1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083963,JHanley,pvd1675234703933765967
2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,11-32-1,OBSTRUCTING OFFICER IN EXECUTION OF DUTY,1,2019-00083963,JHanley,pvd1675234703933765967
2019-08-23T14:42:00.0,2019,8,Male,White,Hispanic,1998,20,LAURA ST,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00083892,"JCotugno, ALevesque, JButen, JJohnson",pvd17953747948212880432


Unnamed: 0_level_0,arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
1,2019-08-24T02:23:00.0,2019,8,Male,White,NonHispanic,1981,37,No Permanent Address,providence,Rhode Island,,,,,2019-00084142,"YGonzalez, LTaveras",pvd2218242150382148273
8,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2.1,Chemical Test Refusal,1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
9,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2,Driving Under the Influence of Liqour or Drugs (=>.08<.1),1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
10,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
11,2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
12,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
13,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
14,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083963,JHanley,pvd1675234703933765967
15,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,11-32-1,OBSTRUCTING OFFICER IN EXECUTION OF DUTY,1,2019-00083963,JHanley,pvd1675234703933765967
18,2019-08-23T14:42:00.0,2019,8,Male,White,Hispanic,1998,20,LAURA ST,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00083892,"JCotugno, ALevesque, JButen, JJohnson",pvd17953747948212880432


## 2.2 `filter()` Examples (cont.)

In [11]:
# Here we create a new data.frame from result of filter()

arrests_males <- arrests_df %>%
    filter(gender == "Male")                

In [12]:
head(arrests_males)

Unnamed: 0_level_0,arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
1,2019-08-24T02:23:00.0,2019,8,Male,White,NonHispanic,1981,37,No Permanent Address,providence,Rhode Island,,,,,2019-00084142,"YGonzalez, LTaveras",pvd2218242150382148273
2,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2.1,Chemical Test Refusal,1.0,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
3,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-27-2,Driving Under the Influence of Liqour or Drugs (=>.08<.1),1.0,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
4,2019-08-23T23:43:00.0,2019,8,Male,Black,NonHispanic,1991,28,PUBLIC ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1.0,2019-00084056,"CVingi, SCooney",pvd6431558757894418021
5,2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1.0,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
6,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1.0,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829


## 2.2 Using `filter()` with Multiple Conditions

In [14]:
arrests_teen_male <- arrests_df %>%
    filter(
        gender == "Male",
        age < 20
    )

head(arrests_teen_male)

Unnamed: 0_level_0,arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
1,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
2,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
3,2019-08-21T13:09:00.0,2019,8,Male,White,Hispanic,1999,19,MELISSA AVE,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083170,"ITavarez, IYousif, CBrown, EDelgado",pvd5047836359365815220
4,2019-08-21T13:09:00.0,2019,8,Male,White,Hispanic,1999,19,MELISSA AVE,Providence,,RI Statute Violation,11-45-1,DISORDERLY CONDUCT,1,2019-00083170,"ITavarez, IYousif, CBrown, EDelgado",pvd5047836359365815220
5,2019-08-21T13:09:00.0,2019,8,Male,White,Hispanic,1999,19,MELISSA AVE,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083170,"ITavarez, IYousif, CBrown, EDelgado",pvd5047836359365815220
6,2019-08-20T02:00:00.0,2019,8,Male,White,Hispanic,1999,19,MINK RD,Providence,Rhode Island,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00078616,"JGagnon, RMalloy",pvd1076862233562848683


### 2.2.1 Using `filter()` with Logical OR

  - Recall the `||` operator is the logical OR
  - The `|` operator performs the same role, but elementwise for columns (or vectors)

In [16]:
young_old_male <- arrests_df %>%
    filter(
        gender == "Male",
        age < 25 | age > 65  
    )

head(young_old_male)

Unnamed: 0_level_0,arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
1,2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
2,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
3,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
4,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083963,JHanley,pvd1675234703933765967
5,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,11-32-1,OBSTRUCTING OFFICER IN EXECUTION OF DUTY,1,2019-00083963,JHanley,pvd1675234703933765967
6,2019-08-23T14:42:00.0,2019,8,Male,White,Hispanic,1998,20,LAURA ST,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00083892,"JCotugno, ALevesque, JButen, JJohnson",pvd17953747948212880432


### 2.2.2 Using `filter()` with Logical OR (cont.)

In [18]:
ptk_young_old_male <- arrests_df %>%
    filter(
        gender == "Male",
        age < 25 | age > 65 | from_city == "Pawtucket"
    )

head(ptk_young_old_male)

Unnamed: 0_level_0,arrest_date,year,month,gender,race,ethnicity,year_of_birth,age,from_address,from_city,from_state,statute_type,statute_code,statute_desc,counts,case_number,arresting_officers,arrestee_id
Unnamed: 0_level_1,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<int>,<int>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<int>,<chr>,<chr>,<chr>
1,2019-08-23T21:38:00.0,2019,8,Male,White,Hispanic,1996,22,DOUGLAS,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00084031,"RCarlin, SKennedy",pvd15614289459563584867
2,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-27-4,"Reckless Driving, Drag Racing - Attempting to Elude",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
3,2019-08-23T19:50:00.0,2019,8,Male,White,Hispanic,2000,19,MOWRY ST,Providence,,RI Statute Violation,31-11-18,"Driving after Denial, Suspension or Revocation of License",1,2019-00083996,"SCampbell, RMalloy",pvd900460037611487829
4,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,12-7-10,RESISTING LEGAL OR ILLEGAL ARREST,1,2019-00083963,JHanley,pvd1675234703933765967
5,2019-08-23T18:26:00.0,2019,8,Male,White,Hispanic,1996,23,CUMERFORD ST,Providence,,RI Statute Violation,11-32-1,OBSTRUCTING OFFICER IN EXECUTION OF DUTY,1,2019-00083963,JHanley,pvd1675234703933765967
6,2019-08-23T14:42:00.0,2019,8,Male,White,Hispanic,1998,20,LAURA ST,Providence,,RI Statute Violation,11-44-1,DOMESTIC-VANDALISM/MALICIOUS INJURY TO PROP,1,2019-00083892,"JCotugno, ALevesque, JButen, JJohnson",pvd17953747948212880432
