<center><h1>Using <code>select()</code> Function in dplyr</h1></center>

# 1. Using `select()` to Extract Columns
  - Recall that `filter()` can be used to filter rows
  - Similarly, `select()` is used to select columns
  - These functions can be "chained"

## 1.1 Example of `select()`

In [1]:
library(dplyr)

arrests <- read.csv("data/pvd_arrests_2020-10-03.csv")


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union




In [2]:
arrests_subset <- arrests %>% 
    select(arrestee_id, age, gender, statute_desc)

head(arrests_subset)

Unnamed: 0_level_0,arrestee_id,age,gender,statute_desc
Unnamed: 0_level_1,<chr>,<int>,<chr>,<chr>
1,pvd2218242150382148273,37,Male,
2,pvd15166785558364246202,25,,"Driving after Denial, Suspension or Revocation of License"
3,pvd3142917706201385905,34,Female,RESISTING LEGAL OR ILLEGAL ARREST
4,pvd3142917706201385905,34,Female,DISORDERLY CONDUCT
5,pvd460449304532374599,18,Female,RESISTING LEGAL OR ILLEGAL ARREST
6,pvd460449304532374599,18,Female,DISORDERLY CONDUCT


### 1.1.1 Comparing `select()` to `[, ]` notation

In [3]:
# dplyr example
arrests %>% 
    select(arrestee_id, age, gender, statute_desc)


# equivalent in "base" R example
cols <- c("arrestee_id", "age", "gender", "statute_desc")

arrests[, cols]

arrestee_id,age,gender,statute_desc
<chr>,<int>,<chr>,<chr>
pvd2218242150382148273,37,Male,
pvd15166785558364246202,25,,"Driving after Denial, Suspension or Revocation of License"
pvd3142917706201385905,34,Female,RESISTING LEGAL OR ILLEGAL ARREST
pvd3142917706201385905,34,Female,DISORDERLY CONDUCT
pvd460449304532374599,18,Female,RESISTING LEGAL OR ILLEGAL ARREST
pvd460449304532374599,18,Female,DISORDERLY CONDUCT
pvd460449304532374599,18,Female,OBSTRUCTING OFFICER IN EXECUTION OF DUTY
pvd6431558757894418021,28,Male,Chemical Test Refusal
pvd6431558757894418021,28,Male,Driving Under the Influence of Liqour or Drugs (=>.08<.1)
pvd6431558757894418021,28,Male,"Driving after Denial, Suspension or Revocation of License"


arrestee_id,age,gender,statute_desc
<chr>,<int>,<chr>,<chr>
pvd2218242150382148273,37,Male,
pvd15166785558364246202,25,,"Driving after Denial, Suspension or Revocation of License"
pvd3142917706201385905,34,Female,RESISTING LEGAL OR ILLEGAL ARREST
pvd3142917706201385905,34,Female,DISORDERLY CONDUCT
pvd460449304532374599,18,Female,RESISTING LEGAL OR ILLEGAL ARREST
pvd460449304532374599,18,Female,DISORDERLY CONDUCT
pvd460449304532374599,18,Female,OBSTRUCTING OFFICER IN EXECUTION OF DUTY
pvd6431558757894418021,28,Male,Chemical Test Refusal
pvd6431558757894418021,28,Male,Driving Under the Influence of Liqour or Drugs (=>.08<.1)
pvd6431558757894418021,28,Male,"Driving after Denial, Suspension or Revocation of License"


## 1.2 Example of `select()` (cont.)

In [4]:
arrests_vio <- arrests %>%
    select(
        arrestee_id,
        age,
        gender,
        statute_desc
    )

In [5]:
head(arrests_vio)           # see first few lines of new dataframe

Unnamed: 0_level_0,arrestee_id,age,gender,statute_desc
Unnamed: 0_level_1,<chr>,<int>,<chr>,<chr>
1,pvd2218242150382148273,37,Male,
2,pvd15166785558364246202,25,,"Driving after Denial, Suspension or Revocation of License"
3,pvd3142917706201385905,34,Female,RESISTING LEGAL OR ILLEGAL ARREST
4,pvd3142917706201385905,34,Female,DISORDERLY CONDUCT
5,pvd460449304532374599,18,Female,RESISTING LEGAL OR ILLEGAL ARREST
6,pvd460449304532374599,18,Female,DISORDERLY CONDUCT


# 2. Chaining _dplyr_ Operators
  - One key reason for _dplyr_ popularity
  - _dplyr_ verbs/functions are "composable"
    + $(f \circ g)(x) == f(g(x))$

In [6]:
female_vio <- arrests %>%
    filter(gender == "Female") %>%
    select(arrestee_id, age, gender, statute_desc)

head(female_vio)

Unnamed: 0_level_0,arrestee_id,age,gender,statute_desc
Unnamed: 0_level_1,<chr>,<int>,<chr>,<chr>
1,pvd3142917706201385905,34,Female,RESISTING LEGAL OR ILLEGAL ARREST
2,pvd3142917706201385905,34,Female,DISORDERLY CONDUCT
3,pvd460449304532374599,18,Female,RESISTING LEGAL OR ILLEGAL ARREST
4,pvd460449304532374599,18,Female,DISORDERLY CONDUCT
5,pvd460449304532374599,18,Female,OBSTRUCTING OFFICER IN EXECUTION OF DUTY
6,pvd8555094992612905738,45,Female,VANDALISM/MALICIOUS INJURY TO PROPERTY


## 2.1 More Chaining

In [7]:
female_midage <- arrests %>%
    filter(
        gender == "Female",
        age > 45,
        statute_desc != ""
    ) %>%
    select(
        arrestee_id, 
        age, 
        gender,
        statute_desc
    ) %>%
    arrange(
        arrestee_id
    )

head(female_midage)

Unnamed: 0_level_0,arrestee_id,age,gender,statute_desc
Unnamed: 0_level_1,<chr>,<int>,<chr>,<chr>
1,pvd10124563078769507165,52,Female,BENCH WARRANT ISSUED FROM SUPERIOR COURT
2,pvd10764271506246443138,53,Female,DOMESTIC-SIMPLE ASSAULT/BATTERY
3,pvd10857153000728613606,54,Female,"Driving after Denial, Suspension or Revocation of License"
4,pvd10879521477909738425,48,Female,DISORDERLY CONDUCT
5,pvd10909858543687069510,49,Female,
6,pvd11014001617297730559,47,Female,"Driving after Denial, Suspension or Revocation of License"
