## Exploring the relationship between gender and policing

Does the gender of a driver have an impact on police behavior during a traffic stop? In this chapter, you will explore that question while practicing filtering, grouping, method chaining, Boolean math, string methods, and more!

### Examining traffic violations
Before comparing the violations being committed by each gender, you should examine the violations committed by all drivers to get a baseline understanding of the data.

In this exercise, you'll count the unique values in the violation column, and then separately express those counts as proportions.

In [1]:
# Import the pandas library as pd
import pandas as pd

# Read 'police.csv' into a DataFrame named ri
ri = pd.read_csv('Traffic stops in Rhode Island.csv')

# Count the unique values in 'violation'
print(ri['violation'].value_counts())

# Express the counts as proportions
print(ri['violation'].value_counts(normalize = True))

Speeding               39705
Moving violation       12811
Equipment               7414
Other                   3755
Registration/plates     2542
Seat belt                376
Name: violation, dtype: int64
Speeding               0.596144
Moving violation       0.192349
Equipment              0.111316
Other                  0.056379
Registration/plates    0.038166
Seat belt              0.005645
Name: violation, dtype: float64


  has_raised = await self.run_ast_nodes(code_ast.body, cell_name,


### Comparing violations by gender
The question we're trying to answer is whether male and female drivers tend to commit different types of traffic violations.

In this exercise, you'll first create a DataFrame for each gender, and then analyze the violations in each DataFrame separately.

In [2]:
# Create a DataFrame of female drivers
female = ri[ri['driver_gender'] == 'F']

# Create a DataFrame of male drivers
male = ri[ri['driver_gender'] == 'M']

# Compute the violations by female drivers (as proportions)
print(female.violation.value_counts(normalize = 'True'))

# Compute the violations by male drivers (as proportions)
print(male.violation.value_counts(normalize = 'True'))

Speeding               0.703508
Moving violation       0.134673
Equipment              0.089652
Registration/plates    0.039247
Other                  0.028200
Seat belt              0.004719
Name: violation, dtype: float64
Speeding               0.556342
Moving violation       0.213740
Equipment              0.119332
Other                  0.066829
Registration/plates    0.037767
Seat belt              0.005989
Name: violation, dtype: float64


### Comparing speeding outcomes by gender
When a driver is pulled over for speeding, many people believe that gender has an impact on whether the driver will receive a ticket or a warning. Can you find evidence of this in the dataset?

First, you'll create two DataFrames of drivers who were stopped for speeding: one containing females and the other containing males.

Then, for each gender, you'll use the stop_outcome column to calculate what percentage of stops resulted in a "Citation" (meaning a ticket) versus a "Warning".

In [3]:
# Create a DataFrame of female drivers stopped for speeding
female_and_speeding = ri[(ri.driver_gender == 'F') & (ri.violation == 'Speeding')]

# Create a DataFrame of male drivers stopped for speeding
male_and_speeding = ri[(ri.driver_gender == 'M') & (ri.violation == 'Speeding')]

# Compute the stop outcomes for female drivers (as proportions)
print(female_and_speeding.stop_outcome.value_counts(normalize = True))

# Compute the stop outcomes for male drivers (as proportions)
print(male_and_speeding.stop_outcome.value_counts(normalize = True))

Citation            0.968831
Arrest Driver       0.006234
N/D                 0.001105
Arrest Passenger    0.000552
No Action           0.000237
Name: stop_outcome, dtype: float64
Citation            0.957197
Arrest Driver       0.017498
Arrest Passenger    0.001258
N/D                 0.001073
No Action           0.001036
Name: stop_outcome, dtype: float64


### Calculating the search rate
During a traffic stop, the police officer sometimes conducts a search of the vehicle. In this exercise, you'll calculate the percentage of all stops in the ri DataFrame that result in a vehicle search, also known as the search rate.

In [5]:
# Check the data type of 'search_conducted'
print(ri.search_conducted.dtype)

# Calculate the search rate by counting the values
print(ri.search_conducted.value_counts(normalize = True))


object
False    0.888087
FALSE    0.073800
True     0.035984
TRUE     0.002115
F        0.000014
Name: search_conducted, dtype: float64


### Counting protective frisks
During a vehicle search, the police officer may pat down the driver to check if they have a weapon. This is known as a "protective frisk."

In this exercise, you'll first check to see how many times "Protective Frisk" was the only search type. Then, you'll use a string method to locate all instances in which the driver was frisked.

In [7]:
# Count the 'search_type' values
print(ri.search_type.value_counts())

# Check if 'search_type' contains the string 'Protective Frisk'
ri['frisk'] = ri.search_type.str.contains('Protective Frisk', na=False)

# Check the data type of 'frisk'
print(ri['frisk'].dtype)

# Take the sum of 'frisk'
print(ri['frisk'].sum())

Incident to Arrest                                          1140
Probable Cause                                               671
Reasonable Suspicion                                         180
Inventory                                                    171
Protective Frisk                                             139
Incident to Arrest,Inventory                                 103
Incident to Arrest,Probable Cause                             80
Incident to Arrest,Protective Frisk                           32
Probable Cause,Reasonable Suspicion                           31
Probable Cause,Protective Frisk                               27
Incident to Arrest,Inventory,Probable Cause                   27
Incident to Arrest,Inventory,Protective Frisk                 18
Protective Frisk,Reasonable Suspicion                         17
Inventory,Probable Cause                                      16
Inventory,Protective Frisk                                    12
Incident to Arrest,Probab

### Comparing frisk rates by gender
In this exercise, you'll compare the rates at which female and male drivers are frisked during a search. Are males frisked more often than females, perhaps because police officers consider them to be higher risk?

Before doing any calculations, it's important to filter the DataFrame to only include the relevant subset of data, namely stops in which a search was conducted.

In [8]:
# Create a DataFrame of stops in which a search was conducted
searched = ri[ri.search_conducted == True]

# Calculate the overall frisk rate by taking the mean of 'frisk'
print(searched.frisk.mean())

# Calculate the frisk rate for each gender
print(searched.groupby('driver_gender').frisk.mean())

0.10148902821316615
driver_gender
F    0.077586
M    0.105263
Name: frisk, dtype: float64
