Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
108 lines (94 sloc) 9.06 KB
title: "Cook County Criminal Judge Severity Ranking"
author: "Jacob Gosselin"
date: "10/1/2018"
output: html_document
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
Hi there! What follows is a written out methodology for *italic* Injustice *italic* Watch's sentencing analysis on Criminal Division judges. All work was done in R and Stata. The R code and Stata code can be downloaded at our GitHub; since our data is read in from the Cook County Online Data portal, our work can easily be re-created or expanded upon. Thanks!
# Reading in Data/Converting Sentence Term
We'll start by reading in our original sentencing data (from We'll then create a conversion table to standardize our units (i.e. years=1, months=1/12, weeks=1/52, days=1/365, all other units are left undefined but the rows are kept). We'll then convert our sentence (i.e. 6 months=.5), and store it under a new variable, "converted_sentence".
```{r, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
original <- read_csv("")
conversion_table <- revalue(original$COMMITMENT_UNIT, c("Year(s)"=1, "Months"=1/12, "Weeks"=1/52, "Days"=1/365, "Natural Life"=100, "Pounds"=NA, "Dollars"=NA, "Term"=NA))
conversion_table <- as.double(conversion_table)
#Creating relevant subsets
We'll now create a series of subsets, to find median sentences. We're going to create a subset for class 1, 2, 3, 4, and X felonies. This will exclude 2792 cases, which are filed under class A, B, C, M, O, P, U, or Z felonies. A lot of these are mistaken filings, but we don't want to assign them. Since the sample size is large, we're better of ignoring them (they only make up <2% of cases).
We're also going to create further subsets (PJ) for sentences to Prison or Jail. We'll use these to find median sentences; while it eliminates a good chunk of our cases (~41%), you have to do this to get an accurate read on median sentence time. Otherwise, a two year probation will skew our median, since that will be considered harsher than a one year prison sentence.
```{r, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
CLASS_1 <- subset(original, CLASS=="1")
CLASS_2 <- subset(original, CLASS=="2")
CLASS_3 <- subset(original, CLASS=="3")
CLASS_4 <- subset(original, CLASS=="4")
CLASS_X <- subset(original, CLASS=="X")
CLASS_1_PJ <- subset(original, CLASS=="1" & (SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail"))
CLASS_2_PJ <- subset(original, CLASS=="2" & (SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail"))
CLASS_3_PJ <- subset(original, CLASS=="3" & (SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail"))
CLASS_4_PJ <- subset(original, CLASS=="4" & (SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail"))
CLASS_X_PJ <- subset(original, CLASS=="X" & (SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail"))
original_PJ <- subset(original, SENTENCE_TYPE=="Prison" | SENTENCE_TYPE=="Jail")
median_1 <- median(CLASS_1_PJ$converted_sentence, na.rm=TRUE)
median_2 <- median(CLASS_2_PJ$converted_sentence, na.rm=TRUE)
median_3 <- median(CLASS_3_PJ$converted_sentence, na.rm=TRUE)
median_4 <- median(CLASS_4_PJ$converted_sentence, na.rm=TRUE)
median_X <- median(CLASS_X_PJ$converted_sentence, na.rm=TRUE)
The outputs are our median prison sentences by felony class.
#Creating Severity Ranking
Now we construct our ranking of Criminal Division judges by sentence severity. We'll do this in both R and Stata (again, all our code can be found in our Github). First we're going to create a subset of our original which solely includes felonies of class 1, 2, 3, 4, and X (which is the vast majority of entries). Then we're going to create a boolean for whether the charge resulted in prison time, and if so, whether that prison sentence was above the median.
```{r, tidy=TRUE, tidy.opts=list(width.cutoff=60)}
original_subset <- subset(original, CLASS=="1" | CLASS=="2" | CLASS=="3" | CLASS=="4" | CLASS=="X")
conversion_table2 <- revalue(original_subset$SENTENCE_TYPE, c("Prison"=TRUE, "Jail"=TRUE))
above_median <- (original_subset$PJ==TRUE & ((original_subset$CLASS=="1" & original_subset$converted_sentence>median_1) | (original_subset$CLASS=="2" & original_subset$converted_sentence>median_2) | (original_subset$CLASS=="3" & original_subset$converted_sentence>median_3) | (original_subset$CLASS=="4" & original_subset$converted_sentence>median_4) | (original_subset$CLASS=="X" & original_subset$converted_sentence>median_X)))
original_subset["above_median"] <- above_median
write_csv(original_subset, "~/Desktop/rankings.csv")
Now we export our new dataset, and re-open it in Stata. We use each boolean to create a table of A) How many sentences each judge has handed down, B) How many prison/jail sentences each judge has handed down, C) How many class 1, 2, 3, and 4 felonies each judge has seen, D) How many class 1, 2, 3, and 4 felonies have resulted in prison time, D) How many of the class 1, 2, 3, and 4 felony sentences which resulted in prison were "above the median", and E) How many prison/jail sentences above the median each judge has handed down overall. We also drop every judge who's ruled on under 1000 cases, (it's unfair to include them, as their decision making patterns are still very susceptible to statistical randomness), and further subset to only include judges within the Criminal Division (as publicly listed at We then calculate the percent of class 4 felony sentences resulting in prison time, and the percent of prison/jail sentences above the median. We finally average out the two, to create our best measure of sentencing severity; this accounts for the severity of sentence type (we only look at class 4 felonies here, i.e. the most minor of felony offenses, to be fair to judges handling more serious cases) and sentence length.
import delimited /Users/jgosselin15/Desktop/rankings.csv
gen counter=1
gen counter_pj=1 if pj=="TRUE"
gen counter_overallabovemedian=1 if above_median=="TRUE"
gen counter_4=1 if class=="4"
gen counter_3=1 if class=="3"
gen counter_2=1 if class=="2"
gen counter_1=1 if class=="1"
gen counter_4pj=1 if class=="4" & pj=="TRUE"
gen counter_3pj=1 if class=="3" & pj=="TRUE"
gen counter_2pj=1 if class=="2" & pj=="TRUE"
gen counter_1pj=1 if class=="1" & pj=="TRUE"
gen counter_4_above=1 if class=="4" & above_median=="TRUE"
gen counter_3_above=1 if class=="3" & above_median=="TRUE"
gen counter_2_above=1 if class=="2" & above_median=="TRUE"
gen counter_1_above=1 if class=="1" & above_median=="TRUE"
collapse (sum) counter (sum) counter_pj (sum) counter_overallabovemedian (sum) counter_4 (sum) counter_4pj (sum) counter_3 (sum) counter_3pj (sum) counter_2 (sum) counter_2pj (sum) counter_1 (sum) counter_1pj (sum) counter_4_above (sum) counter_3_above (sum) counter_2_above (sum) counter_1_above, by(sentence_judge)
drop if counter<1000
keep if sentence_judge=="William T O'Brien"|sentence_judge=="Catherine Marie Haberkorn" |sentence_judge=="Dennis J Porter" |sentence_judge=="Lawrence Edward Flood" |sentence_judge=="Kenneth J Wadas"|sentence_judge=="William H Hooks"|sentence_judge=="Thomas V Gainer"|sentence_judge=="Vincent M Gaughan"|sentence_judge=="Araujo, Mauricio"|sentence_judge=="Byrne, Thomas"|sentence_judge=="Arthur F Hill"|sentence_judge=="Carol M Howard"|sentence_judge=="Timothy Joseph Joyce"|sentence_judge=="James B Linn"|sentence_judge=="Mary Margaret Brosnahan"|sentence_judge=="Michael B McHale"|sentence_judge=="Erica L Reddick"|sentence_judge=="Thomas J Hennelly"|sentence_judge=="Alfredo Maldonado"|sentence_judge=="Charles P Burns"|sentence_judge=="Matthew E Coghlan"|sentence_judge=="Thaddeus L Wilson"|sentence_judge=="Diane Gordon Cannon"|sentence_judge=="Maura Slattery Boyle"
gen class4_prisonpercent=counter_4pj/counter_4
gen class3_prisonpercent=counter_3pj/counter_3
gen class2_prisonpercent=counter_2pj/counter_2
gen class1_prisonpercent=counter_1pj/counter_1
gen above_median=counter_overallabovemedian/counter_pj
gen severity_metric=(class4_prisonpercent+above_median)/2
sort severity_metric
export excel using "/Volumes/GoogleDrive/My Drive/Injustice Watch/overall_rankings.xlsx", firstrow(variables)
The resulting table is saved on our Github. All judges running for retention are highlighted. The average rate of sentencing above the median, and class 4 felony prison percent, is included in an accompanying table.
#Closing Thoughts
In conclusion, it's worth mentioning here that just because a judge seems to sentence more harshly than his or her colleagues (i.e. their severity metric is above the overall average), that does not mean that judge is necessarily more severe. While cases are assigned randomly, further statistical testing is necessary to establish whether such a disparity is significant.
That's it! As previously mentioned, all code used is available, so please expand upon our work.