# Analysis of Jefferson County (TX) Jail Data

The code below analyzes arrest data from Jefferson County, Texas. For more details and context, [please read this page](https://github.com/BuzzFeedNews/2016-01-port-arthur-arrests).

## Load the data

In [1]:
import pandas as pd

In [2]:
date_cols = ["ARREST DT", "RELEASE DT" ]

In [3]:
all_class_c_arrests = pd.read_csv("../data/clean/all_class_c_arrests.csv",
    parse_dates=date_cols, dtype={"WARRANT #": str})

In [4]:
all_class_c_arrests.head()

Unnamed: 0,#,OFFENSE,BOND,WARRANT #,FILED BY,JAIL ID,ARREST DT,RELEASE DT,RACE_SEX,days_served,ARREST YEAR,RACE,SEX
0,1,FAILURE TO APPEAR,0,8080676,PAPD,193551,2005-01-01,2005-01-01,BM,0,2005,B,M
1,2,FAILURE TO APPEAR,0,8080488,PAPD,193551,2005-01-01,2005-01-01,BM,0,2005,B,M
2,3,FAILURE TO APPEAR,0,9080676,PAPD,193551,2005-01-01,2005-01-01,BM,0,2005,B,M
3,4,TRAFFIC OFFENSE-MOVING,0,80676,PAPD,193551,2005-01-01,2005-01-01,BM,0,2005,B,M
4,5,FAILURE TO MNTN FINAN RESP,0,80676,PAPD,193551,2005-01-01,2005-01-01,BM,0,2005,B,M


In [5]:
traffic_only = pd.read_csv("../data/clean/traffic_only.csv", parse_dates=date_cols)

In [6]:
traffic_only.head()

Unnamed: 0,JAIL ID,ARREST DT,charges,warrants_filed_by,RACE,SEX,RELEASE DT,ARREST YEAR,days_served
0,55,2005-08-08,"[FAILURE TO APPEAR, SPEEDING, DRIVING-WHILE LI...","[PAPD, PAPD, PAPD]",W,M,2005-08-08,2005,0
1,55,2006-03-18,"[FAILURE TO APPEAR, FAILURE TO APPEAR, DRIVING...","[PAPD, PAPD, PAPD, PAPD]",W,M,2006-03-18,2006,0
2,99,2009-04-19,"[STOP LIGHT- RUNNING, FAILURE TO APPEAR]","[PAPD, PAPD]",B,M,2009-04-22,2009,3
3,246,2006-01-29,"[FAILURE TO APPEAR, SPEEDING]","[PAPD, PAPD]",B,M,2006-01-29,2006,0
4,540,2015-06-23,"[DRIVING-NO DRIVERS LICENSE, SPEEDING, FAILURE...","[PAPD, PAPD, PAPD]",B,M,2015-06-24,2015,1


In [7]:
traffic_by_year = traffic_only.groupby("ARREST YEAR")

Number of people arrested per year:

In [8]:
all_class_c_arrests_by_year = all_class_c_arrests.groupby("ARREST YEAR")

In [9]:
pd.DataFrame(all_class_c_arrests_by_year["JAIL ID"].nunique())

Unnamed: 0_level_0,JAIL ID
ARREST YEAR,Unnamed: 1_level_1
2005,1007
2006,996
2007,1128
2008,927
2009,1223
2010,1708
2011,1763
2012,1446
2013,1354
2014,1055


In [10]:
pd.DataFrame(traffic_by_year["JAIL ID"].nunique())

Unnamed: 0_level_0,JAIL ID
ARREST YEAR,Unnamed: 1_level_1
2005,407
2006,391
2007,405
2008,341
2009,559
2010,558
2011,531
2012,419
2013,438
2014,290


*Note: Data does not cover the entire year of 2015.*

## Analyses

### "From 2009 to 2011, the height of the city’s traffic enforcement spree, about 1,500 people were booked into lockup for unpaid traffic fines."

In [11]:
traffic_only_2009_2011 = traffic_only[
    (traffic_only["ARREST YEAR"] >= 2009) & 
    (traffic_only["ARREST YEAR"] <= 2011)
]

In [12]:
len(traffic_only_2009_2011["JAIL ID"].unique())

1518

### "Beyond those numbers lies a racial dimension: The people who Port Arthur police put behind bars for their traffic tickets are disproportionately black. While black people make up only 40% of the overall population — and are ticketed at about that rate — they accounted for about 70% of the arrests for these citations in 2014, according to a BuzzFeed News analysis."

In [13]:
all_class_c_arrests_2014 = all_class_c_arrests[
    (all_class_c_arrests["ARREST YEAR"] == 2014)
]

In [14]:
traffic_only_2014 = traffic_only[
    (traffic_only["ARREST YEAR"] == 2014)
]

In [15]:
n_arrested_class_c_2014 = all_class_c_arrests_2014["JAIL ID"].nunique()

n_black_arrested_class_c_2014 = all_class_c_arrests_2014[
    all_class_c_arrests_2014["RACE"].str.contains('B')
]["JAIL ID"].nunique()

In [16]:
print("""
• {0:,} people were arrested for Class C offenses by Port Arthur Police in 2014.
• Of those people, {1:,} — or {2:.1f}% — were listed as black.
""".format(
    n_arrested_class_c_2014,
    n_black_arrested_class_c_2014,
    n_black_arrested_class_c_2014 * 100.0 / n_arrested_class_c_2014))


• 1,055 people were arrested for Class C offenses by Port Arthur Police in 2014.
• Of those people, 668 — or 63.3% — were listed as black.



As the code below shows, a similar pattern holds when you look at people arrested only for traffic offenses.

In [17]:
n_arrested_traffic_only_2014 = traffic_only_2014["JAIL ID"].nunique()

n_black_arrested_traffic_only_2014 = traffic_only_2014[
    traffic_only_2014["RACE"].str.contains('B')
]["JAIL ID"].nunique()

In [18]:
print("""
• {0:,} people were arrested for Class C *traffic* offenses by Port Arthur Police in 2014.
• Of those people, {1:,} — or {2:.1f}% — were listed as black.
""".format(
    n_arrested_traffic_only_2014,
    n_black_arrested_traffic_only_2014,
    n_black_arrested_traffic_only_2014 * 100.0 / n_arrested_traffic_only_2014))


• 290 people were arrested for Class C *traffic* offenses by Port Arthur Police in 2014.
• Of those people, 208 — or 71.7% — were listed as black.



### "Over the past decade, about 1,300 people have spent three days or more in jail for traffic tickets — and about 75% of those people were black."

In [19]:
all_class_c_arrests_2005_2014 = all_class_c_arrests[
    (all_class_c_arrests["ARREST YEAR"] >= 2005) & 
    (all_class_c_arrests["ARREST YEAR"] <= 2014)
]

In [20]:
traffic_only_2005_2014 = traffic_only[
    (traffic_only["ARREST YEAR"] >= 2005) & 
    (traffic_only["ARREST YEAR"] <= 2014)
]

In [21]:
class_c_three_nights_2005_2014 = all_class_c_arrests_2005_2014[
    all_class_c_arrests_2005_2014["days_served"] >= 3
]

n_class_c_three_nights_2005_2014 = class_c_three_nights_2005_2014["JAIL ID"].nunique()

n_black_class_c_three_nights_2005_2014 = class_c_three_nights_2005_2014[
    class_c_three_nights_2005_2014["RACE"].str.contains('B')
]["JAIL ID"].nunique()

In [22]:
print("""
• Over the past decade, {0:,d} people spent 3 nights or more in jail for Class C offenses.
• Of those people, {1:,} — or {2:.1f}% — were black.
""".format(
    n_class_c_three_nights_2005_2014,
    n_black_class_c_three_nights_2005_2014,
    n_black_class_c_three_nights_2005_2014 * 100.0 / n_class_c_three_nights_2005_2014
))


• Over the past decade, 3,021 people spent 3 nights or more in jail for Class C offenses.
• Of those people, 2,083 — or 69.0% — were black.



As the code below shows, a similar pattern holds when you look at people arrested only for traffic offenses.

In [23]:
traffic_three_nights_2005_2014 = traffic_only_2005_2014[
    traffic_only_2005_2014["days_served"] >= 3
]

n_traffic_three_nights_2005_2014 = traffic_three_nights_2005_2014["JAIL ID"].nunique()

n_black_traffic_three_nights_2005_2014 = traffic_three_nights_2005_2014[
    traffic_three_nights_2005_2014["RACE"].str.contains('B')
]["JAIL ID"].nunique()

In [24]:
print("""
• Over the past decade, {0:,d} people spent 3 nights or more in jail for Class C *traffic* offenses.
• Of those people, {1:,} — or {2:.1f}% — were black.
""".format(
    n_traffic_three_nights_2005_2014,
    n_black_traffic_three_nights_2005_2014,
    n_black_traffic_three_nights_2005_2014 * 100.0 / n_traffic_three_nights_2005_2014
))


• Over the past decade, 1,315 people spent 3 nights or more in jail for Class C *traffic* offenses.
• Of those people, 1,000 — or 76.0% — were black.



---

---

---