# Analysis: ICE detainers in Travis County, Texas
This notebook has the scripts to parse and analyze data returned in response to a request for ICE detainer records in Travis County, Texas.

The sheriff's office posts [PDFs of daily reports](https://www.tcsheriff.org/inmate-jail-info/ice-listing) generated by its database software, but the Statesman requested, and received, a spreadsheet version of these reports in early October 2016.

The spreadsheet of raw data lives at `raw_data/ICE - Detainer Added.xlsx`. There's also a file of lookups to help standardize and group data at `./lookups.py`.

Note: This notebook begins with a bash script that transforms the spreadsheet into a CSV and chops some cruft from each end of the file, so this step probs won't work if you're on a Windows machine.

Data from before 2006 was less reliable, so it was excluded from this analysis.

In [5]:
# instantiate matplotlib ish
%matplotlib inline

In [6]:
%%bash

# use csvkit to turn the spreadsheet into a CSV
in2csv "raw_data/ICE - Detainer Added.xlsx" > raw_data/data.csv

# trim six lines from the head and 10 from the tail
count=$(wc -l < raw_data/data.csv | sed 's/ //g'); trim=$(echo $count - 10 | bc); \
tail -n +6 raw_data/data.csv | head -n $trim > raw_data/trimmed_data.csv

# clean up intermediate file
rm -rf raw_data/data.csv

# report line count
wc -l < raw_data/trimmed_data.csv | sed 's/ //g'

53439




In [7]:
from __future__ import print_function

import csv
import datetime
from collections import Counter

import agate
import matplotlib.pyplot as plt
from matplotlib.dates import YearLocator, MonthLocator, DateFormatter

from lookups import COUNTRIES, VIOLENT_CRIMES, FELONIES, MISDEMEANORS


def name_unmangler(name_str):
    try:
        name_split = name_str.split(',')
        return (name_split[0], name_split[1])
    except:
        return name_str


with open('raw_data/trimmed_data.csv', 'r') as infile:
    data = csv.reader(infile, delimiter=',')

    # main dict to hold the data
    inmate_dict = {}

    # set initial defaults    
    new_record = False
    booking_id = None
    
    # loop over spreadsheet data
    for row in data:
        
        # if new_record flag is True, add a new key to dict
        if new_record:
            booking_id = row[1].strip()
            name = row[2].strip()
            race = row[3].strip()
            sex = row[4].strip()
            age = row[6].strip()
            booking_date = row[7].strip()
            nativity = row[8].strip()

            # see if there's a clean country name
            try:
                nativity_clean = COUNTRIES[nativity]
            except:
                nativity_clean = nativity

            # add value to dict
            inmate_dict[booking_id] = {
                'last_name': name_unmangler(name)[0],
                'rest_name': name_unmangler(name)[1],
                'race': race,
                'sex': sex,
                'age': age,
                'booking_date': booking_date,
                'nativity': nativity,
                'nativity_clean': nativity_clean,
                'charges': [],
                'felonies': 0,
                'misdemeanors': 0,
                'violent_crimes': 0
            }
            
            # reset flag
            new_record = False
        else:
            # try to get record from dict
            rec = inmate_dict.get(booking_id, None)
            if rec:
                
                # after bio data, the charges
                # make sure it's not a charges subhead row, or an empty row
                if ''.join(row).strip() != '' and row[1].strip() != 'Charge':
                    charge = {}
                    charge['charge_id'] = row[1].strip()
                    charge['charge_description'] = row[2].strip()
                    charge['charge_level'] = row[3].strip()
                    charge['sentence'] = row[4].strip()
                    charge['disposition_date'] = row[6].strip()
                    charge['disposition_description'] = row[8].strip()
                    rec['charges'].append(charge)
                    
                    # if felony, increment felony counter
                    if row[3].strip() in FELONIES and row[2].strip().upper() != 'ICE DETAINER':
                        rec['felonies'] += 1
                    
                    # if misdemeanor, increment misdemeanors counter
                    if row[3].strip() in MISDEMEANORS and row[2].strip().upper() != 'ICE DETAINER':
                        rec['misdemeanors'] += 1

                    # if violent crime, increment violent crime counter
                    if row[2].strip() in VIOLENT_CRIMES and row[2].strip().upper() != 'ICE DETAINER':
                        rec['violent_crimes'] += 1
                        
        # if it's a blank row, the next one is a new record
        if ''.join(row).strip() == '' or row[1] == 'Booking No':
            # set flag
            new_record = True
            
    print('who has two thumbs and a dict with', len(inmate_dict), 'keys')
    print('you do that\'s who')

who has two thumbs and a dict with 9533 keys
you do that's who


### Load data into Agate for basic overall analysis

In [21]:
# set up an empty list to turn into an Agate table
rows = []

# define column names and types
column_names = ['booking_id', 'booking_date', 'booking_month_year', 'nativity', 'sex',
                'age', 'race', 'felony', 'misdemeanor', 'violent_crime']
column_types = [agate.Text(), agate.Date(), agate.Text(), agate.Text(), agate.Text(), 
                agate.Number(), agate.Text(), agate.Number(), agate.Number(), agate.Number()]

# define a list to feed to a Counter for a basic frequency count of charges
charge_list = []

# get a list of how 'ICE DETAINER' records are coded
ice_codes = []

for key in inmate_dict:
    # empty list to hold values
    ls = []
    
    # get month and year for grouping later
    month_year = inmate_dict[key]['booking_date'][:7]

    # append to placeholder list
    ls.append(key)
    ls.append(inmate_dict[key]['booking_date'])
    ls.append(month_year)
    ls.append(inmate_dict[key]['nativity_clean'])
    ls.append(inmate_dict[key]['sex'])
    ls.append(inmate_dict[key]['age'])
    ls.append(inmate_dict[key]['race'])
    ls.append(inmate_dict[key]['felonies'])
    ls.append(inmate_dict[key]['misdemeanors'])
    ls.append(inmate_dict[key]['violent_crimes'])
    
    # loop over charges
    for charge in inmate_dict[key]['charges']:

        # add to frequency list
        charge_list.append(charge['charge_description'])

        # add to frequency list
        if charge['charge_description'] == 'ICE DETAINER':
            ice_codes.append(charge['charge_level'])

            
    # append placeholder list to Agate list
    rows.append(ls)

# most common charges
most_common_charges = Counter(charge_list).most_common(11)

# list of charge levels associated with detainers
ice_codes = list(set(ice_codes))

print("##############################################")
print("# SUMMARY STATS, JAN. 1, 2006 - OCT. 4, 2006 #")
print("##############################################\n")

# get most common charges
print("10 most common charges\n----------------------")
for charge in most_common_charges:
    print(charge[0] + ":", charge[1])

most_common_charges = [x[0] for x in most_common_charges if x[0] != 'ICE DETAINER']
    
print('\n')
    
# load data into an Agate table
table = agate.Table(rows, column_names, column_types).order_by('booking_date', reverse=True).where(
    # filter out pre-2006 records
    lambda row: row['booking_date'].year >= 2006
)

print("* Total number of detainees:", len(table.rows))

median_age = table.aggregate(agate.Median('age'))

print("* Median age: ", median_age, "\n")

subset_categories = ('felony', 'violent_crime')

for category in subset_categories:
    cat_table = table.where(
        lambda row: row[category] > 0
    )

    pct_with_cat = '{:.2%}'.format(len(cat_table.rows) / len(table.rows))
    
    print("* Detainees with at least 1", category + ":", len(cat_table.rows),
          "(" + pct_with_cat + ")")

    cat_table = table.where(
        lambda row: row[category] > 1
    )

    pct_with_cat = '{:.2%}'.format(len(cat_table.rows) / len(table.rows))
    
    print("* Detainees with more than 1", category + ":", len(cat_table.rows),
          "(" + pct_with_cat + ")\n")

with_only_one_misdemeanor = table.where(
    lambda row: row['felony'] == 0 and row['misdemeanor'] == 1
)

pct_only_one_misdemeanor = '{:.2%}'.format(len(with_only_one_misdemeanor.rows) / len(table.rows))

print("* Detainees with only one misdemeanor:", len(with_only_one_misdemeanor.rows),
      "(" + pct_only_one_misdemeanor + ")\n")

# phil q: how many people charged with "most common" charges also have a felony?
def is_common(booking_id):
    rec = inmate_dict.get(booking_id, None)
    if rec:
        for charge in rec['charges']:
            if charge['charge_description'] in most_common_charges:
                return True
    return False

has_common_charge = table.where(
    lambda row: is_common(row['booking_id'])
)

has_common_charge_and_felony = has_common_charge.where(
    lambda row: row['felony'] > 0
)

print("* Detainees charged with one of the 10 most common crimes:", len(has_common_charge.rows))
print("* Of those, detainees who also were charged with a felony:", len(has_common_charge_and_felony.rows))
print("\n")

# grab some overall grouped totals

pivot = ('nativity', 'sex', 'race')

for pivot_item in pivot:
    print('Grouped by', pivot_item)    
    print('-'*(11 + len(pivot_item)))
    grouped_by = table.pivot(pivot_item).limit(10).order_by('Count', reverse=True) \
                      .print_bars(pivot_item, 'Count', width=80)
    print('\n')

# by age
print("Grouped by age")
print('--------------')
binned_ages = table.bins('age', start=0, end=100, count=10)
binned_ages.print_bars('age', 'Count', width=80)

##############################################
# SUMMARY STATS, JAN. 1, 2006 - OCT. 4, 2006 #
##############################################

10 most common charges
----------------------
ICE DETAINER: 9602
DRIVING WHILE INTOXICATED: 2849
ASSLT CAUSES BODILY INJ:FAMILY MEMBER (MA): 1164
TRAFFIC OFFENSE MULTIPLE: 1077
PUBLIC INTOXICATION: 766
DRIVING WHILE INTOXICATED 2ND: 514
DRIVING WHILE INTOXICATED BAC>=0.15: 501
FAIL TO ID GIVING FALSE/FICTIOUS INFO (MB): 424
POSS MARIJ <2OZ (MB): 404
FAIL TO ID FUGITIVE INTENT GIVE FALSE (MA): 382
TRAFFIC OFFENSE SINGLE: 381


* Total number of detainees: 9489
* Median age:  29 

* Detainees with at least 1 felony: 2143 (22.58%)
* Detainees with more than 1 felony: 450 (4.74%)

* Detainees with at least 1 violent_crime: 1752 (18.46%)
* Detainees with more than 1 violent_crime: 203 (2.14%)

* Detainees with only one misdemeanor: 5041 (53.12%)

* Detainees charged with one of the 10 most common crimes: 9489
* Of those, detainees who also were charge

### Questions to explore
- breakdown of detainers by month, then by race, sex, nativity, age, charge, charge level, etc.
- for each month, how many of total detainees had felonies?
- did ending secure communities have an effect on # or type of detainees?