# Clean Slate: How do transfers to superior court affect the analysis of how many eligible for expungement
> Prepared by [Laura Feeney](https://github.com/laurafeeney) for Code for Boston's [Clean Slate project](https://github.com/codeforboston/clean-slate).

## Summary
This notebook uses the Northwestern DA data and attempts to identify when a charge might be transferred from District to Superior Court. This appears in the dataset as 2 rows for the same charge, with the same charge and offense date, but different courts and usually different dispositions. Usually, the disposition in district court is 'nolle prosequi'. 

This could affect the analysis if this apparent duplication of records make it appear that someone has multiple charges within an incident. While charges heard in Superior court are more serious than in District, many are still expungeable. 

## Result
We can only identify these "transfers" if the text description of the charge is the same in both District and Superior courts. This occurs about 854 times. There may be other 'transfers' but with a slightly different charge. 

Removing the transfers we can identify makes absolutely no difference in how many individuals are eligible for exungement in the Northwestern DA data. With better identification of 'transfers', there may be a handful that would newly appear eligible; however, this is likely to be a very low number. 

This will not impact analysis based on only one incident, or analysis without restriction on number of incidents/charges. 

In [1]:
import pandas as pd
pd.set_option("display.max_rows", 200)
import numpy as np
import regex as re
import glob, os
import datetime 
from datetime import date 


nw = pd.read_csv('../../data/processed/merged_nw.csv', encoding='cp1252',
                    dtype={'Analysis notes':str, 'extra_criteria':str, 'Expungeable': str}) 

nw = nw.loc[nw['CMRoffense'] == 'no']
nw = nw.drop(columns = ['CMRoffense'])

nw['Expungeable'].value_counts(dropna=False)

Yes        55052
No         20013
Attempt      220
Name: Expungeable, dtype: int64

In [2]:
print(nw['Disposition'].value_counts())
nw['Court'].value_counts()

Dismissed at Request of Comm              13991
Nolle Prosequi                            11173
Guilty                                    10801
Continued w/o Finding                      9138
Not Responsible                            8621
Responsible                                3948
c276s87 finding                            3743
Dismissed on Payment                       2952
Dismissed                                  2936
Dismissed Prior to Arraignment             1054
Responsible Filed                           851
Not Guilty                                  713
Guilty Filed                                626
Dismissed by Court                          232
Guilty on Lesser Included Offense           183
Agreed Plea                                 152
Charge Handled as a Civil Charge            145
Delinquent                                  121
Valor Act Dispo                             108
Accord/Satisfaction                         104
Case Transferred                        

Belchertown District Court    20769
Greenfield District Court     20282
Northampton District Court    18147
Orange District Court          9223
Hampshire Superior Court       2471
Franklin Superior Court        1807
Hadley Juvenile Court          1183
Greenfield Juvenile Court       625
Orange Juvenile Court           404
Belchertown Juvenile Court      374
Name: Court, dtype: int64

In [3]:
# Any disposition of "Nolle Prosequi" for any charge in a particular incident
nw['Incident_NP'] = nw['Disposition']=="Nolle Prosequi"
nw['Incident_NP'] = nw.groupby(['Person ID', 'Offense Date'])['Incident_NP'].transform('max')

# Code cases as tried in a Superior vs District Court. 
nw['Superior'] = False
nw.loc[nw['Court'].str.contains("Superior"), 'Superior'] = True
pd.crosstab(nw['Superior'], nw['Court'])

# Any charge in the incident tried in a Superior court
nw['Incident_Sup'] = nw.groupby(['Person ID', 'Offense Date'])['Superior'].transform('max')

# By Person, incident, charge --> was this charge tried in a superior court
# If this is True, and Superior is False, that's an indication that the charge was nolle prosequi in district court
# and moved to the Superior court. 
nw["counter_id"] = nw.groupby(["Person ID", "Offense Date", "Charge", "Court"]).cumcount() + 1
nw['final_court_sup'] = nw.groupby(['Person ID', 'Offense Date', 'Charge', 'counter_id'])['Superior'].transform('max')

# Offense date --> date
nw['Offense Date'] = pd.to_datetime(nw['Offense Date']).dt.date

# Sort
nw = nw.sort_values(by=['Person ID','Offense Date', 'Charge', 'Court'])

In [4]:
# When a charge is tried in both district and superior court, what are the dispositions at the district court?

nw[['Person ID', 'Court', 'final_court_sup', 'Offense Date', 'Charge', 'Disposition', 'Dispo Date', 
    'Expungeable', 'Incident_NP', 'Incident_Sup']].loc[
    (nw['Incident_Sup']==True) &
    (nw['final_court_sup']==True) &
    (nw['Superior']==False) 
]['Disposition'].value_counts(dropna=False)

Nolle Prosequi                  831
NaN                               7
Dismissed                         5
Delinquent                        5
CLOSED-INDICTED                   4
Case Transferred                  1
Dismissed at Request of Comm      1
Name: Disposition, dtype: int64

In [5]:
# Browse a particular ID that has a 'transfer' looking case. Cherry picked this one because they have only 2 rows in the 
# dataframe so its easier to focus.
nw[['Person ID', 'Court', 'Superior', 'final_court_sup', 'Offense Date', 'Charge', 'Disposition', 
    'Dispo Date', 'Incident_NP', 'Incident_Sup']].loc[
    (nw['Person ID']=="NW-10255")]

Unnamed: 0,Person ID,Court,Superior,final_court_sup,Offense Date,Charge,Disposition,Dispo Date,Incident_NP,Incident_Sup
47609,NW-10255,Franklin Superior Court,True,True,2016-03-11,"ROBBERY, UNARMED c265 Ã‚Â§19(b)",Guilty,2016-09-23,True,True
47608,NW-10255,Greenfield District Court,False,True,2016-03-11,"ROBBERY, UNARMED c265 Ã‚Â§19(b)",Nolle Prosequi,2016-05-10,True,True


In [6]:
#browse -- when does this apparent transfer not involve a Nolle Prosequi disposition

#nw[['Person ID', 'Court', 'final_court_sup', 'Offense Date', 'Charge', 'Disposition', 'Dispo Date', 
#    'Expungeable', 'Incident_NP', 'Incident_Sup']].loc[
#    (nw['Incident_Sup']==True) &
#    (nw['final_court_sup']==True) &
#    (nw['Superior']==False) &
#    (nw['Disposition']!="Nolle Prosequi")
#]

#browse the people from above
#nw[['Person ID', 'Court', 'final_court_sup', 'Offense Date', 'Charge', 'Disposition', 'Dispo Date', 
#    'Expungeable', 'Incident_NP', 'Incident_Sup']].loc[nw['Person ID'].isin(('NW-5248', 'NW-7930', 'NW-4638', 'NW-14326'))]

In [7]:
# Create a version of the data, dropping any of the rows that look like a "transfer." Drop the District court version of 
# any charge that ended up in a Superior court. 

nw_no_transfers = nw[~(
    (nw['final_court_sup']==True) &
    (nw['Superior']==False) 
)].copy()
len(nw) - len(nw_no_transfers) 

854

In [8]:
# Does this change how many are eligible for expungement? (No)

nw_no_transfers['num_offenses'] = nw_no_transfers.groupby('Person ID')['Person ID'].transform('count')


x_nt = nw_no_transfers.loc[
    (nw_no_transfers['num_offenses']==1) &
    (nw_no_transfers['Expungeable'] != 'No') &
    (~nw_no_transfers['Age at Offense'].isnull()) &
    (nw_no_transfers['Age at Offense']<21)
]

People_eligible_no_transfers = x_nt['Person ID'].nunique()

In [9]:
nw['num_offenses']=nw.groupby('Person ID')['Person ID'].transform('count')
x = nw.loc[
    (nw['num_offenses']==1) &
    (nw['Expungeable'] != 'No') &
    (~nw['Age at Offense'].isnull()) &
    (nw['Age at Offense']<21)
]

People_eligible = x['Person ID'].nunique()
People_eligible

print(People_eligible, "\n")
print(x.Disposition.value_counts(dropna=False))

549 

c276s87 finding                     154
Dismissed at Request of Comm         96
Responsible                          80
Nolle Prosequi                       50
Dismissed Prior to Arraignment       42
Dismissed                            32
Dismissed on Payment                 28
Continued w/o Finding                26
NaN                                  25
Guilty                                6
Charge Handled as a Civil Charge      3
Dismissed by Court                    3
Delinquent                            3
Responsible Filed                     1
Name: Disposition, dtype: int64


In [10]:
People_eligible == People_eligible_no_transfers

True