# Dashboard
* This notebook contains visualiations that could be incorporated into the final dashboard

In [1]:
import ast
import itertools
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import regex as re
import seaborn as sns

from scipy.stats import chi2_contingency

In [4]:
df = pd.read_csv("parsed1.csv", index_col = 0)
df.drop(columns = ['docket_no'], inplace = True)

In [5]:
# convert string to datetime
#df["offense_date"] = pd.to_datetime(df["offense_date"])
df["arrest_dt"] = pd.to_datetime(df["arrest_dt"])
df["dob"] = pd.to_datetime(df["dob"])
df["bail_date"] = pd.to_datetime(df["bail_date"])
df["prelim_hearing_dt"] = pd.to_datetime(df["prelim_hearing_dt"])

# age column
df['age'] = df['dob'].apply(lambda x: 2020-x.year)

# public defender column: 1 if public defender, 0 if private defender
df["public_defender"] = df["attorney"].apply(lambda x: 1 if x =='Defender Association of  Philadelphia' else 0)

# convert type of "offenses" and "parsed_offenses" from string representation of list to list
df["offenses"] = df["offenses"].apply(lambda x: ast.literal_eval(x))
df["parsed_offenses"] = df["parsed_offenses"].apply(lambda x: ast.literal_eval(x))

# zipcode: remove everything after hyphen
df["zipcode_clean"] = df["zip"].apply(lambda x: re.sub('-.*$','',x) if type(x) == str else x)

# Create column indicating whether zipcode is in philadelphia or not
philly_zipcode = list(range(19102, 19155))
philly_zipcode = [str(item) for item in philly_zipcode]
df['philly_zipcode'] = df['zipcode_clean'].apply(lambda x: 1 if x in philly_zipcode else 0)

In [6]:
df.head()

Unnamed: 0,offenses,offense_date,arrest_dt,case_status,arresting_officer,attorney,dob,zip,bail_set_by,bail_amount,bail_paid,bail_date,bail_type,prelim_hearing_dt,prelim_hearing_time,parsed_offenses,age,public_defender,zipcode_clean,philly_zipcode
0,"[Rape Forcible Compulsion, Rape Forcible Compu...",5/28/20,2020-07-27,Active,"Bengochea, William",Defender Association of Philadelphia,1993-05-19,19123,"Bernard, Francis X.",300000,0,2020-07-28,Monetary,2020-07-28,4:49 AM,"[False Imprisonment, Indec Asslt-W/O Cons Of O...",27.0,1,19123,1
1,[Aggravated Assault - Attempts to cause SBI or...,5/4/20,2020-05-04,Active,"Soares, Baldomiro J. Jr.",Defender Association of Philadelphia,1997-05-05,19121,E-Filing Judge,50000,5000,2020-05-05,Monetary,2020-05-04,5:33 PM,[Aggravated Assault - Attempts to cause SBI or...,23.0,1,19121,1
2,"[Simple Assault, Simple Assault, Recklessly En...",2/2/20,2020-02-12,Active,"Jones, James",Defender Association of Philadelphia,1986-11-15,19124,"Stack, Patrick",7500,750,2020-02-13,Monetary,2020-02-13,8:22 AM,"[Recklessly Endangering Another Person, Simple...",34.0,1,19124,1
3,"[Contraband/Controlled Substance, Contraband/C...",2/10/20,2020-02-21,Active,"Balmer, James M.",Defender Association of Philadelphia,1980-12-13,19135,"Bernard, Francis X.",5000,0,2020-02-21,Unsecured,2020-02-21,8:34 PM,"[Contraband/Controlled Substance, Int Poss Con...",40.0,1,19135,1
4,"[Manufacture, Delivery, or Possession With Int...",3/13/20,2020-03-14,Active,"Sima, Raymond",Richard T. Brown Jr.,1997-11-05,19144,"Stack, Patrick",0,0,2020-03-14,ROR,2020-03-14,8:40 AM,"[Conspiracy, Int Poss Contr Subst By Per Not R...",23.0,0,19144,1


## 1. Aggregate bail information for the year 2020

## 2. Visualizations on magistrate information

In [9]:
df['bail_set_by'].value_counts()

O'Brien, James                            2704
Bernard, Francis X.                       2680
Stack, Patrick                            2454
Emergency Arraignment Court Magistrate    1835
E-Filing Judge                            1564
                                          ... 
Means, Rayford A.                            1
Palumbo, Frank                               1
Active Criminal Records Department           1
Whitfield, Stephanie Yvonne                  1
Pierre, Christopher                          1
Name: bail_set_by, Length: 88, dtype: int64

## 3. Which neighborhoods are heavily impacted by bail?

The following visualizations shows that monetary bail largely impacts those who live in areas with severe poverty. 
* Left: Number of monetary bail cases by zip code. 
    * (We can regenerate the image for entire year of 2020 once we have the data) 
* Right: From US Census Bureau ASC 5-year estimate from 2018. 
    * data and link to table stored at 'data/poverty'.
    * Maybe we can grab 2020 data

data          |  visualization
:-------------------------:|:-------------------------:
Case count of monetary bail by zip code | <img src="visualizations/monetary_bail_case_count.png" alt="drawing" width="800"/>  |
Percentage of population living under poverty by zip code | <img src="visualizations/percent_below_poverty.png" alt="drawing" width="800"/> | 

The following visualizations show that the median household income is higher than median bail amount (\$25K) in many zip codes. 
* Left: median monetary bail amount by zip code.
    * Median computed only for zipcodes that had 6 or more cases.  
    * Median bail amount is usually \$25K
    * We can regenerate the image once we have 2020 data available. 
* Right: median household income by zipcode
    * From US Cenus Bureau 2018 ASC 5-year estimate from 2018. 
    * data and link to table stored at 'data/income' 
    * Maybe we can grab 2020 data


data          |  visualization
:-------------------------:|:-------------------------:
Median monetary bail amount by zip code            | <img src="visualizations/bail_amount.png" alt="drawing" width="800"/> |  
Median household income by zip code | <img src="visualizations/income.png" alt="drawing" width="800"/>


## 4. Break down by race and gender

## 5. How much Philadelphians paid in bail 