# Pandas 101 from an app dev perspective

This notebook contains my notes and learnings of the Pandas library.  I approach this topic from an application developers perspective, instead of a data scientists perspective.  

I look at Pandas as a way to help build better applications - either from a runtime perspective or to understanding the data the application has to use. 

This notebook is very exercise focused, almost cookbook like approach, trying to answer the 'How do I use Pandas to do fill in the blank' or  'I need to understand what my data looks like when fill in the blank'

# Table of Contents

- <a href="#1">1. What is Pandas</a>
- <a href="#2">2. Environment</a>

## <a id='1'> What is Pandas </a>


## <a id='1'> Environment </a>


In [1]:
import pandas as pd

In [13]:
# adding the column dtypes because the IL data is so large we will get a warning that we should set the dtype option
# because pandas cannot determine the dtypes before the entire file is read in.  For smaller files you do not 
# need to do this, for larger files it is a good idea
# I knew what types by reading in the file and then executing:
# il.dtypes
column_dtypes = {
    'id':                        object,
    'state':                     object,
    'stop_date':                 object,
    'stop_time':                 object,
    'location_raw':              object,
    'county_name':               object,
    'county_fips':              float,
    'fine_grained_location':     object,
    'police_department':         object,
    'driver_gender':             object,
    'driver_age_raw':           float,
    'driver_age':               float,
    'driver_race_raw':           object,
    'driver_race':               object,
    'violation_raw':             object,
    'violation':                 object,
    'search_conducted':            bool,
    'search_type_raw':           object,
    'search_type':               object,
    'contraband_found':            bool,
    'stop_outcome':              object,
    'is_arrested':              float,
    'stop_duration':            float,
    'vehicle_type':              object,
    'drugs_related_stop':        object,
    'district':                  object
}

%time il = pd.read_csv('./data/police-il/IL-clean.csv',dtype=column_dtypes)

CPU times: user 36.2 s, sys: 6.88 s, total: 43.1 s
Wall time: 45.8 s


In [14]:
il.head()

Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,search_conducted,search_type_raw,search_type,contraband_found,stop_outcome,is_arrested,stop_duration,vehicle_type,drugs_related_stop,district
0,IL-2004-000001,IL,2004-01-01,00:02,ILLINOIS STATE POLICE 17,,,17,Illinois State Police,F,...,False,,,False,Written Warning,,,Olds 2000,,ILLINOIS STATE POLICE 17
1,IL-2004-000002,IL,2004-01-01,00:07,ILLINOIS STATE POLICE 07,,,7,Illinois State Police,M,...,False,,,False,Written Warning,,,Linc 1990,,ILLINOIS STATE POLICE 07
2,IL-2004-000003,IL,2004-01-01,00:14,ILLINOIS STATE POLICE 11,,,11,Illinois State Police,M,...,False,,,False,Citation,,,Chev 1996,,ILLINOIS STATE POLICE 11
3,IL-2004-000004,IL,2004-01-01,00:15,ILLINOIS STATE POLICE 03,Cook County,17031.0,3,Illinois State Police,F,...,False,,,False,Citation,,,Buic 1992,,ILLINOIS STATE POLICE 03
4,IL-2004-000005,IL,2004-01-01,00:15,ILLINOIS STATE POLICE 09,,,9,Illinois State Police,F,...,False,,,False,Citation,,,Olds 1996,,ILLINOIS STATE POLICE 09


In [15]:
il.loc[::100000]

Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,search_conducted,search_type_raw,search_type,contraband_found,stop_outcome,is_arrested,stop_duration,vehicle_type,drugs_related_stop,district
0,IL-2004-000001,IL,2004-01-01,00:02,ILLINOIS STATE POLICE 17,,,17,Illinois State Police,F,...,False,,,False,Written Warning,,,Olds 2000,,ILLINOIS STATE POLICE 17
100000,IL-2004-100001,IL,2004-04-06,21:15,ILLINOIS STATE POLICE 02,,,2,Illinois State Police,M,...,False,,,False,Citation,,,Merc 1999,,ILLINOIS STATE POLICE 02
200000,IL-2004-200001,IL,2004-06-26,05:00,ILLINOIS STATE POLICE 03,Cook County,17031.0,3,Illinois State Police,M,...,False,,,False,Written Warning,,,Chev 1994,,ILLINOIS STATE POLICE 03
300000,IL-2004-300001,IL,2004-10-04,10:15,ILLINOIS STATE POLICE 07,,,7,Illinois State Police,M,...,False,,,False,Citation,,,Toyt 2001,,ILLINOIS STATE POLICE 07
400000,IL-2005-014190,IL,2005-01-16,20:11,ILLINOIS STATE POLICE 22,,,22,Illinois State Police,F,...,False,,,False,Citation,,,Pont 1999,False,ILLINOIS STATE POLICE 22
500000,IL-2005-114190,IL,2005-04-12,15:13,ILLINOIS STATE POLICE 11,,,11,Illinois State Police,F,...,False,,,False,Citation,,,Jeep 1999,False,ILLINOIS STATE POLICE 11
600000,IL-2005-214190,IL,2005-06-20,21:00,ILLINOIS STATE POLICE 18,,,18,Illinois State Police,F,...,False,,,False,Citation,,,Olds 1992,False,ILLINOIS STATE POLICE 18
700000,IL-2005-314190,IL,2005-09-11,23:25,ILLINOIS STATE POLICE 17,,,17,Illinois State Police,M,...,False,,,False,Written Warning,,,Ford 2002,False,ILLINOIS STATE POLICE 17
800000,IL-2005-414190,IL,2005-12-03,09:57,ILLINOIS STATE POLICE 15,,,15,Illinois State Police,M,...,False,,,False,Written Warning,,,Buic 2005,False,ILLINOIS STATE POLICE 15
900000,IL-2006-073706,IL,2006-03-10,21:08,ILLINOIS STATE POLICE 12,,,12,Illinois State Police,M,...,False,,,False,Written Warning,,,Chry 2004,,ILLINOIS STATE POLICE 12


In [18]:
il.loc[ (il['search_conducted']==True) & (il['contraband_found']==True)]

Unnamed: 0,id,state,stop_date,stop_time,location_raw,county_name,county_fips,fine_grained_location,police_department,driver_gender,...,search_conducted,search_type_raw,search_type,contraband_found,stop_outcome,is_arrested,stop_duration,vehicle_type,drugs_related_stop,district
7,IL-2004-000008,IL,2004-01-01,00:21,ILLINOIS STATE POLICE 15,,,15,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Citation,,,Chev 1998,TRUE,ILLINOIS STATE POLICE 15
28,IL-2004-000029,IL,2004-01-01,00:48,ILLINOIS STATE POLICE 16,,,16,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Citation,,,Pont 1994,TRUE,ILLINOIS STATE POLICE 16
48,IL-2004-000049,IL,2004-01-01,01:20,ILLINOIS STATE POLICE 06,,,06,Illinois State Police,M,...,True,Custodial Arrest,Incident to Arrest,True,Citation,,,Chev 1995,TRUE,ILLINOIS STATE POLICE 06
81,IL-2004-000082,IL,2004-01-01,02:10,ILLINOIS STATE POLICE 11,,,11,Illinois State Police,M,...,True,Custodial Arrest,Incident to Arrest,True,Citation,,,Kia 2002,TRUE,ILLINOIS STATE POLICE 11
97,IL-2004-000098,IL,2004-01-01,02:48,ILLINOIS STATE POLICE 15,,,15,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Citation,,,Chev 1989,TRUE,ILLINOIS STATE POLICE 15
106,IL-2004-000107,IL,2004-01-01,03:10,ILLINOIS STATE POLICE 05,,,05,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Citation,,,Dodg 1994,TRUE,ILLINOIS STATE POLICE 05
143,IL-2004-000144,IL,2004-01-01,06:34,ILLINOIS STATE POLICE 16,,,16,Illinois State Police,M,...,True,Custodial Arrest,Incident to Arrest,True,Citation,,,Niss 1996,FALSE,ILLINOIS STATE POLICE 16
147,IL-2004-000148,IL,2004-01-01,06:50,ILLINOIS STATE POLICE 09,,,09,Illinois State Police,M,...,True,Custodial Arrest,Incident to Arrest,True,Citation,,,Cadi 2002,TRUE,ILLINOIS STATE POLICE 09
409,IL-2004-000410,IL,2004-01-01,15:20,ILLINOIS STATE POLICE 08,,,08,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Written Warning,,,Merc 1991,TRUE,ILLINOIS STATE POLICE 08
605,IL-2004-000606,IL,2004-01-01,20:28,ILLINOIS STATE POLICE 13,,,13,Illinois State Police,M,...,True,Incident to Arrest,Incident to Arrest,True,Citation,,,Gmc 1994,TRUE,ILLINOIS STATE POLICE 13


In [6]:
il.isnull().sum()

id                             0
state                          0
stop_date                      0
stop_time                   2331
location_raw                 266
county_name              4240169
county_fips              4240169
fine_grained_location        266
police_department              0
driver_gender                  0
driver_age_raw                 0
driver_age                  2932
driver_race_raw                0
driver_race                    0
violation_raw                  0
violation                      0
search_conducted               0
search_type_raw          4529411
search_type              4562575
contraband_found               0
stop_outcome                   0
is_arrested              4715031
stop_duration            1286028
vehicle_type                   0
drugs_related_stop       2244026
district                     266
dtype: int64

In [7]:
il.shape

(4715031, 26)

In [8]:
il.dtypes

id                        object
state                     object
stop_date                 object
stop_time                 object
location_raw              object
county_name               object
county_fips              float64
fine_grained_location     object
police_department         object
driver_gender             object
driver_age_raw           float64
driver_age               float64
driver_race_raw           object
driver_race               object
violation_raw             object
violation                 object
search_conducted            bool
search_type_raw           object
search_type               object
contraband_found            bool
stop_outcome              object
is_arrested              float64
stop_duration            float64
vehicle_type              object
drugs_related_stop        object
district                  object
dtype: object