A simple Django project for loading, cleaning and querying Cook County Illinois convictions data.
This is the preprocessing backend that drives the presentation of https://github.com/sc3/cook-convictions/
Create spatial database
$ createdb convictions $ psql convictions > CREATE EXTENSION postgis; > CREATE EXTENSION postgis_topology;
spatialite convictions.sqlite3 "SELECT InitSpatialMetaData();"
git clone https://github.com/sc3/cook-convictions-data.git mkvirtualenv convictions cd django-convictions pip install -r requirements.txt cp convictions/setttings/dev.example.py convictions/settings/dev.py # Edit convictions/settings/dev.py to fill in the needed variables ./manage.py syncdb ./manage.py migrate
We use DataMade's usaddress package to parse addresses when anonymizing them to the block level. However, the stable version of the package doesn't support Python 3. In a pinch, we use a fork that I made that adds rough Python 3 support. We install this fork as editable, so we need to do the training.
workon convictions cd /path/to/virtualenv/src/usaddress python training/training.py
Load spatial data
Download and unpack the Shapefile version of Chicago Community Areas.
./manage.py load_spatial_data CommunityArea data/Comm_20Areas/CommAreas.shp
Download and unpack the Shapefile version of the Cook County Municipalities data from https://datacatalog.cookcountyil.gov/GIS-Maps/ccgisdata-Municipality/ta8t-zebk
./manage.py load_spatial_data Municipality data/Municipality/Municipality.shp
Download and unpack the Shapefile version of the Census Tract data.
./manage.py load_spatial_data CensusTract data/CensusTracts2010/CensusTractsTIGER2010.shp
Download and unpack the Shapefile version of the Illinois Census Places data.
./manage.py load_spatial_data CensusPlace data/tl_2010_17_place10/tl_2010_17_place10.shp
Download and unpack the Shapefile version of the Illinois County data.
./manage.py load_spatial_data County data/tl_2010_17_county10/tl_2010_17_county10.shp
For generating the Chicago and Cook County border GeoJSON file, we use the cartographic versions of the county and place shapefiles because they remove offshore areas. You'll want to download and unpack those too.
Load census data
./manage.py load_aff_data CensusTract total_population GEO.id2 HD01_VD01 HD02_VD01 data/ACS_10_5YR_B01003_with_ann__totpop__tracts.csv ./manage.py load_aff_data CensusTract per_capita_income GEO.id2 HD01_VD01 HD02_VD01 data/ACS_10_5YR_B19301_with_ann__per_capita_income__tracts.csv ./manage.py load_aff_data CensusPlace total_population GEO.id2 HD01_VD01 HD02_VD01 data/ACS_10_5YR_B01003_with_ann__totpop__places.csv ./manage.py load_aff_data CensusPlace per_capita_income GEO.id2 HD01_VD01 HD02_VD01 data/ACS_10_5YR_B19301_with_ann__per_capita_income__places.csv
Aggregate census data to Chicago Community Areas
./manage.py flag_chicago_msa_places data/tl_2010_17_place10_chicago_msa.csv
Identify suburbs in Cook County
Load raw dispositions data
This command will also fix known issues with columbs being shifted in some rows due to bad escaping of quoted columns in the raw CSV file.
Note that the
--delete flag removes any previous records.
./manage.py load_dispositions_csv --delete data/Criminal_Convictions_ALLCOOK_05-09.csv
Populate clean disposition records
Note that the
--delete flag removes any previous records.
./manage.py create_dispositions --delete
Geocode disposition records
Detect Community Area and Census Place boundaries
Create convictions records from the dispositions
./manage.py create_convictions --delete
Export Community Area and Census Place GeoJSON
./manage.py export_model_geojson CommunityArea > community_areas.json ./manage.py export_model_geojson CensusPlace > suburbs.json
Export most common charges overall
./manage.py most_common_statutes > top_statutes.csv
Export most common charges by geography
./manage.py most_common_statutes_by_geo --model CommunityArea > top_statutes_by_community_area.csv ./manage.py most_common_statutes_by_geo --model CensusPlace > export/top_statutes_by_suburb.csv
Extract Chicago and Cook County's border from a shapefile
./manage.py border_geojson_from_shp data/gz_2010_17_160_00_500k/gz_2010_17_160_00_500k.shp data/gz_2010_us_050_00_500k/gz_2010_us_050_00_500k.shp > chicago_cook_borders.json
Export convictions by age bucket
./manage.py export_age_json > convictions_by_age.json
Export disposition data
Export Disposition model records to CSV. Anonymize the data by dropping personal identifier fields and converting address fields to the block. For example, an address number of "2707" would be converted to "2700".
./manage.py export_public_data > dispositions.csv
Export table of felony convictions
Export a CSV table of felony convictions by class and year, mirroring the format of the data at https://performance.cookcountyil.gov/Public-Safety/Number-Of-Felony-Cases-Filed-By-Felony-Class/kcfs-dufb
Export count of cases where there ended up being a felony conviction. In this case, there may have been a charge that started as a misdemeanor but was later ammended to be a felony.
Export count of cases where there was always a felony charge. That is, the charges filed were for felonies and they were never ammended to a different type.
./manage.py export_cases_by_class --filter felony_always
Export table of how charge classes were amended
Or, as percentages (which is probably easier for seeing trends)
./manage export_cases_class_change --pct
Export table of drug convictions
./manage.py export_drug_stats drug_by_class > export/drug_by_class.csv
Export table of top community areas by DUI
./manage.py export_dui_convictions_by_geo --model CommunityArea --count 20 > export/top_dui_community_areas.csv
Creating a list of suburban places
It's hard to define Chicago Suburbs. I made the decision to define these as Census Places in the counties that are part of Chicago's Metropolitan Statistical Area:
I created a list of these census places by bringing the TIGER shapefile for Illinois counties into QGIS. I paired this down to the counties above. Then, I used the "Join Attributes by Location" vector data management tool to create a shapefile of only census places within these counties. Finally, I extracted the attributes from the shapefile as a CSV like this:
ogr2ogr -f CSV tl_2010_17_place10_chicago_msa.csv tl_2010_17_place10_chicago_msa/tl_2010_17_place10_chicago_msa.shp
Loading conviction places from dispositions
Because we added places mid-process, I didn't want to re-create Conviction records. I wrote a one-off management command to copy the places from the dispositions:
- Boundaries - Community Areas (current)
- Cook County Municipalities
- Boundaries - Census Tracts - 2010
- 2010 Illinois Census Place TIGER Shapefile
- 2010 Illinois County TIGER Shapefile
- 2010 Census Cartographic Boundary Shapefile for Counties
- 2010 Census Cartographic Boundary Shapefile for Places <https://www.census.gov/geo/maps-data/data/cbf/cbf_place.html>
- 2010 ACS 5-year Estimates "TOTAL POPULATION" (B01003) for Cook County Census Tracts
- 2010 ACS 5-year Estimates "TOTAL POPULATION" (B01003) for Illinois Census Places
- 2010 ACS 5-year Estimates "PER CAPITA INCOME IN THE PAST 12 MONTHS (IN 2010 INFLATION-ADJUSTED DOLLARS)" (B19301) for Cook County Census Tracts
- 2010 ACS 5-year Estimates "PER CAPITA INCOME IN THE PAST 12 MONTHS (IN 2010 INFLATION-ADJUSTED DOLLARS)" (B19301) for Illinois Census Places