<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Dictionary-of-harmonized-school-names" data-toc-modified-id="Dictionary-of-harmonized-school-names-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Dictionary of harmonized school names</a></span></li><li><span><a href="#Load-and-prepare-data" data-toc-modified-id="Load-and-prepare-data-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Load and prepare data</a></span></li><li><span><a href="#Harmonize-school-names" data-toc-modified-id="Harmonize-school-names-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Harmonize school names</a></span></li><li><span><a href="#Save" data-toc-modified-id="Save-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Save</a></span></li></ul></div>

**Description**: Loads data on implementation of the Safe Passage program provided through the FOIA request. School names are harmonized according to the conventions used in this project. The processed data is then saved.

---

In [1]:
import pickle
from pathlib import Path

import pandas as pd

In [2]:
data_path = Path('../../data')

# Dictionary of harmonized school names
Dictionary with school names as provided by FOIA request as keys and harmonized names as values (manually matched). If multiple entries where found, the CPS website (http://cps.edu/Pages/safepassage.aspx) as well as the CPS school locator (http://cps.edu/ScriptLibrary/Map-SchoolLocator/index.html) were used to verify. 

The only school which can't be matched is "Price".

In [3]:
foia_to_harmonized_names = {
    'Air Force':
    'Air Force HS',
    'Alcott':
    'Alcott Prep HS',
    'Alcott HS':
    'Alcott Prep HS',
    'Aldridge (Altgeld Gardens)':
    'Aldridge/Carver/CICS-Bond/CICS-Hawkins/Dubois',
    'CICS - Lloyd Bond (Altgeld Gardens)':
    'Aldridge/Carver/CICS-Bond/CICS-Hawkins/Dubois',
    'Carver (Altgeld Gardens)':
    'Aldridge/Carver/CICS-Bond/CICS-Hawkins/Dubois',
    'DuBois (Altgeld Gardens)':
    'Aldridge/Carver/CICS-Bond/CICS-Hawkins/Dubois',
    'Larry Hawkins (Altgeld Gardens)':
    'Aldridge/Carver/CICS-Bond/CICS-Hawkins/Dubois',
    'Ames':
    'Marine Acad at Ames',
    'Austin':
    'Austin CCA HS',
    'Bogan':
    'Bogan HS',
    'Bowen':
    'Bowen HS',
    'Brenneman':
    'Brennemann',
    'Brooks':
    'Brooks HS',
    'Back of the Yards':
    'Back of the Yards HS',
    'Cardenas':
    'Cardenas/Castellanos',
    'Chicago Military Academy HS':
    'Chicago Military Acad HS',
    'Clemente':
    'Clemente HS',
    'Corliss':
    'Corliss HS',
    'Crane':
    'Crane Medical HS',
    'CVCA':
    'Chicago Vocational HS',
    'DePriest':
    'De Priest',
    'Dusable':
    'Shabazz - Dusable HS',
    'Dyett':
    'Dyett HS',
    'Douglass':
    'Douglass HS',
    'Dunbar':
    'Dunbar HS',
    'DuSable':
    'Shabazz - Dusable HS',
    'Farragut':
    'Farragut HS',
    'Fenger':
    'Fenger HS',
    'Gage Park':
    'Gage Park HS',
    'Graham HS':
    'Graham, R HS',
    'Hamlin':
    'Hamline/Chavez',
    'Hamline':
    'Hamline/Chavez',
    'Harlan':
    'Harlan HS',
    'Harper':
    'Harper HS',
    'Hirsch':
    'Hirsch HS',
    'Hope':
    'Hope Prep HS',
    'Hyde Park':
    'Hyde Park HS',
    'Hyde Pk':
    'Hyde Park HS',
    'Julian':
    'Julian HS',
    'Kelly':
    'Kelly HS',
    'Gompers':
    'Owens',
    'Kelvyn Park':
    'Kelvyn Park HS',
    'Kenwood':
    'Kenwood HS',
    'Lindblom':
    'Lindblom HS',
    'Manley':
    'Manley HS',
    'Marshall':
    'Marshall, HS',
    'Marshall Middle':
    'Marshall, T',
    'Michele Clark':
    'Clark HS',
    'Michelle Clark':
    'Clark HS',
    'Morgan Park':
    'Morgan Park HS',
    'Morgan Pk':
    'Morgan Park HS',
    'Nicholson':
    'Nicholson Tech Acad',
    'NTA':
    'National Teachers',
    'Orr':
    'Orr HS',
    'Phillips':
    'Phillips HS',
    'Phoenix Military':
    'Phoenix Military ACAD',
    'Randolph ES':
    'Randolph',
    'Richards':
    'Richards HS',
    'Robeson':
    'Robeson HS',
    'Simeon':
    'Simeon HS',
    'Solorio':
    'Solorio HS',
    'Sandoval':
    'Hernandez/Sandoval/Solorio HS',
    'South Shore ES':
    'South Shore',
    'South Shore HS':
    'South Shore Intl HS',
    'S. Shore':
    'South Shore Intl HS',
    'Spencer':
    'Spencer Tech Acad',
    'Spry':
    'Spry HS',
    'Spry Community Links':
    'Spry ES/Spry HS/Telpochcalli/Saucedo/Hammond',
    'TEAM':
    'Team HS',
    'TEAM Englewood':
    'Team HS',
    'Team Englewood':
    'Team HS',
    'Tilden':
    'Tilden HS',
    'Uplift':
    'Uplift HS',
    'Urban Prep':
    'Urban Prep Chtr Bronzeville',
    'Ward, L.':
    'Ward, L',
    'Wells':
    'Wells HS',
    'Wells, I.':
    'Wells, I'
}

# Load and prepare data
Read in all school years as a dictionary with an entry per sheet of the excel file (i.e. keys are sheet names and therefore school years)

In [4]:
foia_sp = pd.read_excel(
    data_path / 'raw/Safe_Passage_Schools_By_Implementation_Year_8.12.16.xlsx',
    sheet_name=None)

Remove trailing or leading whitespaces

In [5]:
for sy in foia_sp.keys():
    foia_sp[sy][sy] = foia_sp[sy][sy].str.strip()

# Harmonize school names
Replace school name by harmonized name if there is one in dictionary, else just leave name as is.

In [6]:
for sy in foia_sp.keys():
    foia_sp[sy][sy] = foia_sp[sy][sy].apply(
        lambda x: foia_to_harmonized_names.get(x, x))

# Save

In [7]:
with (data_path / 'processed/foia_sp.pkl').open('wb') as f:
    pickle.dump(foia_sp, f)