# Mapping schools with >90% students of color

In [328]:
%matplotlib inline
import matplotlib.pyplot as plt
import geopandas as gpd
import numpy as np
import pandas as pd
import folium

After working with the small group of schools in Massachusetts that enroll over 90% students of color in our analysis, I thought it'd be interesting to put these 163 schools on a map with some information about their staff demographics. 

First, I read in the 'schools_w_residuals' file that we generated during our original analysis and our table of school addresses.

In [125]:
schools = pd.read_csv("schools_w_residuals.csv", converters={'Zip Code': str})
addresses = pd.read_csv("../student_data/school_addresses.csv", encoding = "ISO-8859-1", converters={'Zip Code': str}, usecols = ["Org Code", "Address 1", "Town", "State", "Zip Code"])

I formatted the list of school addresses to match the format the Census uses for geocoding. 

In [126]:
schools_match = schools[["Org Code", "Zip Code"]]

In [127]:
addresses = pd.merge(addresses, schools_match, how = 'inner', on = ["Org Code", "Zip Code"])

In [128]:
addresses = addresses.set_index("Org Code")
addresses.head()

Unnamed: 0_level_0,Address 1,Town,State,Zip Code
Org Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
35020405,2001 Roosevelt Avenue,Springfield,MA,1104
4200205,21 Notre Dame Avenue,Cambridge,MA,2140
350541,612 Metropolitan Av,Hyde Park,MA,2136
350390,380 Shawmut Avenue,Boston,MA,2118
350548,20 Church St,Back Bay,MA,2116


In [60]:
addresses.to_csv("geocode.csv", header = False)

I then read in the geocoded file from the Census and created Latitude and Longitude fields for (nearly) all of our schools.

In [129]:
coord = pd.read_csv("geocoded_addresses.csv", header = None, usecols = [0, 5], names = ["Org Code", "Coordinates"], index_col = "Org Code")

In [131]:
coord = coord["Coordinates"].str.split(',', expand=True)

In [132]:
coord = coord.rename(columns = {0:'Long', 1: 'Lat'})
coord = coord.reset_index()

I joined the coordinates to our school dataframe, which contains information about student and staff demographics.

In [135]:
schools_geo = pd.merge(schools, coord, how = "left", on = "Org Code")

In [136]:
schools_geo.head()

Unnamed: 0.1,Unnamed: 0,Zip Code,Org Code,Org Name,Org Type,Year,Job Category,Nonwhite (Num Stu),Total Students,Nonwhite (Num Staff),...,Percent Poverty,% HS Graduates,% College Graduates,Zip Perc White,Percent Nonwhite Residents,In Boston,Residuals,Residuals Group,Long,Lat
0,69,1040,1370040,Holyoke: Kelly Elementary,Public School,2015,All Staff,571.0,586,20.5,...,38.4,77.3,23.4,82.1,17.9,Not Boston,9.11795,High,-72.60086,42.20262
1,73,1040,1370025,Holyoke: Morgan Full Service Community School,Public School,2015,All Staff,391.0,399,13.0,...,38.4,77.3,23.4,82.1,17.9,Not Boston,-3.018515,Low,-72.60774,42.196575
2,74,1040,1370030,Holyoke: William R. Peck School,Public School,2015,All Staff,345.0,371,31.9,...,38.4,77.3,23.4,82.1,17.9,Not Boston,12.319321,High,-72.63068,42.196697
3,75,1040,1370605,Holyoke: Wm J Dean Vocational Technical High,Public School,2015,All Staff,381.0,403,22.0,...,38.4,77.3,23.4,82.1,17.9,Not Boston,1.970393,High,-72.626724,42.184185
4,76,1040,4530005,Holyoke Community Charter (District): Holyoke ...,Charter School,2015,All Staff,659.0,704,30.4,...,38.4,77.3,23.4,82.1,17.9,Not Boston,20.844005,High,-72.63003,42.187336


In [205]:
schools_geo = schools_geo.dropna()
schools_geo = schools_geo.reset_index()

In [241]:
schools_geo["Lat"] = schools_geo["Lat"].astype(float)
schools_geo["Long"] = schools_geo["Long"].astype(float)

I created a variable to color code the schools on the map based on percent of staff members of color.

In [324]:
schools_geo["Alt Color"] = "red"
schools_geo["Alt Color"][schools_geo["Perc Nonwhite Staff"] >= 20.0] = "orange"
schools_geo["Alt Color"][schools_geo["Perc Nonwhite Staff"] >= 40.0] = "yellow"
schools_geo["Alt Color"][schools_geo["Perc Nonwhite Staff"] >= 60.0] = "lightgreen"
schools_geo["Alt Color"][schools_geo["Perc Nonwhite Staff"] >= 80.0] = "green"

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  This is separate from the ipykernel package so we can avoid doing imports until
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  """


I also created a column with the full text I wanted to display for each school's popup. I'm guessing that there's a nicer way to do this. 

In [318]:
schools_geo["Perc Nonwhite Staff"] = round(schools_geo["Perc Nonwhite Staff"], 1)
schools_geo["Full Popup"] = schools_geo["Org Name"] + " (Percent Staff Members of Color: " + schools_geo["Perc Nonwhite Staff"].map(str) + "% | School Leader of Color? " + schools_geo["Nonwhite Leader"].map(str) + ")"

Finally, I made a list of locations from my columns of coordinates and looped through them to add markers to the folium map. This website was a really helpful guide: https://georgetsilva.github.io/posts/mapping-points-with-folium/

In [319]:
locations = schools_geo[['Lat', 'Long']]
loclist = locations.values.tolist()

Our final map shows all of the schools in MA that enroll >90% students of color, color coded by the percent of nonwhite staff in each building and a popup that provides a bit more information about each school. 

In [326]:
m = folium.Map([42, -72], tiles='cartodbpositron', zoom_start=8, max_zoom = 15, min_zoom = 7)
for point in range(0, len(loclist)):
    folium.CircleMarker(loclist[point], radius=7, color=schools_geo["Alt Color"][point], popup=folium.Popup(schools_geo['Full Popup'][point], parse_html=True)).add_to(m)

m.save("ma_schools.html")
m