# The Battle of the Neighborhoods - Week 2

## Introduction: Business Problem

### Background

Toronto is the biggest city in Canda. It is the provincial capital of Ontario and the most populous city in Canada, with a population of 2,731,571 in 2016. Toroto is a truly internal business center in North America and financial capital of Canada. 

Toronto is also the largest centre of education, research and innovation in Canada. The education system combines the Public and Privates schools in Toronto, both elementary and secondary, All schools take their curricular mandate from the Ontario Ministry of Education.

There are four types of school boards in Ontario. Depends on individual student language, religious background or choice, students can attend English Public, English Catholic, French Public, or French Catholic schools.

Publicly funded education is divided into three stages: early childhood education, for children from birth to age 6; elementary school, for students from kindergarten to grade 8; and secondary school, for students from grade 9 to 12.

### Business Problem

##### What is EQAO
- The Education Quality and Accountability Office (EQAO) is an an independent government agency of the Government of Ontario. The purpose of EQAO is develops and oversees reading, writing and mathematics tests that Ontario students must take in Grades 3, 6, 9, and 10.
- The EQAO test results give parents, teachers, principals and school boards information about how well students have learned what the province expects them to learn in reading, writing and mathematics.

##### EQAO results
- Only half of Ontario's Grade 6 students met the provincial standards for math in the 2016-2017 academic year, down seven percentage points from 2013. Meanwhile, 62 per cent of Grade 3 students met the provincial math standards, a decrease of five percentage points from 2014.
- For Grade 9 students, only 44 per cent met the standard in the applied math in 2017-2018. That number experienced a decline compare with years in 2013-2014.

##### Impacts
- After the release of results from 2017-2018 year’s EQAO standardized testing, The Ontario Government announced a four-year math strategy. 
- Ontario will spend more than $55 million this year hiring math learning leads for school boards, providing “extensive” training in elementary and secondary schools, and expanding other programs like tutoring.
- A public concern has been raised regarding the 2017-2018 EQAO results. There is growing number of students using or searching for private tutor services. https://www.cbc.ca/news/canada/toronto/ontario-math-curriculum-private-services-1.4445472

### Business Opportunity

Consider to open a after-school tutor service in Toronto? According to the Wall Street Journal’s Smart Money Magazine, now, it could be the perfect time for you to get into the education business.

Let's go to explore data we collect from multipled data sources and arrange them as a dataframe for the analysis; so that we can target the recommended locations across different areas according to what we discover from the collected data.

## Data Description

Data collection and process in most cases require up to 80% time in the whole Data Science project. How data is gathered and analyzed depends on many factors. These factors are including the content, the problems or issues can be identified with some indicators, the datasource integrity, and the size of data. 

There are some aspects should be considered in the data collection for this project. 

- The schools number in a neiboroughood: If the area has numbers of opening schools, particularly those are public schools, the higher demanding needs for tutoring services.
- The school ranking: if the school has lower ranking, then the number of students are looking for tutors services for academy improvement is higher.
- The number of tutor services: to avoid the competition and towards to more successful in business, the area has no or few tutor services business opening could have an opportunity to open one.

There are at least 3 datasource required for this project to provide data analysis and suggestion for business decision.

- Toronto schools Data : This can be collected from Toronto city open dataset, https://www.toronto.ca/city-government/data-research-maps/open-data/. This data provides the number of schools currently opening in Toronto; school name, and addresses.
- Toronto neiboroughood data: This geo data is from Wiki and foursqure API access, we can use it to analyze the school geo location for the potential location to open a business.
- Toronto Schools ratings: The school ranking is the key aspect in this project. Although parents would choose re-locate for a better school for their children, however, it is a time-consuming and stressful process. The ranking data is yearly updated and can be found in the Fraser Institute web site https://www.fraserinstitute.org/school-performance.

## Data preparation

In [1]:
# Import libraries
import random # library for random number generation
import numpy as np # library for vectorized computation
import pandas as pd # library to process data as dataframes
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib.pyplot as plt # plotting library

# Data collection
import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

# Map
!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library
!conda install -c conda-forge geopy --yes
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# Import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.

Libraries imported.


#### Get Toronto Neiboroughood Data

In [2]:
# Collecting toronto neighborhood data
content = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
# parse data from the html into a beautifulsoup object
data = BeautifulSoup(content, 'html.parser')

In [3]:
# Process data to store, initial three empty lists
postalCodeList = []
boroughList = []
neighborhoodList = []

In [4]:
# Loop through table content; store Postal Code, Borough, and Neighborhood data into each list
# <tr><td>M9B</td><td><a href="/wiki/Etobicoke" title="Etobicoke">Etobicoke</a></td><td><a class="mw-redirect" href="/wiki/Islington,_Toronto" title="Islington, Toronto">Islington</a></td></tr>
for row in data.find('table').find_all('tr'):
    cells = row.find_all('td')
    if(len(cells) > 0):
        postalCodeList.append(cells[0].text)
        boroughList.append(cells[1].text)
        neighborhoodList.append(cells[2].text.rstrip('\n'))

In [5]:
# Define a dataframe consist data of three columns: PostalCode, Borough, and Neighborhood
df_toronto = pd.DataFrame({"PostalCode": postalCodeList,
                           "Borough": boroughList,
                           "Neighborhood": neighborhoodList})

df_toronto.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


In [6]:
# Ignore cells with a borough that is Not assigned.
df_toronto_dropna = df_toronto[df_toronto.Borough != "Not assigned"].reset_index(drop=True)
#df_toronto_dropna.head(10)

# Group neighborhoods that are in the same boroug
df_toronto_grouped = df_toronto_dropna.groupby(["PostalCode", "Borough"], as_index=False).agg(lambda x: ", ".join(x))
df_toronto_grouped.head(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [7]:
# For those "Not assigned" neighborhood, fill neighborhood with same name as the borough
for index, row in df_toronto_grouped.iterrows():
    if row["Neighborhood"] == "Not assigned":
        row["Neighborhood"] = row["Borough"]
        
df_toronto_grouped.tail(10)

Unnamed: 0,PostalCode,Borough,Neighborhood
93,M9A,Etobicoke,Islington Avenue
94,M9B,Etobicoke,"Cloverdale, Islington, Martin Grove, Princess ..."
95,M9C,Etobicoke,"Bloordale Gardens, Eringate, Markland Wood, Ol..."
96,M9L,North York,Humber Summit
97,M9M,North York,"Emery, Humberlea"
98,M9N,York,Weston
99,M9P,Etobicoke,Westmount
100,M9R,Etobicoke,"Kingsview Village, Martin Grove Gardens, Richv..."
101,M9V,Etobicoke,"Albion Gardens, Beaumond Heights, Humbergate, ..."
102,M9W,Etobicoke,Northwest


In [8]:
# Print the number rows of dataframe
df_toronto_grouped.shape

(103, 3)

In [9]:
# Load cvs file that has the geographical coordinates of each postal code
coordinates = pd.read_csv("Geospatial_Coordinates.csv")
coordinates.head()

Unnamed: 0,Postal Code,Latitude,Longitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [10]:
# Make column name consistent
coordinates.rename(columns={"Postal Code": "PostalCode"}, inplace=True)

In [11]:
# Merge two dataframes; groupd by postal code
df_toronto_coordinates = df_toronto_grouped.merge(coordinates, on="PostalCode", how="left")
df_toronto_coordinates.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


In [12]:
# Laverage geopy library to get the latitude and longitude values of Toronto
address = 'Toronto, Ontario'
# Define geo user agent
geolocator = Nominatim(user_agent="toronto_explorer")

# get the geographical coordinates of Toronto
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


In [13]:
# Assign geo location variable to neighborhood
neighborhood_latitude = latitude
neighborhood_longitude = longitude

In [14]:
# create map of Toronto using latitude and longitude values
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

In [15]:
# add markers to map, use dataframe df_toronto_coordinates from part2
for lat, lng, borough, neighborhood in zip(df_toronto_coordinates['Latitude'], df_toronto_coordinates['Longitude'], df_toronto_coordinates['Borough'], df_toronto_coordinates['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

#### Leverage Foursquare API to explore the neighborhoods

In [16]:
# Leverage Foursquare API to explore the neighborhoods
# Defince Foursquare constants
CLIENT_ID = 'RP3BUFDNLXHK1UUETESUVFWFQRKH4NI1GG1RR5BCE5LWIR03' # your Foursquare ID
CLIENT_SECRET = 'TQ1GIAA3GING3RH5PMMXTC3CFUASDKT0F1UHEIL5OXYRAWEJ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: RP3BUFDNLXHK1UUETESUVFWFQRKH4NI1GG1RR5BCE5LWIR03
CLIENT_SECRET:TQ1GIAA3GING3RH5PMMXTC3CFUASDKT0F1UHEIL5OXYRAWEJ


In [13]:
# add markers to map, use dataframe df_toronto_coordinates from part2
for lat, lng, borough, neighborhood in zip(df_toronto_coordinates['Latitude'], df_toronto_coordinates['Longitude'], df_toronto_coordinates['Borough'], df_toronto_coordinates['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto

'https://api.foursquare.com/v2/venues/explore?&client_id=RP3BUFDNLXHK1UUETESUVFWFQRKH4NI1GG1RR5BCE5LWIR03&client_secret=TQ1GIAA3GING3RH5PMMXTC3CFUASDKT0F1UHEIL5OXYRAWEJ&v=20180605&ll=43.6711345,-79.3871298&radius=500&limit=100'

In [18]:
# Explore the first neighborhood in the dataframe
df_toronto_coordinates.loc[0, 'Neighborhood']

'Rouge, Malvern'

In [20]:
# Get the top 100 venues that are in Toronto within a radius of 500 meters
# Create the GET request URL. Name your URL url
radius = 500
LIMIT = 100
url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, radius, LIMIT)

url

'https://api.foursquare.com/v2/venues/explore?client_id=RP3BUFDNLXHK1UUETESUVFWFQRKH4NI1GG1RR5BCE5LWIR03&client_secret=TQ1GIAA3GING3RH5PMMXTC3CFUASDKT0F1UHEIL5OXYRAWEJ&ll=43.653963,-79.387207&v=20180605&radius=500&limit=100'

In [21]:
# GET request and examine the resutls
results = requests.get(url).json()
#results
venues = results['response']['groups'][0]['items']
#venues

#### Leverage some functions learned from course

In [22]:
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

#### Get Toronto Schools data

In [111]:
# Load cvs file that has the Toronto schools data for all type
schools = pd.read_csv("Toronto-School-Locations.csv")
schools.head()

Unnamed: 0,_id,OBJECTID,GEO_ID,NAME,SCHOOL_LEVEL,SCHOOL_TYPE,BOARD_NAME,SOURCE_ADDRESS,SCHOOL_TYPE_DESC,ADDRESS_POINT_ID,ADDRESS_NUMBER,LINEAR_NAME_FULL,ADDRESS_FULL,POSTAL_CODE,MUNICIPALITY,CITY,PLACE_NAME,GENERAL_USE_CODE,CENTRELINE_ID,LO_NUM,LO_NUM_SUF,HI_NUM,HI_NUM_SUF,LINEAR_NAME_ID,X,Y,LATITUDE,LONGITUDE,geometry
0,3868,1,7963754,Avondale Public School,,EP,Toronto District School Board,25 Bunty Lane,English Public,7963754,25,Bunty Lane,25 Bunty Lane,M2K 1W4,North York,Toronto,Bayview Middle School,102001,7963734,25,,,,5064,314026.623,4848287.934,43.776502,-79.38519,"{u'type': u'Point', u'coordinates': (-79.38518..."
1,3869,2,7315504,Avondale Secondary Alternative School,,EP,Toronto District School Board,24 Silverview Dr,English Public,7315504,24,Silverview Dr,24 Silverview Dr,M2M 2B3,North York,Toronto,Griffin Centre,102001,7315505,24,,,,6720,311879.691,4849375.563,43.786315,-79.411846,"{u'type': u'Point', u'coordinates': (-79.41184..."
2,3870,3,20258267,AYJ Global Academy,,PR,,4 Lansing Sq,Priv,20258267,4,Lansing Sq,4 Lansing Sq,M2J 5A2,North York,Toronto,,104008,438287,4,,,,6007,318854.727,4848028.516,43.774091,-79.32522,"{u'type': u'Point', u'coordinates': (-79.32521..."
3,3871,4,7102999,Bais Brucha School,,PR,,3077 Bathurst St,Priv,7102999,3077,Bathurst St,3077 Bathurst St,M6A 1Z9,North York,Toronto,,104008,7102975,3077,,,,436,310459.388,4841999.775,43.719936,-79.42957,"{u'type': u'Point', u'coordinates': (-79.42956..."
4,3872,5,9171677,Bais Chaya Mushka Elementary School,,PR,,4375 Chesswood Dr,Priv,9171677,4375,Chesswood Dr,4375 Chesswood Dr,M3J 2C2,North York,Toronto,,115001,9171675,4375,,,,5187,306600.741,4846528.168,43.760717,-79.477445,"{u'type': u'Point', u'coordinates': (-79.47744..."


In [112]:
# Only keep the columns we need
schools_new=schools.drop(['_id', 'OBJECTID', 'GEO_ID', 'CITY', 'LO_NUM', 'LO_NUM_SUF', 'HI_NUM', 'HI_NUM_SUF', 'CENTRELINE_ID', 'X', 'Y', 'geometry','LINEAR_NAME_ID', 'GENERAL_USE_CODE', 'PLACE_NAME', 'LINEAR_NAME_FULL','SCHOOL_LEVEL', 'BOARD_NAME', 'ADDRESS_POINT_ID', 'SCHOOL_TYPE', 'ADDRESS_NUMBER'], axis=1)


In [113]:
schools_new.head()

Unnamed: 0,NAME,SOURCE_ADDRESS,SCHOOL_TYPE_DESC,ADDRESS_FULL,POSTAL_CODE,MUNICIPALITY,LATITUDE,LONGITUDE
0,Avondale Public School,25 Bunty Lane,English Public,25 Bunty Lane,M2K 1W4,North York,43.776502,-79.38519
1,Avondale Secondary Alternative School,24 Silverview Dr,English Public,24 Silverview Dr,M2M 2B3,North York,43.786315,-79.411846
2,AYJ Global Academy,4 Lansing Sq,Priv,4 Lansing Sq,M2J 5A2,North York,43.774091,-79.32522
3,Bais Brucha School,3077 Bathurst St,Priv,3077 Bathurst St,M6A 1Z9,North York,43.719936,-79.42957
4,Bais Chaya Mushka Elementary School,4375 Chesswood Dr,Priv,4375 Chesswood Dr,M3J 2C2,North York,43.760717,-79.477445


In [114]:
schools_new.shape

(1281, 8)

In [115]:
#schools_new.tail()
# clean up
schools_new_dropna = schools_new[schools_new.SCHOOL_TYPE_DESC != "University"].reset_index(drop=True)
schools_new_dropna.shape

(1273, 8)

In [116]:
schools_new_dropna = schools_new_dropna[schools_new_dropna.SCHOOL_TYPE_DESC != "College"].reset_index(drop=True)
schools_new_dropna.shape

(1250, 8)

In [117]:
schools_new_dropna = schools_new_dropna[schools_new_dropna.POSTAL_CODE != " "].reset_index(drop=True)
schools_new_dropna.shape

(1250, 8)

In [119]:
# Loop through df_toronto_coordinates, find neighborhood name list base on postal code
neighborhood_list = []

for index, row in schools_new_dropna.iterrows():
    post_code = row['POSTAL_CODE']
    #short = post_code.split(' ')
    short = str(post_code[:3])
    #print(post_code)
    neighborhood = row['MUNICIPALITY']
    for index2, row2 in df_toronto_coordinates.iterrows():
        if(short == row2['PostalCode']):
            neighborhood = row2['Neighborhood']
    #print(neighborhood)
    neighborhood_list.append(neighborhood)
    #print(len(neighborhood_list))       

print(len(neighborhood_list))
    

1250


In [120]:
# Remove dupliucated column
schools_new_update = schools_new_dropna.drop(['SOURCE_ADDRESS'], axis=1)

schools_new_update.head()

Unnamed: 0,NAME,SCHOOL_TYPE_DESC,ADDRESS_FULL,POSTAL_CODE,MUNICIPALITY,LATITUDE,LONGITUDE
0,Avondale Public School,English Public,25 Bunty Lane,M2K 1W4,North York,43.776502,-79.38519
1,Avondale Secondary Alternative School,English Public,24 Silverview Dr,M2M 2B3,North York,43.786315,-79.411846
2,AYJ Global Academy,Priv,4 Lansing Sq,M2J 5A2,North York,43.774091,-79.32522
3,Bais Brucha School,Priv,3077 Bathurst St,M6A 1Z9,North York,43.719936,-79.42957
4,Bais Chaya Mushka Elementary School,Priv,4375 Chesswood Dr,M3J 2C2,North York,43.760717,-79.477445


In [121]:
# Add Neighborhood data to each school
schools_new_update['Neighborhood'] = neighborhood_list

In [122]:
schools_new_update.head()

Unnamed: 0,NAME,SCHOOL_TYPE_DESC,ADDRESS_FULL,POSTAL_CODE,MUNICIPALITY,LATITUDE,LONGITUDE,Neighborhood
0,Avondale Public School,English Public,25 Bunty Lane,M2K 1W4,North York,43.776502,-79.38519,Bayview Village
1,Avondale Secondary Alternative School,English Public,24 Silverview Dr,M2M 2B3,North York,43.786315,-79.411846,"Newtonbrook, Willowdale"
2,AYJ Global Academy,Priv,4 Lansing Sq,M2J 5A2,North York,43.774091,-79.32522,"Fairview, Henry Farm, Oriole"
3,Bais Brucha School,Priv,3077 Bathurst St,M6A 1Z9,North York,43.719936,-79.42957,"Lawrence Heights, Lawrence Manor"
4,Bais Chaya Mushka Elementary School,Priv,4375 Chesswood Dr,M3J 2C2,North York,43.760717,-79.477445,"Northwood Park, York University"


#### Get schools ranking data

In [127]:
# import library to read excel datafile 
from pandas import ExcelWriter
from pandas import ExcelFile

In [128]:
# Load school ranking excel 
ranking_df = pd.read_excel('Toronto_schools_ranking_2017-2018.xlsx', sheet_name='Sheet1')

ranking_df.head()

Unnamed: 0,2017-18 Rank,Trend,School Name,Postal Code,City,2017-18 Rating
0,1/3046,,Avondale Alternative,M2N 2V4,Toronto,10.0
1,1/3046,,Havergal,M5N2H9,Toronto,10.0
2,1/3046,,Islamic Institute of Toronto,M1X 1S3,Toronto,10.0
3,1/3046,,Northmount,M3B 1S3,Toronto,10.0
4,1/3046,,Sathya Sai,M1R 4E5,Toronto,10.0


In [129]:
# Load toronto secondary school ranking excel
sr_ranking_df = pd.read_excel('Toronto_Sr_schools_ranking_2017-2018.xlsx', sheet_name='Sheet1')

sr_ranking_df.head()

Unnamed: 0,2017-18 Rank,Trend,School Name,Postal Code,City,2017-18 Rating
0,1/738,,Havergal,M5N2H9,Toronto,10.0
1,3/738,—,St Michael's Choir,M5B1X2,Toronto,9.6
2,3/738,,Ursula Franklin,M6P3J7,Toronto,9.6
3,7/738,—,North Toronto,M4P1T7,Toronto,9.2
4,14/738,—,Cardinal Carter-Arts,M2N3C8,Toronto,8.8


In [131]:
# clean up
ranking_df_new = ranking_df.drop(['Trend'], axis=1)
ranking_df_new.head()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating
0,1/3046,Avondale Alternative,M2N 2V4,Toronto,10.0
1,1/3046,Havergal,M5N2H9,Toronto,10.0
2,1/3046,Islamic Institute of Toronto,M1X 1S3,Toronto,10.0
3,1/3046,Northmount,M3B 1S3,Toronto,10.0
4,1/3046,Sathya Sai,M1R 4E5,Toronto,10.0


In [132]:
sr_ranking_df_new = sr_ranking_df.drop(['Trend'], axis=1)
sr_ranking_df_new.head()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating
0,1/738,Havergal,M5N2H9,Toronto,10.0
1,3/738,St Michael's Choir,M5B1X2,Toronto,9.6
2,3/738,Ursula Franklin,M6P3J7,Toronto,9.6
3,7/738,North Toronto,M4P1T7,Toronto,9.2
4,14/738,Cardinal Carter-Arts,M2N3C8,Toronto,8.8


In [133]:
# print the number of JR and SR ranking data
ranking_df_new.shape

(441, 5)

In [134]:
sr_ranking_df_new.shape

(107, 5)

We can tell the two datasets are imbalanced, therefore we need to keep the dataset seperated. We nned to have Neighborhood data to ranking. 
- for ranking data
- not ranking data, take average

In [185]:
# function to fetch Neighborhood data
def get_neighborhood_list(df1, df2):
    neighborhoodList = []
    for index, row in df1.iterrows():
        post_code = row['Postal Code']
        short = str(post_code[:3])
        neighbor = ""
        for index2, row2 in df2.iterrows():
            if(short == row2['PostalCode']):
                neighbor = row2['Neighborhood']
            print(neighbor)
        neighborhoodList.append(neighbor)
        #print(neighborhoodList)
    return neighborhoodList

In [186]:
# Get Neighborhood data for ranking Jr schools
jr_neighborhood_list = []

for index, row in ranking_df_new.iterrows():
    post_code = row['Postal Code']
    #short = post_code.split(' ')
    short = str(post_code[:3])
    #print(short)
    #neighborhood = "Toronto"
    for index2, row2 in df_toronto_coordinates.iterrows():
        if(short == row2['PostalCode']):
            neighborhood = row2['Neighborhood']
            print("short : {} and neighborhood: {}".format(short, neighborhood))
    #print(neighborhood)
    jr_neighborhood_list.append(neighborhood)
   

short : M2N and neighborhood: Willowdale South
short : M5N and neighborhood: Roselawn
short : M1X and neighborhood: Upper Rouge
short : M3B and neighborhood: Don Mills North
short : M1R and neighborhood: Maryvale, Wexford
short : M6H and neighborhood: Dovercourt Village, Dufferin
short : M9M and neighborhood: Emery, Humberlea
short : M1B and neighborhood: Rouge, Malvern
short : M4T and neighborhood: Moore Park, Summerhill East
short : M3B and neighborhood: Don Mills North
short : M9A and neighborhood: Islington Avenue
short : M4R and neighborhood: North Toronto West
short : M1W and neighborhood: L'Amoreaux West
short : M1S and neighborhood: Agincourt
short : M4N and neighborhood: Lawrence Park
short : M6J and neighborhood: Little Portugal, Trinity
short : M6S and neighborhood: Runnymede, Swansea
short : M1S and neighborhood: Agincourt
short : M4K and neighborhood: The Danforth West, Riverdale
short : M1S and neighborhood: Agincourt
short : M1M and neighborhood: Cliffcrest, Cliffside, S

short : M2N and neighborhood: Willowdale South
short : M4E and neighborhood: The Beaches
short : M1E and neighborhood: Guildwood, Morningside, West Hill
short : M1B and neighborhood: Rouge, Malvern
short : M4C and neighborhood: Woodbine Heights
short : M1M and neighborhood: Cliffcrest, Cliffside, Scarborough Village West
short : M1C and neighborhood: Highland Creek, Rouge Hill, Port Union
short : M9A and neighborhood: Islington Avenue
short : M9P and neighborhood: Westmount
short : M1L and neighborhood: Clairlea, Golden Mile, Oakridge
short : M3K and neighborhood: CFB Toronto, Downsview East
short : M1V and neighborhood: Agincourt North, L'Amoreaux East, Milliken, Steeles East
short : M6J and neighborhood: Little Portugal, Trinity
short : M1L and neighborhood: Clairlea, Golden Mile, Oakridge
short : M4L and neighborhood: The Beaches West, India Bazaar
short : M2L and neighborhood: Silver Hills, York Mills
short : M6P and neighborhood: High Park, The Junction South
short : M1T and neigh

short : M9V and neighborhood: Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown
short : M6N and neighborhood: The Junction North, Runnymede
short : M1V and neighborhood: Agincourt North, L'Amoreaux East, Milliken, Steeles East
short : M2N and neighborhood: Willowdale South
short : M1B and neighborhood: Rouge, Malvern
short : M9V and neighborhood: Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown
short : M1P and neighborhood: Dorset Park, Scarborough Town Centre, Wexford Heights
short : M1G and neighborhood: Woburn
short : M1J and neighborhood: Scarborough Village
short : M1V and neighborhood: Agincourt North, L'Amoreaux East, Milliken, Steeles East
short : M1L and neighborhood: Clairlea, Golden Mile, Oakridge
short : M6E and neighborhood: Caledonia-Fairbanks
short : M6H and neighborhood: Dovercourt Village, Dufferin
short : M6N and neighborhood: The Junction North, Ru

In [187]:
#print(jr_neighborhood_list)

In [188]:
# Add Neighborhood data to each JR school
ranking_df_new['Neighborhood'] = jr_neighborhood_list

In [189]:
ranking_df_new.head(10)

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood
0,1/3046,Avondale Alternative,M2N 2V4,Toronto,10.0,Willowdale South
1,1/3046,Havergal,M5N2H9,Toronto,10.0,Roselawn
2,1/3046,Islamic Institute of Toronto,M1X 1S3,Toronto,10.0,Upper Rouge
3,1/3046,Northmount,M3B 1S3,Toronto,10.0,Don Mills North
4,1/3046,Sathya Sai,M1R 4E5,Toronto,10.0,"Maryvale, Wexford"
5,1/3046,St Sebastian,M6H 3P1,Toronto,10.0,"Dovercourt Village, Dufferin"
6,23/3046,Nile Academy,M9M 1W5,Toronto,9.6,"Emery, Humberlea"
7,25/3046,Fleming,M1B 5B5,Toronto,9.5,"Rouge, Malvern"
8,25/3046,Whitney,M4T 1C7,Toronto,9.5,"Moore Park, Summerhill East"
9,34/3046,Denlow,M3B 1P7,Toronto,9.3,Don Mills North


In [190]:
# use function
sr_neighborhood_list = get_neighborhood_list(sr_ranking_df_new, df_toronto_coordinates)

#print(sr_neighborhood_list)
































































Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn
Roselawn






















































Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden District
Ryerson, Garden Distric














































Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin
Dovercourt Village, Dufferin


















































































High Park, The Junction South
High Park, The Junction South
High Park, The Junction South


Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale
Rosedale







Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea, Golden Mile, Oakridge
Clairlea,



























Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, Don Mills South
Flemingdon Park, 

East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto
East Toronto































































































Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland Wood, Old Burnhamthorpe
Bloordale Gardens, Eringate, Markland W






CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Downsview East
CFB Toronto, Do



















































































Northwest






East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, Ionview, Kennedy Park
East Birchmount Park, 

Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and Wellesley
Church and We

Dorset Park, Scarborough Town Centre, Wexford Heights
Dorset Park, Scarborough Town Centre, Wexford Heights
Dorset Park, Scarborough Town Centre, Wexford Heights
Dorset Park, Scarborough Town Centre, Wexford Heights


In [191]:
print(sr_neighborhood_list)

['Roselawn', 'Ryerson, Garden District', 'High Park, The Junction South', 'Davisville North', 'Willowdale South', 'Bathurst Manor, Downsview North, Wilson Heights', "Humber Bay, King's Mill Park, Kingsway Park South East, Mimico NE, Old Mill South, The Queensway East, Royal York South East, Sunnylea", 'Willowdale South', 'Ryerson, Garden District', 'Hillcrest Village', 'Leaside', 'Agincourt', 'Dovercourt Village, Dufferin', 'High Park, The Junction South', 'North Toronto West', 'The Beaches', 'Humber Bay Shores, Mimico South, New Toronto', 'Islington Avenue', 'Dovercourt Village, Dufferin', 'Harbord, University of Toronto', 'Studio District', 'Dovercourt Village, Dufferin', "Agincourt North, L'Amoreaux East, Milliken, Steeles East", 'Willowdale West', "L'Amoreaux West", 'Rosedale', 'Clairlea, Golden Mile, Oakridge', 'The Beaches', 'Agincourt', 'Roselawn', 'Davisville North', 'Cliffcrest, Cliffside, Scarborough Village West', 'Flemingdon Park, Don Mills South', "Humber Bay, King's Mill 

In [193]:
# Add Neighborhood data to each SR school
sr_ranking_df_new['Neighborhood'] = sr_neighborhood_list

sr_ranking_df_new.tail(10)

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood
97,686/738,John Polanyi,M6A1B1,Toronto,3.3,"Lawrence Heights, Lawrence Manor"
98,689/738,Westview Centennial,M3N1W7,Toronto,3.2,Downsview Northwest
99,699/738,Weston,M9N2Y9,Toronto,3.0,Weston
100,699/738,Downsview,M3K1W3,Toronto,3.0,"CFB Toronto, Downsview East"
101,716/738,George Harvey,M6M3W5,Toronto,1.8,"Del Ray, Keelesdale, Mount Dennis, Silverthorn"
102,716/738,Emery,M9M2V9,Toronto,1.8,"Emery, Humberlea"
103,720/738,Oakwood,M6E1A3,Toronto,1.5,Caledonia-Fairbanks
104,722/738,Kipling,M9R1H4,Toronto,1.4,"Kingsview Village, Martin Grove Gardens, Richv..."
105,729/738,Central,M5S2R5,Toronto,0.2,"Harbord, University of Toronto"
106,731/738,Bendale,M1P3C1,Toronto,0.0,"Dorset Park, Scarborough Town Centre, Wexford ..."


In [197]:
# the average rating for JR and SR
ranking_df_new.loc[:,"2017-18 Rating"].mean()

6.362131519274376

In [198]:
# the average rating for SR
sr_ranking_df_new.loc[:,"2017-18 Rating"].mean()

5.878504672897195

Now we know the the consider Neighborhoods are those schools which rating are below average

In [210]:
# select low rating school
jr_low_df = ranking_df_new[ranking_df_new['2017-18 Rating'] < 5]

jr_low_df.head()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood
353,2315/3046,Charles E Webster,M6M3X7,Toronto,4.9,"Del Ray, Keelesdale, Mount Dennis, Silverthorn"
354,2315/3046,Fairmount,M1M1C7,Toronto,4.9,"Cliffcrest, Cliffside, Scarborough Village West"
355,2315/3046,Our Lady of Guadalupe Catholic,M2J3C2,Toronto,4.9,"Fairview, Henry Farm, Oriole"
356,2315/3046,St Bernard,M6M4W4,Toronto,4.9,"Del Ray, Keelesdale, Mount Dennis, Silverthorn"
357,2315/3046,St Charles,M6B2W1,Toronto,4.9,Glencairn


In [209]:
jr_low_df.shape

(173, 6)

In [206]:
sr_low_df = sr_ranking_df_new[sr_ranking_df_new['2017-18 Rating'] < 5.878]

sr_low_df.head()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood
59,443/738,Pope John Paul II,M1E4P6,Toronto,5.8,"Guildwood, Morningside, West Hill"
60,461/738,Danforth,M4J4B7,Toronto,5.7,East Toronto
61,477/738,Wexford Collegiate-Arts,M1R2H7,Toronto,5.6,"Maryvale, Wexford"
62,477/738,St Basil The Great,M9M3B2,Toronto,5.6,"Emery, Humberlea"
63,492/738,Delphi Alternative,M1S2R7,Toronto,5.5,Agincourt


In [207]:
sr_low_df.shape

(48, 6)

In [241]:
# Apply geo data to those two lower ranking school dataset so that we can view them on map
# Creat a function return related geo data and Borough

def get_geo_data(df1, df2):
    lat_list = []
    lng_list = []
    post_code_list = []
    borough_list = []
    d = {}
    for index, row in df1.iterrows():
        post_code = row['Postal Code']
        short = str(post_code[:3])
        lat_i = "NA"
        lng_i = "NA"
        borough_i = "NA"
        for index2, row2 in df2.iterrows():
            post_code_2 = row2['POSTAL_CODE']
            short_2 = str(post_code_2[:3])
            if(short == short_2):
                lat_i = row2['LATITUDE']
                lng_i = row2['LONGITUDE']
                borough_i = row2['MUNICIPALITY']
            #print("lat : {} and lng: {}".format(lat_i, lng_i))
        lat_list.append(lat_i)
        lng_list.append(lng_i)
        post_code_list.append(post_code)
        borough_list.append(borough_i)
    #print("lat len: {} and lng len: {}".format(len(lat_list), len(lng_list)))
    d = {'Postal Code':post_code_list,'Borough':borough_list,'Latitude':lat_list,'Longitude':lng_list}
    return pd.DataFrame(d)

In [242]:
#jr_low_df.shape
jr_low_geo_df = get_geo_data(jr_low_df, schools_new_update)

In [243]:
jr_low_geo_df.head()

Unnamed: 0,Postal Code,Borough,Latitude,Longitude
0,M6M3X7,York,43.690287,-79.476243
1,M1M1C7,Scarborough,43.715646,-79.242479
2,M2J3C2,North York,43.781941,-79.348741
3,M6M4W4,York,43.690287,-79.476243
4,M6B2W1,North York,43.717388,-79.43548


In [244]:
# Merge two dataframe
updated_jr_low_df = jr_low_df.merge(jr_low_geo_df, on="Postal Code", how="left")

In [245]:
updated_jr_low_df.head()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood,Borough,Latitude,Longitude
0,2315/3046,Charles E Webster,M6M3X7,Toronto,4.9,"Del Ray, Keelesdale, Mount Dennis, Silverthorn",York,43.690287,-79.476243
1,2315/3046,Fairmount,M1M1C7,Toronto,4.9,"Cliffcrest, Cliffside, Scarborough Village West",Scarborough,43.715646,-79.242479
2,2315/3046,Our Lady of Guadalupe Catholic,M2J3C2,Toronto,4.9,"Fairview, Henry Farm, Oriole",North York,43.781941,-79.348741
3,2315/3046,St Bernard,M6M4W4,Toronto,4.9,"Del Ray, Keelesdale, Mount Dennis, Silverthorn",York,43.690287,-79.476243
4,2315/3046,St Charles,M6B2W1,Toronto,4.9,Glencairn,North York,43.717388,-79.43548


In [246]:
# create map of Toronto using latitude and longitude values
jr_school_map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

In [248]:
# add markers to map, use dataframe df_toronto_coordinates from part2
for lat_jr, lng_jr, borough_jr, neighborhood_jr in zip(updated_jr_low_df['Latitude'], updated_jr_low_df['Longitude'], updated_jr_low_df['Borough'], updated_jr_low_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood_jr, borough_jr)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat_jr, lng_jr],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(jr_school_map_toronto)  
    
jr_school_map_toronto

In [249]:
# same as to SR schools
sr_low_geo_df = get_geo_data(sr_low_df, schools_new_update)

In [250]:
sr_low_geo_df.head()

Unnamed: 0,Postal Code,Borough,Latitude,Longitude
0,M1E4P6,Scarborough,43.768514,-79.165034
1,M4J4B7,East York,43.691928,-79.348158
2,M1R2H7,Scarborough,43.740199,-79.303731
3,M9M3B2,North York,43.749191,-79.532014
4,M1S2R7,Scarborough,43.787004,-79.248054


In [251]:
# Merge two dataframe
updated_sr_low_df = sr_low_df.merge(sr_low_geo_df, on="Postal Code", how="left")

updated_sr_low_df.tail()

Unnamed: 0,2017-18 Rank,School Name,Postal Code,City,2017-18 Rating,Neighborhood,Borough,Latitude,Longitude
43,716/738,Emery,M9M2V9,Toronto,1.8,"Emery, Humberlea",North York,43.749191,-79.532014
44,720/738,Oakwood,M6E1A3,Toronto,1.5,Caledonia-Fairbanks,former Toronto,43.677385,-79.444437
45,722/738,Kipling,M9R1H4,Toronto,1.4,"Kingsview Village, Martin Grove Gardens, Richv...",Etobicoke,43.690831,-79.548997
46,729/738,Central,M5S2R5,Toronto,0.2,"Harbord, University of Toronto",former Toronto,43.669236,-79.389922
47,731/738,Bendale,M1P3C1,Toronto,0.0,"Dorset Park, Scarborough Town Centre, Wexford ...",Scarborough,43.747906,-79.27815


In [252]:
# create map of Toronto using latitude and longitude values
sr_school_map_toronto = folium.Map(location=[latitude, longitude], zoom_start=11)

In [253]:
# add markers to map, use dataframe df_toronto_coordinates from part2
for lat_sr, lng_sr, borough_sr, neighborhood_sr in zip(updated_sr_low_df['Latitude'], updated_sr_low_df['Longitude'], updated_sr_low_df['Borough'], updated_sr_low_df['Neighborhood']):
    label = '{}, {}'.format(neighborhood_sr, borough_sr)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat_sr, lng_sr],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(sr_school_map_toronto)  
    
sr_school_map_toronto

In [257]:
# Let us look into both result to see if there are neighborhoods appears both JR and SR dataframe

low_jr_neighborhoods_name_list = updated_jr_low_df['Neighborhood'].tolist()
#low_jr_neighborhood_name_list

In [258]:
# function to remove duplicated neighborhood name from list
def remove(duplicate): 
    final_list = [] 
    for num in duplicate: 
        if num not in final_list: 
            final_list.append(num) 
    return final_list

In [261]:
# remove duplicated neighborhood
removed_jr_neighborhoods_name_list = remove(low_jr_neighborhoods_name_list)
#removed_jr_neighborhoods_name_list

In [265]:
# loop through SR dataframe to look for the same neighborhood
neighborhood_list_on_both = []

for i in range(len(removed_jr_neighborhoods_name_list)):
    neighborhood_name = removed_jr_neighborhoods_name_list[i]
    founded = ""
    #print(short)
    for index, row in updated_sr_low_df.iterrows():
        if(neighborhood_name == row['Neighborhood']):
            founded = row['Neighborhood']
            #print(founded)
    if(len(founded) > 0):
        neighborhood_list_on_both.append(founded)

In [266]:
neighborhood_list_on_both

['Del Ray, Keelesdale, Mount Dennis, Silverthorn',
 'Cliffcrest, Cliffside, Scarborough Village West',
 'Glencairn',
 'East Birchmount Park, Ionview, Kennedy Park',
 'Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown',
 'Downsview, North Park, Upwood Park',
 'Emery, Humberlea',
 'Rouge, Malvern',
 'East Toronto',
 'Church and Wellesley',
 'Christie',
 'Caledonia-Fairbanks',
 'Downsview Northwest',
 'Dovercourt Village, Dufferin',
 'Dorset Park, Scarborough Town Centre, Wexford Heights',
 'Parkwoods',
 'Kingsview Village, Martin Grove Gardens, Richview Gardens, St. Phillips',
 'Harbord, University of Toronto',
 'Runnymede, Swansea',
 'Guildwood, Morningside, West Hill',
 'Woburn',
 'Maryvale, Wexford',
 'Agincourt',
 'Northwest',
 "L'Amoreaux West",
 'Brockton, Exhibition Place, Parkdale Village',
 'Newtonbrook, Willowdale']

## Results

After analyzed the data I collected from trusted resouce (see Data Prepartion section), to answer the in our business use case  

#### Which areas or neighborhoods in Toronto are ideal locations to open a after-school tutor service?

Here are the locations can be considered to open one.

'Del Ray, Keelesdale, Mount Dennis, Silverthorn',
 'Cliffcrest, Cliffside, Scarborough Village West',
 'Glencairn',
 'East Birchmount Park, Ionview, Kennedy Park',
 'Albion Gardens, Beaumond Heights, Humbergate, Jamestown, Mount Olive, Silverstone, South Steeles, Thistletown',
 'Downsview, North Park, Upwood Park',
 'Emery, Humberlea',
 'Rouge, Malvern',
 'East Toronto',
 'Church and Wellesley',
 'Christie',
 'Caledonia-Fairbanks',
 'Downsview Northwest',
 'Dovercourt Village, Dufferin',
 'Dorset Park, Scarborough Town Centre, Wexford Heights',
 'Parkwoods',
 'Kingsview Village, Martin Grove Gardens, Richview Gardens, St. Phillips',
 'Harbord, University of Toronto',
 'Runnymede, Swansea',
 'Guildwood, Morningside, West Hill',
 'Woburn',
 'Maryvale, Wexford',
 'Agincourt',
 'Northwest',
 "L'Amoreaux West",
 'Brockton, Exhibition Place, Parkdale Village',
 'Newtonbrook, Willowdale'