Steps needed to be taken:

    1) identify the universities in Toronto
    2) identify the city the universities are in
    3) obtain the rental data for all the cities
    4) obtain coordiantes of the universities
    5) get coodriantes of the rentals within the cities where the universities are
    6) calculate the distance between the universities and each rental houses
    7) filter rental houses that are far away (1km?)
    8) using foursquare, get the venues sarounding each of the rental houses
    9) cluster analysis

In [153]:
import pandas as pd
import numpy as np
import folium

In [119]:
# Step 1: identify the universities in Canada
# list of universities in Torronto gotten from: https://www.unipage.net/en/universities?country=38&city_id=6167865
df = pd.read_excel('uni_toronto.xlsx')

# preview of dataset
df

Unnamed: 0,Title,Country,City,Latitude,Longitude
0,University of Toronto,Canada,Toronto,,
1,York University,Canada,Toronto,,
2,Ryerson University,Canada,Toronto,,
3,Humber College,Canada,Toronto,,
4,Centennial College,Canada,Toronto,,
5,George Brown College,Canada,Toronto,,
6,OCAD University,Canada,Toronto,,
7,Michener Institute for Applied Health Sciences,Canada,Toronto,,
8,Canadian Memorial Chiropractic College,Canada,Toronto,,
9,Tyndale College & Seminary,Canada,Toronto,,


### Use geopy library to get the latitude and longitude values of Universities in Toronto

In order to define an instance of the geocoder, we need to define a user_agent. We will name our agent <em>ny_explorer</em>, as shown below.

In [81]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

address = 'University of Toronto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of University of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of University of Toronto are 43.663461999999996, -79.39775965337452.


Great! We are able to retrive the lattitude and longitude coordinate using the geopy's Nominatim function. Let's retrieve all the coordinates for the remaining universities and add them to the dataframe.

In [122]:
for index, rows in df.iterrows():
    address = rows['University']
    geolocator = Nominatim(user_agent="ny_explorer")
    location = geolocator.geocode(address)
    
    try:
        latitude = location.latitude
        longitude = location.longitude
        print(f"{index}: {rows['Title']}: {latitude},{longitude}")
        df.loc[index, 'Latitude'] = latitude
        df.loc[index, 'Longitude'] = longitude
        
    except:
        latitude = np.nan
        print("No location")

0: University of Toronto: 43.663461999999996,-79.39775965337452
1: York University: 43.7792419,-79.4835593
2: Ryerson University: 43.65846945,-79.37899327245886
3: Humber College: 43.7285115,-79.60624961542362
4: Centennial College: 43.7857924,-79.22780957535817
5: George Brown College: 6.93121,-10.91934
6: OCAD University: 43.649334,-79.3888874
No location
No location
9: Tyndale College & Seminary: 43.7972511,-79.39261743818221
No location
11: Seneca College: 43.7967304,-79.34937740361715
12: Victoria University: 0.31360965,32.58913184912379
No location
14: University of Trinity College: 43.665301549999995,-79.39525214965947
No location
No location
17: International Language Academy of Canada: 49.2883111,-123.1225453
18: Upper Madison College: 40.0057285,-75.1745453
19: Institute for Christian Studies: 52.21710445,0.1376036101856198
20: University of St. Michael's College: 43.6655987,-79.391267
21: Wycliffe College: 43.664512450000004,-79.39514104116316


Cool. We are able to get most of the coordinates for the universities. There are a few that do not have either the lat, long coordinates. That is still fine. We can manually add them in for now. First, let's preview the dataframe and see what we came up with.

In [125]:
df

Unnamed: 0,Title,Country,City,Latitude,Longitude
0,University of Toronto,Canada,Toronto,43.663462,-79.39776
1,York University,Canada,Toronto,43.779242,-79.483559
2,Ryerson University,Canada,Toronto,43.658469,-79.378993
3,Humber College,Canada,Toronto,43.728512,-79.60625
4,Centennial College,Canada,Toronto,43.785792,-79.22781
5,George Brown College,Canada,Toronto,6.93121,-10.91934
6,OCAD University,Canada,Toronto,43.649334,-79.388887
7,Michener Institute for Applied Health Sciences,Canada,Toronto,,
8,Canadian Memorial Chiropractic College,Canada,Toronto,,
9,Tyndale College & Seminary,Canada,Toronto,43.797251,-79.392617


In [126]:
#saving what we have for now
df.to_excel('uni_toronto.xlsx')

In [131]:
# get the universities with nan values

df[df['Longitude'].isna() == True]

Unnamed: 0,Title,Country,City,Latitude,Longitude
7,Michener Institute for Applied Health Sciences,Canada,Toronto,,
8,Canadian Memorial Chiropractic College,Canada,Toronto,,
10,Regis College University of Toronto,Canada,Toronto,,
13,Academy of Design Toronto,Canada,Toronto,,
15,Sprott Shaw Language College,Canada,Toronto,,
16,Upper Career College Business & Technology,Canada,Toronto,,


In [149]:
# coordinates for Michener Institute for Applied Health Sciences: 43.6558896, -79.3910475
df.loc[7, 'Latitude'] = 43.6558896
df.loc[7, 'Longitude'] = -79.3910475

# coordinates for Canadian Memorial Chiropractic College: 43.8039466,-79.3718667
df.loc[8, 'Latitude'] = 43.8039466
df.loc[8, 'Longitude'] = -79.3718667

# coordinates for Regis College University of Toronto: 43.6639847,-79.3902391
df.loc[10, 'Latitude'] = 43.6639847
df.loc[10, 'Longitude'] = -79.3902391

# coordinates for Academy of Design Toronto: 43.696708,-79.396104
df.loc[13, 'Latitude'] = 43.696708
df.loc[13, 'Longitude'] = -79.396104

# coordinates for Sprott Shaw Language College: 43.7076913,-79.3949289
df.loc[15, 'Latitude'] = 43.7076913
df.loc[15, 'Longitude'] = -79.3949289


# coordinates for Upper Career College Business & Technology: 43.7076738,-79.3949037
df.loc[16, 'Latitude'] = 43.7076738
df.loc[16, 'Longitude'] = -79.3949037

In [150]:
df

Unnamed: 0,Title,Country,City,Latitude,Longitude
0,University of Toronto,Canada,Toronto,43.663462,-79.39776
1,York University,Canada,Toronto,43.779242,-79.483559
2,Ryerson University,Canada,Toronto,43.658469,-79.378993
3,Humber College,Canada,Toronto,43.728512,-79.60625
4,Centennial College,Canada,Toronto,43.785792,-79.22781
5,George Brown College,Canada,Toronto,6.93121,-10.91934
6,OCAD University,Canada,Toronto,43.649334,-79.388887
7,Michener Institute for Applied Health Sciences,Canada,Toronto,43.65589,-79.391047
8,Canadian Memorial Chiropractic College,Canada,Toronto,43.803947,-79.371867
9,Tyndale College & Seminary,Canada,Toronto,43.797251,-79.392617


Now that we have all the coordinates fileld up, let's visualise them to see if they are correct

In [164]:
# create map of Toronto using latitude and longitude values
toronto_latitude = 43.6532
toronto_longitude = -79.3832
map_toronto = folium.Map(location=[latitude, longitude], zoom_start=10.5)

# add markers to map
for lat, lng, uni in zip(df['Latitude'], df['Longitude'], df['Title']):
    label = f'{uni}'
    label = folium.Popup(label, parse_html=True)
    folium.Marker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_toronto)  
    
map_toronto