# Location for a new animal shelter in Toronto

**Ying Zhou**

**Time for the usual imports**

In [102]:
import pandas as pd
import numpy as np
import requests
import folium

Let's put the url together.

The CAT_LIST includes the following categories: Animal shelters, pet cafes, pet service, pet store and veterinarians. All the categories are available here: https://developer.foursquare.com/docs/resources/categories The reason why I chose radius = 5000 is that when it is much larger the amount of results actually decrease.

In [118]:
CLIENT_ID = '35JVTJQ3DWXUWRZWIVRBAPEDD515TCLODKSVFL1TMKE0VOME' # your Foursquare ID
CLIENT_SECRET = 'UHOTU4NXUEBTO15T4BA3XN0032VU1TZ1OGSMXDRMCO5NGMMK' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LATITUDE = 43.70011
LONGITUDE = -79.4163
CAT_LIST = '56aa371be4b08b9a8d573508,5032897c91d4c4b30a586d69,4bf58dd8d48988d100951735,4e52d2d203646f7c19daa8ae,4d954af4a243a5684765b473'
LIMIT = 300
radius = 10000
url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&intent=checkin&ll={},{}&categoryId={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    LATITUDE, 
    LONGITUDE,
    CAT_LIST,
    radius, 
    LIMIT)

In [119]:
results = requests.get(url).json()

In [120]:
size = len(results['response']['venues'])
size

50

OK so we have 50 locations! :-)

In [121]:
res_arr = results['response']['venues']
#df = pd.DataFrame(columns = ['name', 'address','latitude','longitude', 'category', 'zipcode'])
data_list = []
for res in res_arr:
    name = res['name']
    address = ', '.join(res['location']['formattedAddress'])
    latitude = res['location']['lat']
    longitude = res['location']['lng']
    category = res['categories'][0]['name']
    zipcode = res['location'].get('postalCode','N/A').upper()
    zip3 = zipcode[0:3]
    data_list = data_list + [[name, address, latitude, longitude, category, zipcode]]
df = pd.DataFrame(data_list,columns = ['name', 'address','latitude','longitude', 'category', 'zipcode'])
df

Unnamed: 0,name,address,latitude,longitude,category,zipcode
0,Central Toronto Veterinary Referral Clinic,"1051 Eglinton Ave W, Toronto ON M6C 2C9, Canada",43.699336,-79.433081,Veterinarian,M6C 2C9
1,PetSmart,"835 Eglinton Ave E (at Laird Dr.), East York O...",43.712682,-79.362636,Pet Store,M4G 4G9
2,PetSmart,"75 Gunns Road, Unit 103, Toronto ON M6N 0A3, C...",43.675062,-79.471696,Pet Store,M6N 0A3
3,Pet Valu,"339 College St (at Augusta Ave), Toronto ON M5...",43.657322,-79.402946,Pet Store,M5T 1S2
4,Global Pet Foods,"133 Danforth Ave., Toronto ON, Canada",43.676913,-79.355225,Pet Store,
5,Pet Valu,"1660 Bloor St W (at Indian Rd), Toronto ON M6P...",43.655366,-79.457232,Pet Store,M6P 1A8
6,Dogfather & Co.,"1007 Yonge Street (Crescent Rd.), Toronto ON M...",43.677205,-79.389451,Pet Store,M4W 2K9
7,Global Pet Foods,"2019 Yonge St (Manor Rd E), Toronto ON M4S 2A2...",43.701266,-79.397115,Pet Store,M4S 2A2
8,Global Pet Foods,"Dupont, Toronto ON, Canada",43.67558,-79.403386,Pet Store,
9,Global Pet Foods,"1947 Avenue Rd. (Wilson), Toronto ON, Canada",43.734015,-79.419371,Pet Store,


Now let's clean up the data. Row 49 is actually an animal hospital. So let's fix that. All other categorizations are mostly accurate.

In [122]:
df.loc[49, 'category'] = 'Pet Store'
df

Unnamed: 0,name,address,latitude,longitude,category,zipcode
0,Central Toronto Veterinary Referral Clinic,"1051 Eglinton Ave W, Toronto ON M6C 2C9, Canada",43.699336,-79.433081,Veterinarian,M6C 2C9
1,PetSmart,"835 Eglinton Ave E (at Laird Dr.), East York O...",43.712682,-79.362636,Pet Store,M4G 4G9
2,PetSmart,"75 Gunns Road, Unit 103, Toronto ON M6N 0A3, C...",43.675062,-79.471696,Pet Store,M6N 0A3
3,Pet Valu,"339 College St (at Augusta Ave), Toronto ON M5...",43.657322,-79.402946,Pet Store,M5T 1S2
4,Global Pet Foods,"133 Danforth Ave., Toronto ON, Canada",43.676913,-79.355225,Pet Store,
5,Pet Valu,"1660 Bloor St W (at Indian Rd), Toronto ON M6P...",43.655366,-79.457232,Pet Store,M6P 1A8
6,Dogfather & Co.,"1007 Yonge Street (Crescent Rd.), Toronto ON M...",43.677205,-79.389451,Pet Store,M4W 2K9
7,Global Pet Foods,"2019 Yonge St (Manor Rd E), Toronto ON M4S 2A2...",43.701266,-79.397115,Pet Store,M4S 2A2
8,Global Pet Foods,"Dupont, Toronto ON, Canada",43.67558,-79.403386,Pet Store,
9,Global Pet Foods,"1947 Avenue Rd. (Wilson), Toronto ON, Canada",43.734015,-79.419371,Pet Store,


We also have several locations with completely ridiculous addresses or missing postal codes. That's fine since we only need their latitude and longitude anyway. Now it is time for a relabelling.

Now let's plot the 50 pet-related locations on a Folium map.

In [123]:
toronto_map = folium.Map(location=[LATITUDE, LONGITUDE], zoom_start=11)
for lat, lng, category, name in zip(df.latitude, df.longitude, df.category, df.name):
    folium.vector_layers.CircleMarker(
        [lat, lng],
        radius=5, # define how big you want the circle markers to be
        color='yellow',
        fill=True,
        popup=name + ' ' + category,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(toronto_map)

# show map
toronto_map

It seems that pets are indeed popular in Toronto! Using different approaches we can actually get different results and changing the limit does not seem to improve the situation. Hence we are going to define Pet Score (PS) of a neighborhood using the distance between the center of the neighborhood and the 5 nearest pet locations.

In [124]:
import scipy as sp

Now we need to import the old neighborhood information from our .csv file. Note that M5W and M7Y aren't exactly neighborhoods. However they are still locations in a borough which is why they are still useful.

In [146]:
df_loc = pd.read_csv('/Users/CatLover/Documents/Python_Gamma/Jupiter Notebookz/neighborhoods_with_location.csv', index_col = 0)
df_loc.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,M3A,North York,Parkwoods,43.753259,-79.329656
1,M4A,North York,Victoria Village,43.725882,-79.315572
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.65426,-79.360636
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494


Now let's calculate the PS. Since we have no information on whether animal shelters should be in rich or middle class neighborhoods nor can we really estimate the actual population that passes through an area we decide that PS should solely depend on how many existing pet-related venues exist near a neighborhood. Let's use 1 km as the radius because we assume that it is easy for many customers to walk for a 1km while visiting multiple pet-related venues. The actual distance between the center of a neighborhood and a pet-related venue is not used because a neighborhood has size and the center does not represent the entire neighborhood.

In [180]:
def get_pet_score(lat, lng, amount = 1000):
    url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&intent=checkin&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    lat, 
    lng,
    CAT_LIST,
    amount, 
    LIMIT)
    results = requests.get(url).json()
    res_array = results['response']['venues']
    count = len(res_array) + 1 #Because this new animal shelter also counts!
    #for res in res_array:
        #print(res['name'])
    return count

In [181]:
ps = [None] * df_loc.shape[0]
for index, row in df_loc.iterrows():
    print(row.PostalCode)
    #print(index)
    ps[index] = get_pet_score(row.Latitude, row.Longitude)
print(ps)

#df_loc.head()

M3A
M4A
M5A
M6A
M7A
M9A
M1B
M3B
M4B
M5B
M6B
M9B
M1C
M3C
M4C
M5C
M6C
M9C
M1E
M4E
M5E
M6E
M1G
M4G
M5G
M6G
M1H
M2H
M3H
M4H
M5H
M6H
M1J
M2J
M3J
M4J
M5J
M6J
M1K
M2K
M3K
M4K
M5K
M6K
M1L
M2L
M3L
M4L
M5L
M6L
M9L
M1M
M2M
M3M
M4M
M5M
M6M
M9M
M1N
M2N
M3N
M4N
M5N
M6N
M9N
M1P
M2P
M4P
M5P
M6P
M9P
M1R
M2R
M4R
M5R
M6R
M7R
M9R
M1S
M4S
M5S
M6S
M1T
M4T
M5T
M1V
M4V
M5V
M8V
M9V
M1W
M4W
M5W
M8W
M9W
M1X
M4X
M5X
M8X
M4Y
M7Y
M8Y
M8Z
[3, 5, 24, 4, 15, 1, 3, 6, 6, 23, 3, 2, 3, 1, 5, 19, 7, 4, 2, 18, 12, 3, 2, 9, 17, 17, 7, 2, 6, 9, 19, 13, 2, 1, 2, 7, 9, 23, 4, 3, 2, 10, 19, 10, 1, 1, 1, 12, 22, 2, 3, 4, 3, 2, 12, 8, 1, 1, 1, 12, 2, 2, 9, 5, 3, 4, 4, 16, 8, 16, 4, 3, 2, 9, 18, 15, 4, 2, 6, 19, 14, 10, 3, 4, 22, 2, 10, 7, 4, 1, 1, 7, 18, 2, 1, 1, 13, 18, 8, 20, 11, 3, 8]


In [167]:
#print(len(ps))
df_loc.insert(loc = 5, column = 'PetScore', value = ps)
df_loc

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,PetScore
0,M3A,North York,Parkwoods,43.753259,-79.329656,3
1,M4A,North York,Victoria Village,43.725882,-79.315572,5
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.654260,-79.360636,24
3,M6A,North York,"Lawrence Heights, Lawrence Manor",43.718518,-79.464763,4
4,M7A,Queen's Park,Queen's Park,43.662301,-79.389494,15
5,M9A,Etobicoke,Islington Avenue,43.667856,-79.532242,1
6,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,3
7,M3B,North York,Don Mills North,43.745906,-79.352188,6
8,M4B,East York,"Woodbine Gardens, Parkview Hill",43.706397,-79.309937,6
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,23


Note that wild life control and other mistakes definitely exist in the answers. However in cases where the PetScore is already high they don't really make that much of a difference. After all most venues that are supposed to be pet-related are actually pet-related. Now we can order the table by PetScore.

In [168]:
df_loc_ordered = df_loc.sort_values('PetScore', ascending = False)
df_loc_ordered

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,PetScore
2,M5A,Downtown Toronto,"Harbourfront, Regent Park",43.654260,-79.360636,24
37,M6J,West Toronto,"Little Portugal, Trinity",43.647927,-79.419750,23
9,M5B,Downtown Toronto,"Ryerson, Garden District",43.657162,-79.378937,23
84,M5T,Downtown Toronto,"Chinatown, Grange Park, Kensington Market",43.653206,-79.400049,22
48,M5L,Downtown Toronto,"Commerce Court, Victoria Hotel",43.648198,-79.379817,22
99,M4Y,Downtown Toronto,Church and Wellesley,43.665860,-79.383160,20
79,M4S,Central Toronto,Davisville,43.704324,-79.388790,19
30,M5H,Downtown Toronto,"Adelaide, King, Richmond",43.650571,-79.384568,19
42,M5K,Downtown Toronto,"Design Exchange, Toronto Dominion Centre",43.647177,-79.381576,19
15,M5C,Downtown Toronto,St. James Town,43.651494,-79.375418,19


Now it is time to plot our results.

In [172]:
toronto_map = folium.Map(location=[LATITUDE, LONGITUDE], zoom_start=11)
for lat, lng, zipcode, ps in zip(df_loc.Latitude, df_loc.Longitude, df_loc.PostalCode, df_loc.PetScore):
    pcolor = ''
    if ps < 10:
        pcolor = 'red'
    elif ps < 15:
        pcolor = 'orange'
    elif ps < 20:
        pcolor = 'yellow'
    else:
        pcolor = 'green'
    folium.vector_layers.CircleMarker(
        [lat, lng],
        radius=5, # define how big you want the circle markers to be
        color=pcolor,
        fill=True,
        popup=zipcode + " " + str(ps),
        fill_color=pcolor,
        fill_opacity=0.6
    ).add_to(toronto_map)

# show map
toronto_map

It seems that M5A is the best postal area, followed by M6J, M5B, M5T and M5L. Now let's estimate the situation using boroughs.

In [179]:
df_b = df_loc[['Borough','PetScore']].groupby('Borough').mean().sort_values('PetScore', ascending = False)
df_b

Unnamed: 0_level_0,PetScore
Borough,Unnamed: 1_level_1
Downtown Toronto,16.666667
Queen's Park,15.0
West Toronto,14.5
East Toronto,12.6
Central Toronto,10.555556
East York,7.2
Mississauga,4.0
York,3.8
Etobicoke,3.333333
North York,3.291667


In terms of boroughs Downtown Toronto is the best, followed by Queen's Park, West Toronto, East Toronto and Central Toronto. The rest have fairly low pet scores which may be a consequence of them not being very walkable. This isn't actually unfair to suburban areas because walkability is in fact desirable for pet-related locations. Central Toronto is a bit overrated due to M5W and East Toronto is slightly underrated due to M7Y but these shouldn't change the situation too much.

## Conclusion

So the verdict is out: The best locaion for a new animal shelter is M5A, namely Harbourfront and Regent Park.