# The Battle of the Neighbourhoods
## Pandemic and remote work: thinking about moving out of central Toronto?

## 1. Background and Research Problem

With a global pandemic unfolding, part of our response has been that many people are working from home. With that being the case, it may not be necessary to remain living in a city where the cost of living tends to be hights, just to be in proximity to employment. Remote work has made it possible for some people to consider moving from large cities to suburbs and smaller towns, where there is less urban density and more space. Until now, people have congregated in cities for many reasons, including access to the employment available there, the communities there, and some of the ammenities available in larger urban environments. 

In this study, we consider the example of Toronto and set out to look for towns in Southern Ontario that might best suit some of those looking to leave central Toronto. 

This study aims at readers who wish to leave Toronto while giving up as little of the quality of life to which they have become accustomed as possible. 

In the study, we make a number of assumptions. First, our readers are in a position where they may move out of the downtown core of Toronto but do not wish to stray too far away from their family, friends and existing social relationships - we shall restrict ourselves to an area within 100km of Toronto. Second, we will limit our search to neighbourhoods around a train station within this area, for several reasons. Train stations represent a higher order of mobility options, which itself has practical benefits. In addition, train stations tend only to operate where there is sufficient demand, and as such, correlate highly with other urban quality of life characteristics. We can therefore use train stations as a proxy for quality of life characteristics otherwise diffficult to measure. 


## 2. Data


The data we will use for this study will be familiar to peer learners of this course. 

Foursquare offers basic location data we can build on, including train stations near to Toronto, including their longitude, latitude, and municipality. 

Also available are data on venues in different neighbourhoods, including descriptions of cafés, restaurants of different types, and other kinds of venues. Taken together, the collection of venues in a given neighbourhood can tell us a lot about the atmosphere and quality of life of that neighbourhood. 

Building on the locations of train stations within our search area, we can therefore use the Foursquare data of what venues are most common around those locations to find neighbourhoods around train stations in Southern Ontario with neighbourhoods most like the big city. 



## 3. Methodology


First, we will explore the Foursquare data to identify train stations within 100 km of Toronto and define them as centres of Station Neighbourhoods.  

Next, we get venues within 2,500m of those Staions and rank the 10 most common venues in each Station Neighbourhood. 

Because we are looking to compare and partition these neighbourhoods into groups with similar characteristics, we will deploy k-means clustering to find groups of Station Neighbourhoods based on the most common venues in those neighbourhoods. The clustering process will be run multiple times to test against getting local optima that make little real world sense. 

We then go on to generate a map of Southern Ontario plotting the clusters. 

Finally, we print the details of each cluster to enable validation, and help us choose labels for the clusters the analysis generates. 

With this information, our readers will can can use the map together with the details of each cluster to identify neighbourhoods outside Toronto around train stations that might be most similar to Toronto neighbourhoods familiar to them and so aiding their search and decision making around a possible move out of the city. 

## 4. Results

Cluster 4 contains station neighbourhoods most like those around central Toronto. Outside of central Toronto, this includes Hamilton, Guelph and the 30 Queen St East station in Mississauga, a suburban neighbourhood surrounding what used to be a small Ontario town that has been swallowed by Toronto's urban growth known as Port Credit.


## 5. Discussion

The analysis does provide some indications to help the reader, while there are interesting outliers. We ran the clustering analyses multiple times, controlling for number of clusters and initiatialization of random parameters. One noteworthy results is that two station neighbourhoods, XXX and YYY, are so different from those around them that they consistently wind up in clusters of one. This can be a result of how k-means clusering treats outliers, namely that all points are assigned to a cluster even if they do not belong in any. Avenues for further study here might therefore include expanding the analysis to test heirarchical clustering or DB scan. 

## 6. Conclusion

By looking at clusterings of types of venues around train stations, we were able to identify station neighbourhoods not too far (withih 100km) of central Tornto that are linked to higher-order transit and most resemble the characteristics of central Toronto. Readers looking to move out of downtown may wish to consider Guelph, Hamilton or Port Credit. 

## APPENDIX: Details of Analysis

### Load necessary supports

In [1]:
# Initialize libraries

import json, requests
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
from pandas.io.json import json_normalize
!pip install folium
import folium # map rendering library
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import matplotlib.cm as cm
import matplotlib.colors as colors
from bs4 import BeautifulSoup
from sklearn.cluster import KMeans

print('Libraries imported.')

Collecting folium
  Downloading folium-0.11.0-py2.py3-none-any.whl (93 kB)
[K     |████████████████████████████████| 93 kB 2.9 MB/s  eta 0:00:01
Collecting branca>=0.3.0
  Downloading branca-0.4.1-py3-none-any.whl (24 kB)
Installing collected packages: branca, folium
Successfully installed branca-0.4.1 folium-0.11.0
Libraries imported.


### Extract and Prepare Foursquare Data

In [2]:
# Prepare use of Foursquare data

CLIENT_ID = '2OUAF5OEOXMJAFKRGOB4P2ZQEHBNTSDAZEHC5DM20BAJLIFK' # your Foursquare ID
CLIENT_SECRET = 'Y3NTD1J0WPLK2LTDDC43K2XDTDQULNZ15DCHYGJHID04TJDB' # your Foursquare Secret
VERSION = '20180604'
LIMIT = 50
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 2OUAF5OEOXMJAFKRGOB4P2ZQEHBNTSDAZEHC5DM20BAJLIFK
CLIENT_SECRET:Y3NTD1J0WPLK2LTDDC43K2XDTDQULNZ15DCHYGJHID04TJDB


In [3]:
# Define the city and get its latitude & longitude 

city = 'Toronto'
geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(city)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

43.6534817 -79.3839347


In [4]:
# Explore for train stations within 100km of Toronto

categoryId = '4bf58dd8d48988d129951735'
radius = 100000

# Define the URL

url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'\
.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, categoryId, radius, LIMIT)
url

# Send the GET Request and examine the results

results = requests.get(url).json()

# Assign relevant part of JSON to venues
venues = results['response']['venues']

# Tranform venues into a dataframe
stations_so = pd.json_normalize(venues)
stations_so.head()

Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,location.crossStreet,location.lat,location.lng,location.labeledLatLngs,location.distance,location.postalCode,location.cc,location.neighborhood,location.city,location.state,location.country,location.formattedAddress
0,4ad94f83f964a520b91921e3,Union Station,"[{'id': '4bf58dd8d48988d129951735', 'name': 'T...",v-1606045924,False,65 Front St W,btwn Bay & York St,43.645167,-79.380641,"[{'label': 'display', 'lat': 43.64516712040756...",962,M5J 1E6,CA,Financial District,Toronto,ON,Canada,"[65 Front St W (btwn Bay & York St), Toronto O..."
1,5a10c7cf840fc2618c73e3c5,Exhibition Station - Track 1,"[{'id': '4bf58dd8d48988d129951735', 'name': 'T...",v-1606045924,False,,,43.63584,-79.4187,"[{'label': 'display', 'lat': 43.63584, 'lng': ...",3420,M6K,CA,,Toronto,ON,Canada,"[Toronto ON M6K, Canada]"
2,5f43dd9ab347a862c43b2d7e,VIA Rail Arrivals,"[{'id': '4bf58dd8d48988d129951735', 'name': 'T...",v-1606045924,False,Union Station,,43.644474,-79.3803,"[{'label': 'display', 'lat': 43.644474, 'lng':...",1044,M5J 1E5,CA,Entertainment District,Toronto,ON,Canada,"[Union Station, Toronto ON M5J 1E5, Canada]"
3,4db199f1a86e63d2116ea484,Union Station Platform 26,"[{'id': '4f4531504b9074f6e4fb0102', 'name': 'P...",v-1606045924,False,65 Front St. W,at Union Station,43.64409,-79.379978,"[{'label': 'display', 'lat': 43.6440898992931,...",1092,,CA,,Toronto,ON,Canada,"[65 Front St. W (at Union Station), Toronto ON..."
4,4b2041a6f964a520752f24e3,Burlington GO Station,"[{'id': '4bf58dd8d48988d129951735', 'name': 'T...",v-1606045924,False,2101 Fairview St,Brant St,43.340608,-79.809863,"[{'label': 'display', 'lat': 43.3406080277783,...",48949,L7R 2E1,CA,,Burlington,ON,Canada,"[2101 Fairview St (Brant St), Burlington ON L7..."


In [5]:
stations_so.shape

(49, 18)

In [6]:
# Remove unneeded data and rename stations column

stations_so = stations_so.drop(['id', 'categories', 'referralId' , 'hasPerk', 'location.crossStreet', 'location.crossStreet', 'location.labeledLatLngs', 'location.cc', 'location.country', 'location.formattedAddress'] , axis='columns')
stations_so = stations_so.rename(columns = {'name':'Station'})
stations_so.head()

Unnamed: 0,Station,location.address,location.lat,location.lng,location.distance,location.postalCode,location.neighborhood,location.city,location.state
0,Union Station,65 Front St W,43.645167,-79.380641,962,M5J 1E6,Financial District,Toronto,ON
1,Exhibition Station - Track 1,,43.63584,-79.4187,3420,M6K,,Toronto,ON
2,VIA Rail Arrivals,Union Station,43.644474,-79.3803,1044,M5J 1E5,Entertainment District,Toronto,ON
3,Union Station Platform 26,65 Front St. W,43.64409,-79.379978,1092,,,Toronto,ON
4,Burlington GO Station,2101 Fairview St,43.340608,-79.809863,48949,L7R 2E1,,Burlington,ON


In [7]:
stations_so.shape

(49, 9)

In [9]:
# From Foursquare get data on the venues within 2,500m of the selected train stations

radius = 2500
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(stations_so['location.lat'], stations_so['location.lng'], stations_so['Station']):

    # create the API request URL
    
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))
    

In [10]:
# Convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# Define the column names
venues_df.columns = ['Station', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(3804, 7)


Unnamed: 0,Station,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Union Station,43.645167,-79.380641,Union Pearson Express,43.644362,-79.383199,Train Station
1,Union Station,43.645167,-79.380641,Scotiabank Arena,43.643446,-79.37904,Basketball Stadium
2,Union Station,43.645167,-79.380641,Delta Hotels by Marriott Toronto,43.642882,-79.383949,Hotel
3,Union Station,43.645167,-79.380641,Canoe,43.647452,-79.38132,Restaurant
4,Union Station,43.645167,-79.380641,Real Sports Apparel,43.64286,-79.380184,Sporting Goods Shop


In [11]:
# Check how many venues were returned for each neighbourhood

venues_df.groupby(["Station"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
Station,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Agincourt GO Station,100,100,100,100,100,100
Ajax GO Station,78,78,78,78,78,78
Aldershot VIA/GO Station,27,27,27,27,27,27
Allandale Waterfront GO Station,100,100,100,100,100,100
"Amtrak Station - Exchange Street (BFX) (Amtrak - Buffalo, NY Exchange Street Station)",100,100,100,100,100,100
Amtrak Station - Niagara Falls (NFL) (Amtrak - Niagara Falls Station),100,100,100,100,100,100
Appleby GO Station,68,68,68,68,68,68
Aurora GO Station,67,67,67,67,67,67
Barrie South GO Station,19,19,19,19,19,19
Bloor GO / UP Station,100,100,100,100,100,100


In [12]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 292 uniques categories.


In [13]:
# Print out the list of categories

venues_df['VenueCategory'].unique()[:50]

array(['Train Station', 'Basketball Stadium', 'Hotel', 'Restaurant',
       'Sporting Goods Shop', 'Plaza', 'Café', 'Park', 'Pub', 'Museum',
       'Japanese Restaurant', 'Gym', 'Brewery', 'Aquarium',
       'Scenic Lookout', 'Monument / Landmark', 'Food Truck',
       'Thai Restaurant', 'Mediterranean Restaurant', 'Neighborhood',
       'Speakeasy', 'Lake', 'Performing Arts Venue', 'Gastropub',
       'Vegetarian / Vegan Restaurant', 'Farmers Market', 'Movie Theater',
       'American Restaurant', 'Dessert Shop', 'Asian Restaurant',
       'Baseball Stadium', 'Deli / Bodega', 'Italian Restaurant',
       'Smoke Shop', 'Skating Rink', 'Theater', 'Creperie',
       'Liquor Store', 'New American Restaurant', 'Coffee Shop',
       'Pizza Place', 'Ice Cream Shop', 'Bookstore',
       'Middle Eastern Restaurant', 'Beer Bar', 'Clothing Store',
       'Yoga Studio', 'Souvlaki Shop', 'Shopping Mall',
       'Food & Drink Shop'], dtype=object)

### Analyse each neighbourhood

In [14]:
# One hot encoding

stations_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")


# Add neighborhood column back to dataframe

stations_onehot['Station'] = venues_df['Station'] 


# Move station column to the first column

fixed_columns = [stations_onehot.columns[-1]] + list(stations_onehot.columns[:-1])
stations_onehot = stations_onehot[fixed_columns]


print(stations_onehot.shape)
stations_onehot.head()

(3804, 293)


Unnamed: 0,Station,ATM,Airport,Airport Lounge,Airport Service,American Restaurant,Amphitheater,Aquarium,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Big Box Store,Bistro,Boat or Ferry,Bookstore,Border Crossing,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cable Car,Cafeteria,Café,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Cha Chaan Teng,Chinese Restaurant,Chocolate Shop,Circus,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cruise Ship,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Duty-free Shop,Eastern European Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hockey Rink,Hong Kong Restaurant,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Hungarian Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Lake,Laser Tag,Latin American Restaurant,Lighthouse,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Monument / Landmark,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Outdoor Sculpture,Pakistani Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Portuguese Restaurant,Post Office,Poutine Place,Pub,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Resort,Rest Area,Restaurant,River,Road,Rock Climbing Spot,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,State / Provincial Park,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Temple,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tour Provider,Toy / Game Store,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo
0,Union Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Union Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Union Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Union Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Union Station,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [15]:
# Next, group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

stations_grouped = stations_onehot.groupby(["Station"]).mean().reset_index()

print(stations_grouped.shape)
stations_grouped.head()

(49, 293)


Unnamed: 0,Station,ATM,Airport,Airport Lounge,Airport Service,American Restaurant,Amphitheater,Aquarium,Arcade,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Big Box Store,Bistro,Boat or Ferry,Bookstore,Border Crossing,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cable Car,Cafeteria,Café,Candy Store,Cantonese Restaurant,Caribbean Restaurant,Cha Chaan Teng,Chinese Restaurant,Chocolate Shop,Circus,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Comic Shop,Community Center,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cruise Ship,Cuban Restaurant,Cupcake Shop,Curling Ice,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dog Run,Donut Shop,Duty-free Shop,Eastern European Restaurant,Electronics Store,Elementary School,Escape Room,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Financial or Legal Service,Fish & Chips Shop,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Golf Course,Golf Driving Range,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hockey Rink,Hong Kong Restaurant,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotpot Restaurant,Hungarian Restaurant,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Korean Restaurant,Lake,Laser Tag,Latin American Restaurant,Lighthouse,Lingerie Store,Liquor Store,Lounge,Malay Restaurant,Market,Martial Arts School,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mini Golf,Miscellaneous Shop,Monument / Landmark,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Night Market,Nightclub,Noodle House,North Indian Restaurant,Office,Optical Shop,Organic Grocery,Other Great Outdoors,Outdoor Sculpture,Pakistani Restaurant,Paper / Office Supplies Store,Park,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Portuguese Restaurant,Post Office,Poutine Place,Pub,Racetrack,Ramen Restaurant,Record Shop,Recreation Center,Rental Car Location,Rental Service,Resort,Rest Area,Restaurant,River,Road,Rock Climbing Spot,Rock Club,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shanghai Restaurant,Shipping Store,Shoe Repair,Shoe Store,Shop & Service,Shopping Mall,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Soup Place,South American Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Sri Lankan Restaurant,Stables,State / Provincial Park,Steakhouse,Storage Facility,Supermarket,Supplement Shop,Surf Spot,Sushi Restaurant,Taco Place,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Temple,Tennis Court,Thai Restaurant,Theater,Theme Park,Theme Restaurant,Thrift / Vintage Store,Tibetan Restaurant,Tour Provider,Toy / Game Store,Track,Trail,Train Station,Vegetarian / Vegan Restaurant,Veterinarian,Video Game Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo
0,Agincourt GO Station,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.03,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.08,0.0,0.0,0.04,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0
1,Ajax GO Station,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.012821,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.038462,0.0,0.012821,0.0,0.0,0.012821,0.0,0.076923,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.012821,0.012821,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.012821,0.0,0.0,0.025641,0.0,0.012821,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.012821,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.012821,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.025641,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012821,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0
2,Aldershot VIA/GO Station,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.074074,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0
3,Allandale Waterfront GO Station,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.03,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.09,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.05,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0
4,Amtrak Station - Exchange Street (BFX) (Amtrak...,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.07,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.05,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0


### Visually inspect station neighbourhood characteristics

In [16]:
# Print each station neighborhood along with the top 5 most common venues

num_top_venues = 5

for station in stations_grouped['Station']:
    print("----"+station+"----")
    temp = stations_grouped[stations_grouped['Station'] == station].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agincourt GO Station----
                  venue  freq
0    Chinese Restaurant  0.08
1           Coffee Shop  0.06
2  Fast Food Restaurant  0.04
3        Clothing Store  0.04
4                  Bank  0.04


----Ajax GO Station----
                  venue  freq
0           Coffee Shop  0.08
1         Grocery Store  0.04
2                   Gym  0.04
3              Pharmacy  0.04
4  Caribbean Restaurant  0.04


----Aldershot VIA/GO Station----
                  venue  freq
0           Coffee Shop  0.11
1  Fast Food Restaurant  0.07
2         Grocery Store  0.04
3       Harbor / Marina  0.04
4            Steakhouse  0.04


----Allandale Waterfront GO Station----
                  venue  freq
0           Coffee Shop  0.09
1                   Pub  0.05
2  Fast Food Restaurant  0.05
3        Sandwich Place  0.05
4                 Hotel  0.05


----Amtrak Station - Exchange Street (BFX) (Amtrak - Buffalo, NY Exchange Street Station)----
                venue  freq
0                 Bar  0

### Create a dataframe with the top 10 venues for each neighbourhood

In [17]:
# First, let's write a function to sort the venues in descending order

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

# Now create a new dataframe and display the top 10 venues for each neighborhood

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# Create columns according to number of top venues
columns = ['Station']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# Create a new dataframe
stations_venues_sorted = pd.DataFrame(columns=columns)
stations_venues_sorted['Station'] = stations_grouped['Station']

for ind in np.arange(stations_grouped.shape[0]):
    stations_venues_sorted.iloc[ind, 1:] = return_most_common_venues(stations_grouped.iloc[ind, :], num_top_venues)

stations_venues_sorted.head()

Unnamed: 0,Station,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agincourt GO Station,Chinese Restaurant,Coffee Shop,Bank,Fast Food Restaurant,Supermarket,Clothing Store,Restaurant,Sandwich Place,Noodle House,Bubble Tea Shop
1,Ajax GO Station,Coffee Shop,Pharmacy,Caribbean Restaurant,Breakfast Spot,Grocery Store,Gym,Burger Joint,Department Store,Bank,Pub
2,Aldershot VIA/GO Station,Coffee Shop,Fast Food Restaurant,Breakfast Spot,Cosmetics Shop,Golf Course,Park,Steakhouse,Grocery Store,Gastropub,Pharmacy
3,Allandale Waterfront GO Station,Coffee Shop,Hotel,Pizza Place,Pub,Fast Food Restaurant,Sandwich Place,Diner,Ice Cream Shop,Bar,Bank
4,Amtrak Station - Exchange Street (BFX) (Amtrak...,Bar,Brewery,Hotel,Coffee Shop,Italian Restaurant,Harbor / Marina,Seafood Restaurant,Cruise Ship,New American Restaurant,American Restaurant


### Use k-means clustering to create groups of similar station neighbourhoods

In [18]:
# Run k-means to cluster the neighborhood into 5 clusters.

# Set number of clusters
kclusters = 5

stations_grouped_clustering = stations_grouped.drop('Station', 1)

# Run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=42).fit(stations_grouped_clustering)

# Check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 


array([0, 0, 4, 4, 3, 0, 0, 0, 1, 3], dtype=int32)

### Create a new dataframe that includes the cluster and the top 10 venues for each station neighborhood

In [19]:
# Create a new dataframe

stations_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

stations_merged = stations_so

# Merge stations_grouped with stations_data to add latitude/longitude for each neighborhood
stations_merged = stations_merged.join(stations_venues_sorted.set_index('Station'), on='Station')

stations_merged.head()

Unnamed: 0,Station,location.address,location.lat,location.lng,location.distance,location.postalCode,location.neighborhood,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Union Station,65 Front St W,43.645167,-79.380641,962,M5J 1E6,Financial District,Toronto,ON,3,Café,Coffee Shop,Hotel,Restaurant,Park,Gym,Japanese Restaurant,Plaza,Sandwich Place,Baseball Stadium
1,Exhibition Station - Track 1,,43.63584,-79.4187,3420,M6K,,Toronto,ON,3,Park,Bakery,Café,Coffee Shop,Italian Restaurant,American Restaurant,Pizza Place,Gift Shop,Cocktail Bar,Soccer Stadium
2,VIA Rail Arrivals,Union Station,43.644474,-79.3803,1044,M5J 1E5,Entertainment District,Toronto,ON,3,Coffee Shop,Café,Hotel,Restaurant,Park,Japanese Restaurant,Beer Bar,Gym,Dessert Shop,Thai Restaurant
3,Union Station Platform 26,65 Front St. W,43.64409,-79.379978,1092,,,Toronto,ON,3,Coffee Shop,Café,Hotel,Restaurant,Park,Japanese Restaurant,Beer Bar,Gym,Dessert Shop,Thai Restaurant
4,Burlington GO Station,2101 Fairview St,43.340608,-79.809863,48949,L7R 2E1,,Burlington,ON,0,Restaurant,Coffee Shop,Bookstore,Pizza Place,Mediterranean Restaurant,Café,Grocery Store,Sushi Restaurant,Vegetarian / Vegan Restaurant,Gym / Fitness Center


### Create a Map of the Station Neighbourhood Clusters

In [20]:
# Create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=9)

# Set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# Add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(stations_merged['location.lat'], stations_merged['location.lng'], stations_merged['Station'], stations_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Examine Cluster Details to Validate and Suggest Names

In [21]:
# Cluster 1

stations_merged.loc[stations_merged['Cluster Labels'] == 0, stations_merged.columns[[1] + list(range(7, stations_merged.shape[1]))]]

Unnamed: 0,location.address,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,2101 Fairview St,Burlington,ON,0,Restaurant,Coffee Shop,Bookstore,Pizza Place,Mediterranean Restaurant,Café,Grocery Store,Sushi Restaurant,Vegetarian / Vegan Restaurant,Gym / Fitness Center
8,825 Depot Ave W,Niagara Falls,NY,0,Pizza Place,Discount Store,Hotel,Pharmacy,Convenience Store,Coffee Shop,Italian Restaurant,Scenic Lookout,Trail,Donut Shop
9,20 Brow Dr.,Toronto,ON,0,Coffee Shop,Grocery Store,Pharmacy,Fast Food Restaurant,Bakery,Sushi Restaurant,Pizza Place,Department Store,Cosmetics Shop,Clothing Store
15,1322 Bayly St.,Pickering,ON,0,Restaurant,Sandwich Place,Coffee Shop,Ice Cream Shop,Pharmacy,Fast Food Restaurant,Gym,Grocery Store,Pizza Place,Burger Joint
17,214 Cross Ave,Oakville,ON,0,Restaurant,Coffee Shop,Bakery,Pub,Pizza Place,Sushi Restaurant,Café,Sandwich Place,Grocery Store,Japanese Restaurant
18,4100 Sheppard Ave. E,Scarborough,ON,0,Chinese Restaurant,Coffee Shop,Bank,Fast Food Restaurant,Supermarket,Clothing Store,Restaurant,Sandwich Place,Noodle House,Bubble Tea Shop
22,121 Wellington St. E,Aurora,ON,0,Bank,Coffee Shop,Restaurant,Sushi Restaurant,Grocery Store,Gym,Diner,Seafood Restaurant,Sandwich Place,Middle Eastern Restaurant
23,100 Westney Rd. S.,Ajax,ON,0,Coffee Shop,Pharmacy,Caribbean Restaurant,Breakfast Spot,Grocery Store,Gym,Burger Joint,Department Store,Bank,Pub
24,7970 Kennedy Rd.,Markham,ON,0,Dessert Shop,Bubble Tea Shop,Park,Chinese Restaurant,Seafood Restaurant,Bank,Cha Chaan Teng,Japanese Restaurant,Gym,Bakery
25,1350 Brock St. S.,Whitby,ON,0,Coffee Shop,Gas Station,Restaurant,Japanese Restaurant,Sandwich Place,Breakfast Spot,Park,Café,Burger Joint,Pub


In [22]:
# Cluster 2

stations_merged.loc[stations_merged['Cluster Labels'] == 1, stations_merged.columns[[1] + list(range(7, stations_merged.shape[1]))]]

Unnamed: 0,location.address,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
27,833 Yonge St.,Barrie,ON,1,Gas Station,Fast Food Restaurant,Coffee Shop,Supermarket,Pharmacy,Bank,Grocery Store,Liquor Store,Sandwich Place,Playground


In [23]:
# Cluster 3

stations_merged.loc[stations_merged['Cluster Labels'] == 2, stations_merged.columns[[1] + list(range(7, stations_merged.shape[1]))]]

Unnamed: 0,location.address,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
38,7 Station Rd.,King City,ON,2,Italian Restaurant,BBQ Joint,Gas Station,Gastropub,Flower Shop,Skating Rink,Coffee Shop,Market,Pizza Place,Pharmacy


In [24]:
# Cluster 4

stations_merged.loc[stations_merged['Cluster Labels'] == 3, stations_merged.columns[[1] + list(range(7, stations_merged.shape[1]))]]

Unnamed: 0,location.address,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,65 Front St W,Toronto,ON,3,Café,Coffee Shop,Hotel,Restaurant,Park,Gym,Japanese Restaurant,Plaza,Sandwich Place,Baseball Stadium
1,,Toronto,ON,3,Park,Bakery,Café,Coffee Shop,Italian Restaurant,American Restaurant,Pizza Place,Gift Shop,Cocktail Bar,Soccer Stadium
2,Union Station,Toronto,ON,3,Coffee Shop,Café,Hotel,Restaurant,Park,Japanese Restaurant,Beer Bar,Gym,Dessert Shop,Thai Restaurant
3,65 Front St. W,Toronto,ON,3,Coffee Shop,Café,Hotel,Restaurant,Park,Japanese Restaurant,Beer Bar,Gym,Dessert Shop,Thai Restaurant
5,36 Hunter St E,Hamilton,ON,3,Café,Coffee Shop,Restaurant,Pub,Park,Pizza Place,American Restaurant,Middle Eastern Restaurant,Bar,Bakery
6,75 Exhange St,Buffalo,NY,3,Bar,Brewery,Hotel,Coffee Shop,Italian Restaurant,Harbor / Marina,Seafood Restaurant,Cruise Ship,New American Restaurant,American Restaurant
7,61 Front St. W,Toronto,ON,3,Restaurant,Coffee Shop,Hotel,Park,Beer Bar,Japanese Restaurant,Gym,Café,Italian Restaurant,Monument / Landmark
11,Toronto Pearson International Airport,Mississauga,ON,3,Hotel,Coffee Shop,American Restaurant,Rental Car Location,Airport Lounge,Restaurant,Steakhouse,Hobby Shop,Convenience Store,Brewery
12,65 Front St. W,Toronto,ON,3,Coffee Shop,Café,Hotel,Restaurant,Park,Gym,Japanese Restaurant,Plaza,Sandwich Place,Baseball Stadium
13,1456 Bloor Street West,Toronto,ON,3,Café,Coffee Shop,Italian Restaurant,Bakery,Bar,Park,Restaurant,Indian Restaurant,Brewery,Ice Cream Shop


In [25]:
# Cluster 5

stations_merged.loc[stations_merged['Cluster Labels'] == 4, stations_merged.columns[[1] + list(range(7, stations_merged.shape[1]))]]

Unnamed: 0,location.address,location.city,location.state,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,6845 Millcreek Drive,Mississauga,ON,4,Coffee Shop,Fast Food Restaurant,Hotel,Pizza Place,Mexican Restaurant,Sushi Restaurant,Gym,Bank,Grocery Store,Restaurant
14,1199 Waterdown Rd.,Burlington,ON,4,Coffee Shop,Fast Food Restaurant,Breakfast Spot,Cosmetics Shop,Golf Course,Park,Steakhouse,Grocery Store,Gastropub,Pharmacy
16,2104 Wyecroft Road,Oakville,ON,4,Coffee Shop,Gym,Bank,Restaurant,Park,Gas Station,Sandwich Place,Pharmacy,Pizza Place,Convenience Store
19,4105 Kingston Road,Toronto,ON,4,Fast Food Restaurant,Coffee Shop,Pizza Place,Sandwich Place,Discount Store,Supermarket,Bank,Pharmacy,Beer Store,Train Station
20,Lakeshore,Barrie,ON,4,Coffee Shop,Hotel,Pizza Place,Pub,Fast Food Restaurant,Sandwich Place,Diner,Ice Cream Shop,Bar,Bank
29,1713 Steeles Avenue East,Bramalea,ON,4,Coffee Shop,Indian Restaurant,Gas Station,Bank,Asian Restaurant,Fast Food Restaurant,Grocery Store,Greek Restaurant,Bookstore,Clothing Store
31,251 Holland St. E.,Bradford,ON,4,Coffee Shop,Pizza Place,Gas Station,Grocery Store,Sandwich Place,Fast Food Restaurant,Chinese Restaurant,Thai Restaurant,Beer Store,Hardware Store
35,1110 Southdown Rd,Mississauga,ON,4,Coffee Shop,Pizza Place,Italian Restaurant,Breakfast Spot,Hotel,Sandwich Place,Restaurant,Japanese Restaurant,Ice Cream Shop,Bank
36,721 Westburne Dr,Maple,ON,4,Fast Food Restaurant,Coffee Shop,Italian Restaurant,Gas Station,Sandwich Place,Pizza Place,Pharmacy,Park,Grocery Store,Sushi Restaurant
39,39 John St,Etobicoke,ON,4,Coffee Shop,Pizza Place,Gas Station,Sandwich Place,Fast Food Restaurant,Grocery Store,Supermarket,Vietnamese Restaurant,Bank,Ice Cream Shop
