# Battle of the Neighborhoods  
The primary purpose of this notebook is to complete the IBM Data Science Professional Certificate Capstone course on www.coursera.org  

## Table of Contents  
* [Abstract](#abstract)  
* [Hello World (Module 1)](#hello-world)
* [Segmenting and Clustering Neighborhoods in Toronto (Module 2)](#segmenting-and-clustering-neighborhoods-in-toronto)


## Abstract

## Hello World

In [1]:
import pandas as pd
import numpy as np
import json
from pandas.io.json import json_normalize
import matplotlib.cm as cm
import matplotlib.colors as colors

In [2]:
print('Hello Capstone Project Course!')

Hello Capstone Project Course!


## Segmenting and Clustering Neighborhoods in Toronto

In [3]:
import requests
import lxml.html as lh
import pandas as pd
import numpy as np

We are going to use the requests and lxml libraries to parse the html table from wikipedia. We begin grabbing the page, then loading into an lxml document. The lxml library is helpful because we can use it to filter the table by the html tag. Using the library we are able to quickly parse the 3 column table by grabbing the column names before repeating with the actual data. We then convert this to a dictionary for pandas.

In [4]:
url = 'https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
page = requests.get(url)
doc = lh.fromstring(page.content)

tr_elements = doc.xpath('//tr')
col = []
i = 0
for t in tr_elements[0]:
    i += 1
    name = t.text_content()
    name = name.replace('\n', '')
    col.append((name, []))

for j in range(1, len(tr_elements)):
    row = tr_elements[j]
    if len(row) != 3:
        break
    i = 0
    
    for t in row.iterchildren():
        data = t.text_content()
        col[i][1].append(data)
        i += 1

neighbor_dict = {title: column for title, column in col}

We need to load the dictionary into a dataframe. Then we are going to apply various transformations to make the data workable. First, we eliminate the newline characters. We then replace 'Not assigned' with NaN values to be used later. The DataFrame.combine_first function is used to default NaN Neighborhoods to the Borough name. Finally we drop any remaining NaNs. We also drop the extra column as the author is accustomed to the American spelling of 'Neighborhood'.

For the Assignment, this is check mark 1.

In [5]:
nbhd_df = pd.DataFrame(neighbor_dict)
nbhd_df.head()
#Clean up new lines
nbhd_df['Neighborhood'] = nbhd_df['Neighbourhood'].apply(lambda x: x.replace('\n', ''))
nbhd_df = nbhd_df.drop(['Neighbourhood'], axis=1)
nbhd_df = nbhd_df.replace('Not assigned', np.nan)
nbhd_df.Neighborhood = nbhd_df.Neighborhood.combine_first(nbhd_df.Borough)
nbhd_df = nbhd_df.dropna(axis=0, how='any')
nbhd_df = nbhd_df.reset_index(drop=True)
nbhd_df = nbhd_df.groupby('Postcode').agg({'Borough': 'first', 'Neighborhood': ', '.join}).reset_index()
nbhd_df.head(12)
#Eliminate future key errors

Unnamed: 0,Postcode,Borough,Neighborhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae
5,M1J,Scarborough,Scarborough Village
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park"
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge"
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West"
9,M1N,Scarborough,"Birch Cliff, Cliffside West"


In [6]:
nbhd_df.shape

(103, 3)

In [7]:
# from geopy.geocoders import GoogleV3
# from geopy.extra.rate_limiter import RateLimiter

Get locations for each neighborhood using google API for which I already had a key. Code is commented out but left for example. Data will now be imported using the written locations.

In [10]:
# geolocator = GoogleV3(api_key='', domain='maps.google.ca')
# geo = RateLimiter(geolocator.geocode, min_delay_seconds=.1, max_retries=5)
# locations = nbhd_df['Postcode']
# locations = locations.apply(lambda x: x + ', Toronto, Ontario')

In [11]:
# nbhd_df['loc'] = locations.apply(geo)
# nbhd_df['latitude'] = nbhd_df['loc'].apply(lambda x: x.latitude if x else None)
# nbhd_df['longitude'] = nbhd_df['loc'].apply(lambda x: x.longitude if x else None)
# nbhd_df.to_csv('Neighborhoods.csv')

From here on out, we will be loading our data set from within the folder.

In [8]:
nbhd_df = pd.read_csv('Neighborhoods.csv')
nbhd_df = nbhd_df.set_index(['Unnamed: 0'])
nbhd_df = nbhd_df.reset_index(drop=True)
nbhd_df = nbhd_df.drop('loc', axis=1)

The table below corresponds to check mark 2.

In [9]:
nbhd_df.head(12)

Unnamed: 0,Postcode,Borough,Neighborhood,latitude,longitude
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848


In [10]:
import folium
import seaborn as sns

I'm going to assign colors to neighborhoods based on their borough for visualization purposes. I generate a 'Paired' color palette so I can fill the points with a lighter color. I then create a dictionary to hold all of the feature groups and colors so the colors are standardized.

In [11]:
boroughs = nbhd_df['Borough'].unique()
palette = sns.color_palette('Paired',2 * len(boroughs))
p1 = palette.as_hex()[::2]
p2 = palette.as_hex()[1::2]
layer_names = {}
for name, c1, c2 in zip(boroughs, p1, p2):
    layer_names[name] = (folium.map.FeatureGroup(name=name), (c1, c2))

Create markers for each neighborhood with popups. Add each marker to a borough FeatureGroup so they can be toggled in layers.

In [12]:
map_toronto = folium.Map(location=[43.761539, -79.411079], zoom_start=10)
nbhd_df = nbhd_df.dropna(axis=0, how='any')
for lat, lng, label, bor in zip(nbhd_df['latitude'], nbhd_df['longitude'], nbhd_df['Neighborhood'], nbhd_df['Borough']):
    label = folium.Popup(bor + ': ' + bor, parse_html=True)
    fgroup = layer_names[bor][0]
    c1 = layer_names[bor][1][1]
    c2 = layer_names[bor][1][0]
    fgroup.add_child(folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=c1,
        fill=True,
        fill_color=c2,
        fill_opacity=.7,
        parse_html=False
        ))
for x in layer_names.values():
    map_toronto.add_child(x[0])
map_toronto.add_child(folium.map.LayerControl())
map_toronto

Roughly color coded based on the neighborhood, with popups added. This set of locations will be my last revision of the map. Dufferin was the last misbehaving point and was set manually. 

### Exploring Neighborhoods

In [20]:
# CLIENT_ID = ''
# CLIENT_SECRET = ''
# VERSION = 20190417
# LIMIT = 100

Code to pull data from Foursquare, commented out in favor of locally stored data to lighten api requests

In [21]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Generate list of venues

In [13]:
# venues = getNearbyVenues(nbhd_df['Neighborhood'], nbhd_df['latitude'], nbhd_df['longitude'])
# venues.to_csv('venues.csv')
venues = pd.read_csv('venues.csv')


In [14]:
venues = venues.set_index(['Unnamed: 0'])
venues = venues.reset_index(drop=True)
venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Rouge, Malvern",43.806686,-79.194353,Wendy's,43.807448,-79.199056,Fast Food Restaurant
1,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,Royal Canadian Legion,43.782533,-79.163085,Bar
2,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,Affordable Toronto Movers,43.787919,-79.162977,Moving Target
3,"Guildwood, Morningside, West Hill",43.763573,-79.188711,Swiss Chalet Rotisserie & Grill,43.767697,-79.189914,Pizza Place
4,"Guildwood, Morningside, West Hill",43.763573,-79.188711,G & G Electronics,43.765309,-79.191537,Electronics Store


In [15]:
print(venues.shape)
print('There are {} uniques categories.'.format(len(venues['Venue Category'].unique())))

(2248, 7)
There are 277 uniques categories.


One Hot encoding

In [16]:
tor_onehot = pd.get_dummies(venues[['Venue Category']], prefix='', prefix_sep='')
tor_onehot['Name'] = venues['Neighborhood']

fixed_columns = [tor_onehot.columns[-1]] + list(tor_onehot.columns[:-1])
tor_onehot = tor_onehot[fixed_columns]

tor_onehot.head()

Unnamed: 0,Name,Accessories Store,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"Rouge, Malvern",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,"Highland Creek, Rouge Hill, Port Union",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,"Highland Creek, Rouge Hill, Port Union",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,"Guildwood, Morningside, West Hill",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,"Guildwood, Morningside, West Hill",0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Generate a frequency of categories

In [17]:
tor_grp = tor_onehot.groupby('Name').mean().reset_index()
tor_grp.head()

Unnamed: 0,Name,Accessories Store,Adult Boutique,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,...,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Wine Bar,Wine Shop,Wings Joint,Women's Store,Yoga Studio
0,"Adelaide, King, Richmond",0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0
1,Agincourt,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Agincourt North, L'Amoreaux East, Milliken, St...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Alderwood, Long Branch",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [18]:
tor_grp.shape

(100, 278)

In [19]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Get a list of Top venues from data set by which we can cluster the data.

In [20]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = tor_grp['Name']

for ind in np.arange(tor_grp.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(tor_grp.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Adelaide, King, Richmond",Coffee Shop,Steakhouse,American Restaurant,Café,Thai Restaurant,Bar,Bakery,Restaurant,Burger Joint,Hotel
1,Agincourt,Lounge,Sandwich Place,Breakfast Spot,Skating Rink,Yoga Studio,Donut Shop,Diner,Discount Store,Dog Run,Doner Restaurant
2,"Agincourt North, L'Amoreaux East, Milliken, St...",Park,Playground,Sculpture Garden,Yoga Studio,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
3,"Albion Gardens, Beaumond Heights, Humbergate, ...",Grocery Store,Pizza Place,Pharmacy,Sandwich Place,Liquor Store,Fast Food Restaurant,Beer Store,Fried Chicken Joint,Concert Hall,Construction & Landscaping
4,"Alderwood, Long Branch",Pizza Place,Pharmacy,Pool,Skating Rink,Sandwich Place,Pub,Athletics & Sports,Coffee Shop,Gym,Comfort Food Restaurant


In [21]:
from sklearn.cluster import KMeans

Cluster the neighborhoods

In [22]:
k = 3
tor_grp_clustering = tor_grp.drop('Name', 1)

kmeans = KMeans(n_clusters=k, random_state=0).fit(tor_grp_clustering)

kmeans.labels_[0:10]

array([1, 1, 2, 1, 1, 1, 1, 1, 1, 1])

In [23]:
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

tor_merged = nbhd_df

tor_merged = tor_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

In [24]:
tor_merged.head(10)

Unnamed: 0,Postcode,Borough,Neighborhood,latitude,longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M1B,Scarborough,"Rouge, Malvern",43.806686,-79.194353,1.0,Fast Food Restaurant,Donut Shop,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Drugstore,Farmers Market
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.784535,-79.160497,1.0,Moving Target,Bar,Yoga Studio,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711,1.0,Medical Center,Breakfast Spot,Rental Car Location,Mexican Restaurant,Intersection,Electronics Store,Pizza Place,Spa,Eastern European Restaurant,Dumpling Restaurant
3,M1G,Scarborough,Woburn,43.770992,-79.216917,1.0,Coffee Shop,Korean Restaurant,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476,1.0,Athletics & Sports,Caribbean Restaurant,Bakery,Bank,Thai Restaurant,Fried Chicken Joint,Hakka Restaurant,Lounge,Empanada Restaurant,Electronics Store
5,M1J,Scarborough,Scarborough Village,43.744734,-79.239476,1.0,Convenience Store,Playground,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
6,M1K,Scarborough,"East Birchmount Park, Ionview, Kennedy Park",43.727929,-79.262029,1.0,Discount Store,Department Store,Coffee Shop,Bus Station,Convenience Store,Donut Shop,Dim Sum Restaurant,Diner,Dog Run,Doner Restaurant
7,M1L,Scarborough,"Clairlea, Golden Mile, Oakridge",43.711112,-79.284577,1.0,Bus Line,Bakery,Metro Station,Intersection,Bus Station,Soccer Field,Fast Food Restaurant,Eastern European Restaurant,Dumpling Restaurant,Drugstore
8,M1M,Scarborough,"Cliffcrest, Cliffside, Scarborough Village West",43.716316,-79.239476,1.0,American Restaurant,Skating Rink,Motel,Yoga Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
9,M1N,Scarborough,"Birch Cliff, Cliffside West",43.692657,-79.264848,1.0,College Stadium,General Entertainment,Skating Rink,Café,Comic Shop,Dessert Shop,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant


Build map and color according to clusters

In [25]:
cl_map_toronto = folium.Map(location=[43.761539, -79.411079], zoom_start=10)
tor_merged = tor_merged.dropna(axis=0, how='any')

x = np.arange(k)
ys = [i + x + (i*x)**2 for i in range(k)]
colors_array = cm.rainbow(np.linspace(0,1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

markers_colors = []
for lat, lng, poi, cluster in zip(tor_merged['latitude'], tor_merged['longitude'], tor_merged['Neighborhood'], tor_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(int(cluster)), parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(cl_map_toronto)
cl_map_toronto.add_child(folium.map.LayerControl())
cl_map_toronto

In [26]:
tor_merged.loc[tor_merged['Cluster Labels'] == 0, tor_merged.columns[[2] + list(range(5, tor_merged.shape[1]))]].head(10)

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
91,"Humber Bay, King's Mill Park, Kingsway Park So...",0.0,Baseball Field,Yoga Studio,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Dessert Shop
97,"Emery, Humberlea",0.0,Baseball Field,Yoga Studio,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Dessert Shop


In [27]:
tor_merged.loc[tor_merged['Cluster Labels'] == 1, tor_merged.columns[[2] + list(range(5, tor_merged.shape[1]))]].head(10)

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Rouge, Malvern",1.0,Fast Food Restaurant,Donut Shop,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Drugstore,Farmers Market
1,"Highland Creek, Rouge Hill, Port Union",1.0,Moving Target,Bar,Yoga Studio,Department Store,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
2,"Guildwood, Morningside, West Hill",1.0,Medical Center,Breakfast Spot,Rental Car Location,Mexican Restaurant,Intersection,Electronics Store,Pizza Place,Spa,Eastern European Restaurant,Dumpling Restaurant
3,Woburn,1.0,Coffee Shop,Korean Restaurant,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
4,Cedarbrae,1.0,Athletics & Sports,Caribbean Restaurant,Bakery,Bank,Thai Restaurant,Fried Chicken Joint,Hakka Restaurant,Lounge,Empanada Restaurant,Electronics Store
5,Scarborough Village,1.0,Convenience Store,Playground,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop,Yoga Studio
6,"East Birchmount Park, Ionview, Kennedy Park",1.0,Discount Store,Department Store,Coffee Shop,Bus Station,Convenience Store,Donut Shop,Dim Sum Restaurant,Diner,Dog Run,Doner Restaurant
7,"Clairlea, Golden Mile, Oakridge",1.0,Bus Line,Bakery,Metro Station,Intersection,Bus Station,Soccer Field,Fast Food Restaurant,Eastern European Restaurant,Dumpling Restaurant,Drugstore
8,"Cliffcrest, Cliffside, Scarborough Village West",1.0,American Restaurant,Skating Rink,Motel,Yoga Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
9,"Birch Cliff, Cliffside West",1.0,College Stadium,General Entertainment,Skating Rink,Café,Comic Shop,Dessert Shop,Ethiopian Restaurant,Empanada Restaurant,Electronics Store,Eastern European Restaurant


In [28]:
tor_merged.loc[tor_merged['Cluster Labels'] == 2, tor_merged.columns[[2] + list(range(5, tor_merged.shape[1]))]].head(10)

Unnamed: 0,Neighborhood,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,"Agincourt North, L'Amoreaux East, Milliken, St...",2.0,Park,Playground,Sculpture Garden,Yoga Studio,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
20,"Silver Hills, York Mills",2.0,Park,Martial Arts Dojo,Cafeteria,Yoga Studio,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run
23,York Mills West,2.0,Park,Bank,Yoga Studio,Donut Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Drugstore
25,Parkwoods,2.0,Park,Fast Food Restaurant,Pool,Food & Drink Shop,Yoga Studio,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store
30,"CFB Toronto, Downsview East",2.0,Park,Airport,Bus Stop,Yoga Studio,Drugstore,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
40,East Toronto,2.0,Park,Convenience Store,Coffee Shop,Drugstore,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant,Donut Shop
44,Lawrence Park,2.0,Park,Bus Line,Swim School,Yoga Studio,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Donut Shop
50,Rosedale,2.0,Park,Trail,Playground,Yoga Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dog Run,Doner Restaurant
64,"Forest Hill North, Forest Hill West",2.0,Park,Sushi Restaurant,Jewelry Store,Trail,Yoga Studio,Doner Restaurant,Dim Sum Restaurant,Diner,Discount Store,Dog Run
74,Caledonia-Fairbanks,2.0,Park,Pharmacy,Women's Store,Market,Fast Food Restaurant,Doner Restaurant,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store


Group 0: Two neighborhoods in close proximity, seem to be defined by industries to support the baseball field  
Group 1: General restaurants and stores   
Group 2: Defined by public services, very outdoorsy.