# Neighborhoods in Toronto

# Coursera -  IBM Data Science Capstone

## Week 3 - Exercise Segmenting and Clustering Neighborhoods in Toronto

####  First step: Install and import all the libraries that it will be need to develop tasks of the exercise.

In [2]:
!conda install -c conda-forge beautifulsoup4 --yes

!conda install -c conda-forge geopy --yes

!conda install -c conda-forge folium --yes


print('Libraries Installed!')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - beautifulsoup4


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    beautifulsoup4-4.9.0       |   py36h9f0ad1d_0         160 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.6 MB

The following NEW packages will be INSTALLED:

    python_abi:      3.6-1_cp36m       conda-forge

The following packages will be UPDATED:

    beautifulsoup4:  4.7.1-py36_1                

In [28]:
import pandas as pd
import numpy as np # library to handle data in a vectorized manner
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import requests
from pandas.io.json import json_normalize
import json

import requests

from bs4 import BeautifulSoup

from geopy.geocoders import Nominatim

import folium
import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans

print('Libraries imported!')

Libraries imported!


## Task 1: Web scraping for Toronto neighborhood and build a clean dataframe¶

Use the Notebook to build the code to scrape the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe.

1) The dataframe will consist of three columns: PostalCode, Borough, and Neighborhood

2) Only process the cells that have an assigned borough. Ignore cells with a borough that is Not assigned.

3) More than one neighborhood can exist in one postal code area. For example, in the table on the Wikipedia page, you will notice that M5A is listed twice and has two neighborhoods: Harbourfront and Regent Park. These two rows will be combined into one row with the neighborhoods separated with a comma.

4) If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough. So for the 9th cell in the table on the Wikipedia page, the value of the Borough and the Neighborhood columns will be Queen's Park.

5) Clean your Notebook and add Markdown cells to explain your work and any assumptions you are making.

6) In the last cell of your notebook, use the .shape method to print the number of rows of your dataframe.

In [5]:
# Extract the html file from Wikipedia page using "request.get" and open with Beautiful Soup
Wiki = requests.get('https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M').text
soup = BeautifulSoup(Wiki, 'html.parser')

#### Creating Dataframe

Let's create a list using "beautifulsoup4" and after that convert to Dataframe using "panda".

In [6]:
L_postalCode = []
L_borough = []
L_neighborhood = []

for row in soup.find('table').find_all('tr'):
    cells = row.find_all('td')
    if(len(cells) > 0):
        L_postalCode.append(cells[0].text.rstrip('\n'))
        L_borough.append(cells[1].text.rstrip('\n'))
        L_neighborhood.append(cells[2].text.rstrip('\n')) # remove the new line char from neighborhood cell
        
toronto_neighorhood = [('PostalCode', L_postalCode), ('Borough', L_borough),('Neighborhood', L_neighborhood)]

toronto_df = pd.DataFrame.from_dict(dict(toronto_neighorhood))
toronto_df.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M1A,Not assigned,
1,M2A,Not assigned,
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,"Regent Park, Harbourfront"


#### Removing rows that have "Not assigned" from "Borough" Column

In [7]:
toronto_dropna = toronto_df[toronto_df.Borough != 'Not assigned'].reset_index(drop=True)
toronto_dropna.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,"Regent Park, Harbourfront"
3,M6A,North York,"Lawrence Manor, Lawrence Heights"
4,M7A,Downtown Toronto,"Queen's Park, Ontario Provincial Government"


In [9]:
toronto_grouped = toronto_dropna.groupby(['PostalCode','Borough'], as_index=False).agg(lambda x: ','.join(x))
toronto_grouped.columns=['Postecode','Borough','Neighborhood']
print(toronto_grouped.shape)
toronto_grouped.head()


(103, 3)


Unnamed: 0,Postecode,Borough,Neighborhood
0,M1B,Scarborough,"Malvern, Rouge"
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


## Task 2: Getting coordinates and add to the Toronto DataFrame

Now that you have built a dataframe of the postal code of each neighborhood along with the borough name and neighborhood name, in order to utilize the Foursquare location data, we need to get the latitude and the longitude coordinates of each neighborhood.

In an older version of this course, we were leveraging the Google Maps Geocoding API to get the latitude and the longitude coordinates of each neighborhood. However, recently Google started charging for their API: http://geoawesomeness.com/developers-up-in-arms-over-google-maps-api-insane-price-hike/, so we will use the Geocoder Python package instead: https://geocoder.readthedocs.io/index.html.

The problem with this Package is you have to be persistent sometimes in order to get the geographical coordinates of a given postal code. So you can make a call to get the latitude and longitude coordinates of a given postal code and the result would be None, and then make the call again and you would get the coordinates. So, in order to make sure that you get the coordinates for all of our neighborhoods, you can run a while loop for each postal code. Taking postal code M5G as an example, your code would look something like this:

"import geocoder # import geocoder

##### initialize your variable to None
lat_lng_coords = None

##### loop until you get the coordinates
while(lat_lng_coords is None):
  g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
  lat_lng_coords = g.latlng

latitude = lat_lng_coords[0]
longitude = lat_lng_coords[1]"

### Given that this package can be very unreliable, in case you are not able to get the geographical coordinates of the neighborhoods using the Geocoder package, here is a link to a csv file that has the geographical coordinates of each postal code: http://cocl.us/Geospatial_data

Important Note: There is a limit on how many times you can call geocoder.google function. It is 2500 times per day. This should be way more than enough for you to get acquainted with the package and to use it to get the geographical coordinates of the neighborhoods in the Toronto.


In [11]:
url = ('http://cocl.us/Geospatial_data')
geo_data = pd.read_csv(url)
geo_data.columns = ['Postecode','Latitude','Logitude']

print(geo_data.shape)
geo_data.head()

(103, 3)


Unnamed: 0,Postecode,Latitude,Logitude
0,M1B,43.806686,-79.194353
1,M1C,43.784535,-79.160497
2,M1E,43.763573,-79.188711
3,M1G,43.770992,-79.216917
4,M1H,43.773136,-79.239476


In [12]:
LaLo = pd.merge(toronto_grouped, geo_data, on='Postecode')
LaLo.head()

Unnamed: 0,Postecode,Borough,Neighborhood,Latitude,Logitude
0,M1B,Scarborough,"Malvern, Rouge",43.806686,-79.194353
1,M1C,Scarborough,"Rouge Hill, Port Union, Highland Creek",43.784535,-79.160497
2,M1E,Scarborough,"Guildwood, Morningside, West Hill",43.763573,-79.188711
3,M1G,Scarborough,Woburn,43.770992,-79.216917
4,M1H,Scarborough,Cedarbrae,43.773136,-79.239476


#### Create a map of Toronto

Find first the latitude and longitude of Toronto

In [13]:
address = 'Toronto, Ontario'

geolocator = Nominatim(user_agent="T_On")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("the geographical coordinate of Toronto is {}, {}".format(latitude,longitude))

the geographical coordinate of Toronto is 43.6534817, -79.3839347


Use the latitude and longitude above to create de map

In [14]:
toronto_map = folium.Map(location=[latitude, longitude], zoom_start=11)

for lat, lng, post, borough, neigh in zip(LaLo['Latitude'], LaLo['Logitude'], LaLo['Postecode'], LaLo['Borough'], LaLo['Neighborhood']):
    label = "{} ({}): {}".format(borough, post, neigh)
    popup = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=popup,
        color='green',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(toronto_map)
    
toronto_map

## Task 3: Explore and cluster the neighborhoods in Toronto

Explore and cluster the neighborhoods in Toronto. You can decide to work with only boroughs that contain the word Toronto and then replicate the same analysis we did to the New York City data. It is up to you.


#### * To simplify the analise, first let's using only the Borough that has the word "Toronto"

In [15]:
toronto_borough = ['East Toronto', 'Central Toronto', 'Downtown Toronto', 'West Toronto']
LaLoT = LaLo[LaLo['Borough'].isin(toronto_borough)].reset_index(drop=True)
print(LaLoT.shape)
LaLoT.head(10)

(39, 5)


Unnamed: 0,Postecode,Borough,Neighborhood,Latitude,Logitude
0,M4E,East Toronto,The Beaches,43.676357,-79.293031
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188
2,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572
3,M4M,East Toronto,Studio District,43.659526,-79.340923
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879
5,M4P,Central Toronto,Davisville North,43.712751,-79.390197
6,M4R,Central Toronto,North Toronto West,43.715383,-79.405678
7,M4S,Central Toronto,Davisville,43.704324,-79.38879
8,M4T,Central Toronto,"Moore Park, Summerhill East",43.689574,-79.38316
9,M4V,Central Toronto,"Summerhill West, Rathnelly, South Hill, Forest...",43.686412,-79.400049


In [16]:
toronto_borough_map = folium.Map(location=[latitude, longitude], zoom_start=11)

for lat, lng, post, borough, neigh in zip(LaLoT['Latitude'], LaLoT['Logitude'], LaLoT['Postecode'], LaLoT['Borough'], LaLoT['Neighborhood']):
    label = "{} ({}): {}".format(borough, post, neigh)
    popup = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=popup,
        color='Yellow',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(toronto_borough_map)

toronto_borough_map

#### Now using the functions that we learnt in the last exercise of this course to take data from Foursquare

In [17]:
# The code was removed by Watson Studio for sharing.

In [18]:
radius = 500
LIMIT = 100

venues_list = []

for lat,lng, post, borough, neighborhood in zip(LaLoT['Latitude'],LaLoT['Logitude'],LaLoT['Postecode'],LaLoT['Borough'],LaLoT['Neighborhood']):
    # create the API request URL
    url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    for v in results:
        venues_list.append((
                post,
                borough,
                neighborhood,
                lat, 
                lng, 
                v['venue']['name'], 
                v['venue']['location']['lat'], 
                v['venue']['location']['lng'],  
                v['venue']['categories'][0]['name']))

And create a Dataframe


In [19]:
df_venues = pd.DataFrame(venues_list)
df_venues.columns = ['Postecode', 
                     'Borough', 
                     'Neighborhood',
                     'Borough Latitude', 
                     'Borough Longitude', 
                     'Venue Name',
                     'Venue Latitude',
                     'Venue Logitude',
                     'Venue Category']
print("Dimension of the Dataframe:", df_venues.shape)
df_venues.head()

Dimension of the Dataframe: (1602, 9)


Unnamed: 0,Postecode,Borough,Neighborhood,Borough Latitude,Borough Longitude,Venue Name,Venue Latitude,Venue Logitude,Venue Category
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,Glen Manor Ravine,43.676821,-79.293942,Trail
1,M4E,East Toronto,The Beaches,43.676357,-79.293031,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,M4E,East Toronto,The Beaches,43.676357,-79.293031,Grover Pub and Grub,43.679181,-79.297215,Pub
3,M4E,East Toronto,The Beaches,43.676357,-79.293031,Upper Beaches,43.680563,-79.292869,Neighborhood
4,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,MenEssentials,43.67782,-79.351265,Cosmetics Shop


##### Let's check how many venues were returned for each neighborhood

In [20]:
df_venues.groupby('Borough').count()

Unnamed: 0_level_0,Postecode,Neighborhood,Borough Latitude,Borough Longitude,Venue Name,Venue Latitude,Venue Logitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
Central Toronto,117,117,117,117,117,117,117,117
Downtown Toronto,1205,1205,1205,1205,1205,1205,1205,1205
East Toronto,125,125,125,125,125,125,125,125
West Toronto,155,155,155,155,155,155,155,155


##### Let's find out how many unique categories can be curated from all the returned venues

In [21]:
print ('There are {} unique categories'.format(len(df_venues['Venue Category'].unique())))

There are 235 unique categories


#### Analyze each borough

Let's use function dummies to separate in columns the column "Veneu Category". After that we can do some analysis.

In [22]:
#one hot encoding
Toronto_onehot = pd.get_dummies(df_venues[['Venue Category']], prefix = "", prefix_sep = "")

#add neighborhood column back to dataframe
Toronto_onehot['Borough'] = df_venues['Borough']

#move Borough to first column
fixed_columns = [Toronto_onehot.columns[-1]] + list(Toronto_onehot.columns[:-1])
Toronto_onehot = Toronto_onehot[fixed_columns]

print("Dimension of the DT by categories:",Toronto_onehot.shape)
Toronto_onehot.head()

Dimension of the DT by categories: (1602, 236)


Unnamed: 0,Borough,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0
1,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,East Toronto,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


##### Next, let's group rows by Borough and by taking the mean of the frequency of occurrence of each category

In [24]:
Toronto_grouped = Toronto_onehot.groupby('Borough').mean().reset_index()
print("Dimension of the Borough by categories:",Toronto_grouped.shape)
Toronto_grouped

Dimension of the Borough by categories: (4, 236)


Unnamed: 0,Borough,Afghan Restaurant,Airport,Airport Food Court,Airport Gate,Airport Lounge,Airport Service,Airport Terminal,American Restaurant,Antique Shop,Aquarium,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Workshop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Stadium,Basketball Stadium,Beach,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Bistro,Board Shop,Boat or Ferry,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Building,Burger Joint,Burrito Place,Bus Line,Bus Stop,Butcher,Café,Cajun / Creole Restaurant,Candy Store,Caribbean Restaurant,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Arts Building,College Auditorium,College Cafeteria,College Gym,College Rec Center,Colombian Restaurant,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Convention Center,Cosmetics Shop,Coworking Space,Creperie,Cuban Restaurant,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Distribution Center,Dog Run,Doner Restaurant,Donut Shop,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Health & Beauty Service,Health Food Store,Historic Site,History Museum,Hobby Shop,Home Service,Hookah Bar,Hospital,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Light Rail Station,Lingerie Store,Liquor Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Neighborhood,New American Restaurant,Nightclub,Noodle House,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Park,Performing Arts Venue,Pet Store,Pharmacy,Pizza Place,Plane,Playground,Plaza,Poke Place,Pool,Poutine Place,Pub,Ramen Restaurant,Record Shop,Recording Studio,Rental Car Location,Restaurant,Roof Deck,Sake Bar,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Smoothie Shop,Snack Place,Soup Place,Spa,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Stationery Store,Steakhouse,Strip Club,Supermarket,Sushi Restaurant,Swim School,Taco Place,Tailor Shop,Taiwanese Restaurant,Tanning Salon,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Toy / Game Store,Trail,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Yoga Studio
0,Central Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017094,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.0,0.008547,0.0,0.008547,0.0,0.008547,0.008547,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.008547,0.0,0.0,0.008547,0.0,0.008547,0.0,0.0,0.051282,0.0,0.0,0.0,0.008547,0.008547,0.0,0.0,0.0,0.017094,0.0,0.068376,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.034188,0.017094,0.008547,0.0,0.0,0.0,0.008547,0.0,0.0,0.0,0.0,0.0,0.008547,0.008547,0.0,0.0,0.0,0.0,0.008547,0.0,0.008547,0.0,0.0,0.0,0.0,0.008547,0.0,0.0,0.008547,0.0,0.008547,0.0,0.008547,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.008547,0.0,0.034188,0.017094,0.0,0.008547,0.0,0.0,0.008547,0.0,0.008547,0.0,0.0,0.008547,0.0,0.0,0.0,0.017094,0.0,0.0,0.0,0.017094,0.0,0.0,0.008547,0.0,0.0,0.0,0.0,0.008547,0.0,0.017094,0.0,0.0,0.0,0.0,0.0,0.008547,0.008547,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.059829,0.0,0.0,0.017094,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.008547,0.025641,0.0,0.0,0.0,0.008547,0.059829,0.0,0.0,0.008547,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008547,0.0,0.008547,0.008547,0.0,0.0,0.0,0.0,0.008547,0.034188,0.008547,0.0,0.0,0.0,0.0,0.0,0.008547,0.008547,0.0,0.0,0.008547,0.017094,0.0,0.008547,0.0,0.008547,0.0,0.0,0.0,0.008547
1,Downtown Toronto,0.00083,0.00083,0.00083,0.00083,0.00166,0.00249,0.00166,0.014108,0.00166,0.004149,0.009129,0.00166,0.00249,0.007469,0.0,0.0,0.00249,0.00083,0.00166,0.019087,0.006639,0.012448,0.00166,0.00332,0.00083,0.00083,0.014938,0.00249,0.00166,0.00332,0.0,0.00083,0.009959,0.00083,0.00166,0.011618,0.00249,0.005809,0.00249,0.006639,0.005809,0.0,0.0,0.00249,0.056432,0.0,0.00083,0.00249,0.00332,0.004979,0.00083,0.00083,0.0,0.014938,0.010788,0.099585,0.00083,0.00083,0.00166,0.00083,0.00083,0.00166,0.004149,0.00166,0.007469,0.0,0.00083,0.007469,0.0,0.006639,0.0,0.00083,0.00166,0.011618,0.004149,0.005809,0.008299,0.00166,0.00249,0.00083,0.00083,0.00083,0.00166,0.00166,0.00166,0.00249,0.00083,0.004149,0.006639,0.00083,0.0,0.00332,0.0,0.0,0.0,0.00083,0.00332,0.00249,0.00332,0.006639,0.00332,0.0,0.0,0.00249,0.00166,0.00083,0.0,0.0,0.013278,0.00166,0.00166,0.00332,0.00083,0.00332,0.00332,0.00249,0.00249,0.008299,0.016598,0.005809,0.00083,0.00083,0.0,0.00083,0.00166,0.00166,0.0,0.00083,0.00083,0.026556,0.00166,0.00166,0.007469,0.00332,0.00083,0.0,0.00249,0.023237,0.026556,0.004149,0.00083,0.00332,0.00083,0.00166,0.00166,0.0,0.00332,0.00249,0.004979,0.00083,0.00083,0.00332,0.00166,0.005809,0.004149,0.00083,0.00249,0.00083,0.00249,0.00166,0.00166,0.004149,0.00249,0.00166,0.005809,0.00332,0.00166,0.00166,0.00166,0.00083,0.00083,0.00083,0.017427,0.00166,0.00083,0.00249,0.010788,0.00083,0.00166,0.005809,0.00332,0.0,0.00166,0.012448,0.00332,0.00083,0.0,0.00083,0.035685,0.00166,0.00083,0.009959,0.004979,0.009959,0.00249,0.00166,0.017427,0.00166,0.004149,0.0,0.00083,0.00249,0.00166,0.00083,0.00166,0.00332,0.00249,0.008299,0.00083,0.0,0.0,0.009959,0.00083,0.00083,0.013278,0.0,0.00083,0.004149,0.00083,0.00083,0.009129,0.0,0.012448,0.010788,0.00083,0.0,0.00083,0.00249,0.009959,0.00166,0.00332,0.004979,0.0,0.00083,0.005809
2,East Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.024,0.008,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.016,0.0,0.0,0.0,0.04,0.008,0.0,0.0,0.016,0.0,0.0,0.008,0.04,0.0,0.0,0.008,0.008,0.0,0.0,0.0,0.0,0.008,0.0,0.056,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.008,0.0,0.008,0.0,0.008,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.024,0.0,0.008,0.008,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.008,0.008,0.016,0.0,0.008,0.008,0.0,0.016,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.072,0.008,0.008,0.016,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032,0.008,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.008,0.0,0.0,0.008,0.008,0.0,0.016,0.008,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.016,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.032,0.0,0.016,0.0,0.024,0.0,0.0,0.0,0.0,0.0,0.0,0.024,0.0,0.0,0.008,0.0,0.032,0.0,0.0,0.0,0.0,0.016,0.0,0.0,0.008,0.0,0.0,0.008,0.0,0.008,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.008,0.008,0.0,0.0,0.008,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.0,0.016,0.0,0.0,0.0,0.0,0.008,0.0,0.0,0.016
3,West Toronto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.006452,0.012903,0.0,0.0,0.0,0.0,0.0,0.025806,0.012903,0.058065,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.0,0.0,0.0,0.025806,0.006452,0.0,0.025806,0.012903,0.0,0.0,0.0,0.012903,0.0,0.006452,0.0,0.070968,0.006452,0.0,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.006452,0.03871,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.006452,0.0,0.0,0.0,0.0,0.012903,0.006452,0.0,0.0,0.0,0.012903,0.012903,0.0,0.0,0.006452,0.0,0.0,0.006452,0.0,0.0,0.0,0.006452,0.0,0.006452,0.0,0.006452,0.0,0.006452,0.0,0.006452,0.0,0.0,0.0,0.0,0.012903,0.006452,0.0,0.0,0.012903,0.0,0.0,0.0,0.0,0.006452,0.0,0.0,0.0,0.0,0.019355,0.0,0.006452,0.006452,0.019355,0.012903,0.006452,0.0,0.0,0.006452,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.006452,0.006452,0.0,0.03871,0.006452,0.0,0.0,0.006452,0.006452,0.0,0.006452,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012903,0.012903,0.006452,0.006452,0.0,0.0,0.0,0.0,0.006452,0.0,0.012903,0.0,0.006452,0.012903,0.0,0.0,0.0,0.0,0.0,0.0,0.019355,0.006452,0.006452,0.012903,0.019355,0.0,0.0,0.0,0.0,0.006452,0.0,0.012903,0.0,0.006452,0.0,0.0,0.03871,0.0,0.0,0.0,0.0,0.006452,0.006452,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.0,0.0,0.006452,0.0,0.0,0.006452,0.0,0.0,0.0,0.006452,0.012903,0.0,0.0,0.0,0.0,0.0,0.006452,0.0,0.012903,0.006452,0.0,0.0,0.0,0.0,0.019355,0.0,0.012903,0.006452,0.006452,0.0,0.019355


#### Let's print each Borough along with the top 5 most common venues

In [25]:
num_top_venues = 5

for hood in Toronto_grouped['Borough']:
    print("----"+hood+"----")
    temp = Toronto_grouped[Toronto_grouped['Borough'] == hood].T.reset_index()
    temp.columns = ['Venue', 'Frequency']
    temp = temp.iloc[1:]
    temp['Frequency'] = temp['Frequency'].astype(float)
    temp = temp.round({'Frequency':2})
    print(temp.sort_values('Frequency', ascending = False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Central Toronto----
            Venue  Frequency
0     Coffee Shop       0.07
1  Sandwich Place       0.06
2            Park       0.06
3     Pizza Place       0.05
4            Café       0.05


----Downtown Toronto----
                 Venue  Frequency
0          Coffee Shop       0.10
1                 Café       0.06
2           Restaurant       0.04
3                Hotel       0.03
4  Japanese Restaurant       0.03


----East Toronto----
                Venue  Frequency
0    Greek Restaurant       0.07
1         Coffee Shop       0.06
2  Italian Restaurant       0.04
3                Café       0.04
4             Brewery       0.04


----West Toronto----
                Venue  Frequency
0                Café       0.07
1                 Bar       0.06
2          Restaurant       0.04
3  Italian Restaurant       0.04
4         Coffee Shop       0.04




#### Let's put the TOP 5 into a pandas dataframe

First, let's write a function to sort the venues in descending order.

In [26]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [29]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
Borough_venues_sorted = pd.DataFrame(columns=columns)
Borough_venues_sorted['Borough'] = Toronto_grouped['Borough']

for ind in np.arange(Toronto_grouped.shape[0]):
    Borough_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Toronto_grouped.iloc[ind, :], num_top_venues)

Borough_venues_sorted.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Central Toronto,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
1,Downtown Toronto,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
2,East Toronto,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
3,West Toronto,Café,Bar,Restaurant,Coffee Shop,Italian Restaurant,Bakery,Bookstore,Breakfast Spot,Yoga Studio,Gift Shop


### Cluster Neighborhoods

We are almost there to finish the analisys. Be strong!!

In [30]:
#set number of clusters
kclusters = 3

Toronto_grouped_clustering = Toronto_grouped.drop('Borough',1)

#run k-means
kmeans = KMeans(n_clusters=kclusters, random_state = 0).fit(Toronto_grouped_clustering)
kmeans.labels_[0:10]

array([0, 1, 2, 1], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [31]:
Borough_venues_sorted.insert(0, 'Cluster Lables', kmeans.labels_)

Toronto_merged = LaLoT

Toronto_merged = Toronto_merged.join(Borough_venues_sorted.set_index('Borough'), on='Borough')

Toronto_merged.head()

Unnamed: 0,Postecode,Borough,Neighborhood,Latitude,Logitude,Cluster Lables,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,M4E,East Toronto,The Beaches,43.676357,-79.293031,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
1,M4K,East Toronto,"The Danforth West, Riverdale",43.679557,-79.352188,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
2,M4L,East Toronto,"India Bazaar, The Beaches West",43.668999,-79.315572,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
3,M4M,East Toronto,Studio District,43.659526,-79.340923,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
4,M4N,Central Toronto,Lawrence Park,43.72802,-79.38879,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant


### Visualize clusters

In [39]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=12)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Toronto_merged['Latitude'], Toronto_merged['Logitude'], Toronto_merged['Borough'], Toronto_merged['Cluster Lables']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters       

### Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

In [43]:
Toronto_merged.loc[Toronto_merged['Cluster Lables'] == 0, Toronto_merged.columns[[1] + list(range(5, Toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
5,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
6,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
7,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
8,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
9,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
22,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
23,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant
24,Central Toronto,0,Coffee Shop,Sandwich Place,Park,Pizza Place,Café,Dessert Shop,Gym,Sushi Restaurant,Pub,Restaurant


In [45]:
Toronto_merged.loc[Toronto_merged['Cluster Lables'] == 1, Toronto_merged.columns[[1] + list(range(5, Toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
11,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
12,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
13,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
14,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
15,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
16,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
17,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
18,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym
19,Downtown Toronto,1,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Italian Restaurant,Bakery,Seafood Restaurant,Park,Gym


In [46]:
Toronto_merged.loc[Toronto_merged['Cluster Lables'] == 2, Toronto_merged.columns[[1] + list(range(5, Toronto_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Lables,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,East Toronto,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
1,East Toronto,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
2,East Toronto,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
3,East Toronto,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place
38,East Toronto,2,Greek Restaurant,Coffee Shop,Brewery,Italian Restaurant,Café,Park,Ice Cream Shop,Restaurant,Fast Food Restaurant,Pizza Place


# Conclusion

Analysing the clusters we can name as follow:

    Cluster 0: Living Area (Because has Park, Gym, Pub, Restaurant)
    Cluster 1: Business area (Because has Hotels, many coffee shops and gift shop)
    Cluster 2: Living Area with less population (Because has less options and interaction with Gym, for example)
  