<h1 align=center><font size = 5>Correlation between Malaga and Toronto Neighborhoods</font></h1>

## Introduction

The goal of this notebook is to explore the neighborhoods of Malaga and Toronto in order to identify similarities and correlations between them. In short, the questions to be answered could be: Given a neighborhood X in the city of Toronto, which neighborhoods in Malaga have the same kind of venues?

For the Toronto and Malaga neighborhood data, there are webpages that has all the information needed to explore and cluster the neighborhoods. Webpages are scraped, and then, the data is wrangled and cleaned. After that, the data is loaded into a pandas dataframe so that it is in a structured format. Once the data is in a structured format, the dataset can be explored and the neighborhoods clustered thorugh a machine learning algorithm. Finally, the resulting clusters are analyzed, both analytically and visually.

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Scrape Webpages to obtain the Data</a>
    1. <a href="#item1a">Toronto Data</a>
    2. <a href="#item1b">Malaga Data</a>
    
    
2. <a href="#item2">Data Wrangling / Data Cleaning</a>
    1. <a href="#item2a">Toronto Data</a>
    2. <a href="#item2b">Malaga Data</a>
    
    
3. <a href="#item3">Add Lat/Lon</a>
    1. <a href="#item3a">Toronto Data</a>
    2. <a href="#item3b">Malaga Data</a>
    
    
4. <a href="#item4">Merging Data</a>      

5. <a href="#item5">Explore Neighborhoods</a>  
    
6. <a href="#item6">Analyze Each Neighborhood</a>  
    
7. <a href="#item7">Cluster Neighborhoods</a>  
    
8. <a href="#item8">Examine Clusters</a> 
    
9. <a href="#item9">Conclusion</a>  
  
</font>
</div>

Before getting the data and start exploring it, let's download all the dependencies that we will need.

In [1]:
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import geocoder # import geocoder
import folium # map rendering library
import requests # library to handle requests
from sklearn.cluster import KMeans # import k-means for clustering
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
import matplotlib.cm as cm
import matplotlib.colors as colors

print('Libraries imported.')

Libraries imported.


<a id='item1'></a>

## 1. Scrape Webpages to obtain the Data

In this section, some webpages are scraped in order to obtain the Toronto and Malaga neighborhood data.

<a id='item1a'></a>

### A. Toronto Data

Scrapping the following Wikipedia page, https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M, in order to obtain the data that is in the table of postal codes and to transform the data into a pandas dataframe.

In [14]:
# Getting an old version of the Wikipedia Webpage since the format has changed recently
link = "https://en.wikipedia.org/w/index.php?title=List_of_postal_codes_of_Canada:_M&oldid=935851093"

# For scraping the table, I use pandas to read the table into a pandas dataframe.
tables = pd.read_html(link)

# There are 3 tables in the Wikipedia webpage, but I am interested in the first one
postal_code_toronto = tables[0]
postal_code_toronto.head()

Unnamed: 0,Postcode,Borough,Neighbourhood
0,M1A,Not assigned,Not assigned
1,M2A,Not assigned,Not assigned
2,M3A,North York,Parkwoods
3,M4A,North York,Victoria Village
4,M5A,Downtown Toronto,Harbourfront


<a id='item1b'></a>

### B. Malaga Data

In [15]:
link = "https://wiki.openstreetmap.org/wiki/ES:Lugares_en_M%C3%A1laga"

# For scraping the table, I use pandas to read the table into a pandas dataframe.
tables = pd.read_html(link)

# Thereare 3 tables in the Wiki page, but I am interested in the second one
districts_malaga = tables[1]
districts_malaga.head()

Unnamed: 0,Ref.,Distrito,Barrios
0,1,Centro,"Barcenillas, Campos Elíseos, Cañada de los Ing..."
1,2,Este,"Baños del Carmen, Bellavista, Castillo de Sant..."
2,3,Ciudad Jardín,"Alegría de la Huerta, Ciudad Jardín, Cortijo B..."
3,4,Bailén-Miraflores,"Camino de Suárez, Carlinda, Carlos Haya, Flori..."
4,5,Palma-Palmilla,"26 de febrero, 503 Viviendas, Arroyo de los Án..."


## 2.  Data Wrangling / Data Cleaning

<a id='item2a'></a>

### A. Toronto Data

In [16]:
# Renaming first column
postal_code_toronto.rename(columns = {'Postcode':'PostalCode'}, inplace = True)

In [17]:
# Dropping rows with "Not assigned" Borough
toronto_data = postal_code_toronto[postal_code_toronto['Borough'] != 'Not assigned'].reset_index(drop=True)
toronto_data.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor


In [18]:
# If a cell has a borough but a Not assigned neighborhood, then the neighborhood will be the same as the borough
toronto_data_clean = toronto_data.copy()
toronto_data_clean.loc[toronto_data['Neighbourhood'] == 'Not assigned', 'Neighbourhood'] = toronto_data['Borough']
toronto_data_clean.head(10)

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M3A,North York,Parkwoods
1,M4A,North York,Victoria Village
2,M5A,Downtown Toronto,Harbourfront
3,M6A,North York,Lawrence Heights
4,M6A,North York,Lawrence Manor
5,M7A,Downtown Toronto,Queen's Park
6,M9A,Queen's Park,Queen's Park
7,M1B,Scarborough,Rouge
8,M1B,Scarborough,Malvern
9,M3B,North York,Don Mills North


In [19]:
# More than one neighborhood can exist in one postal code area. 
# In this case, rows will be combined into one row with the neighborhoods separated with a comma
toronto_data_clean_grouped = toronto_data_clean.groupby(['PostalCode', 'Borough'])['Neighbourhood'].apply(lambda x: ', '.join(x)).reset_index()
toronto_data_clean_grouped.head()

Unnamed: 0,PostalCode,Borough,Neighbourhood
0,M1B,Scarborough,"Rouge, Malvern"
1,M1C,Scarborough,"Highland Creek, Rouge Hill, Port Union"
2,M1E,Scarborough,"Guildwood, Morningside, West Hill"
3,M1G,Scarborough,Woburn
4,M1H,Scarborough,Cedarbrae


In [20]:
print("Shape {}".format(toronto_data_clean_grouped.shape))

Shape (103, 3)


<a id='item2b'></a>

### B. Malaga Data

In [26]:
# Remove useless columns
if 'Observaciones' in districts_malaga.columns:
    malaga_data = districts_malaga.drop(columns=['Ref.', 'Observaciones'])
else:
    malaga_data = districts_malaga.drop(columns=['Ref.'])

# Renaming columns
malaga_data.columns = ['Borough', 'Neighbourhood']

malaga_data.head()

Unnamed: 0,Borough,Neighbourhood
0,Centro,"Barcenillas, Campos Elíseos, Cañada de los Ing..."
1,Este,"Baños del Carmen, Bellavista, Castillo de Sant..."
2,Ciudad Jardín,"Alegría de la Huerta, Ciudad Jardín, Cortijo B..."
3,Bailén-Miraflores,"Camino de Suárez, Carlinda, Carlos Haya, Flori..."
4,Palma-Palmilla,"26 de febrero, 503 Viviendas, Arroyo de los Án..."


In [27]:
print("Shape {}".format(malaga_data.shape))

Shape (11, 2)


<a id='item3'></a>

## 3.  Add Lat/Lon

<a id='item3a'></a>

### A. Toronto Data

In [28]:
# Initialize your variables
lats = []
lons = []

# loop through all postal codes
for postal_code in toronto_data_clean_grouped['PostalCode']:
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
      g = geocoder.arcgis('{}, Toronto, Ontario'.format(postal_code))
      lat_lng_coords = g.latlng
    
    # append the new coordinates 
    lats.append(lat_lng_coords[0])
    lons.append(lat_lng_coords[1])

In [29]:
# add coordinates to the DF
toronto_data_clean_grouped['Latitude'] = lats
toronto_data_clean_grouped['Longitude'] = lons

# Remove PostalCode columns since it will not be used anymore
if 'PostalCode' in toronto_data_clean_grouped.columns:
    toronto_data_clean_grouped.drop(columns=['PostalCode'], inplace=True)

toronto_data_clean_grouped.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Scarborough,"Rouge, Malvern",43.811525,-79.195517
1,Scarborough,"Highland Creek, Rouge Hill, Port Union",43.785665,-79.158725
2,Scarborough,"Guildwood, Morningside, West Hill",43.765815,-79.175193
3,Scarborough,Woburn,43.768369,-79.21759
4,Scarborough,Cedarbrae,43.769688,-79.23944


<a id='item3b'></a>

### B. Malaga Data

In [30]:
# Initialize your variables
lats = []
lons = []

# loop through all districts
for postal_code in malaga_data['Borough']:
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
      g = geocoder.arcgis('{}, Malaga, Spain'.format(postal_code))
      lat_lng_coords = g.latlng
    
    # append the new coordinates 
    lats.append(lat_lng_coords[0])
    lons.append(lat_lng_coords[1])

In [31]:
# add coordinates to the DF
malaga_data['Latitude'] = lats
malaga_data['Longitude'] = lons

malaga_data.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Centro,"Barcenillas, Campos Elíseos, Cañada de los Ing...",36.72256,-4.42529
1,Este,"Baños del Carmen, Bellavista, Castillo de Sant...",36.746028,-4.091789
2,Ciudad Jardín,"Alegría de la Huerta, Ciudad Jardín, Cortijo B...",36.74609,-4.42186
3,Bailén-Miraflores,"Camino de Suárez, Carlinda, Carlos Haya, Flori...",36.72438,-4.35065
4,Palma-Palmilla,"26 de febrero, 503 Viviendas, Arroyo de los Án...",36.74094,-4.42922


<a id='item4'></a>

## 4. Merging Data

In [32]:
# [Optional] Work only with boroughs that contain the word Toronto --> TORONTO CITY
toronto_city_data = toronto_data_clean_grouped[toronto_data_clean_grouped['Borough'].str.contains('Toronto')]
toronto_city_data.shape

(39, 4)

In [33]:
toronto_city_data['City'] = 'Toronto'
malaga_data['City'] = 'Malaga'

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


In [34]:
# Concat Toronto and Malaga data
toronto_malaga_data = pd.concat([toronto_city_data, malaga_data])
toronto_malaga_data.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,City
37,East Toronto,The Beaches,43.676531,-79.295425,Toronto
41,East Toronto,"The Danforth West, Riverdale",43.683178,-79.355105,Toronto
42,East Toronto,"The Beaches West, India Bazaar",43.667965,-79.314667,Toronto
43,East Toronto,Studio District,43.660629,-79.334855,Toronto
44,Central Toronto,Lawrence Park,43.72842,-79.387133,Toronto


<a id='item5'></a>

## 5. Explore Neighborhoods

#### Define Foursquare Credentials and Version

In [35]:
CLIENT_ID = 'GFRY1XKD0KM34O1PNLXM2PCVPRDDBEKMKH1XBNKSZJDYWUCH' # your Foursquare ID
CLIENT_SECRET = 'LIFCGOJSSBNFZOY2LXGUNIZEVS2QC5TL2OIVYTQZD0IAF12V' # your Foursquare Secret
VERSION = '20190929'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: GFRY1XKD0KM34O1PNLXM2PCVPRDDBEKMKH1XBNKSZJDYWUCH
CLIENT_SECRET:LIFCGOJSSBNFZOY2LXGUNIZEVS2QC5TL2OIVYTQZD0IAF12V


#### Let's create a function to get the top 100 venues that are in all the neighborhoods in Toronto City within a radius of 500 meters

In [36]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighbourhood', 
                  'Neighbourhood Latitude', 
                  'Neighbourhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Get nearby venues

In [37]:
toronto_malaga_venues = getNearbyVenues(names=toronto_malaga_data['Neighbourhood'],
                                        latitudes=toronto_malaga_data['Latitude'],
                                        longitudes=toronto_malaga_data['Longitude'])
toronto_malaga_venues.head()

Unnamed: 0,Neighbourhood,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,The Beaches,43.676531,-79.295425,Glen Manor Ravine,43.676821,-79.293942,Trail
1,The Beaches,43.676531,-79.295425,The Big Carrot Natural Food Market,43.678879,-79.297734,Health Food Store
2,The Beaches,43.676531,-79.295425,Tori's Bakeshop,43.672114,-79.290331,Vegetarian / Vegan Restaurant
3,The Beaches,43.676531,-79.295425,The Beech Tree,43.680493,-79.288846,Gastropub
4,The Beaches,43.676531,-79.295425,Mastermind Toys,43.671453,-79.293971,Toy / Game Store


Let's check how many venues were returned for each neighborhood

In [38]:
toronto_malaga_venues.groupby('Neighbourhood').count()

Unnamed: 0_level_0,Neighbourhood Latitude,Neighbourhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighbourhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"25 Años de Paz, Alaska, Almudena, Ardira, Ave María, Barceló, Cortijo Vallejo, Dos Hermanas, El Bulto, El Higueral, El Torcal, Finca El Pato, Girón, Guadaljaire, Haza de la Pesebrera, Haza Honda, Huelin, Jardín de la Abadía, La Luz, La Paz, La Princesa, Las Delicias, Los Girasoles, Los Guindos, Mainake, Nuevo San Andrés 1, Nuevo San Andrés 2, Pacífico, Parque Ayala, Parque Mediterráneo, Puerta Blanca, Regio, Sacaba Beach, San Andrés, San Carlos, San Carlos Condote, Santa Isabel, Santa Paula, Sixto, Tabacalera, Torres de la Serna, Virgen de Belén, Vistafranca.",62,62,62,62,62,62
"26 de febrero, 503 Viviendas, Arroyo de los Ángeles, Huerta La Palma, La Palma, La Palmilla, La Roca, La Rosaleda, Las Erizas, Las Virreinas, Martiricos, Virreina, Virreina Alta.",28,28,28,28,28,28
"4 de Diciembre, Arroyo del Cuarto, Camino de Antequera, Carranque, Cortijo de Torres, Cruz del Humilladero, El Duende, Explanada de la Estación, Haza Cuevas, La Asunción, La Aurora, La Barriguilla, La Unión, Los Prados, Los Tilos, Mármoles (Cruz de Humilladero), Nuestra Señora del Carmen, Núcleo General Franco, Polígono Alameda, Polígono Carretera de Cártama, Portada Alta, Renfe, San Rafael, Santa Cristina, Santa Julia, Santa Marta, Tiro de Pichón, San José del Viso, Intelhorce, Sánchez Blanca.",69,69,69,69,69,69
"Adelaide, King, Richmond",100,100,100,100,100,100
"Alegría de la Huerta, Ciudad Jardín, Cortijo Bazán, Hacienda Los Montes, Haza Carpintero, Herrera Oria, Huerta Nueva, Jardín de Málaga, Jardín Virginia, Las Flores, Los Cassini, Los Cipreses, Los Naranjos, Los Viveros, Mangas Verdes, Monte Dorado, Parque del Sur, Sagrada Familia, San José.",18,18,18,18,18,18
"Arroyo España, Atabal Este, El Atabal, El Chaparral, El Cortijuelo-Junta de los Caminos, El Limonero, El Tomillar, Fuente Alegre, Hacienda Altamira, Hacienda Cabello, Huerta Nueva-Puerto de la Torre, Las Morillas 2, Las Morillas-Puerto de la Torre, Los Almendros, Los Asperones 1 y 3, Los Morales, Los Morales 1, Los Morales 2, Los Ramos, Los Tomillares, Orozco, Puertosol, Salinas, Santa Isabel-Puerto de la Torre, Soliva Este, Torremar, Virgen del Carmen.",4,4,4,4,4,4
"Barcenillas, Campos Elíseos, Cañada de los Ingleses, Capuchinos, Centro Histórico, Conde de Ureña, Cristo de la Epidemia, El Ejido, El Molinillo, Ensanche Centro, La Caleta, La Goleta, La Malagueta, La Manía, La Merced, La Trinidad (Centro), La Victoria, Lagunillas, Los Antonios, Mármoles (Centro), Monte Sancha, Olletas, Perchel Norte, Perchel Sur, Pinares de Olletas, Plaza de Toros Vieja, San Felipe Neri, San Miguel, Santa Amalia, Segalerva, Sierra Blanquilla, Ventaja Alta.",100,100,100,100,100,100
"Baños del Carmen, Bellavista, Castillo de Santa Catalina, Cerrado de Calderón, Clavero, Echeverría del Palo, El Candado, El Chanquete, El Drago, El Limonar, El Mayorazgo, El Morlaco, El Palo, El Polvorín, El Rocío, Finca El Candado, Hacienda Clavero, Hacienda Miramar, Hacienda Paredes, Jarazmín, La Araña, La Mosca, La Pelusa, La Pelusilla, La Torrecilla, La Vaguada, La Viña, Las Acacias, Las Cuevas, Las Palmeras, Lomas de San Antón, Los Pinos del Limonar, Miraflores, Miraflores Alto, Miraflores del Palo, Miramar, Miramar del Palo, Olías, Parque Clavero, Pedregalejo, Pedregalejo Playa, Peinado Grande, Pinares de San Antón, Playa Virginia, Playas del Palo, Podadera, San Francisco, San Isidro, Santa Paula Miramar, Torre de San Telmo, Valle de los Galanes, Villa Cristina, Virgen de las Angustias.",51,51,51,51,51,51
Berczy Park,100,100,100,100,100,100
"Brockton, Exhibition Place, Parkdale Village",100,100,100,100,100,100


<a id='item6'></a>

## 6. Analyze Each Neighborhood

In [39]:
# one hot encoding
toronto_malaga_onehot = pd.get_dummies(toronto_malaga_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
toronto_malaga_onehot['Neighbourhood'] = toronto_malaga_venues['Neighbourhood'] 

# move neighborhood column to the first column
fixed_columns = [toronto_malaga_onehot.columns[-1]] + list(toronto_malaga_onehot.columns[:-1])
toronto_malaga_onehot = toronto_malaga_onehot[fixed_columns]

toronto_malaga_onehot.head()

Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,...,Video Store,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,The Beaches,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [40]:
toronto_malaga_grouped = toronto_malaga_onehot.groupby('Neighbourhood').mean().reset_index()
toronto_malaga_grouped.head()

Unnamed: 0,Neighbourhood,Accessories Store,Afghan Restaurant,American Restaurant,Animal Shelter,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,...,Video Store,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Winery,Wings Joint,Women's Store,Yoga Studio
0,"25 Años de Paz, Alaska, Almudena, Ardira, Ave ...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"26 de febrero, 503 Viviendas, Arroyo de los Án...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"4 de Diciembre, Arroyo del Cuarto, Camino de A...",0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Adelaide, King, Richmond",0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Alegría de la Huerta, Ciudad Jardín, Cortijo B...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each neighborhood along with the top 5 most common venues

In [41]:
num_top_venues = 5

for hood in toronto_malaga_grouped['Neighbourhood']:
    print("----"+hood+"----")
    temp = toronto_malaga_grouped[toronto_malaga_grouped['Neighbourhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----25 Años de Paz, Alaska, Almudena, Ardira, Ave María, Barceló, Cortijo Vallejo, Dos Hermanas, El Bulto, El Higueral, El Torcal, Finca El Pato, Girón, Guadaljaire, Haza de la Pesebrera, Haza Honda, Huelin, Jardín de la Abadía, La Luz, La Paz, La Princesa, Las Delicias, Los Girasoles, Los Guindos, Mainake, Nuevo San Andrés 1, Nuevo San Andrés 2, Pacífico, Parque Ayala, Parque Mediterráneo, Puerta Blanca, Regio, Sacaba Beach, San Andrés, San Carlos, San Carlos Condote, Santa Isabel, Santa Paula, Sixto, Tabacalera, Torres de la Serna, Virgen de Belén, Vistafranca.----
              venue  freq
0        Restaurant  0.10
1  Tapas Restaurant  0.06
2              Café  0.06
3     Grocery Store  0.05
4     Shopping Mall  0.03


----26 de febrero, 503 Viviendas, Arroyo de los Ángeles, Huerta La Palma, La Palma, La Palmilla, La Roca, La Rosaleda, Las Erizas, Las Virreinas, Martiricos, Virreina, Virreina Alta.----
                 venue  freq
0       Clothing Store  0.14
1       Ice Cream Shop 

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [42]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)

    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [120]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighbourhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighbourhood'] = toronto_malaga_grouped['Neighbourhood']

for ind in np.arange(toronto_malaga_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(toronto_malaga_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighbourhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"25 Años de Paz, Alaska, Almudena, Ardira, Ave ...",Restaurant,Tapas Restaurant,Café,Grocery Store,Italian Restaurant,Shopping Mall,Pub,Nightclub,Hotel,Burger Joint
1,"26 de febrero, 503 Viviendas, Arroyo de los Án...",Clothing Store,Ice Cream Shop,Sporting Goods Shop,Bar,Coffee Shop,Tapas Restaurant,Market,Chinese Restaurant,Café,Supermarket
2,"4 de Diciembre, Arroyo del Cuarto, Camino de A...",Spanish Restaurant,Burger Joint,Café,Italian Restaurant,Ice Cream Shop,Grocery Store,Restaurant,Bakery,Plaza,Electronics Store
3,"Adelaide, King, Richmond",Hotel,Coffee Shop,Japanese Restaurant,Café,Theater,Restaurant,Vegetarian / Vegan Restaurant,Concert Hall,Italian Restaurant,Pizza Place
4,"Alegría de la Huerta, Ciudad Jardín, Cortijo B...",Grocery Store,Ice Cream Shop,Park,Snack Place,Market,Bar,Mexican Restaurant,BBQ Joint,Café,Gym / Fitness Center


<a id='item7'></a>

## 7. Cluster Neighborhoods

Run *k*-means using different "k" values to find for which "k" there are more cluster sharing Malaga and Toronto Neighborhoods.

In [128]:
# In total, summing Malaga and Toronto, there are 50 neighborhoods, so it does not make sense to try "k" > 25
k_range = range(2, 25)

# set number of clusters
kclusters = k_range

# Drop "Neighbourhood" column
toronto_malaga_grouped_clustering = toronto_malaga_grouped.drop('Neighbourhood', 1)

# Score Array
score = []

for k in kclusters:
    # run k-means clustering
    kmeans = KMeans(n_clusters=k, random_state=5, n_init=50).fit(toronto_malaga_grouped_clustering)
    
    # add clustering labels
    if not 'Cluster_Labels' in neighborhoods_venues_sorted:
        neighborhoods_venues_sorted.insert(0, 'Cluster_Labels', kmeans.labels_)
        toronto_malaga_merged = toronto_malaga_data.copy()
        
        # merge neighborhoods_venues_sorted with toronto_malaga_data to add latitude/longitude for each neighborhood
        toronto_malaga_merged = toronto_malaga_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')
    else:
        toronto_malaga_merged['Cluster_Labels'] = kmeans.labels_
        
    toronto_clusters = toronto_malaga_merged[toronto_malaga_merged['City'] == 'Toronto']['Cluster_Labels']
    malaga_clusters = toronto_malaga_merged[toronto_malaga_merged['City'] == 'Malaga']['Cluster_Labels']
      
    # "Score" is the amount of clusters where there is at least one Malaga Neighborhood and one Toronto Neighborhood
    score.append(len(np.intersect1d(toronto_clusters.unique(), malaga_clusters.unique())))

# Which is the better "k" ? (Highest Score)
highest_score_k = kclusters[np.argmax(score)]
highest_score_k

20

In [129]:
# run k-means clustering with the better "k" found
kmeans = KMeans(n_clusters=highest_score_k, random_state=5, n_init=50).fit(toronto_malaga_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_

array([17, 14, 17,  2, 12,  5, 17,  8,  2, 10,  3,  2, 19, 16,  9,  4, 15,
       13, 10, 16, 13,  2, 19, 19, 18,  2, 10,  2, 18, 10, 19,  7,  0, 11,
       10, 18, 19, 19, 13,  1,  6, 19, 13,  2,  2,  0, 16, 16, 16, 19])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [130]:
# Add clustering labels
if not 'Cluster_Labels' in neighborhoods_venues_sorted:
    neighborhoods_venues_sorted.insert(0, 'Cluster_Labels', kmeans.labels_)
    toronto_malaga_merged = toronto_malaga_data.copy()

    # merge neighborhoods_venues_sorted with toronto_malaga_data to add latitude/longitude for each neighborhood
    toronto_malaga_merged = toronto_malaga_merged.join(neighborhoods_venues_sorted.set_index('Neighbourhood'), on='Neighbourhood')
else:
    toronto_malaga_merged['Cluster_Labels'] = kmeans.labels_

toronto_malaga_merged.head()

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,East Toronto,The Beaches,43.676531,-79.295425,Toronto,17,Pub,Coffee Shop,Park,Beach,Breakfast Spot,Bakery,BBQ Joint,Sandwich Place,Caribbean Restaurant,Restaurant
41,East Toronto,"The Danforth West, Riverdale",43.683178,-79.355105,Toronto,14,Greek Restaurant,Coffee Shop,Café,Pub,Pizza Place,Fast Food Restaurant,Italian Restaurant,Bakery,Ice Cream Shop,Yoga Studio
42,East Toronto,"The Beaches West, India Bazaar",43.667965,-79.314667,Toronto,17,Indian Restaurant,Coffee Shop,Restaurant,Pub,Beach,Italian Restaurant,Fast Food Restaurant,Park,Pizza Place,Bakery
43,East Toronto,Studio District,43.660629,-79.334855,Toronto,2,Coffee Shop,Café,Pizza Place,Italian Restaurant,Bakery,Bar,American Restaurant,Diner,Brewery,Bank
44,Central Toronto,Lawrence Park,43.72842,-79.387133,Toronto,12,Café,Coffee Shop,Park,Bus Line,Bookstore,Restaurant,College Quad,College Gym,Trail,Gym / Fitness Center


Let's get the geographical coordinates of Toronto.

In [131]:
address = 'Toronto'

geolocator = Nominatim(user_agent="geo_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 43.653963, -79.387207.


Let's visualize the resulting clusters on Toronto

In [133]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(highest_score_k)
ys = [i + x + (i*x)**2 for i in range(highest_score_k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_malaga_merged['Latitude'], toronto_malaga_merged['Longitude'], toronto_malaga_merged['Neighbourhood'], toronto_malaga_merged['Cluster_Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

Let's get the geographical coordinates of Malaga.

In [134]:
address = 'Malaga'

geolocator = Nominatim(user_agent="geo_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Toronto are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Toronto are 36.7213028, -4.4216366.


Finally, let's visualize the resulting clusters on Malaga

In [135]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(highest_score_k)
ys = [i + x + (i*x)**2 for i in range(highest_score_k)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(toronto_malaga_merged['Latitude'], toronto_malaga_merged['Longitude'], toronto_malaga_merged['Neighbourhood'], toronto_malaga_merged['Cluster_Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item8'></a>

## 8. Examine Clusters

Now, we can examine each cluster and determine which one has a correlation between Malaga and Toronto Neighborhoods.
Then, we can understand which are the common venues defining and arranging those clusters.

In [139]:
# Loop thorugh all clusters
for cluster in range(highest_score_k):
    print('CLUSTER #{}:'.format(cluster))
    display(toronto_malaga_merged.loc[toronto_malaga_merged['Cluster_Labels'] == cluster, toronto_malaga_merged.columns[[0,1,4] + list(range(5, toronto_malaga_merged.shape[1]))]])
    is_malaga_toronto = len(toronto_malaga_merged.loc[toronto_malaga_merged['Cluster_Labels'] == cluster, 'City'].unique()) == 2
    if is_malaga_toronto:
        print('THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD')
    else:
        print('THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO')
    print('\n\n')

CLUSTER #0:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
77,West Toronto,"Little Portugal, Trinity",Toronto,0,Bar,Café,Coffee Shop,Restaurant,Bakery,Asian Restaurant,Pizza Place,Italian Restaurant,Cocktail Bar,Wine Bar
6,Carretera de Cádiz,"25 Años de Paz, Alaska, Almudena, Ardira, Ave ...",Malaga,0,Restaurant,Tapas Restaurant,Café,Grocery Store,Italian Restaurant,Shopping Mall,Pub,Nightclub,Hotel,Burger Joint


THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD



CLUSTER #1:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Centro,"Barcenillas, Campos Elíseos, Cañada de los Ing...",Malaga,1,Tapas Restaurant,Plaza,Café,Spanish Restaurant,Mediterranean Restaurant,Bar,Restaurant,Coffee Shop,Pub,Cocktail Bar


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #2:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
43,East Toronto,Studio District,Toronto,2,Coffee Shop,Café,Pizza Place,Italian Restaurant,Bakery,Bar,American Restaurant,Diner,Brewery,Bank
48,Central Toronto,"Moore Park, Summerhill East",Toronto,2,Coffee Shop,Park,Italian Restaurant,Grocery Store,Thai Restaurant,Gym,Gastropub,Café,Pub,Sushi Restaurant
51,Downtown Toronto,"Cabbagetown, St. James Town",Toronto,2,Park,Restaurant,Café,Japanese Restaurant,Gastropub,Coffee Shop,Pool,Diner,Rock Club,Theater
61,Downtown Toronto,"Commerce Court, Victoria Hotel",Toronto,2,Coffee Shop,Café,Restaurant,Hotel,Japanese Restaurant,Gastropub,Bakery,Concert Hall,Beer Bar,Art Gallery
66,Downtown Toronto,"Harbord, University of Toronto",Toronto,2,Café,Restaurant,Vegetarian / Vegan Restaurant,Bar,Bakery,Mexican Restaurant,Bookstore,Coffee Shop,Yoga Studio,Burrito Place
68,Downtown Toronto,"CN Tower, Bathurst Quay, Island airport, Harbo...",Toronto,2,Coffee Shop,Italian Restaurant,Bar,Bakery,Café,Restaurant,Yoga Studio,Gym,French Restaurant,Park
4,Palma-Palmilla,"26 de febrero, 503 Viviendas, Arroyo de los Án...",Malaga,2,Clothing Store,Ice Cream Shop,Sporting Goods Shop,Bar,Coffee Shop,Tapas Restaurant,Market,Chinese Restaurant,Café,Supermarket
5,Cruz de Humilladero,"4 de Diciembre, Arroyo del Cuarto, Camino de A...",Malaga,2,Spanish Restaurant,Burger Joint,Café,Italian Restaurant,Ice Cream Shop,Grocery Store,Restaurant,Bakery,Plaza,Electronics Store


THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD



CLUSTER #3:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
50,Downtown Toronto,Rosedale,Toronto,3,Park,Neighborhood,Athletics & Sports,Playground,Trail,Beer Store,Grocery Store,Candy Store,Yoga Studio,Donut Shop


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #4:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
55,Downtown Toronto,St. James Town,Toronto,4,Coffee Shop,Café,Restaurant,Seafood Restaurant,Hotel,Bakery,Italian Restaurant,Cosmetics Shop,Breakfast Spot,Farmers Market


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #5:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
45,Central Toronto,Davisville North,Toronto,5,Coffee Shop,Italian Restaurant,Pizza Place,Café,Dessert Shop,Fast Food Restaurant,Sushi Restaurant,Japanese Restaurant,Pharmacy,Park


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #6:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Este,"Baños del Carmen, Bellavista, Castillo de Sant...",Malaga,6,Seafood Restaurant,Mediterranean Restaurant,Spanish Restaurant,Café,Tapas Restaurant,Cocktail Bar,Ice Cream Shop,Restaurant,Food,Beach Bar


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #7:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
76,West Toronto,"Dovercourt Village, Dufferin",Toronto,7,Bar,Café,Bakery,Coffee Shop,Grocery Store,Park,Ice Cream Shop,Middle Eastern Restaurant,Skating Rink,Italian Restaurant


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #8:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
47,Central Toronto,Davisville,Toronto,8,Indian Restaurant,Italian Restaurant,Sushi Restaurant,Coffee Shop,Café,Restaurant,Gym,Pizza Place,Dessert Shop,Bar


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #9:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
54,Downtown Toronto,"Ryerson, Garden District",Toronto,9,Coffee Shop,Clothing Store,Middle Eastern Restaurant,Restaurant,Gastropub,Italian Restaurant,Japanese Restaurant,Diner,Plaza,Café


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #10:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
49,Central Toronto,"Deer Park, Forest Hill SE, Rathnelly, South Hi...",Toronto,10,Coffee Shop,Sushi Restaurant,Sandwich Place,Park,Café,Thai Restaurant,Italian Restaurant,Bank,Gym,Grocery Store
58,Downtown Toronto,"Adelaide, King, Richmond",Toronto,10,Hotel,Coffee Shop,Japanese Restaurant,Café,Theater,Restaurant,Vegetarian / Vegan Restaurant,Concert Hall,Italian Restaurant,Pizza Place
67,Downtown Toronto,"Chinatown, Grange Park, Kensington Market",Toronto,10,Café,Vegetarian / Vegan Restaurant,Bar,Vietnamese Restaurant,Coffee Shop,Yoga Studio,Bakery,Mexican Restaurant,Art Gallery,Comfort Food Restaurant
70,Downtown Toronto,"First Canadian Place, Underground city",Toronto,10,Hotel,Coffee Shop,Café,Restaurant,Japanese Restaurant,Theater,Lounge,Seafood Restaurant,Pizza Place,Concert Hall
82,West Toronto,"High Park, The Junction South",Toronto,10,Bar,Café,Coffee Shop,Thai Restaurant,Italian Restaurant,Convenience Store,Pizza Place,Breakfast Spot,Mexican Restaurant,Sushi Restaurant


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #11:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
78,West Toronto,"Brockton, Exhibition Place, Parkdale Village",Toronto,11,Coffee Shop,Restaurant,Bar,Café,Gift Shop,Furniture / Home Store,Bakery,Athletics & Sports,New American Restaurant,Cocktail Bar


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #12:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
44,Central Toronto,Lawrence Park,Toronto,12,Café,Coffee Shop,Park,Bus Line,Bookstore,Restaurant,College Quad,College Gym,Trail,Gym / Fitness Center


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #13:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
57,Downtown Toronto,Central Bay Street,Toronto,13,Coffee Shop,Clothing Store,Japanese Restaurant,Italian Restaurant,Sushi Restaurant,Department Store,Bookstore,Seafood Restaurant,Café,Restaurant
60,Downtown Toronto,"Design Exchange, Toronto Dominion Centre",Toronto,13,Coffee Shop,Hotel,Restaurant,Café,Italian Restaurant,Japanese Restaurant,Monument / Landmark,Beer Bar,Pizza Place,Plaza
87,East Toronto,Business Reply Mail Processing Centre 969 Eastern,Toronto,13,Coffee Shop,Pizza Place,Café,Hotel,Restaurant,Italian Restaurant,Japanese Restaurant,Fast Food Restaurant,Seafood Restaurant,Pub
3,Bailén-Miraflores,"Camino de Suárez, Carlinda, Carlos Haya, Flori...",Malaga,13,Seafood Restaurant,Beach,Spanish Restaurant,Restaurant,Food & Drink Shop,Supermarket,Market,Outdoors & Recreation,Golf Course,Cosmetics Shop


THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD



CLUSTER #14:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
41,East Toronto,"The Danforth West, Riverdale",Toronto,14,Greek Restaurant,Coffee Shop,Café,Pub,Pizza Place,Fast Food Restaurant,Italian Restaurant,Bakery,Ice Cream Shop,Yoga Studio


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #15:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
56,Downtown Toronto,Berczy Park,Toronto,15,Coffee Shop,Hotel,Restaurant,Café,Japanese Restaurant,Beer Bar,Bakery,Park,Breakfast Spot,Gym


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #16:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
53,Downtown Toronto,Harbourfront,Toronto,16,Coffee Shop,Pub,Park,Theater,Café,Bakery,Boat or Ferry,Italian Restaurant,Lounge,Bar
59,Downtown Toronto,"Harbourfront East, Toronto Islands, Union Station",Toronto,16,Harbor / Marina,Beach,Café,Park,Boat or Ferry,Disc Golf,Pier,Farmers Market,Farm,Falafel Restaurant
7,Churriana,"Buenavista, Cañada de Ceuta, Carambuco, Churri...",Malaga,16,Supermarket,Pharmacy,Café,Seafood Restaurant,Tennis Stadium,Burger Joint,Pizza Place,Video Store,Diner,Dessert Shop
8,Campanillas,"Campanillas, Castañetas, Colmenarejo, El Brill...",Malaga,16,Spanish Restaurant,Hotel,Pizza Place,Bank,Mediterranean Restaurant,Grocery Store,Smoke Shop,Café,Farm,Falafel Restaurant
9,Puerto de la Torre,"Arroyo España, Atabal Este, El Atabal, El Chap...",Malaga,16,Shipping Store,Coffee Shop,Metro Station,Park,Filipino Restaurant,Fast Food Restaurant,Farmers Market,Farm,Falafel Restaurant,Fish & Chips Shop


THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD



CLUSTER #17:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
37,East Toronto,The Beaches,Toronto,17,Pub,Coffee Shop,Park,Beach,Breakfast Spot,Bakery,BBQ Joint,Sandwich Place,Caribbean Restaurant,Restaurant
42,East Toronto,"The Beaches West, India Bazaar",Toronto,17,Indian Restaurant,Coffee Shop,Restaurant,Pub,Beach,Italian Restaurant,Fast Food Restaurant,Park,Pizza Place,Bakery
46,Central Toronto,North Toronto West,Toronto,17,Coffee Shop,Skating Rink,Restaurant,Italian Restaurant,Park,Diner,Mexican Restaurant,Café,Yoga Studio,Sushi Restaurant


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #18:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
65,Central Toronto,"The Annex, North Midtown, Yorkville",Toronto,18,Café,Coffee Shop,Gym,Restaurant,Pub,Italian Restaurant,Pizza Place,Mexican Restaurant,Vegetarian / Vegan Restaurant,Sandwich Place
69,Downtown Toronto,Stn A PO Boxes 25 The Esplanade,Toronto,18,Coffee Shop,Pizza Place,Café,Hotel,Restaurant,Italian Restaurant,Japanese Restaurant,Fast Food Restaurant,Seafood Restaurant,Pub
83,West Toronto,"Parkdale, Roncesvalles",Toronto,18,Coffee Shop,Café,Pizza Place,Restaurant,Grocery Store,Park,Sushi Restaurant,Bakery,Eastern European Restaurant,Pub


THIS CLUSTER HAS NO CORRELATION BETWEEN MALAGA AND TORONTO



CLUSTER #19:


Unnamed: 0,Borough,Neighbourhood,City,Cluster_Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
52,Downtown Toronto,Church and Wellesley,Toronto,19,Coffee Shop,Japanese Restaurant,Restaurant,Park,Gay Bar,Gastropub,Ramen Restaurant,Pizza Place,Men's Store,Hotel
63,Central Toronto,Roselawn,Toronto,19,Pharmacy,Bank,Trail,Skating Rink,Café,Fast Food Restaurant,Farmers Market,Farm,Falafel Restaurant,Event Space
64,Central Toronto,"Forest Hill North, Forest Hill West",Toronto,19,Park,Café,Sushi Restaurant,Italian Restaurant,Coffee Shop,Bank,Jewelry Store,Sandwich Place,Garden,Bookstore
75,Downtown Toronto,Christie,Toronto,19,Korean Restaurant,Café,Grocery Store,Coffee Shop,Mexican Restaurant,Park,Cocktail Bar,Japanese Restaurant,Ethiopian Restaurant,Diner
84,West Toronto,"Runnymede, Swansea",Toronto,19,Café,Coffee Shop,Pizza Place,Bakery,Pub,Falafel Restaurant,Sushi Restaurant,Gastropub,Italian Restaurant,Bar
85,Downtown Toronto,Queen's Park,Toronto,19,Coffee Shop,Gastropub,Japanese Restaurant,Bubble Tea Shop,Dessert Shop,Park,Italian Restaurant,Sushi Restaurant,Supermarket,Seafood Restaurant
2,Ciudad Jardín,"Alegría de la Huerta, Ciudad Jardín, Cortijo B...",Malaga,19,Grocery Store,Ice Cream Shop,Park,Snack Place,Market,Bar,Mexican Restaurant,BBQ Joint,Café,Gym / Fitness Center
10,Teatinos-Universidad,"Cañada de los Cardos, Ciudad Santa Inés, Colon...",Malaga,19,Pizza Place,Tapas Restaurant,Fast Food Restaurant,Café,Pub,Coffee Shop,Grocery Store,Playground,Auto Workshop,Chinese Restaurant


THIS CLUSTER HAS MALAGA AND TORONTO NEIGHBOURHOOD





<a id='item9'></a>

## 9. Conclusion

- **Clusters 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 15, 17, 18:** THESE CLUSTERS HAVE NO CORRELATION BETWEEN MALAGA AND TORONTO

- **Cluster 0:** There is 1 Toronto Neighbourhood and 1 Malaga Neighborhood where the main spots are cheap Restaurants and Pubs. Due to this, we can name this cluster as **Teenagers and Nightlife**

- **Cluster 2:** There are 6 Toronto Neighbourhoods and 2 Malaga Neighborhoods where the main spots are are Restaurants (Italian, Fast Food, Japanese ...) and Cafés. Due to this, we can name this cluster as **Restaurants and Cafés**

- **Cluster 13:** There are 3 Toronto Neighbourhoods and 1 Malaga Neighborhood where there a huge mix of everything (shops, cafés, department stores, hotels, pubs ...). Due to this, we can name this cluster as **Regular**

- **Cluster 16:** There are 2 Toronto Neighbourhoods and 3 Malaga Neighborhoods where the main spots are places close or related to the sea (Harbors, Ferries, Seafood Restaurants .... ) and Farmer Markerts. Due to this, we can name this cluster as **Sea and Farms**

- **Cluster 19:** There are 6 Toronto Neighbourhoods and 2 Malaga Neighborhoods where the main spots are Asian restaurants (sushi, japanese, Korean ...), Cafés (or Coffee Shops) and Parks. Due to this, we can name this cluster as **Asian Restaurants, Cafés and Parks**