# The Battle of Neighborhoods - MADRID

### Introduction

I have selected the city of Madrid to carry out my capstone project. The objective of carrying out this analysis is to make a segmentation and a cluster of each one of the zones to select the most appropriate to open a restaurant.

Madrid is my city. Before starting I would like to give a brief presentation of this great city.

The capital and economic center of Spain is a very diverse city with a great culture. Full of charming corners with great restaurants and places of great interest such as the Padro museum or the Santiago Bernabeu stadium

Therefore, the objective of the project can be summarized in the following question:<b> What is the most suitable neighborhood to open a new restaurant? </b>

To achieve and answer this question, we begin by identifying the sources of the information.
The following page will be used and more specifically the link excel to obtain the neighborhoods and districts of the city of Madrid

<b>Page:</b>

http://www.madrid.org/iestadis/fijas/clasificaciones/barrios.htm

<b>Excel:</b>

http://www.madrid.org/iestadis/fijas/clasificaciones/descarga/cobar18.xls


In the first part, we carry out an analysis and treatment of this information. Ending this section with the search for the latitude and longitude of each of these neighborhoods to be able to use the advantages of the Foursquare library.

<b>Let's start</b>

### Import and Download Libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    altair-4.1.0               |             py_1         614 KB  conda-forge
    branca-0.4.0               |             py_0          26 KB  conda-forge
    openssl-1.1.1f             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2020.4.5.1 |       hecc5488_0         146 KB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    certifi-2020.4.5.1         |   py36h9f0ad1d_0         151 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    ------------------------------------------------------------
                       

### Load Data as Dataframe

In [2]:
df = pd.read_excel('http://www.madrid.org/iestadis/fijas/clasificaciones/descarga/cobar18.xls') 
df.head()

Unnamed: 0,munic,distr,ldistr,barrio,descrip,secci
0,796,1,Centro,1,Palacio,1
1,796,1,Centro,1,Palacio,2
2,796,1,Centro,1,Palacio,3
3,796,1,Centro,1,Palacio,4
4,796,1,Centro,1,Palacio,6


### Cleaning data and pre-procesing

In [3]:
df.rename(columns={'ldistr':'Borough', 'descrip': 'Neighborhood'}, inplace=True)
df['munic'] = df['munic'].apply(lambda x: str(x))
df['distr'] = df['distr'].apply(lambda x: str(x))
df['barrio'] = df['barrio'].apply(lambda x: str(x))
df['PostalCode'] = df['munic'] + df['distr']
df = df.drop(['barrio', 'secci', 'distr', 'munic'], axis=1)
df.head()

Unnamed: 0,Borough,Neighborhood,PostalCode
0,Centro,Palacio,7961
1,Centro,Palacio,7961
2,Centro,Palacio,7961
3,Centro,Palacio,7961
4,Centro,Palacio,7961


In [4]:
df.shape

(2443, 3)

In [5]:
df2 = df.drop_duplicates()
df2.reset_index(drop=True, inplace=True)
df2.head()

Unnamed: 0,Borough,Neighborhood,PostalCode
0,Centro,Palacio,7961
1,Centro,Embajadores,7961
2,Centro,Cortes,7961
3,Centro,Justicia,7961
4,Centro,Universidad,7961


In [6]:
df2.shape

(131, 3)

In [7]:
df3 = df2.groupby(['PostalCode','Borough'], sort=True).agg(', '.join)
df3.reset_index(inplace=True)
df3.shape

(21, 3)

In [8]:
df3.head()

Unnamed: 0,PostalCode,Borough,Neighborhood
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer..."
1,79610,Latina,"Los Cármenes, Puerta del Angel, Lucero, Aluche..."
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu..."
3,79612,Usera,"Orcasitas, Orcasur, San Fermín, Almendrales, M..."
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer..."


### Find Latitude and Longitude

In [22]:
from geopy.geocoders import Nominatim

def get_geocoder(borough_from_df):
    # Addres
    address = 'Madrid ' + borough_from_df + ', Spain'
    print('Search address', address)
    # Obtein the latitude and longitude
    geolocator = Nominatim(user_agent="ny_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    return latitude,longitude

In [23]:
df3['Latitude'] = 0.0
df3['Longitude'] = 0.0

In [24]:
for i in range(0,len(df3)):
    df3['Latitude'][i],df3['Longitude'][i]=get_geocoder(df3.iloc[i]['Borough'])

Search address Madrid Centro, Spain


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  from ipykernel import kernelapp as app


Search address Madrid Latina, Spain
Search address Madrid Carabanchel, Spain
Search address Madrid Usera, Spain
Search address Madrid Puente de Vallecas, Spain
Search address Madrid Moratalaz, Spain
Search address Madrid Ciudad Lineal, Spain
Search address Madrid Hortaleza, Spain
Search address Madrid Villaverde, Spain
Search address Madrid Villa de Vallecas, Spain
Search address Madrid Vicalvaro, Spain
Search address Madrid Arganzuela, Spain
Search address Madrid San Blas, Spain
Search address Madrid Barajas, Spain
Search address Madrid Retiro, Spain
Search address Madrid Salamanca, Spain
Search address Madrid Chamartin, Spain
Search address Madrid Tetuan, Spain
Search address Madrid Chamberi, Spain
Search address Madrid Fuencarral, Spain
Search address Madrid Moncloa-Aravaca, Spain


In [16]:
df3

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819
1,79610,Latina,"Los Cármenes, Puerta del Angel, Lucero, Aluche...",40.411603,-3.749912
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",40.375855,-3.74091
3,79612,Usera,"Orcasitas, Orcasur, San Fermín, Almendrales, M...",40.37754,-3.715229
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer...",40.381633,-3.668024
5,79614,Moratalaz,"Pavones, Horcajo, Marroquina, Media Legua, Fon...",40.400081,-3.631538
6,79615,Ciudad Lineal,"Ventas, Pueblo Nuevo, Quintana, Concepción, Sa...",40.43398,-3.657251
7,79616,Hortaleza,"Palomas, Piovera, Canillas, Pinar del Rey, Apo...",40.458139,-3.641003
8,79617,Villaverde,"Villaverde alto, Casco Histórico de Villaverde...",40.358858,-3.708645
9,79618,Villa de Vallecas,"Casco Histórico de Vallecas, Santa Eugenia, En...",40.373537,-3.614098


### Plot Result

In [17]:
from geopy import Nominatim
address = 'Madrid, Spain'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Madrid are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Madrid are 40.4167047, -3.7035825.


In [18]:
map_madrid = folium.Map(location=[latitude,longitude],zoom_start=10)

for lat,lng,borough,neighbourhood in zip(df3['Latitude'],df3['Longitude'],df3['Borough'],df3['Neighborhood']):
    label = '{}, {}'.format(neighbourhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [lat,lng],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_madrid)
map_madrid

### Foursquare

In [45]:
# The code was removed by Watson Studio for sharing.

In [46]:
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [47]:
df3_venues = getNearbyVenues(names=df3['Neighborhood'],
                                   latitudes=df3['Latitude'],
                                   longitudes=df3['Longitude']
                                  )

Palacio, Embajadores, Cortes, Justicia, Universidad, Sol
Los Cármenes, Puerta del Angel, Lucero, Aluche, Campamento, Cuatro Vientos, Las Águilas
Comillas, Opañel, San Isidro, Vista Alegre, Puerta Bonita, Buenavista, Abrantes
Orcasitas, Orcasur, San Fermín, Almendrales, Moscardó, Zofío, Pradolongo
Entrevías, San Diego, Palomeras Bajas, Palomeras Sureste, Portazgo, Numancia
Pavones, Horcajo, Marroquina, Media Legua, Fontarrón, Vinateros
Ventas, Pueblo Nuevo, Quintana, Concepción, San Pascual, San Juan Bautista, Colina, Atalaya, Costillares
Palomas, Piovera, Canillas, Pinar del Rey, Apostol Santiago, Valdefuentes
Villaverde alto, Casco Histórico de Villaverde, San Cristobal, Butarque, Los Rosales, Los Angeles
Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas
Casco Histórico de Vicálvaro, Valdebernardo, Valderrivas, El Cañaveral
Imperial, Acacias, Chopera, Legazpi, Delicias, Palos de Moguer, Atocha
Simancas, Hellín, Amposta, Arcos, Rosas, Rejas, Canillejas, Salvador
Alameda 

In [48]:
print(df3_venues.shape)
df3_venues.head()

(711, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,Honest Greens,40.42488,-3.697894,Restaurant
1,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,Urso hotel & spa,40.426825,-3.698169,Hotel
2,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,DSTAgE,40.424729,-3.696305,Restaurant
3,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,Macera Tallerbar,40.426102,-3.698176,Cocktail Bar
4,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,La Duquesita,40.42551,-3.696688,Dessert Shop


In [49]:
df3_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Alameda de Osuna, Aeropuerto, Casco Histórico de Barajas, Timón, Corralejos",1,1,1,1,1,1
"Bellas Vistas, Cuatro Caminos, Castillejos, Almenara, Valdeacederas, Berruguete",29,29,29,29,29,29
"Casa de Campo, Argüelles, Ciudad Universitaria, Valdezarza, Valdemarín, El Plantío, Aravaca",4,4,4,4,4,4
"Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas",7,7,7,7,7,7
"Casco Histórico de Vicálvaro, Valdebernardo, Valderrivas, El Cañaveral",15,15,15,15,15,15
"Comillas, Opañel, San Isidro, Vista Alegre, Puerta Bonita, Buenavista, Abrantes",8,8,8,8,8,8
"El Pardo, Fuentelareina, Peñagrande, Pilar, La Paz, Valverde, Mirasierra, El Goloso",11,11,11,11,11,11
"El Viso, Prosperidad, Ciudad Jardín, Hispanoamérica, Nueva España, Castilla",40,40,40,40,40,40
"Entrevías, San Diego, Palomeras Bajas, Palomeras Sureste, Portazgo, Numancia",14,14,14,14,14,14
"Gaztambide, Arapiles, Trafalgar, Almagro, Rios Rosas, Vallehermoso",74,74,74,74,74,74


In [50]:
print('There are {} uniques categories.'.format(len(df3_venues['Venue Category'].unique())))

There are 151 uniques categories.


In [51]:
# one hot encoding
df3_onehot = pd.get_dummies(df3_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
df3_onehot['Neighborhood'] = df3_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [df3_onehot.columns[-1]] + list(df3_onehot.columns[:-1])
df3_onehot = df3_onehot[fixed_columns]

df3_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,American Restaurant,Apres Ski Bar,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Cafeteria,Café,Casino,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Event Space,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Fountain,Frozen Yogurt Shop,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motorcycle Shop,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Noodle House,Optical Shop,Other Nightlife,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Piano Bar,Pie Shop,Pizza Place,Platform,Plaza,Pool,Pool Hall,Pub,Ramen Restaurant,Rental Car Location,Restaurant,Road,Roof Deck,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Trail,Train Station,Travel Lounge,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [52]:
df3_grouped = df3_onehot.groupby('Neighborhood').mean().reset_index()
df3_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,American Restaurant,Apres Ski Bar,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Buffet,Building,Burger Joint,Burrito Place,Bus Station,Business Service,Butcher,Cafeteria,Café,Casino,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Event Space,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Fountain,Frozen Yogurt Shop,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motorcycle Shop,Multiplex,Museum,Music Venue,Nightclub,Noodle House,Optical Shop,Other Nightlife,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Peruvian Restaurant,Pet Store,Pharmacy,Piano Bar,Pie Shop,Pizza Place,Platform,Plaza,Pool,Pool Hall,Pub,Ramen Restaurant,Rental Car Location,Restaurant,Road,Roof Deck,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Toy / Game Store,Trail,Train Station,Travel Lounge,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop
0,"Alameda de Osuna, Aeropuerto, Casco Histórico ...",0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,"Bellas Vistas, Cuatro Caminos, Castillejos, Al...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.034483,0.034483,0.0,0.0,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.172414,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0
2,"Casa de Campo, Argüelles, Ciudad Universitaria...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Casco Histórico de Vallecas, Santa Eugenia, En...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.142857,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"Casco Histórico de Vicálvaro, Valdebernardo, V...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,"El Pardo, Fuentelareina, Peñagrande, Pilar, La...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0
7,"El Viso, Prosperidad, Ciudad Jardín, Hispanoam...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.0,0.0,0.0,0.075,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.025,0.0,0.025,0.0
8,"Entrevías, San Diego, Palomeras Bajas, Palomer...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.071429,0.0,0.0,0.0,0.0
9,"Gaztambide, Arapiles, Trafalgar, Almagro, Rios...",0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.013514,0.040541,0.067568,0.0,0.027027,0.0,0.013514,0.013514,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.067568,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.027027,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.013514,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.013514,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.081081,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.067568,0.0,0.013514,0.013514,0.081081,0.0,0.0,0.0,0.067568,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.013514,0.0


In [53]:
num_top_venues = 5

for hood in df3_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = df3_grouped[df3_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alameda de Osuna, Aeropuerto, Casco Histórico de Barajas, Timón, Corralejos----
           venue  freq
0  Apres Ski Bar   1.0
1    Yoga Studio   0.0
2           Park   0.0
3    Music Venue   0.0
4      Nightclub   0.0


----Bellas Vistas, Cuatro Caminos, Castillejos, Almenara, Valdeacederas, Berruguete----
                venue  freq
0  Spanish Restaurant  0.17
1      Sandwich Place  0.10
2               Hotel  0.10
3          Restaurant  0.07
4  Italian Restaurant  0.07


----Casa de Campo, Argüelles, Ciudad Universitaria, Valdezarza, Valdemarín, El Plantío, Aravaca----
           venue  freq
0    Bus Station  0.25
1  Metro Station  0.25
2     Restaurant  0.25
3         Museum  0.25
4    Yoga Studio  0.00


----Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas----
              venue  freq
0      Soccer Field  0.14
1         Pet Store  0.14
2       Supermarket  0.14
3  Asian Restaurant  0.14
4  Tapas Restaurant  0.14


----Casco Histórico de Vicálvaro, Valdebernardo

In [54]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [55]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = df3_grouped['Neighborhood']

for ind in np.arange(df3_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df3_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Alameda de Osuna, Aeropuerto, Casco Histórico ...",Apres Ski Bar,Wine Shop,Department Store,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Electronics Store,Diner,Dessert Shop
1,"Bellas Vistas, Cuatro Caminos, Castillejos, Al...",Spanish Restaurant,Hotel,Sandwich Place,Restaurant,Breakfast Spot,Italian Restaurant,Business Service,Fast Food Restaurant,Building,Burger Joint
2,"Casa de Campo, Argüelles, Ciudad Universitaria...",Restaurant,Museum,Metro Station,Bus Station,Flower Shop,Falafel Restaurant,Event Space,Electronics Store,Diner,Dessert Shop
3,"Casco Histórico de Vallecas, Santa Eugenia, En...",Tapas Restaurant,Pet Store,Soccer Field,Grocery Store,Restaurant,Asian Restaurant,Supermarket,Wine Shop,Electronics Store,Diner
4,"Casco Histórico de Vicálvaro, Valdebernardo, V...",Pizza Place,Spanish Restaurant,Fast Food Restaurant,Supermarket,Park,Sandwich Place,Ice Cream Shop,Falafel Restaurant,Grocery Store,Café
5,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",Hotel,Tapas Restaurant,Café,Dance Studio,Supermarket,Restaurant,Food & Drink Shop,Bakery,Fast Food Restaurant,Falafel Restaurant
6,"El Pardo, Fuentelareina, Peñagrande, Pilar, La...",Café,Restaurant,Snack Place,Park,Train Station,Paper / Office Supplies Store,Pool,Diner,Event Space,Electronics Store
7,"El Viso, Prosperidad, Ciudad Jardín, Hispanoam...",Spanish Restaurant,Café,Platform,Hotel,Gym / Fitness Center,Restaurant,Train Station,Bar,Sandwich Place,Bowling Alley
8,"Entrevías, San Diego, Palomeras Bajas, Palomer...",Clothing Store,Shopping Mall,Spanish Restaurant,Bakery,Sandwich Place,Restaurant,Pizza Place,Park,Toy / Game Store,Gym
9,"Gaztambide, Arapiles, Trafalgar, Almagro, Rios...",Tapas Restaurant,Restaurant,Spanish Restaurant,Café,Theater,Bar,Bakery,Coffee Shop,Mexican Restaurant,Beer Bar


### Modeling

In [56]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 5

df3_grouped_clustering = df3_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(df3_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([2, 0, 4, 0, 0, 0, 0, 0, 0, 0], dtype=int32)

In [57]:
# add clustering labels
neighborhoods_venues_sorted = neighborhoods_venues_sorted.dropna(how='any')
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

df4 = df3.dropna()

df3_merged = df4

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
df3_merged = df3_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

df3_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,0.0,Restaurant,Spanish Restaurant,Cocktail Bar,Hotel,Bakery,Deli / Bodega,Cosmetics Shop,Italian Restaurant,Café,Bookstore
1,79610,Latina,"Los Cármenes, Puerta del Angel, Lucero, Aluche...",40.411603,-3.749912,1.0,Theme Park Ride / Attraction,Restaurant,Metro Station,Tapas Restaurant,Snack Place,Buffet,Theme Park,Burger Joint,Falafel Restaurant,Event Space
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",40.375855,-3.74091,0.0,Hotel,Tapas Restaurant,Café,Dance Studio,Supermarket,Restaurant,Food & Drink Shop,Bakery,Fast Food Restaurant,Falafel Restaurant
3,79612,Usera,"Orcasitas, Orcasur, San Fermín, Almendrales, M...",40.37754,-3.715229,3.0,Athletics & Sports,Spanish Restaurant,Park,Beer Garden,Dessert Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Electronics Store
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer...",40.381633,-3.668024,0.0,Clothing Store,Shopping Mall,Spanish Restaurant,Bakery,Sandwich Place,Restaurant,Pizza Place,Park,Toy / Game Store,Gym


In [58]:
df3_merged = df3_merged.dropna(how='any')
df3_merged['Cluster Labels'].unique()
df3_merged['Cluster Labels'] = df3_merged['Cluster Labels'].astype(np.int64)
df3_merged['Cluster Labels'].unique()

array([0, 1, 3, 2, 4])

In [59]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df3_merged['Latitude'], df3_merged['Longitude'], df3_merged['Neighborhood'], df3_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [61]:
df3_merged.loc[df3_merged['Cluster Labels'] == 0, df3_merged.columns[[1] + list(range(5, df3_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Centro,0,Restaurant,Spanish Restaurant,Cocktail Bar,Hotel,Bakery,Deli / Bodega,Cosmetics Shop,Italian Restaurant,Café,Bookstore
2,Carabanchel,0,Hotel,Tapas Restaurant,Café,Dance Studio,Supermarket,Restaurant,Food & Drink Shop,Bakery,Fast Food Restaurant,Falafel Restaurant
4,Puente de Vallecas,0,Clothing Store,Shopping Mall,Spanish Restaurant,Bakery,Sandwich Place,Restaurant,Pizza Place,Park,Toy / Game Store,Gym
6,Ciudad Lineal,0,Spanish Restaurant,Grocery Store,Restaurant,Chinese Restaurant,Mediterranean Restaurant,Hotel,Tapas Restaurant,Gourmet Shop,Pool Hall,Karaoke Bar
7,Hortaleza,0,Soccer Field,Metro Station,Italian Restaurant,Plaza,Theater,Wine Shop,Department Store,Falafel Restaurant,Event Space,Electronics Store
9,Villa de Vallecas,0,Tapas Restaurant,Pet Store,Soccer Field,Grocery Store,Restaurant,Asian Restaurant,Supermarket,Wine Shop,Electronics Store,Diner
10,Vicalvaro,0,Pizza Place,Spanish Restaurant,Fast Food Restaurant,Supermarket,Park,Sandwich Place,Ice Cream Shop,Falafel Restaurant,Grocery Store,Café
11,Arganzuela,0,Spanish Restaurant,Hotel,Restaurant,Grocery Store,Museum,Train Station,Sandwich Place,Italian Restaurant,Garden,Latin American Restaurant
12,San Blas,0,Hotel,Plaza,Spanish Restaurant,Tapas Restaurant,Hostel,Wine Bar,Restaurant,Gourmet Shop,Clothing Store,Bookstore
14,Retiro,0,Hotel,Plaza,Spanish Restaurant,Tapas Restaurant,Hostel,Wine Bar,Restaurant,Gourmet Shop,Clothing Store,Bookstore


### Cluster 2

In [62]:
df3_merged.loc[df3_merged['Cluster Labels'] == 1, df3_merged.columns[[1] + list(range(5, df3_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Latina,1,Theme Park Ride / Attraction,Restaurant,Metro Station,Tapas Restaurant,Snack Place,Buffet,Theme Park,Burger Joint,Falafel Restaurant,Event Space


### Cluster 3

In [63]:
df3_merged.loc[df3_merged['Cluster Labels'] == 2, df3_merged.columns[[1] + list(range(5, df3_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Barajas,2,Apres Ski Bar,Wine Shop,Department Store,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Electronics Store,Diner,Dessert Shop


### Cluster 4

In [64]:
df3_merged.loc[df3_merged['Cluster Labels'] == 3, df3_merged.columns[[1] + list(range(5, df3_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Usera,3,Athletics & Sports,Spanish Restaurant,Park,Beer Garden,Dessert Shop,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space,Electronics Store
5,Moratalaz,3,Pharmacy,Grocery Store,Spanish Restaurant,Athletics & Sports,Coffee Shop,Dessert Shop,Fast Food Restaurant,Falafel Restaurant,Clothing Store,Event Space


### Cluster 5

In [65]:
df3_merged.loc[df3_merged['Cluster Labels'] == 4, df3_merged.columns[[1] + list(range(5, df3_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
20,Moncloa-Aravaca,4,Restaurant,Museum,Metro Station,Bus Station,Flower Shop,Falafel Restaurant,Event Space,Electronics Store,Diner,Dessert Shop


## Review

As we can see, cluster 1 is the most appropriate area for the opening of a restaurant.

But if we look more closely we see that the gastronomic variety is very large in Madrid. So at this point in the analysis it is necessary to carry out a resegmentation or a new cluster within cluster 1 to know what type of restaurant is in each area.

With this I intend to know which areas have fast or traditional Spanish or tapas restaurants.

Let's find out

In [145]:
df_C1 = df3_merged[df3_merged['Cluster Labels'] == 0]
df_C1 = df_C1.drop(['Cluster Labels', '1st Most Common Venue', '2nd Most Common Venue', '3rd Most Common Venue', '4th Most Common Venue', '5th Most Common Venue'], axis=1)
df_C1 = df_C1.drop(['6th Most Common Venue', '7th Most Common Venue', '8th Most Common Venue', '9th Most Common Venue', '10th Most Common Venue'], axis=1)
df_C1.head()

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",40.375855,-3.74091
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer...",40.381633,-3.668024
6,79615,Ciudad Lineal,"Ventas, Pueblo Nuevo, Quintana, Concepción, Sa...",40.43398,-3.657251
7,79616,Hortaleza,"Palomas, Piovera, Canillas, Pinar del Rey, Apo...",40.458139,-3.641003


### Pre-procesing

### Foursquare

In [146]:
df_C1_venues = getNearbyVenues(names=df_C1['Neighborhood'],
                                   latitudes=df_C1['Latitude'],
                                   longitudes=df_C1['Longitude']
                                  )

Palacio, Embajadores, Cortes, Justicia, Universidad, Sol
Comillas, Opañel, San Isidro, Vista Alegre, Puerta Bonita, Buenavista, Abrantes
Entrevías, San Diego, Palomeras Bajas, Palomeras Sureste, Portazgo, Numancia
Ventas, Pueblo Nuevo, Quintana, Concepción, San Pascual, San Juan Bautista, Colina, Atalaya, Costillares
Palomas, Piovera, Canillas, Pinar del Rey, Apostol Santiago, Valdefuentes
Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas
Casco Histórico de Vicálvaro, Valdebernardo, Valderrivas, El Cañaveral
Imperial, Acacias, Chopera, Legazpi, Delicias, Palos de Moguer, Atocha
Simancas, Hellín, Amposta, Arcos, Rosas, Rejas, Canillejas, Salvador
Pacífico, Adelfas, Estrella, Ibiza, Jerónimos, Niño Jesús
Recoletos, Goya, Fuente del Berro, Guindalera, Lista, Castellana
El Viso, Prosperidad, Ciudad Jardín, Hispanoamérica, Nueva España, Castilla
Bellas Vistas, Cuatro Caminos, Castillejos, Almenara, Valdeacederas, Berruguete
Gaztambide, Arapiles, Trafalgar, Almagro, Rios Rosas

In [147]:
df_C1_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
"Bellas Vistas, Cuatro Caminos, Castillejos, Almenara, Valdeacederas, Berruguete",29,29,29,29,29,29
"Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas",7,7,7,7,7,7
"Casco Histórico de Vicálvaro, Valdebernardo, Valderrivas, El Cañaveral",15,15,15,15,15,15
"Comillas, Opañel, San Isidro, Vista Alegre, Puerta Bonita, Buenavista, Abrantes",8,8,8,8,8,8
"El Pardo, Fuentelareina, Peñagrande, Pilar, La Paz, Valverde, Mirasierra, El Goloso",11,11,11,11,11,11
"El Viso, Prosperidad, Ciudad Jardín, Hispanoamérica, Nueva España, Castilla",40,40,40,40,40,40
"Entrevías, San Diego, Palomeras Bajas, Palomeras Sureste, Portazgo, Numancia",14,14,14,14,14,14
"Gaztambide, Arapiles, Trafalgar, Almagro, Rios Rosas, Vallehermoso",74,74,74,74,74,74
"Imperial, Acacias, Chopera, Legazpi, Delicias, Palos de Moguer, Atocha",46,46,46,46,46,46
"Pacífico, Adelfas, Estrella, Ibiza, Jerónimos, Niño Jesús",100,100,100,100,100,100


In [148]:
# one hot encoding
df_C1_onehot = pd.get_dummies(df_C1_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
df_C1_onehot['Neighborhood'] = df_C1_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [df_C1_onehot.columns[-1]] + list(df_C1_onehot.columns[:-1])
df_C1_onehot = df_C1_onehot[fixed_columns]

df_C1_onehot.head()

Unnamed: 0,Yoga Studio,Accessories Store,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,BBQ Joint,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Business Service,Butcher,Cafeteria,Café,Casino,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Event Space,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Fountain,Frozen Yogurt Shop,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motorcycle Shop,Multiplex,Museum,Music Venue,Neighborhood,Nightclub,Noodle House,Optical Shop,Other Nightlife,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Peruvian Restaurant,Pet Store,Piano Bar,Pie Shop,Pizza Place,Platform,Plaza,Pool,Pool Hall,Pub,Ramen Restaurant,Rental Car Location,Restaurant,Road,Roof Deck,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Travel Lounge,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,"Palacio, Embajadores, Cortes, Justicia, Univer...",0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [149]:
df_C1_grouped = df_C1_onehot.groupby('Neighborhood').mean().reset_index()
df_C1_grouped

Unnamed: 0,Neighborhood,Yoga Studio,Accessories Store,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,BBQ Joint,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Bowling Alley,Boxing Gym,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Business Service,Butcher,Cafeteria,Café,Casino,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Convenience Store,Cosmetics Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Event Space,Falafel Restaurant,Fast Food Restaurant,Flower Shop,Food & Drink Shop,Food Court,Fountain,Frozen Yogurt Shop,Garden,Gastropub,Gay Bar,General Entertainment,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Hawaiian Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Karaoke Bar,Korean Restaurant,Latin American Restaurant,Lounge,Market,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Motorcycle Shop,Multiplex,Museum,Music Venue,Nightclub,Noodle House,Optical Shop,Other Nightlife,Paella Restaurant,Paper / Office Supplies Store,Park,Pastry Shop,Performing Arts Venue,Peruvian Restaurant,Pet Store,Piano Bar,Pie Shop,Pizza Place,Platform,Plaza,Pool,Pool Hall,Pub,Ramen Restaurant,Rental Car Location,Restaurant,Road,Roof Deck,Salon / Barbershop,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Supermarket,Sushi Restaurant,Tapas Restaurant,Tattoo Parlor,Tea Room,Thai Restaurant,Theater,Toy / Game Store,Trail,Train Station,Travel Lounge,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop
0,"Bellas Vistas, Cuatro Caminos, Castillejos, Al...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.034483,0.034483,0.0,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.103448,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.172414,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0
1,"Casco Histórico de Vallecas, Santa Eugenia, En...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.142857,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,"Casco Histórico de Vicálvaro, Valdebernardo, V...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,"El Pardo, Fuentelareina, Peñagrande, Pilar, La...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0
5,"El Viso, Prosperidad, Ciudad Jardín, Hispanoam...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.0,0.0,0.0,0.075,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.025,0.0,0.025,0.0
6,"Entrevías, San Diego, Palomeras Bajas, Palomer...",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.071429,0.0,0.0,0.0,0.0
7,"Gaztambide, Arapiles, Trafalgar, Almagro, Rios...",0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.013514,0.013514,0.040541,0.067568,0.0,0.027027,0.0,0.013514,0.013514,0.0,0.0,0.013514,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.067568,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.027027,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.013514,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.013514,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.081081,0.0,0.0,0.0,0.013514,0.013514,0.0,0.0,0.0,0.0,0.0,0.0,0.067568,0.0,0.013514,0.013514,0.081081,0.0,0.0,0.0,0.067568,0.013514,0.0,0.0,0.0,0.0,0.013514,0.0
8,"Imperial, Acacias, Chopera, Legazpi, Delicias,...",0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.065217,0.021739,0.0,0.0,0.0,0.021739,0.0,0.0,0.108696,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.152174,0.021739,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0
9,"Pacífico, Adelfas, Estrella, Ibiza, Jerónimos,...",0.0,0.01,0.0,0.0,0.01,0.01,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.02,0.0,0.01,0.02,0.02,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.04,0.09,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.01,0.0,0.0,0.02,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.07,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.01,0.01,0.05,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.03,0.0


In [150]:
num_top_venues = 5

for hood in df_C1_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = df_C1_grouped[df_C1_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bellas Vistas, Cuatro Caminos, Castillejos, Almenara, Valdeacederas, Berruguete----
                venue  freq
0  Spanish Restaurant  0.17
1      Sandwich Place  0.10
2               Hotel  0.10
3          Restaurant  0.07
4  Italian Restaurant  0.07


----Casco Histórico de Vallecas, Santa Eugenia, Ensanche de Vallecas----
              venue  freq
0      Soccer Field  0.14
1  Asian Restaurant  0.14
2        Restaurant  0.14
3       Supermarket  0.14
4  Tapas Restaurant  0.14


----Casco Histórico de Vicálvaro, Valdebernardo, Valderrivas, El Cañaveral----
                venue  freq
0         Pizza Place  0.20
1  Spanish Restaurant  0.13
2            Beer Bar  0.07
3                Park  0.07
4  Falafel Restaurant  0.07


----Comillas, Opañel, San Isidro, Vista Alegre, Puerta Bonita, Buenavista, Abrantes----
          venue  freq
0    Restaurant  0.12
1  Dance Studio  0.12
2         Hotel  0.12
3          Café  0.12
4   Supermarket  0.12


----El Pardo, Fuentelareina, Peñagrande,

In [151]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = df_C1_grouped['Neighborhood']

for ind in np.arange(df_C1_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(df_C1_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,"Bellas Vistas, Cuatro Caminos, Castillejos, Al...",Spanish Restaurant,Hotel,Sandwich Place,Restaurant,Breakfast Spot,Italian Restaurant,Burger Joint,Building,Business Service,Fast Food Restaurant
1,"Casco Histórico de Vallecas, Santa Eugenia, En...",Tapas Restaurant,Supermarket,Pet Store,Restaurant,Soccer Field,Grocery Store,Asian Restaurant,Falafel Restaurant,Event Space,Electronics Store
2,"Casco Histórico de Vicálvaro, Valdebernardo, V...",Pizza Place,Spanish Restaurant,Ice Cream Shop,Supermarket,Café,Falafel Restaurant,Fast Food Restaurant,Sandwich Place,Breakfast Spot,Beer Bar
3,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",Bakery,Café,Restaurant,Supermarket,Tapas Restaurant,Food & Drink Shop,Dance Studio,Hotel,Deli / Bodega,Department Store
4,"El Pardo, Fuentelareina, Peñagrande, Pilar, La...",Café,Restaurant,Park,Paper / Office Supplies Store,Train Station,Snack Place,Pool,Diner,Wine Shop,Event Space
5,"El Viso, Prosperidad, Ciudad Jardín, Hispanoam...",Spanish Restaurant,Café,Platform,Hotel,Gym / Fitness Center,Restaurant,Bar,Sandwich Place,Train Station,Skating Rink
6,"Entrevías, San Diego, Palomeras Bajas, Palomer...",Clothing Store,Park,Chinese Restaurant,Sandwich Place,Shopping Mall,Burger Joint,Spanish Restaurant,Pizza Place,Bakery,Toy / Game Store
7,"Gaztambide, Arapiles, Trafalgar, Almagro, Rios...",Restaurant,Tapas Restaurant,Theater,Bar,Spanish Restaurant,Café,Bakery,Multiplex,Plaza,Mexican Restaurant
8,"Imperial, Acacias, Chopera, Legazpi, Delicias,...",Spanish Restaurant,Hotel,Restaurant,Grocery Store,Sandwich Place,Train Station,Museum,Gym,Brewery,Latin American Restaurant
9,"Pacífico, Adelfas, Estrella, Ibiza, Jerónimos,...",Hotel,Plaza,Tapas Restaurant,Spanish Restaurant,Hostel,Wine Bar,Gourmet Shop,Clothing Store,Restaurant,Dessert Shop


### Modeling

In [152]:
# import k-means from clustering stage
from sklearn.cluster import KMeans

# set number of clusters
kclusters = 8

df_C1_grouped_clustering = df_C1_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(df_C1_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 4, 7, 5, 3, 0, 6, 1, 0, 1], dtype=int32)

In [153]:
# add clustering labels
neighborhoods_venues_sorted = neighborhoods_venues_sorted.dropna(how='any')
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

df_C1_a = df_C1.dropna()

df_C1_merged = df_C1_a

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
df_C1_merged = df_C1_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

df_C1_merged.head() # check the last columns!

Unnamed: 0,PostalCode,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,7961,Centro,"Palacio, Embajadores, Cortes, Justicia, Univer...",40.425356,-3.69819,1,Restaurant,Spanish Restaurant,Bakery,Hotel,Cocktail Bar,Cosmetics Shop,Café,Vegetarian / Vegan Restaurant,Bar,Italian Restaurant
2,79611,Carabanchel,"Comillas, Opañel, San Isidro, Vista Alegre, Pu...",40.375855,-3.74091,5,Bakery,Café,Restaurant,Supermarket,Tapas Restaurant,Food & Drink Shop,Dance Studio,Hotel,Deli / Bodega,Department Store
4,79613,Puente de Vallecas,"Entrevías, San Diego, Palomeras Bajas, Palomer...",40.381633,-3.668024,6,Clothing Store,Park,Chinese Restaurant,Sandwich Place,Shopping Mall,Burger Joint,Spanish Restaurant,Pizza Place,Bakery,Toy / Game Store
6,79615,Ciudad Lineal,"Ventas, Pueblo Nuevo, Quintana, Concepción, Sa...",40.43398,-3.657251,0,Spanish Restaurant,Grocery Store,Restaurant,Chinese Restaurant,Mediterranean Restaurant,Tapas Restaurant,Hotel,Gourmet Shop,Park,Butcher
7,79616,Hortaleza,"Palomas, Piovera, Canillas, Pinar del Rey, Apo...",40.458139,-3.641003,2,Metro Station,Soccer Field,Italian Restaurant,Plaza,Theater,Diner,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space


In [154]:
df_C1_merged = df_C1_merged.dropna(how='any')
df_C1_merged['Cluster Labels'].unique()
df_C1_merged['Cluster Labels'] = df_C1_merged['Cluster Labels'].astype(np.int64)
df_C1_merged['Cluster Labels'].unique()

array([1, 5, 6, 0, 2, 4, 7, 3])

In [155]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import matplotlib.colors as colors

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_C1_merged['Latitude'], df_C1_merged['Longitude'], df_C1_merged['Neighborhood'], df_C1_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### Cluster 1

In [156]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 0, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Ciudad Lineal,0,Spanish Restaurant,Grocery Store,Restaurant,Chinese Restaurant,Mediterranean Restaurant,Tapas Restaurant,Hotel,Gourmet Shop,Park,Butcher
11,Arganzuela,0,Spanish Restaurant,Hotel,Restaurant,Grocery Store,Sandwich Place,Train Station,Museum,Gym,Brewery,Latin American Restaurant
16,Chamartin,0,Spanish Restaurant,Café,Platform,Hotel,Gym / Fitness Center,Restaurant,Bar,Sandwich Place,Train Station,Skating Rink
17,Tetuan,0,Spanish Restaurant,Hotel,Sandwich Place,Restaurant,Breakfast Spot,Italian Restaurant,Burger Joint,Building,Business Service,Fast Food Restaurant


### Cluster 2

In [158]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 1, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Centro,1,Restaurant,Spanish Restaurant,Bakery,Hotel,Cocktail Bar,Cosmetics Shop,Café,Vegetarian / Vegan Restaurant,Bar,Italian Restaurant
12,San Blas,1,Hotel,Plaza,Tapas Restaurant,Spanish Restaurant,Hostel,Wine Bar,Gourmet Shop,Clothing Store,Restaurant,Dessert Shop
14,Retiro,1,Hotel,Plaza,Tapas Restaurant,Spanish Restaurant,Hostel,Wine Bar,Gourmet Shop,Clothing Store,Restaurant,Dessert Shop
15,Salamanca,1,Hotel,Spanish Restaurant,Plaza,Restaurant,Theater,Park,Café,Japanese Restaurant,Art Gallery,BBQ Joint
18,Chamberi,1,Restaurant,Tapas Restaurant,Theater,Bar,Spanish Restaurant,Café,Bakery,Multiplex,Plaza,Mexican Restaurant


### Cluster 3

In [159]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 2, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Hortaleza,2,Metro Station,Soccer Field,Italian Restaurant,Plaza,Theater,Diner,Flower Shop,Fast Food Restaurant,Falafel Restaurant,Event Space


### Cluster 4

In [160]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 3, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Fuencarral,3,Café,Restaurant,Park,Paper / Office Supplies Store,Train Station,Snack Place,Pool,Diner,Wine Shop,Event Space


### Cluster 5

In [161]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 4, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Villa de Vallecas,4,Tapas Restaurant,Supermarket,Pet Store,Restaurant,Soccer Field,Grocery Store,Asian Restaurant,Falafel Restaurant,Event Space,Electronics Store


### Cluster 6

In [162]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 5, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Carabanchel,5,Bakery,Café,Restaurant,Supermarket,Tapas Restaurant,Food & Drink Shop,Dance Studio,Hotel,Deli / Bodega,Department Store


### Cluster 7

In [163]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 6, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Puente de Vallecas,6,Clothing Store,Park,Chinese Restaurant,Sandwich Place,Shopping Mall,Burger Joint,Spanish Restaurant,Pizza Place,Bakery,Toy / Game Store


### Cluster 8

In [164]:
df_C1_merged.loc[df_C1_merged['Cluster Labels'] == 7, df_C1_merged.columns[[1] + list(range(5, df_C1_merged.shape[1]))]]

Unnamed: 0,Borough,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Vicalvaro,7,Pizza Place,Spanish Restaurant,Ice Cream Shop,Supermarket,Café,Falafel Restaurant,Fast Food Restaurant,Sandwich Place,Breakfast Spot,Beer Bar
