# Paris, Brussels, Rome, Madrid 
# Segmenting and Clustering Neighborhoods

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1.  <a href="#item1">Paris, Brussels, Rome, Madrid : Download and Explore Dataset.</a>

2.  <a href="#item66">Foursquare.</a>    
    
3.  <a href="#item22">Paris : Explore, Cluster and Examine Neighborhoods.</a>
    
4.  <a href="#item33">Brussels: Explore, Cluster and Examine Neighborhoods.</a>
    
5.  <a href="#item44">Rome: Explore, Cluster and Examine Neighborhoods.</a>    
    
6.  <a href="#item55">Madrid: Explore, Cluster and Examine Neighborhoods.</a>

    </font>
    </div>


# Index.
## Paris, Brussels, Rome, Madrid :
###            Download and Explore Dataset.
## Foursquare :
###            Credentials.
## Explore, Cluster and Examine Neighborhoods :
### Paris.
### Brussels.
### Rome.
### Madrid.

# Libraries


In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)


import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

ModuleNotFoundError: No module named 'geopy'

<a id='item1'></a>


## 1. Paris, Brussels, Rome, Madrid: Download and Explore Dataset.


### Paris

In [2]:
p_n=pd.read_csv('Paris.csv')

In [3]:
p_n.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Paris,Louvre,48.862563,2.336443
1,Paris,Bourse,48.868279,2.342803
2,Paris,Buttes-Chaumont,48.887076,2.384821
3,Paris,Luxembourg,48.84913,2.332898
4,Paris,Passy,48.860392,2.261971


In [4]:
address = 'Paris'

geolocator = Nominatim(user_agent="pa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))
p_latitude,p_longitude=(latitude, longitude)

The geograpical coordinate of Paris are 48.8566969, 2.3514616.


In [5]:
neighborhoods=p_n

In [6]:
# create map of New York using latitude and longitude values
map_neighborhoods = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_neighborhoods)  
    
map_neighborhoods

### Brussels

In [7]:
b_n=pd.read_csv('Brussels.csv')

In [8]:
b_n.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Brussels,Bruxelles-Ville,50.850346,4.351721
1,Brussels,Schaerbeek,50.867416,4.377298
2,Brussels,Etterbeek,50.832578,4.388994
3,Brussels,Ixelles,50.833343,4.366629
4,Brussels,Saint Gilles,50.830144,4.340218


In [9]:
address = 'Brussels'

geolocator = Nominatim(user_agent="pa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))
b_latitude,b_longitude=(latitude, longitude)

The geograpical coordinate of Brussels are 50.8465573, 4.351697.


In [10]:
neighborhoods=b_n

In [11]:
# create map of New York using latitude and longitude values
map_neighborhoods = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_neighborhoods)  
    
map_neighborhoods

### Rome

In [114]:
r_n=pd.read_csv('Rome.csv')

In [115]:
r_n.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Rome,Municipio I – Historical Center,41.90286,12.485487
1,Rome,Municipio II – Parioli/Nomentano,41.922397,12.498321
2,Rome,Municipio III – Monte Sacro,41.942542,12.540979
3,Rome,Municipio IV – Tiburtina,41.92163,12.553682
4,Rome,Municipio V – Prenestino/Centocelle,41.891288,12.551022


In [116]:
address = 'Rome'

geolocator = Nominatim(user_agent="pa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))
r_latitude,r_longitude=(latitude, longitude)

The geograpical coordinate of Rome are 41.8933203, 12.4829321.


In [117]:
neighborhoods=r_n

In [118]:
# create map of New York using latitude and longitude values
map_neighborhoods = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_neighborhoods)  
    
map_neighborhoods

### Madrid

In [17]:
m_n=pd.read_csv('Madrid.csv')

In [18]:
m_n.head()

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude
0,Madrid,Centro,40.411535,-3.707628
1,Madrid,Arganzuela,40.398889,-3.710203
2,Madrid,Retiro,40.411335,-3.674905
3,Madrid,Salamanca,40.428002,-3.686771
4,Madrid,Chamartin,40.46152,-3.686584


In [19]:
address = 'Madrid'

geolocator = Nominatim(user_agent="pa_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))


The geograpical coordinate of Madrid are 40.4167047, -3.7035825.


In [20]:
m_latitude,m_longitude=(latitude, longitude)

In [21]:
neighborhoods=m_n

In [22]:
# create map of New York using latitude and longitude values
map_neighborhoods = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, borough, neighborhood in zip(neighborhoods['Latitude'], neighborhoods['Longitude'], neighborhoods['Borough'], neighborhoods['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_neighborhoods)  
    
map_neighborhoods

<a id='item66'></a>


## Foursquare data

#### Define Foursquare Credentials and Version


In [23]:

VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

CLIENT_ID = 'GVNP3JOMLBIADOJOE5JLTFWLQYJC30GAC23VHYTMTZVCGJY2' # your Foursquare ID
CLIENT_SECRET = 'RDD1EYDCYA0MHPIVVWYAQ2E5FO4YQYG25ABBY44JZBGEFD0V' # your Foursquare Secret
# ACCESS_TOKEN = 'DR4VOEHRB3NBWSSFJ0NEOYB43MNJ1SVDO1DQCXJ4C3J1C0JU' # your FourSquare Access Token

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: GVNP3JOMLBIADOJOE5JLTFWLQYJC30GAC23VHYTMTZVCGJY2
CLIENT_SECRET:RDD1EYDCYA0MHPIVVWYAQ2E5FO4YQYG25ABBY44JZBGEFD0V


<a id='item22'></a>


## 3. Paris : explore, Cluster and Examine Neighborhoods.

In [24]:
neighbor_data=p_n
neighborhoods=p_n

In [25]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [26]:

n_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )


Louvre
Bourse
Buttes-Chaumont
Luxembourg
Passy
Opéra
Buttes-Montmartre
Entrepôt
Popincourt
Vaugirard
Temple
Palais-Bourbon
Panthéon
Élysée
Gobelins
Batignolles-Monceau
Ménilmontant
Hôtel-de-Ville
Observatoire
Reuilly


In [27]:
print(n_venues.shape)
n_venues.head()

(1273, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Louvre,48.862563,2.336443,Musée du Louvre,48.860847,2.33644,Art Museum
1,Louvre,48.862563,2.336443,Palais Royal,48.863236,2.337127,Historic Site
2,Louvre,48.862563,2.336443,Comédie-Française,48.863088,2.336612,Theater
3,Louvre,48.862563,2.336443,Place du Palais Royal,48.862523,2.336688,Plaza
4,Louvre,48.862563,2.336443,Cour Napoléon,48.861172,2.335088,Plaza


Let's check how many venues were returned for each neighborhood


In [28]:
n_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Batignolles-Monceau,59,59,59,59,59,59
Bourse,100,100,100,100,100,100
Buttes-Chaumont,44,44,44,44,44,44
Buttes-Montmartre,44,44,44,44,44,44
Entrepôt,100,100,100,100,100,100
Gobelins,62,62,62,62,62,62
Hôtel-de-Ville,100,100,100,100,100,100
Louvre,82,82,82,82,82,82
Luxembourg,49,49,49,49,49,49
Ménilmontant,48,48,48,48,48,48


#### Let's find out how many unique categories can be curated from all the returned venues


In [29]:
print('There are {} uniques categories.'.format(len(n_venues['Venue Category'].unique())))

There are 204 uniques categories.


### 3. Analyze Each Neighborhood in Paris


In [30]:
# one hot encoding
n_onehot = pd.get_dummies(n_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
n_onehot['Neighborhood'] = n_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [n_onehot.columns[-1]] + list(n_onehot.columns[:-1])
n_onehot = n_onehot[fixed_columns]

n_onehot.head()

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auvergne Restaurant,Baby Store,Bagel Shop,Bakery,Bar,Basque Restaurant,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Bus Station,Bus Stop,Butcher,Café,Cambodian Restaurant,Canal,Candy Store,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Corsican Restaurant,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Cycle Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Donut Shop,Electronics Store,Escape Room,Ethiopian Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flower Shop,Food,Food & Drink Shop,Fountain,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,General College & University,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewish Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lebanese Restaurant,Lingerie Store,Liquor Store,Lounge,Lyonese Bouchon,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Store,New American Restaurant,Nightclub,Noodle House,Okonomiyaki Restaurant,Optical Shop,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Resort,Restaurant,Romanian Restaurant,Roof Deck,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Science Museum,Sculpture Garden,Seafood Restaurant,Shanxi Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Snack Place,Soba Restaurant,South American Restaurant,Southwestern French Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Stadium,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Udon Restaurant,University,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Zoo,Zoo Exhibit
0,Louvre,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Louvre,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.


In [31]:
n_onehot.shape

(1273, 205)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


In [32]:
n_grouped = n_onehot.groupby('Neighborhood').mean().reset_index()
n_grouped

Unnamed: 0,Neighborhood,Afghan Restaurant,African Restaurant,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auvergne Restaurant,Baby Store,Bagel Shop,Bakery,Bar,Basque Restaurant,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Bus Station,Bus Stop,Butcher,Café,Cambodian Restaurant,Canal,Candy Store,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Concert Hall,Convenience Store,Corsican Restaurant,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Cycle Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Donut Shop,Electronics Store,Escape Room,Ethiopian Restaurant,Exhibit,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Flower Shop,Food,Food & Drink Shop,Fountain,French Restaurant,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,General College & University,Gift Shop,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Halal Restaurant,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewish Restaurant,Juice Bar,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lebanese Restaurant,Lingerie Store,Liquor Store,Lounge,Lyonese Bouchon,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Museum,Music Store,New American Restaurant,Nightclub,Noodle House,Okonomiyaki Restaurant,Optical Shop,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Persian Restaurant,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Portuguese Restaurant,Print Shop,Pub,Ramen Restaurant,Resort,Restaurant,Romanian Restaurant,Roof Deck,Russian Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scandinavian Restaurant,Science Museum,Sculpture Garden,Seafood Restaurant,Shanxi Restaurant,Shoe Store,Shopping Mall,Smoke Shop,Snack Place,Soba Restaurant,South American Restaurant,Southwestern French Restaurant,Souvenir Shop,Souvlaki Shop,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Stadium,Thai Restaurant,Theater,Toy / Game Store,Trail,Turkish Restaurant,Udon Restaurant,University,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Video Game Store,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Zoo,Zoo Exhibit
0,Batignolles-Monceau,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.050847,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.169492,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.169492,0.0,0.0,0.0,0.0,0.0,0.0,0.067797,0.050847,0.0,0.0,0.0,0.033898,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.050847,0.0,0.016949,0.0,0.0,0.0,0.0,0.050847,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016949,0.0,0.0,0.0,0.0,0.016949,0.0,0.016949,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Bourse,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.06,0.02,0.01,0.0,0.02,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.13,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.04,0.01,0.01,0.01,0.02,0.0,0.01,0.03,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.01,0.02,0.0,0.0
2,Buttes-Chaumont,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.090909,0.0,0.0,0.045455,0.0,0.0,0.022727,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.022727,0.0,0.0,0.0,0.068182,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.113636,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0
3,Buttes-Montmartre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.136364,0.0,0.0,0.022727,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.113636,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.045455,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0
4,Entrepôt,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.05,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.13,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.05,0.0,0.01,0.04,0.0,0.0,0.01,0.02,0.03,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0
5,Gobelins,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.193548,0.0,0.0,0.0,0.0,0.032258,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.016129,0.0,0.0,0.0,0.080645,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.016129,0.0,0.016129,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.064516,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.016129,0.0,0.0,0.032258,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.080645,0.0,0.0,0.016129,0.0,0.0,0.0,0.0,0.0,0.0,0.209677,0.0,0.0,0.0,0.0,0.0
6,Hôtel-de-Ville,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.04,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,0.0,0.01,0.02,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.05,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.03,0.03,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0
7,Louvre,0.0,0.0,0.0,0.0,0.0,0.0,0.036585,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.02439,0.0,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.02439,0.02439,0.012195,0.0,0.012195,0.0,0.0,0.036585,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.097561,0.0,0.012195,0.02439,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.085366,0.0,0.0,0.0,0.0,0.0,0.0,0.036585,0.073171,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.085366,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.012195,0.0,0.0,0.012195,0.012195,0.012195,0.0,0.012195,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.02439,0.02439,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.012195,0.0,0.0,0.0
8,Luxembourg,0.0,0.0,0.020408,0.0,0.0,0.0,0.020408,0.0,0.0,0.020408,0.0,0.0,0.020408,0.061224,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.040816,0.0,0.020408,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.020408,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.040816,0.061224,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.040816,0.0,0.020408,0.0,0.0,0.0,0.0,0.020408,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.020408,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.061224,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.020408,0.061224,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.020408,0.0,0.020408,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.020408,0.0,0.0,0.020408,0.0,0.0,0.020408,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020408,0.0,0.0,0.0,0.0
9,Ménilmontant,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.104167,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.0,0.020833,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.041667,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.020833,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.020833,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.041667,0.083333,0.0,0.0,0.0,0.020833,0.0,0.020833,0.0,0.0,0.0,0.020833,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.020833,0.0,0.0,0.0,0.0


#### Let's confirm the new size


In [33]:
n_grouped.shape

(20, 205)

#### Let's print each neighborhood along with the top 5 most common venues


In [34]:
num_top_venues = 5

for hood in n_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = n_grouped[n_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Batignolles-Monceau----
                venue  freq
0               Hotel  0.17
1   French Restaurant  0.17
2  Italian Restaurant  0.07
3               Plaza  0.05
4              Bakery  0.05


----Bourse----
               venue  freq
0  French Restaurant  0.13
1       Cocktail Bar  0.06
2           Wine Bar  0.06
3              Hotel  0.04
4             Bakery  0.04


----Buttes-Chaumont----
               venue  freq
0  French Restaurant  0.11
1                Bar  0.09
2               Café  0.07
3             Bistro  0.05
4            Brewery  0.05


----Buttes-Montmartre----
               venue  freq
0                Bar  0.14
1  French Restaurant  0.11
2  Convenience Store  0.05
3         Restaurant  0.05
4               Café  0.05


----Entrepôt----
               venue  freq
0  French Restaurant  0.13
1             Bistro  0.05
2              Hotel  0.05
3  Indian Restaurant  0.04
4               Café  0.04


----Gobelins----
                   venue  freq
0  Vietnamese Re

#### Let's put that into a _pandas_ dataframe


First, let's write a function to sort the venues in descending order.


In [35]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.


In [36]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = n_grouped['Neighborhood']

for ind in np.arange(n_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(n_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Batignolles-Monceau,Hotel,French Restaurant,Italian Restaurant,Japanese Restaurant,Bakery,Restaurant,Bistro,Plaza,Café,Korean Restaurant
1,Bourse,French Restaurant,Cocktail Bar,Wine Bar,Hotel,Bakery,Italian Restaurant,Bistro,Creperie,Cheese Shop,Coffee Shop
2,Buttes-Chaumont,French Restaurant,Bar,Café,Hotel,Bistro,Beer Bar,Brewery,Seafood Restaurant,Supermarket,Pizza Place
3,Buttes-Montmartre,Bar,French Restaurant,Hotel,Café,Coffee Shop,Convenience Store,Restaurant,Supermarket,Beer Store,Gastropub
4,Entrepôt,French Restaurant,Bistro,Hotel,Indian Restaurant,Coffee Shop,Café,Pizza Place,Japanese Restaurant,Korean Restaurant,Bar


<a id='item4'></a>


### Cluster Neighborhoods in Paris


Run _k_-means to cluster the neighborhood into 5 clusters.


In [37]:
# set number of clusters
kclusters = 5

n_grouped_clustering = n_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(n_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 1, 1, 1, 1, 2, 1, 1, 1, 1])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.


In [38]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

n_merged = p_n

# merge n_grouped with n_data to add latitude/longitude for each neighborhood
n_merged = n_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

n_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Paris,Louvre,48.862563,2.336443,1,French Restaurant,Plaza,Hotel,Japanese Restaurant,Coffee Shop,Art Museum,Italian Restaurant,Wine Bar,Udon Restaurant,Garden
1,Paris,Bourse,48.868279,2.342803,1,French Restaurant,Cocktail Bar,Wine Bar,Hotel,Bakery,Italian Restaurant,Bistro,Creperie,Cheese Shop,Coffee Shop
2,Paris,Buttes-Chaumont,48.887076,2.384821,1,French Restaurant,Bar,Café,Hotel,Bistro,Beer Bar,Brewery,Seafood Restaurant,Supermarket,Pizza Place
3,Paris,Luxembourg,48.84913,2.332898,1,Pastry Shop,Plaza,Bakery,French Restaurant,Fountain,Chocolate Shop,Hotel,Pharmacy,Deli / Bodega,Lebanese Restaurant
4,Paris,Passy,48.860392,2.261971,4,Plaza,Lake,French Restaurant,Pool,Bus Station,Art Museum,Boat or Ferry,Bus Stop,Park,Ethiopian Restaurant


Finally, let's visualize the resulting clusters


In [39]:
latitude=p_latitude
longitude=p_longitude

In [40]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(n_merged['Latitude'], n_merged['Longitude'], n_merged['Neighborhood'], n_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>


### Examine Clusters in Paris


#### Cluster 1


In [41]:
n_merged.loc[n_merged['Cluster Labels'] == 0, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Palais-Bourbon,Hotel,French Restaurant,Italian Restaurant,Plaza,Café,History Museum,Cocktail Bar,Historic Site,Japanese Restaurant,Gourmet Shop
13,Élysée,French Restaurant,Hotel,Bakery,Spa,Department Store,Cocktail Bar,Resort,Corsican Restaurant,Plaza,Italian Restaurant
15,Batignolles-Monceau,Hotel,French Restaurant,Italian Restaurant,Japanese Restaurant,Bakery,Restaurant,Bistro,Plaza,Café,Korean Restaurant
18,Observatoire,French Restaurant,Hotel,Bistro,Italian Restaurant,Bakery,Brasserie,Fast Food Restaurant,Supermarket,Sushi Restaurant,Tea Room


#### Cluster 2


In [42]:
n_merged.loc[n_merged['Cluster Labels'] == 1, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Louvre,French Restaurant,Plaza,Hotel,Japanese Restaurant,Coffee Shop,Art Museum,Italian Restaurant,Wine Bar,Udon Restaurant,Garden
1,Bourse,French Restaurant,Cocktail Bar,Wine Bar,Hotel,Bakery,Italian Restaurant,Bistro,Creperie,Cheese Shop,Coffee Shop
2,Buttes-Chaumont,French Restaurant,Bar,Café,Hotel,Bistro,Beer Bar,Brewery,Seafood Restaurant,Supermarket,Pizza Place
3,Luxembourg,Pastry Shop,Plaza,Bakery,French Restaurant,Fountain,Chocolate Shop,Hotel,Pharmacy,Deli / Bodega,Lebanese Restaurant
5,Opéra,French Restaurant,Hotel,Bakery,Cocktail Bar,Bistro,Japanese Restaurant,Wine Bar,Lounge,Restaurant,Italian Restaurant
6,Buttes-Montmartre,Bar,French Restaurant,Hotel,Café,Coffee Shop,Convenience Store,Restaurant,Supermarket,Beer Store,Gastropub
7,Entrepôt,French Restaurant,Bistro,Hotel,Indian Restaurant,Coffee Shop,Café,Pizza Place,Japanese Restaurant,Korean Restaurant,Bar
8,Popincourt,French Restaurant,Restaurant,Café,Supermarket,Pastry Shop,Italian Restaurant,Bakery,Bar,Wine Bar,Cocktail Bar
9,Vaugirard,Italian Restaurant,Hotel,French Restaurant,Coffee Shop,Thai Restaurant,Indian Restaurant,Japanese Restaurant,Lebanese Restaurant,Park,Brasserie
10,Temple,French Restaurant,Italian Restaurant,Art Gallery,Japanese Restaurant,Coffee Shop,Gourmet Shop,Sandwich Place,Bakery,Cocktail Bar,Wine Bar


#### Cluster 3


In [43]:
n_merged.loc[n_merged['Cluster Labels'] == 2, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Gobelins,Vietnamese Restaurant,Asian Restaurant,Chinese Restaurant,Thai Restaurant,French Restaurant,Bakery,Juice Bar,Italian Restaurant,Cambodian Restaurant,Sandwich Place


#### Cluster 4


In [44]:
n_merged.loc[n_merged['Cluster Labels'] == 3, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
19,Reuilly,Zoo Exhibit,Supermarket,Monument / Landmark,Zoo,Antique Shop,African Restaurant,Fountain,Food & Drink Shop,Food,Flower Shop


#### Cluster 5


In [45]:
n_merged.loc[n_merged['Cluster Labels'] == 4, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Passy,Plaza,Lake,French Restaurant,Pool,Bus Station,Art Museum,Boat or Ferry,Bus Stop,Park,Ethiopian Restaurant


<a id='item33'></a>


## 4. Brussels: Explore, Cluster and Examine Neighborhoods.


In [46]:
neighbor_data=b_n
neighborhoods=b_n

In [47]:

n_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )


Bruxelles-Ville
Schaerbeek
Etterbeek
Ixelles
Saint Gilles
Anderlecht
Molenbeek-St-Jean
Koekelberg
Berchem-Ste-Agathe
Ganshoren
Jette
Evere
Woluwé-St-Pierre
Auderghem
Watermael-Boitsfort
Uccle
Forest
Woluwé-St-Lambert
St Josse-ten-Noode


In [48]:
print(n_venues.shape)
n_venues.head()

(597, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Bruxelles-Ville,50.850346,4.351721,Bia Mara,50.848682,4.350707,Fish & Chips Shop
1,Bruxelles-Ville,50.850346,4.351721,HEMA,50.850507,4.35346,Department Store
2,Bruxelles-Ville,50.850346,4.351721,Peck 47,50.84865,4.351121,Sandwich Place
3,Bruxelles-Ville,50.850346,4.351721,Corica Grand Place,50.8485,4.35109,Café
4,Bruxelles-Ville,50.850346,4.351721,Le Vismet,50.851005,4.350143,Seafood Restaurant


Let's check how many venues were returned for each neighborhood


In [49]:
n_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Anderlecht,29,29,29,29,29,29
Auderghem,46,46,46,46,46,46
Berchem-Ste-Agathe,15,15,15,15,15,15
Bruxelles-Ville,100,100,100,100,100,100
Etterbeek,39,39,39,39,39,39
Evere,14,14,14,14,14,14
Forest,16,16,16,16,16,16
Ganshoren,21,21,21,21,21,21
Ixelles,38,38,38,38,38,38
Jette,26,26,26,26,26,26


#### Let's find out how many unique categories can be curated from all the returned venues


In [50]:
print('There are {} uniques categories.'.format(len(n_venues['Venue Category'].unique())))

There are 166 uniques categories.


<a id='item3'></a>


### Analyze Each Neighborhood in Brussels


In [51]:
# one hot encoding
n_onehot = pd.get_dummies(n_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
n_onehot['Neighborhood'] = n_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [n_onehot.columns[-1]] + list(n_onehot.columns[:-1])
n_onehot = n_onehot[fixed_columns]

n_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Bookstore,Botanical Garden,Boutique,Brasserie,Brazilian Restaurant,Breakfast Spot,Burger Joint,Bus Line,Bus Station,Bus Stop,Butcher,Cafeteria,Café,Carpet Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food & Drink Shop,Food Service,French Restaurant,Fried Chicken Joint,Friterie,Gaming Cafe,Garden,Gastropub,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Health Food Store,Herbs & Spices Store,Historic Site,History Museum,Hockey Field,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Theater,Indonesian Restaurant,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Lounge,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Middle Eastern Restaurant,Mini Golf,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Nightclub,Notary,Office,Organic Grocery,Other Repair Shop,Paella Restaurant,Park,Performing Arts Venue,Perfume Shop,Pharmacy,Piano Bar,Pizza Place,Platform,Plaza,Pool,Pool Hall,Portuguese Restaurant,Record Shop,Restaurant,Roller Rink,Rooftop Bar,Salad Place,Sandwich Place,Seafood Restaurant,Shoe Repair,Shoe Store,Shopping Mall,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Swiss Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Volleyball Court,Wine Bar,Women's Store,Yoga Studio
0,Bruxelles-Ville,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Bruxelles-Ville,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Bruxelles-Ville,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Bruxelles-Ville,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Bruxelles-Ville,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.


In [52]:
n_onehot.shape

(597, 167)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


In [53]:
n_grouped = n_onehot.groupby('Neighborhood').mean().reset_index()
n_grouped

Unnamed: 0,Neighborhood,Accessories Store,African Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Bed & Breakfast,Beer Bar,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Bookstore,Botanical Garden,Boutique,Brasserie,Brazilian Restaurant,Breakfast Spot,Burger Joint,Bus Line,Bus Station,Bus Stop,Butcher,Cafeteria,Café,Carpet Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,City Hall,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Electronics Store,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food & Drink Shop,Food Service,French Restaurant,Fried Chicken Joint,Friterie,Gaming Cafe,Garden,Gastropub,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Health Food Store,Herbs & Spices Store,Historic Site,History Museum,Hockey Field,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Theater,Indonesian Restaurant,Indoor Play Area,Italian Restaurant,Japanese Restaurant,Jazz Club,Juice Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Lounge,Massage Studio,Mediterranean Restaurant,Men's Store,Metro Station,Middle Eastern Restaurant,Mini Golf,Moroccan Restaurant,Movie Theater,Museum,Music Venue,Nightclub,Notary,Office,Organic Grocery,Other Repair Shop,Paella Restaurant,Park,Performing Arts Venue,Perfume Shop,Pharmacy,Piano Bar,Pizza Place,Platform,Plaza,Pool,Pool Hall,Portuguese Restaurant,Record Shop,Restaurant,Roller Rink,Rooftop Bar,Salad Place,Sandwich Place,Seafood Restaurant,Shoe Repair,Shoe Store,Shopping Mall,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Club,Stadium,Steakhouse,Supermarket,Sushi Restaurant,Swiss Restaurant,Tapas Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Restaurant,Thrift / Vintage Store,Toy / Game Store,Train Station,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Volleyball Court,Wine Bar,Women's Store,Yoga Studio
0,Anderlecht,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.103448,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.068966,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Auderghem,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.043478,0.043478,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.065217,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.021739,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.043478,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.021739,0.0,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.043478,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Berchem-Ste-Agathe,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bruxelles-Ville,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.03,0.05,0.0,0.02,0.01,0.02,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.07,0.01,0.02,0.0,0.01,0.01,0.01,0.02,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.04,0.01,0.01,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.02,0.03,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01
4,Etterbeek,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.025641,0.0,0.051282,0.0,0.076923,0.025641,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.051282,0.0,0.0,0.025641,0.0,0.0,0.025641,0.025641,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Evere,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.071429,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Forest,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0625,0.0625,0.0,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Ganshoren,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.095238,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.047619,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.095238,0.0,0.095238,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Ixelles,0.0,0.026316,0.052632,0.0,0.0,0.0,0.0,0.0,0.052632,0.078947,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.026316,0.0,0.0,0.026316,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.052632,0.026316,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.078947,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.026316,0.026316,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.052632,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.026316,0.0,0.0,0.0,0.0,0.052632,0.0,0.0
9,Jette,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.076923,0.192308,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.115385,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.076923,0.038462,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's confirm the new size


In [54]:
n_grouped.shape

(19, 167)

#### Let's print each neighborhood along with the top 5 most common venues


In [55]:
num_top_venues = 5

for hood in n_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = n_grouped[n_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Anderlecht----
               venue  freq
0                Bar  0.10
1             Bakery  0.07
2  Convenience Store  0.07
3         Restaurant  0.07
4              Plaza  0.07


----Auderghem----
                  venue  freq
0           Snack Place  0.07
1  Fast Food Restaurant  0.07
2      Sushi Restaurant  0.04
3                Bakery  0.04
4     French Restaurant  0.04


----Berchem-Ste-Agathe----
              venue  freq
0       Supermarket  0.13
1  Greek Restaurant  0.13
2    Sandwich Place  0.07
3      Burger Joint  0.07
4             Plaza  0.07


----Bruxelles-Ville----
            venue  freq
0  Chocolate Shop  0.07
1           Plaza  0.06
2             Bar  0.05
3       Bookstore  0.05
4           Hotel  0.04


----Etterbeek----
            venue  freq
0  Sandwich Place  0.08
1           Plaza  0.08
2             Bar  0.08
3  Cosmetics Shop  0.05
4     Supermarket  0.05


----Evere----
         venue  freq
0       Bakery  0.21
1    Brasserie  0.14
2  Supermarket  0.07


#### Let's put that into a _pandas_ dataframe


First, let's write a function to sort the venues in descending order.


Now let's create the new dataframe and display the top 10 venues for each neighborhood.


In [56]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = n_grouped['Neighborhood']

for ind in np.arange(n_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(n_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Anderlecht,Bar,Convenience Store,Plaza,Restaurant,Greek Restaurant,Metro Station,Bakery,Supermarket,History Museum,Italian Restaurant
1,Auderghem,Snack Place,Fast Food Restaurant,Bakery,Sushi Restaurant,Thai Restaurant,Pizza Place,Bar,French Restaurant,Middle Eastern Restaurant,Gym / Fitness Center
2,Berchem-Ste-Agathe,Supermarket,Greek Restaurant,Burger Joint,Café,French Restaurant,Restaurant,Notary,Bakery,Sandwich Place,Gym
3,Bruxelles-Ville,Chocolate Shop,Plaza,Bar,Bookstore,Hotel,Bakery,Italian Restaurant,Seafood Restaurant,Clothing Store,Sandwich Place
4,Etterbeek,Bar,Sandwich Place,Plaza,Cosmetics Shop,Supermarket,Pizza Place,Snack Place,Diner,Department Store,Kebab Restaurant


<a id='item4'></a>


### Cluster Neighborhoods in Brussels


Run _k_-means to cluster the neighborhood into 5 clusters.


In [57]:
# set number of clusters
kclusters = 5

n_grouped_clustering = n_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(n_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 0, 0, 0, 1, 0, 0, 0, 0])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.


In [58]:
n_merged = b_n

In [59]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

# merge n_grouped with n_data to add latitude/longitude for each neighborhood
n_merged = n_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

n_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Brussels,Bruxelles-Ville,50.850346,4.351721,0,Chocolate Shop,Plaza,Bar,Bookstore,Hotel,Bakery,Italian Restaurant,Seafood Restaurant,Clothing Store,Sandwich Place
1,Brussels,Schaerbeek,50.867416,4.377298,0,Tram Station,Supermarket,Plaza,Hookah Bar,Gastropub,Italian Restaurant,Coffee Shop,Middle Eastern Restaurant,Bus Station,Soccer Field
2,Brussels,Etterbeek,50.832578,4.388994,0,Bar,Sandwich Place,Plaza,Cosmetics Shop,Supermarket,Pizza Place,Snack Place,Diner,Department Store,Kebab Restaurant
3,Brussels,Ixelles,50.833343,4.366629,0,Bar,Italian Restaurant,Clothing Store,Wine Bar,Art Gallery,Tea Room,Coffee Shop,Bakery,French Restaurant,Plaza
4,Brussels,Saint Gilles,50.830144,4.340218,0,Bar,Greek Restaurant,Moroccan Restaurant,Bakery,Performing Arts Venue,Pizza Place,Plaza,Brasserie,Friterie,Gym / Fitness Center


Finally, let's visualize the resulting clusters


In [60]:
latitude=b_latitude
longitude=b_longitude

In [61]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(n_merged['Latitude'], n_merged['Longitude'], n_merged['Neighborhood'], n_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>


### Examine Clusters in Brussels


#### Cluster 1


In [62]:
n_merged.loc[n_merged['Cluster Labels'] == 0, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bruxelles-Ville,Chocolate Shop,Plaza,Bar,Bookstore,Hotel,Bakery,Italian Restaurant,Seafood Restaurant,Clothing Store,Sandwich Place
1,Schaerbeek,Tram Station,Supermarket,Plaza,Hookah Bar,Gastropub,Italian Restaurant,Coffee Shop,Middle Eastern Restaurant,Bus Station,Soccer Field
2,Etterbeek,Bar,Sandwich Place,Plaza,Cosmetics Shop,Supermarket,Pizza Place,Snack Place,Diner,Department Store,Kebab Restaurant
3,Ixelles,Bar,Italian Restaurant,Clothing Store,Wine Bar,Art Gallery,Tea Room,Coffee Shop,Bakery,French Restaurant,Plaza
4,Saint Gilles,Bar,Greek Restaurant,Moroccan Restaurant,Bakery,Performing Arts Venue,Pizza Place,Plaza,Brasserie,Friterie,Gym / Fitness Center
5,Anderlecht,Bar,Convenience Store,Plaza,Restaurant,Greek Restaurant,Metro Station,Bakery,Supermarket,History Museum,Italian Restaurant
7,Koekelberg,Gym,History Museum,Bar,Piano Bar,Convenience Store,Sandwich Place,Falafel Restaurant,Soccer Field,French Restaurant,Supermarket
8,Berchem-Ste-Agathe,Supermarket,Greek Restaurant,Burger Joint,Café,French Restaurant,Restaurant,Notary,Bakery,Sandwich Place,Gym
9,Ganshoren,Plaza,Bakery,Friterie,Bar,Bus Station,Pizza Place,Indian Restaurant,Thai Restaurant,Bus Stop,Boutique
10,Jette,Bar,Bus Station,Platform,Bakery,Park,Brasserie,Bed & Breakfast,Snack Place,Restaurant,Plaza


#### Cluster 2


In [63]:
n_merged.loc[n_merged['Cluster Labels'] == 1, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Evere,Bakery,Brasserie,Theater,Pizza Place,History Museum,Restaurant,Sandwich Place,Snack Place,Spa,Supermarket


#### Cluster 3


In [64]:
n_merged.loc[n_merged['Cluster Labels'] == 2, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
14,Watermael-Boitsfort,Stadium,Bus Stop,Athletics & Sports,Bike Shop,Cafeteria,Roller Rink,Electronics Store,Fast Food Restaurant,Farmers Market,Falafel Restaurant


#### Cluster 4


In [65]:
n_merged.loc[n_merged['Cluster Labels'] == 3, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Woluwé-St-Pierre,Tram Station,Hockey Field,Tennis Court,Park,Museum,Mini Golf,Lounge,Brasserie,Lake,French Restaurant


#### Cluster 5


In [66]:
n_merged.loc[n_merged['Cluster Labels'] == 4, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Molenbeek-St-Jean,Nightclub,Coffee Shop,Tram Station,Hostel,Italian Restaurant,Snack Place,Burger Joint,Supermarket,History Museum,Bagel Shop


<a id='item44'></a>



## 5. Rome: Explore, Cluster and Examine Neighborhoods.


In [67]:
neighbor_data=r_n
neighborhoods=r_n

In [68]:

n_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )


Municipio I – Historical Center
Municipio II – Parioli/Nomentano
Municipio III – Monte Sacro
Municipio IV – Tiburtina
Municipio V – Prenestino/Centocelle
Municipio VI – Roma Delle Torri
Municipio VII – Appio-Latino/Tuscolano/Cinecittà
Municipio VIII – Appia Antica
Municipio IX – EUR
Municipio X – Ostia/Acilia
Municipio XI – Arvalia/Portuense
Municipio XII – Monte Verde
Municipio XIII – Aurelia
Municipio XIV – Monte Mario
Municipio XV – Cassia/Flaminia


In [69]:
print(n_venues.shape)
n_venues.head()

(251, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Municipio I – Historical Center,41.90286,12.485487,La Prosciutteria,41.901888,12.484467,Italian Restaurant
1,Municipio I – Historical Center,41.90286,12.485487,La Sandwicheria,41.902901,12.483336,Sandwich Place
2,Municipio I – Historical Center,41.90286,12.485487,Sant'Andrea,41.903197,12.483252,Italian Restaurant
3,Municipio I – Historical Center,41.90286,12.485487,Fendi,41.90271,12.484492,Boutique
4,Municipio I – Historical Center,41.90286,12.485487,Gelateria Valentino,41.901449,12.484981,Ice Cream Shop


Let's check how many venues were returned for each neighborhood


In [70]:
n_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Municipio I – Historical Center,87,87,87,87,87,87
Municipio II – Parioli/Nomentano,32,32,32,32,32,32
Municipio III – Monte Sacro,7,7,7,7,7,7
Municipio IV – Tiburtina,4,4,4,4,4,4
Municipio V – Prenestino/Centocelle,10,10,10,10,10,10
Municipio VI – Roma Delle Torri,15,15,15,15,15,15
Municipio VII – Appio-Latino/Tuscolano/Cinecittà,10,10,10,10,10,10
Municipio VIII – Appia Antica,12,12,12,12,12,12
Municipio X – Ostia/Acilia,6,6,6,6,6,6
Municipio XI – Arvalia/Portuense,1,1,1,1,1,1


#### Let's find out how many unique categories can be curated from all the returned venues


In [71]:
print('There are {} uniques categories.'.format(len(n_venues['Venue Category'].unique())))

There are 80 uniques categories.


<a id='item3'></a>


### Analyze Each Neighborhood in Rome


In [72]:
# one hot encoding
n_onehot = pd.get_dummies(n_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
n_onehot['Neighborhood'] = n_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [n_onehot.columns[-1]] + list(n_onehot.columns[:-1])
n_onehot = n_onehot[fixed_columns]

n_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Art Museum,Asian Restaurant,Bakery,Basketball Court,Basketball Stadium,Bed & Breakfast,Betting Shop,Boarding House,Bookstore,Boutique,Breakfast Spot,Brewery,Bus Line,Café,Chinese Restaurant,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop,Dessert Shop,Diner,Dog Run,Fast Food Restaurant,Flower Shop,Fountain,Fried Chicken Joint,Garden Center,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Historic Site,Home Service,Hotel,Hotel Pool,Ice Cream Shop,Italian Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lingerie Store,Mediterranean Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Museum,Nightclub,Noodle House,Office,Park,Pharmacy,Pizza Place,Plaza,Pool,Pub,Restaurant,Road,Roman Restaurant,Sandwich Place,Seafood Restaurant,Shopping Mall,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Tennis Court,Toy / Game Store,Trattoria/Osteria,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,Municipio I – Historical Center,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Municipio I – Historical Center,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Municipio I – Historical Center,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Municipio I – Historical Center,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Municipio I – Historical Center,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.


In [73]:
n_onehot.shape

(251, 81)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


In [74]:
n_grouped = n_onehot.groupby('Neighborhood').mean().reset_index()
n_grouped

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Art Museum,Asian Restaurant,Bakery,Basketball Court,Basketball Stadium,Bed & Breakfast,Betting Shop,Boarding House,Bookstore,Boutique,Breakfast Spot,Brewery,Bus Line,Café,Chinese Restaurant,Circus,Clothing Store,Cocktail Bar,Coffee Shop,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop,Dessert Shop,Diner,Dog Run,Fast Food Restaurant,Flower Shop,Fountain,Fried Chicken Joint,Garden Center,Gift Shop,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Historic Site,Home Service,Hotel,Hotel Pool,Ice Cream Shop,Italian Restaurant,Jazz Club,Jewelry Store,Juice Bar,Lingerie Store,Mediterranean Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Monument / Landmark,Museum,Nightclub,Noodle House,Office,Park,Pharmacy,Pizza Place,Plaza,Pool,Pub,Restaurant,Road,Roman Restaurant,Sandwich Place,Seafood Restaurant,Shopping Mall,Soccer Stadium,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Tennis Court,Toy / Game Store,Trattoria/Osteria,Vegetarian / Vegan Restaurant,Wine Bar,Wine Shop,Women's Store
0,Municipio I – Historical Center,0.011494,0.0,0.011494,0.0,0.0,0.0,0.0,0.011494,0.0,0.011494,0.011494,0.068966,0.0,0.011494,0.0,0.011494,0.011494,0.0,0.022989,0.011494,0.0,0.0,0.0,0.011494,0.0,0.022989,0.0,0.0,0.011494,0.0,0.034483,0.0,0.0,0.011494,0.0,0.0,0.0,0.0,0.011494,0.0,0.137931,0.0,0.068966,0.137931,0.011494,0.034483,0.011494,0.011494,0.011494,0.0,0.0,0.022989,0.011494,0.0,0.0,0.0,0.0,0.0,0.022989,0.103448,0.0,0.0,0.0,0.011494,0.0,0.034483,0.0,0.011494,0.0,0.0,0.0,0.0,0.011494,0.0,0.011494,0.011494,0.0,0.0,0.0,0.022989
1,Municipio II – Parioli/Nomentano,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.03125,0.0,0.0,0.0,0.03125,0.0,0.03125,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.09375,0.0,0.03125,0.1875,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.03125,0.0,0.0,0.0,0.0,0.03125,0.0625,0.0,0.03125,0.0625,0.0,0.03125,0.03125,0.125,0.0,0.0,0.03125,0.03125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Municipio III – Monte Sacro,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0
3,Municipio IV – Tiburtina,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Municipio V – Prenestino/Centocelle,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Municipio VI – Roma Delle Torri,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.066667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0
6,Municipio VII – Appio-Latino/Tuscolano/Cinecittà,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.2,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Municipio VIII – Appia Antica,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0
8,Municipio X – Ostia/Acilia,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Municipio XI – Arvalia/Portuense,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's confirm the new size


In [75]:
n_grouped.shape

(14, 81)

#### Let's print each neighborhood along with the top 5 most common venues


In [76]:
num_top_venues = 5

for hood in n_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = n_grouped[n_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Municipio I – Historical Center----
                venue  freq
0               Hotel  0.14
1  Italian Restaurant  0.14
2               Plaza  0.10
3      Ice Cream Shop  0.07
4            Boutique  0.07


----Municipio II – Parioli/Nomentano----
                venue  freq
0  Italian Restaurant  0.19
1  Seafood Restaurant  0.12
2               Hotel  0.09
3               Plaza  0.06
4          Restaurant  0.06


----Municipio III – Monte Sacro----
                venue  freq
0               Plaza  0.29
1                Café  0.29
2       Jewelry Store  0.14
3   Trattoria/Osteria  0.14
4  Basketball Stadium  0.14


----Municipio IV – Tiburtina----
          venue  freq
0  Concert Hall  0.25
1   Pizza Place  0.25
2      Gym Pool  0.25
3         Diner  0.25
4        Museum  0.00


----Municipio V – Prenestino/Centocelle----
            venue  freq
0     Pizza Place   0.2
1            Café   0.2
2             Gym   0.1
3  Sandwich Place   0.1
4    Noodle House   0.1


----Municipio VI

#### Let's put that into a _pandas_ dataframe


First, let's write a function to sort the venues in descending order.


Now let's create the new dataframe and display the top 10 venues for each neighborhood.


In [77]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = n_grouped['Neighborhood']

for ind in np.arange(n_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(n_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Municipio I – Historical Center,Italian Restaurant,Hotel,Plaza,Ice Cream Shop,Boutique,Sandwich Place,Jewelry Store,Fountain,Dessert Shop,Pizza Place
1,Municipio II – Parioli/Nomentano,Italian Restaurant,Seafood Restaurant,Hotel,Plaza,Restaurant,Fountain,Nightclub,Coffee Shop,College Cafeteria,Juice Bar
2,Municipio III – Monte Sacro,Plaza,Café,Trattoria/Osteria,Basketball Stadium,Jewelry Store,Dog Run,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop
3,Municipio IV – Tiburtina,Gym Pool,Diner,Concert Hall,Pizza Place,Grocery Store,Gift Shop,Garden Center,Fried Chicken Joint,Fountain,Flower Shop
4,Municipio V – Prenestino/Centocelle,Pizza Place,Café,Supermarket,Italian Restaurant,Noodle House,Sandwich Place,Gym,Basketball Court,Basketball Stadium,Fast Food Restaurant


<a id='item4'></a>


### Cluster Neighborhoods in Rome


Run _k_-means to cluster the neighborhood into 5 clusters.


In [78]:
# set number of clusters
kclusters = 5

n_grouped_clustering = n_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(n_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 4, 3, 0, 0, 0, 0, 0, 1])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.


In [79]:
n_merged = r_n

In [80]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

# merge n_grouped with n_data to add latitude/longitude for each neighborhood
n_merged = n_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

n_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Rome,Municipio I – Historical Center,41.90286,12.485487,0.0,Italian Restaurant,Hotel,Plaza,Ice Cream Shop,Boutique,Sandwich Place,Jewelry Store,Fountain,Dessert Shop,Pizza Place
1,Rome,Municipio II – Parioli/Nomentano,41.922397,12.498321,0.0,Italian Restaurant,Seafood Restaurant,Hotel,Plaza,Restaurant,Fountain,Nightclub,Coffee Shop,College Cafeteria,Juice Bar
2,Rome,Municipio III – Monte Sacro,41.942542,12.540979,4.0,Plaza,Café,Trattoria/Osteria,Basketball Stadium,Jewelry Store,Dog Run,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop
3,Rome,Municipio IV – Tiburtina,41.92163,12.553682,3.0,Gym Pool,Diner,Concert Hall,Pizza Place,Grocery Store,Gift Shop,Garden Center,Fried Chicken Joint,Fountain,Flower Shop
4,Rome,Municipio V – Prenestino/Centocelle,41.891288,12.551022,0.0,Pizza Place,Café,Supermarket,Italian Restaurant,Noodle House,Sandwich Place,Gym,Basketball Court,Basketball Stadium,Fast Food Restaurant


Finally, let's visualize the resulting clusters


In [81]:
latitude=r_latitude
longitude=r_longitude

In [82]:
# drop nan
prpr=n_merged
n_merged.dropna()
prpr2=n_merged
prpr2.dropna(subset = ['Cluster Labels'], inplace=True)
prpr2['Cluster Labels']=prpr2['Cluster Labels'].astype(int)
n_merged=prpr2

In [83]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(n_merged['Latitude'], n_merged['Longitude'], n_merged['Neighborhood'], n_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>


### Examine Clusters in Rome


#### Cluster 1


In [84]:
n_merged.loc[n_merged['Cluster Labels'] == 0, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Municipio I – Historical Center,Italian Restaurant,Hotel,Plaza,Ice Cream Shop,Boutique,Sandwich Place,Jewelry Store,Fountain,Dessert Shop,Pizza Place
1,Municipio II – Parioli/Nomentano,Italian Restaurant,Seafood Restaurant,Hotel,Plaza,Restaurant,Fountain,Nightclub,Coffee Shop,College Cafeteria,Juice Bar
4,Municipio V – Prenestino/Centocelle,Pizza Place,Café,Supermarket,Italian Restaurant,Noodle House,Sandwich Place,Gym,Basketball Court,Basketball Stadium,Fast Food Restaurant
5,Municipio VI – Roma Delle Torri,Supermarket,Pizza Place,Brewery,Wine Shop,Fried Chicken Joint,Italian Restaurant,Fast Food Restaurant,Middle Eastern Restaurant,Office,Plaza
6,Municipio VII – Appio-Latino/Tuscolano/Cinecittà,Pizza Place,Ice Cream Shop,Pub,Miscellaneous Shop,Italian Restaurant,Gym / Fitness Center,Garden Center,Fried Chicken Joint,Fountain,Flower Shop
7,Municipio VIII – Appia Antica,Pizza Place,Plaza,Diner,Vegetarian / Vegan Restaurant,Bakery,Café,Ice Cream Shop,Supermarket,Italian Restaurant,Soccer Stadium
9,Municipio X – Ostia/Acilia,Home Service,Bed & Breakfast,Garden Center,Pool,Supermarket,Circus,Basketball Court,Flower Shop,Cosmetics Shop,Cupcake Shop
12,Municipio XIII – Aurelia,Café,Hotel,Pizza Place,Breakfast Spot,Italian Restaurant,Plaza,Pub,Restaurant,Park,Diner
13,Municipio XIV – Monte Mario,Clothing Store,Supermarket,Pizza Place,Shopping Mall,Betting Shop,Gym / Fitness Center,Fried Chicken Joint,Dessert Shop,Gym,Coffee Shop
14,Municipio XV – Cassia/Flaminia,Italian Restaurant,Restaurant,Cocktail Bar,Ice Cream Shop,Wine Bar,Café,Pizza Place,Steakhouse,Hotel,Clothing Store


#### Cluster 2


In [85]:
n_merged.loc[n_merged['Cluster Labels'] == 1, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Municipio XI – Arvalia/Portuense,Italian Restaurant,Women's Store,Fast Food Restaurant,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop,Dessert Shop,Diner,Dog Run


#### Cluster 3


In [86]:
n_merged.loc[n_merged['Cluster Labels'] == 2, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
11,Municipio XII – Monte Verde,Pool,Cocktail Bar,Coffee Shop,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop,Dessert Shop,Diner,Dog Run


#### Cluster 4


In [87]:
n_merged.loc[n_merged['Cluster Labels'] == 3, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Municipio IV – Tiburtina,Gym Pool,Diner,Concert Hall,Pizza Place,Grocery Store,Gift Shop,Garden Center,Fried Chicken Joint,Fountain,Flower Shop


#### Cluster 5


In [88]:
n_merged.loc[n_merged['Cluster Labels'] == 4, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Municipio III – Monte Sacro,Plaza,Café,Trattoria/Osteria,Basketball Stadium,Jewelry Store,Dog Run,College Cafeteria,Concert Hall,Cosmetics Shop,Cupcake Shop


<a id='item55'></a>


## 6. Madrid: Explore, Cluster and Examine Neighborhoods.


In [89]:
neighbor_data=m_n
neighborhoods=m_n

In [90]:

n_venues = getNearbyVenues(names=neighborhoods['Neighborhood'],
                                   latitudes=neighborhoods['Latitude'],
                                   longitudes=neighborhoods['Longitude']
                                  )


Centro
Arganzuela
Retiro
Salamanca
Chamartin
Tetuan
Chamberi
Fuencarral-El Pardo
Moncloa-Aravaca
Latina
Carabanchel
Usera
Puente de Vallecas
Moratalaz
Ciudad Lineal
Hortaleza
Villaverde
Villa de Vallecas
Vicalvaro
San Blas-Canillejas
Barajas


In [91]:
print(n_venues.shape)
n_venues.head()

(573, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Centro,40.411535,-3.707628,Toga,40.411118,-3.706673,Spanish Restaurant
1,Centro,40.411535,-3.707628,La Tienda de la Cerveza,40.410546,-3.708128,Food & Drink Shop
2,Centro,40.411535,-3.707628,ok hostel,40.411135,-3.706819,Hostel
3,Centro,40.411535,-3.707628,Ruda Café,40.410486,-3.708,Coffee Shop
4,Centro,40.411535,-3.707628,Sala Equis,40.41219,-3.706028,Indie Movie Theater


Let's check how many venues were returned for each neighborhood


In [92]:
n_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Arganzuela,47,47,47,47,47,47
Barajas,7,7,7,7,7,7
Carabanchel,4,4,4,4,4,4
Centro,100,100,100,100,100,100
Chamartin,36,36,36,36,36,36
Chamberi,100,100,100,100,100,100
Ciudad Lineal,30,30,30,30,30,30
Fuencarral-El Pardo,4,4,4,4,4,4
Hortaleza,15,15,15,15,15,15
Latina,14,14,14,14,14,14


#### Let's find out how many unique categories can be curated from all the returned venues


In [93]:
print('There are {} uniques categories.'.format(len(n_venues['Venue Category'].unique())))

There are 132 uniques categories.


<a id='item3'></a>


### Analyze Each Neighborhood in Madrid


In [94]:
# one hot encoding
n_onehot = pd.get_dummies(n_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
n_onehot['Neighborhood'] = n_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [n_onehot.columns[-1]] + list(n_onehot.columns[:-1])
n_onehot = n_onehot[fixed_columns]

n_onehot.head()

Unnamed: 0,Wine Shop,Accessories Store,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bakery,Bar,Beach,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Café,Candy Store,Cheese Shop,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Deli / Bodega,Department Store,Dessert Shop,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,General Entertainment,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Historic Site,History Museum,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Kebab Restaurant,Korean Restaurant,Lake,Latin American Restaurant,Liquor Store,Lounge,Market,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Molecular Gastronomy Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Music Venue,Neighborhood,Nightclub,Noodle House,Office,Other Nightlife,Outdoors & Recreation,Paella Restaurant,Park,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pub,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Skate Park,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Train,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Wine Bar
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Centro,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Centro,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Centro,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Centro,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Centro,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


And let's examine the new dataframe size.


In [95]:
n_onehot.shape

(573, 132)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category


In [96]:
n_grouped = n_onehot.groupby('Neighborhood').mean().reset_index()
n_grouped

Unnamed: 0,Neighborhood,Wine Shop,Accessories Store,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Bakery,Bar,Beach,Beer Bar,Beer Garden,Beer Store,Bistro,Bookstore,Boutique,Boxing Gym,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Café,Candy Store,Cheese Shop,Chinese Restaurant,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Convenience Store,Cosmetics Shop,Creperie,Deli / Bodega,Department Store,Dessert Shop,Diner,Dog Run,Donut Shop,Dumpling Restaurant,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,Furniture / Home Store,Garden,Gastropub,Gay Bar,General Entertainment,Gourmet Shop,Grocery Store,Gym,Gym / Fitness Center,Historic Site,History Museum,Hostel,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Kebab Restaurant,Korean Restaurant,Lake,Latin American Restaurant,Liquor Store,Lounge,Market,Mediterranean Restaurant,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Molecular Gastronomy Restaurant,Motorcycle Shop,Movie Theater,Multiplex,Music Venue,Nightclub,Noodle House,Office,Other Nightlife,Outdoors & Recreation,Paella Restaurant,Park,Peruvian Restaurant,Pharmacy,Pizza Place,Playground,Plaza,Pool,Pub,Resort,Restaurant,Salad Place,Salon / Barbershop,Sandwich Place,Scenic Lookout,Seafood Restaurant,Shoe Store,Skate Park,Snack Place,Soccer Field,Spa,Spanish Restaurant,Sporting Goods Shop,Sports Bar,Sports Club,Steakhouse,Supermarket,Sushi Restaurant,Tapas Restaurant,Theater,Thrift / Vintage Store,Toy / Game Store,Train,Train Station,Vegetarian / Vegan Restaurant,Video Game Store,Wine Bar
0,Arganzuela,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.106383,0.021277,0.0,0.042553,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.021277,0.021277,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.085106,0.021277,0.0,0.021277,0.042553,0.042553,0.021277,0.021277,0.0,0.021277,0.0,0.0,0.0,0.0,0.021277,0.0,0.021277,0.0,0.0,0.0,0.12766,0.0,0.0,0.0,0.0,0.021277,0.0,0.042553,0.0,0.0,0.0,0.0,0.021277,0.0,0.0,0.0
1,Barajas,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Carabanchel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Centro,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.05,0.0,0.01,0.01,0.0,0.03,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.03,0.05,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.09,0.0,0.01,0.01,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.09,0.0,0.0,0.0,0.0,0.01,0.0,0.13,0.02,0.0,0.0,0.0,0.0,0.02,0.0,0.02
4,Chamartin,0.0,0.0,0.027778,0.0,0.027778,0.0,0.0,0.027778,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.055556,0.027778,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.027778,0.027778,0.0,0.0,0.027778,0.0,0.0,0.0,0.027778,0.0,0.111111,0.0,0.0,0.0,0.0,0.027778,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.027778,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Chamberi,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.06,0.0,0.01,0.02,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.05,0.0,0.01,0.01,0.0,0.02,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.03,0.02,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.04,0.0,0.05,0.01,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.09,0.0,0.02,0.0,0.0,0.03,0.02,0.04,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0
6,Ciudad Lineal,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.033333,0.0,0.066667,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.033333,0.066667,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Fuencarral-El Pardo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.75,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Hortaleza,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.133333,0.0,0.066667,0.0,0.0,0.0,0.133333,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.133333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0
9,Latina,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.214286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0


#### Let's confirm the new size


In [97]:
n_grouped.shape

(21, 132)

#### Let's print each neighborhood along with the top 5 most common venues


In [98]:
num_top_venues = 5

for hood in n_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = n_grouped[n_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Arganzuela----
                venue  freq
0  Spanish Restaurant  0.13
1                 Bar  0.11
2                Park  0.09
3               Plaza  0.04
4          Playground  0.04


----Barajas----
                venue  freq
0         Pizza Place  0.14
1          Restaurant  0.14
2  Mexican Restaurant  0.14
3                Café  0.14
4    Tapas Restaurant  0.14


----Carabanchel----
                  venue  freq
0                   Bar  0.50
1             Gastropub  0.25
2  Gym / Fitness Center  0.25
3             Wine Shop  0.00
4       Motorcycle Shop  0.00


----Centro----
                venue  freq
0    Tapas Restaurant  0.13
1  Spanish Restaurant  0.09
2               Plaza  0.09
3               Hotel  0.05
4                 Bar  0.05


----Chamartin----
                      venue  freq
0        Spanish Restaurant  0.22
1                Restaurant  0.11
2  Mediterranean Restaurant  0.08
3                     Hotel  0.06
4                 Nightclub  0.06


----Chamberi--

#### Let's put that into a _pandas_ dataframe


First, let's write a function to sort the venues in descending order.


Now let's create the new dataframe and display the top 10 venues for each neighborhood.


In [99]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = n_grouped['Neighborhood']

for ind in np.arange(n_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(n_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arganzuela,Spanish Restaurant,Bar,Park,Plaza,Playground,Beer Garden,Tapas Restaurant,Ice Cream Shop,Pool,Pizza Place
1,Barajas,Hostel,Café,Restaurant,Tapas Restaurant,Mexican Restaurant,Pizza Place,Coffee Shop,Concert Hall,Convenience Store,Comfort Food Restaurant
2,Carabanchel,Bar,Gastropub,Gym / Fitness Center,Wine Bar,Donut Shop,Dog Run,Diner,Dessert Shop,Department Store,Deli / Bodega
3,Centro,Tapas Restaurant,Plaza,Spanish Restaurant,Bar,Hotel,Hostel,Restaurant,Bistro,Coffee Shop,Dessert Shop
4,Chamartin,Spanish Restaurant,Restaurant,Mediterranean Restaurant,Hotel,Tapas Restaurant,Nightclub,Ice Cream Shop,Gym / Fitness Center,Garden,Japanese Restaurant


<a id='item4'></a>


### Cluster Neighborhoods in Madrid


Run _k_-means to cluster the neighborhood into 5 clusters.


In [100]:
# set number of clusters
kclusters = 5

n_grouped_clustering = n_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(n_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 3, 1, 1, 1, 1, 2, 1, 1])

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.


In [101]:
n_merged = m_n

In [102]:
# add clustering labels
neighborhoods_venues_sorted

neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

# merge n_grouped with n_data to add latitude/longitude for each neighborhood
n_merged = n_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

n_merged.head() # check the last columns!

Unnamed: 0,Borough,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Madrid,Centro,40.411535,-3.707628,1,Tapas Restaurant,Plaza,Spanish Restaurant,Bar,Hotel,Hostel,Restaurant,Bistro,Coffee Shop,Dessert Shop
1,Madrid,Arganzuela,40.398889,-3.710203,1,Spanish Restaurant,Bar,Park,Plaza,Playground,Beer Garden,Tapas Restaurant,Ice Cream Shop,Pool,Pizza Place
2,Madrid,Retiro,40.411335,-3.674905,1,Spanish Restaurant,Grocery Store,Garden,Supermarket,Plaza,Ice Cream Shop,Brewery,Jazz Club,Mediterranean Restaurant,Diner
3,Madrid,Salamanca,40.428002,-3.686771,1,Restaurant,Spanish Restaurant,Clothing Store,Boutique,Tapas Restaurant,Furniture / Home Store,Hotel,Jewelry Store,Sporting Goods Shop,Japanese Restaurant
4,Madrid,Chamartin,40.46152,-3.686584,1,Spanish Restaurant,Restaurant,Mediterranean Restaurant,Hotel,Tapas Restaurant,Nightclub,Ice Cream Shop,Gym / Fitness Center,Garden,Japanese Restaurant


Finally, let's visualize the resulting clusters


In [104]:

latitude=m_latitude
longitude=m_longitude

In [105]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(n_merged['Latitude'], n_merged['Longitude'], n_merged['Neighborhood'], n_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>


### Examine Clusters in Madrid


#### Cluster 1


In [106]:
n_merged.loc[n_merged['Cluster Labels'] == 0, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
18,Vicalvaro,Breakfast Spot,Dog Run,Bar,Wine Bar,Dessert Shop,Dumpling Restaurant,Donut Shop,Diner,Department Store,Farmers Market


#### Cluster 2


In [107]:
n_merged.loc[n_merged['Cluster Labels'] == 1, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Centro,Tapas Restaurant,Plaza,Spanish Restaurant,Bar,Hotel,Hostel,Restaurant,Bistro,Coffee Shop,Dessert Shop
1,Arganzuela,Spanish Restaurant,Bar,Park,Plaza,Playground,Beer Garden,Tapas Restaurant,Ice Cream Shop,Pool,Pizza Place
2,Retiro,Spanish Restaurant,Grocery Store,Garden,Supermarket,Plaza,Ice Cream Shop,Brewery,Jazz Club,Mediterranean Restaurant,Diner
3,Salamanca,Restaurant,Spanish Restaurant,Clothing Store,Boutique,Tapas Restaurant,Furniture / Home Store,Hotel,Jewelry Store,Sporting Goods Shop,Japanese Restaurant
4,Chamartin,Spanish Restaurant,Restaurant,Mediterranean Restaurant,Hotel,Tapas Restaurant,Nightclub,Ice Cream Shop,Gym / Fitness Center,Garden,Japanese Restaurant
5,Tetuan,Spanish Restaurant,Coffee Shop,Restaurant,Grocery Store,Supermarket,Bar,Chinese Restaurant,Hotel,Seafood Restaurant,Resort
6,Chamberi,Spanish Restaurant,Bar,Café,Restaurant,Theater,Sandwich Place,Pub,Tapas Restaurant,Pizza Place,Supermarket
9,Latina,Pizza Place,Metro Station,Asian Restaurant,Train Station,Grocery Store,Italian Restaurant,Supermarket,Scenic Lookout,Fast Food Restaurant,Lake
11,Usera,Spanish Restaurant,Café,Mobile Phone Shop,Grocery Store,Noodle House,Pizza Place,Theater,Bubble Tea Shop,Asian Restaurant,Fast Food Restaurant
13,Moratalaz,Restaurant,Metro Station,Pizza Place,Bar,Concert Hall,Pub,Coffee Shop,Gym,Chinese Restaurant,Comfort Food Restaurant


#### Cluster 3


In [108]:
n_merged.loc[n_merged['Cluster Labels'] == 2, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Fuencarral-El Pardo,Spanish Restaurant,Soccer Field,Deli / Bodega,Donut Shop,Dog Run,Diner,Dessert Shop,Department Store,Creperie,Electronics Store
8,Moncloa-Aravaca,Spanish Restaurant,Wine Bar,Department Store,Dumpling Restaurant,Donut Shop,Dog Run,Diner,Dessert Shop,Deli / Bodega,Farmers Market


#### Cluster 4


In [109]:
n_merged.loc[n_merged['Cluster Labels'] == 3, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Carabanchel,Bar,Gastropub,Gym / Fitness Center,Wine Bar,Donut Shop,Dog Run,Diner,Dessert Shop,Department Store,Deli / Bodega


#### Cluster 5


In [110]:
n_merged.loc[n_merged['Cluster Labels'] == 4, n_merged.columns[[1] + list(range(5, n_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Puente de Vallecas,Spanish Restaurant,Supermarket,Park,Electronics Store,Pizza Place,Tapas Restaurant,Burger Joint,Restaurant,Cosmetics Shop,Department Store
16,Villaverde,Spanish Restaurant,Pizza Place,Train,Brewery,Wine Bar,Deli / Bodega,Dog Run,Diner,Dessert Shop,Department Store


## Change Log

| Date (YYYY-MM-DD) | Version | Changed By    | Change Description         |
| ----------------- | ------- | ------------- | -------------------------- |
| 2020-11-26        | 2.0     | Lakshmi Holla | Updated the markdown cells |
| 2020-12-31        | 3.0     | Victor Pagan  | D Science Capstone Project |
|                   |         |               |                            |

## <h3 align="center"> © IBM Corporation 2020. All rights reserved. <h3/>
