# Northern / Southern Rome. A false dichotomy?

In order to make acquietance with Foursquare's API, I used the IBM final course notebooks to discover if the old ideological division between Northen and Southern Rome is numerically sensible: clustering the neighborhoods of Rome with respect to the local venues could be a way to read the territory in that sense.

Please note: I'm not Roman. It was also a way to know the city areas in a funny way.

So, let's import some libraries.

In [1]:
# Import libraries
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

import geocoder

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## Phase 1 : Constructing Datasets.

### Roma Lat Long - Neighborohoods datasets.

First, we generate a dataframe with columns ID, name, lat, lon for Rome's community areas. Let's define the geocoder using BING API

In [2]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.bing('{}, Roma, Italia'.format(neighborhood), key="-vcZ2fo-O")
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [3]:
# test the functioning of the geo_coder. Import the "center of Rome" lat, long.
latitude, longitude = get_latlng("")

In [4]:
#load manually written Rome Neighborhood datasets (there was no tabular wiki)
rome_quart = pd.read_csv("data/roma.csv")

In [5]:
# request lat,long via bing
coords = [ get_latlng(quart) for quart in rome_quart["Name"].tolist() ]

In [6]:
# merge the datasets
df_coords = pd.DataFrame(coords, columns=["Latitude", "Longitude"])
rome_quart["Latitude"] = df_coords["Latitude"]
rome_quart["Longitude"] = df_coords["Longitude"]

In [126]:
rome_quart.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Rione Monti,41.89492,12.49434
1,Rione Trevi,41.902199,12.48606
2,Rione Campo Marzio,41.907108,12.47786
3,Rione Ponte,41.89904,12.46711
4,Rione Regola,41.894791,12.47027


In [8]:
# splits the two datasets
quartieri_roma = rome_quart[rome_quart["Neighborhood"].str.contains("Q")]
rioni_roma = rome_quart[rome_quart["Neighborhood"].str.contains("R")]

In [9]:
# create map of Rome using latitude and longitude values
map_rome = folium.Map(location=[latitude, longitude], tiles="Stamen Terrain",zoom_start=11.5)

# add quartieri_circle to map
for lat, lng, neighborhood in zip(quartieri_roma['Latitude'], quartieri_roma['Longitude'], quartieri_roma['Name']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lng],
        radius=1200,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_rome)
    
# add rioni_circle to map
for lat, lng, neighborhood in zip(rioni_roma['Latitude'], rioni_roma['Longitude'], rioni_roma['Name']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.Circle(
        [lat, lng],
        radius=400,
        popup=label,
        color='red',
        fill=True,
        fill_color='#FA8072',
        fill_opacity=0.7).add_to(map_rome)
    
    
map_rome

I manually removed two circles which were too much overlapped, corresponding to the "Rioni" *Parione* and *Campitelli*.

In [10]:
# save the map as HTML file
map_rome.save('map_rome.html')

## Phase 2 : Venues Analysis.

First, we define the mandatory API identification codes WARNING! Cancel out them when publishing!

In [11]:
CLIENT_ID = '' # your Foursquare ID
CLIENT_SECRET = '' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 2NYCZDBEEJVUNCLLUL53NW4OASOGKAD0Y4KW2XPTROFZN35E
CLIENT_SECRET:0JAMONZ25LMQYL3C1PQZU55L2ZOL3H2IZPLFCE5FKKYURKEO


#### Now, let's get the top 100 venues that are in the first "Rione" within a radius of 500 meters.

In [12]:
print("The designed Rione for testing is '{}'".format(rome_quart.loc[0, "Name"]))

The designed Rione for testing is 'Rione Monti'


Get the neighborhood's latitude and longitude values.

In [13]:
neighborhood_latitude = rome_quart.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = rome_quart.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = rome_quart.loc[0, 'Name'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Rione Monti are 41.894920349121094, 12.494339942932129.


Then, let's create the GET request URL

In [14]:
LIMIT = 200 # limit of number of venues returned by Foursquare API
radius = 500 # define radius

# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=2NYCZDBEEJVUNCLLUL53NW4OASOGKAD0Y4KW2XPTROFZN35E&client_secret=0JAMONZ25LMQYL3C1PQZU55L2ZOL3H2IZPLFCE5FKKYURKEO&v=20180605&ll=41.894920349121094,12.494339942932129&radius=500&limit=200'

Through a GET request, acquire frome Foursquare the raw JSON `results`

In [15]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5cd93dbff594df21bfead284'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Monti',
  'headerFullLocation': 'Monti, Rome',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 109,
  'suggestedBounds': {'ne': {'lat': 41.899420353621096,
    'lng': 12.50037403611768},
   'sw': {'lat': 41.89042034462109, 'lng': 12.488305849746578}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4fca3346e4b0e1d3a31b740f',
       'name': 'Fatamorgana',
       'location': {'address': 'Piazza Degli Zingari 5',
        'lat': 41.89561,
        'lng': 12.493304,
        'labeledLatLngs': [{'label': 'display',
          'lat': 41.89561,
          'lng': 12.

In [16]:
# function that extracts the category of the venue from Foursquare Labs

def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [17]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Fatamorgana,Ice Cream Shop,41.89561,12.493304
1,Trieste,Pizza Place,41.896305,12.494132
2,Grezzo,Pastry Shop,41.896681,12.494535
3,Montipalace Hotel,Hotel,41.895384,12.493839
4,Libreria Caffè Bohemien,Cocktail Bar,41.895444,12.492863


In [18]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


#### Then we define the following module, that returns via the *explore* endpoint in Foursquare API the Venues in the nearby 

In [19]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

Please note that the "rioni" neighborhoods are less distributed, w.r.t "quartieri" neighborhoods. Then we should have a different approach: we should split the two datasets and choose a different radius (e.g. 500 m and 3000 m, respectively for R* e Q*)

In [20]:
# Now write the code to run the above function on each neighborhood and create a new merged dataframe.

In [21]:
rioni_venues = getNearbyVenues(names=rioni_roma['Name'],
                                   latitudes=rioni_roma['Latitude'],
                                   longitudes=rioni_roma['Longitude'],
                                   radius=400
                                  )
quartieri_venues = getNearbyVenues(names=quartieri_roma['Name'],
                                   latitudes=quartieri_roma['Latitude'],
                                   longitudes=quartieri_roma['Longitude'],
                                   radius=1200
                                  )

Rione Monti
Rione Trevi
Rione Campo Marzio
Rione Ponte
Rione Regola
Rione Sant’Eustachio
Rione Pigna
Rione Sant’Angelo
Rione Ripa
Rione Trastevere
Rione Borgo
Rione Esquilino
Rione Ludovisi
Rione Sallustiano
Rione Castro Pretorio
Rione Celio
Rione Testaccio
Rione Prati
Quartiere Flaminio
Quartiere Parioli
Quartiere Pinciano
Quartiere Salario
Quartiere Nomentano
Quartiere Tiburtino
Quartiere Prenestino Labicano
Quartiere Tuscolano
Quartiere Appio Latino
Quartiere Ostiense
Quartiere Portuense
Quartiere Gianicolense
Quartiere Aurelio
Quartiere Trionfale
Quartiere Della Vittoria
Quartiere Monte Sacro
Quartiere Trieste
Quartiere Tor di Quinto
Quartiere Prenestino Centocelle
Quartiere Ardeatino
Quartiere Pietralata
Quartiere Collatino
Quartiere Alessandrino
Quartiere Don Bosco
Quartiere Appio Claudio
Quartiere Appio Pignatelli
Quartiere Primavalle
Quartiere Monte Sacro Alto
Quartiere Ponte Mammolo
Quartiere San Basilio
Quartiere Giuliano Dalmata
Quartiere Eur


In [22]:
quartieri_venues

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Quartiere Flaminio,41.929989,12.46451,Neve Di Latte,41.929813,12.464887,Ice Cream Shop
1,Quartiere Flaminio,41.929989,12.46451,MAXXI Museo Nazionale delle Arti del XXI Secolo,41.928455,12.46684,Art Museum
2,Quartiere Flaminio,41.929989,12.46451,Bistrot 64,41.930391,12.466205,Restaurant
3,Quartiere Flaminio,41.929989,12.46451,Hostaria Lo Sgobbone,41.928453,12.462484,Italian Restaurant
4,Quartiere Flaminio,41.929989,12.46451,Stadio dei Marmi,41.933528,12.458784,Stadium
5,Quartiere Flaminio,41.929989,12.46451,Siciliainbocca,41.932792,12.467468,Italian Restaurant
6,Quartiere Flaminio,41.929989,12.46451,Stadio Olimpico,41.932312,12.457413,Soccer Stadium
7,Quartiere Flaminio,41.929989,12.46451,20MQ Design e Caffè,41.925191,12.470467,Café
8,Quartiere Flaminio,41.929989,12.46451,Cuccurucù,41.92478,12.45889,Italian Restaurant
9,Quartiere Flaminio,41.929989,12.46451,Accademia Nazionale di Santa Cecilia,41.929612,12.473657,Concert Hall


Then we could concatenate the two dataframes, eliminating the duplicated `venues`. We assume that 
1. there are residual overlapping in our search, 
1. in Italy, two venues couldn't have the same name if inside the same city.

In [23]:
rome_venues = pd.concat([rioni_venues,quartieri_venues])
rome_venues.drop_duplicates(subset ="Venue", 
                     keep = False, inplace = True) 

Let's check the venues dataframe size.

In [24]:
print(rome_venues.shape)
rome_venues.head()

(2297, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
1,Rione Monti,41.89492,12.49434,Trieste,41.896305,12.494132,Pizza Place
2,Rione Monti,41.89492,12.49434,Grezzo,41.896681,12.494535,Pastry Shop
3,Rione Monti,41.89492,12.49434,Montipalace Hotel,41.895384,12.493839,Hotel
4,Rione Monti,41.89492,12.49434,Libreria Caffè Bohemien,41.895444,12.492863,Cocktail Bar
5,Rione Monti,41.89492,12.49434,Relais Monti,41.896606,12.494637,Bed & Breakfast


In [25]:
rome_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Quartiere Alessandrino,11,11,11,11,11,11
Quartiere Appio Claudio,8,8,8,8,8,8
Quartiere Appio Latino,60,60,60,60,60,60
Quartiere Appio Pignatelli,10,10,10,10,10,10
Quartiere Ardeatino,40,40,40,40,40,40
Quartiere Aurelio,83,83,83,83,83,83
Quartiere Collatino,34,34,34,34,34,34
Quartiere Della Vittoria,39,39,39,39,39,39
Quartiere Don Bosco,46,46,46,46,46,46
Quartiere Eur,71,71,71,71,71,71


#### Let's find out how many unique categories can be curated from all the returned venues

In [26]:
print('There are {} uniques categories.'.format(len(rome_venues['Venue Category'].unique())))

There are 212 uniques categories.


## 3. Analyze Each Neighborhood

In [27]:
# one hot encoding
rome_onehot = pd.get_dummies(rome_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
rome_onehot['Neighborhood'] = rome_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [rome_onehot.columns[-1]] + list(rome_onehot.columns[:-1])
rome_onehot = rome_onehot[fixed_columns]

rome_onehot.head()

Unnamed: 0,Neighborhood,Abruzzo Restaurant,Accessories Store,African Restaurant,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bistro,Boarding House,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buffet,Burger Joint,Bus Line,Bus Station,Butcher,Cafeteria,Café,Camera Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convention Center,Cosmetics Shop,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Donut Shop,Electronics Store,Embassy / Consulate,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Film Studio,Flea Market,Fondue Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Friterie,Garden,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Hardware Store,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Lake,Laser Tag,Light Rail Station,Lingerie Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Noodle House,Other Nightlife,Outdoors & Recreation,Paper / Office Supplies Store,Park,Parking,Pastry Shop,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Café,Pet Store,Pharmacy,Photography Lab,Piano Bar,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Recording Studio,Resort,Restaurant,Road,Rock Club,Roman Restaurant,Roof Deck,Russian Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Spa,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Toy / Game Store,Train Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,Zoo
1,Rione Monti,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Rione Monti,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Rione Monti,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Rione Monti,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Rione Monti,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [28]:
# Let's see if the dimensions are consistent. We should have 2300 venues for 212 category (+1 column for neigh)
rome_onehot.shape

(2297, 213)

#### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [29]:
rome_grouped = rome_onehot.groupby('Neighborhood').mean().reset_index()
rome_grouped

Unnamed: 0,Neighborhood,Abruzzo Restaurant,Accessories Store,African Restaurant,Airport,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bistro,Boarding House,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buffet,Burger Joint,Bus Line,Bus Station,Butcher,Cafeteria,Café,Camera Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,College Gym,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convention Center,Cosmetics Shop,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Diner,Discount Store,Dog Run,Donut Shop,Electronics Store,Embassy / Consulate,Ethiopian Restaurant,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Film Studio,Flea Market,Fondue Restaurant,Food,Food & Drink Shop,Food Court,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Friterie,Garden,Gastropub,Gay Bar,General Entertainment,General Travel,German Restaurant,Gift Shop,Gluten-free Restaurant,Golf Course,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Halal Restaurant,Hardware Store,Health Food Store,Historic Site,History Museum,Hostel,Hotel,Hotel Bar,Hotel Pool,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Lake,Laser Tag,Light Rail Station,Lingerie Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Monument / Landmark,Motorcycle Shop,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Noodle House,Other Nightlife,Outdoors & Recreation,Paper / Office Supplies Store,Park,Parking,Pastry Shop,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Café,Pet Store,Pharmacy,Photography Lab,Piano Bar,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Pub,Public Art,Radio Station,Ramen Restaurant,Record Shop,Recording Studio,Resort,Restaurant,Road,Rock Club,Roman Restaurant,Roof Deck,Russian Restaurant,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Skate Park,Skating Rink,Smoke Shop,Snack Place,Soccer Field,Soccer Stadium,Spa,Speakeasy,Spiritual Center,Sporting Goods Shop,Sports Bar,Steakhouse,Supermarket,Sushi Restaurant,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park,Thrift / Vintage Store,Toy / Game Store,Train Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Game Store,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,Zoo
0,Quartiere Alessandrino,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.181818,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.181818,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Quartiere Appio Claudio,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Quartiere Appio Latino,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.033333,0.0,0.0,0.0,0.016667,0.0,0.0,0.016667,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.0,0.0,0.0,0.033333,0.016667,0.016667,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.033333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.066667,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.016667,0.016667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.016667,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Quartiere Appio Pignatelli,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Quartiere Ardeatino,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.025,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.05,0.0,0.0,0.025,0.0,0.0,0.0,0.1,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.025,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.075,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Quartiere Aurelio,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.084337,0.0,0.0,0.0,0.048193,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.012048,0.036145,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.180723,0.0,0.012048,0.024096,0.012048,0.0,0.0,0.13253,0.012048,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.012048,0.0,0.024096,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.048193,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.012048,0.0,0.036145,0.0,0.0,0.0,0.0,0.0,0.0,0.048193,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.024096,0.012048,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012048,0.012048,0.0,0.0,0.0,0.012048,0.0,0.0,0.0,0.012048,0.0
6,Quartiere Collatino,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.088235,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.029412,0.0,0.0,0.0,0.058824,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.088235,0.0,0.0,0.0,0.058824,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0,0.0,0.029412,0.0,0.0,0.0,0.0
7,Quartiere Della Vittoria,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.102564,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.051282,0.0,0.0,0.0,0.25641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.051282,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.025641,0.051282,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0
8,Quartiere Don Bosco,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.043478,0.0,0.0,0.0,0.130435,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.108696,0.0,0.0,0.021739,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0
9,Quartiere Eur,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.014085,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.112676,0.0,0.0,0.0,0.042254,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028169,0.028169,0.0,0.0,0.0,0.0,0.0,0.028169,0.0,0.042254,0.028169,0.014085,0.028169,0.0,0.0,0.0,0.070423,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.084507,0.0,0.0,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028169,0.014085,0.0,0.0,0.014085,0.014085,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.056338,0.0,0.0,0.0,0.0,0.0,0.0,0.042254,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.014085,0.0,0.0,0.0,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.028169,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [30]:
rome_grouped.shape

(50, 213)

In [31]:
num_top_venues = 5

for hood in rome_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = rome_grouped[rome_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Quartiere Alessandrino----
                   venue  freq
0         Ice Cream Shop  0.18
1                   Park  0.18
2           Burger Joint  0.09
3            Pizza Place  0.09
4  Outdoors & Recreation  0.09


----Quartiere Appio Claudio----
                venue  freq
0  Italian Restaurant  0.25
1         Film Studio  0.25
2        Burger Joint  0.12
3                Café  0.12
4  Seafood Restaurant  0.12


----Quartiere Appio Latino----
                venue  freq
0                 Pub  0.08
1         Pizza Place  0.08
2               Plaza  0.07
3   Trattoria/Osteria  0.07
4  Italian Restaurant  0.07


----Quartiere Appio Pignatelli----
                  venue  freq
0    Italian Restaurant   0.1
1                 Hotel   0.1
2     Martial Arts Dojo   0.1
3  Gym / Fitness Center   0.1
4          Tennis Court   0.1


----Quartiere Ardeatino----
                venue  freq
0                Café  0.20
1  Italian Restaurant  0.10
2          Restaurant  0.08
3         Supermarket

                venue  freq
0  Italian Restaurant  0.19
1               Plaza  0.08
2       Historic Site  0.08
3    Roman Restaurant  0.05
4      Scenic Lookout  0.05


----Rione Sant’Eustachio----
                venue  freq
0  Italian Restaurant  0.32
1               Hotel  0.08
2      Ice Cream Shop  0.06
3          Restaurant  0.06
4            Fountain  0.06


----Rione Testaccio----
                venue  freq
0  Italian Restaurant  0.17
1    Roman Restaurant  0.07
2           Nightclub  0.07
3      Ice Cream Shop  0.04
4        Cocktail Bar  0.04


----Rione Trastevere----
                venue  freq
0  Italian Restaurant  0.32
1        Cocktail Bar  0.11
2         Pizza Place  0.11
3                 Bar  0.05
4               Plaza  0.05


----Rione Trevi----
                venue  freq
0  Italian Restaurant  0.22
1               Hotel  0.18
2      Ice Cream Shop  0.06
3                Café  0.04
4         Pizza Place  0.04




First, let's write a function to sort the venues in descending order.

In [32]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [33]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = rome_grouped['Neighborhood']

for ind in np.arange(rome_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(rome_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Quartiere Alessandrino,Park,Ice Cream Shop,Athletics & Sports,Pizza Place,Supermarket,Metro Station,Burger Joint,Outdoors & Recreation,Performing Arts Venue,Falafel Restaurant
1,Quartiere Appio Claudio,Italian Restaurant,Film Studio,Golf Course,Burger Joint,Seafood Restaurant,Café,Zoo,Food,Fondue Restaurant,Flea Market
2,Quartiere Appio Latino,Pizza Place,Pub,Café,Restaurant,Italian Restaurant,Plaza,Trattoria/Osteria,Dessert Shop,Ice Cream Shop,Bakery
3,Quartiere Appio Pignatelli,Food & Drink Shop,Italian Restaurant,Tennis Court,Café,Martial Arts Dojo,Garden,Gym / Fitness Center,Pizza Place,Hotel,Bed & Breakfast
4,Quartiere Ardeatino,Café,Italian Restaurant,Restaurant,Supermarket,Hotel,Bistro,College Gym,Seafood Restaurant,Mexican Restaurant,Camera Store


In [116]:
# set number of clusters
kclusters = 3

rome_grouped_clustering = rome_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(rome_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 0, 0, 0, 1, 0, 2, 0, 0], dtype=int32)

In [117]:
# add clustering labels
try:
    neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
except:
    neighborhoods_venues_sorted.drop("Cluster Labels", axis=1, inplace=True)
    neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

rome_merged = rome_quart
#rome_merged.drop("Neighborhood", axis=1, inplace=True)
#rome_merged.rename(columns={'Name': 'Neighborhood'}, inplace=True)

# merge rome_grouped with rome_data to add latitude/longitude for each neighborhood
rome_merged = rome_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

rome_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Rione Monti,41.89492,12.49434,2,Italian Restaurant,Pizza Place,Hotel,Ice Cream Shop,Wine Bar,Plaza,Chinese Restaurant,Sandwich Place,Cocktail Bar,Coffee Shop
1,Rione Trevi,41.902199,12.48606,1,Italian Restaurant,Hotel,Ice Cream Shop,Pizza Place,Café,Plaza,Toy / Game Store,Theater,Bed & Breakfast,Fountain
2,Rione Campo Marzio,41.907108,12.47786,2,Italian Restaurant,Hotel,Boutique,Ice Cream Shop,Jewelry Store,Wine Bar,Plaza,Women's Store,Pizza Place,Café
3,Rione Ponte,41.89904,12.46711,2,Italian Restaurant,Pizza Place,Roman Restaurant,Café,Trattoria/Osteria,Hotel,Breakfast Spot,Cocktail Bar,Plaza,Spa
4,Rione Regola,41.894791,12.47027,2,Italian Restaurant,Plaza,Sandwich Place,Hotel,Café,Seafood Restaurant,Pub,Trattoria/Osteria,Art Museum,Ice Cream Shop


In [119]:
# splits the two datasets
a_quartieri_roma = rome_merged[rome_merged["Neighborhood"].str.contains("Quartiere")]
a_rioni_roma = rome_merged[rome_merged["Neighborhood"].str.contains("Rione")]

# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add rioni markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(a_rioni_roma['Latitude'], a_rioni_roma['Longitude'], a_rioni_roma['Neighborhood'], a_rioni_roma['Cluster Labels']):
    if np.isnan(cluster) :
        pass
    else:
        label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
        cluster
        folium.Circle(
            [lat, lon],
            radius=400,
            popup=label,
            color=rainbow[cluster-1],
            fill=True,
            fill_color=rainbow[cluster-1],
            fill_opacity=0.7).add_to(map_clusters)       

# add quartieri markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(a_quartieri_roma['Latitude'], a_quartieri_roma['Longitude'], a_quartieri_roma['Neighborhood'], a_quartieri_roma['Cluster Labels']):
    if np.isnan(cluster) :
        pass
    else:
        label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
        cluster
        folium.Circle(
            [lat, lon],
            radius=800,
            popup=label,
            color=rainbow[cluster-1],
            fill=True,
            fill_color=rainbow[cluster-1],
            fill_opacity=0.7).add_to(map_clusters)       
map_clusters

In [125]:
map_clusters.save('map_clusters.html')

## 5. Examine Clusters

#### Cluster 0 - Neighborhoods middle-low class residential

In [120]:
rome_merged.loc[rome_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Quartiere Nomentano,41.914471,12.5221,0,Platform,Pizza Place,Ice Cream Shop,Mediterranean Restaurant,Pub,Italian Restaurant,Bookstore,Restaurant,Food Court,Dessert Shop
23,Quartiere Tiburtino,41.89785,12.52101,0,Pub,Italian Restaurant,Pizza Place,Other Nightlife,Hotel,Café,Dessert Shop,Restaurant,Trattoria/Osteria,Steakhouse
24,Quartiere Prenestino Labicano,41.8857,12.53521,0,Pizza Place,Italian Restaurant,Bar,Café,Cocktail Bar,Plaza,Bistro,Pub,Restaurant,Sandwich Place
25,Quartiere Tuscolano,41.871639,12.54055,0,Italian Restaurant,Steakhouse,Pizza Place,Wine Bar,Ice Cream Shop,Restaurant,Trattoria/Osteria,Bar,Lounge,Cupcake Shop
26,Quartiere Appio Latino,41.874611,12.51333,0,Pizza Place,Pub,Café,Restaurant,Italian Restaurant,Plaza,Trattoria/Osteria,Dessert Shop,Ice Cream Shop,Bakery
28,Quartiere Portuense,41.852829,12.45724,0,Italian Restaurant,Pizza Place,Gym / Fitness Center,Gym,Sushi Restaurant,Café,Soccer Field,Food & Drink Shop,Dessert Shop,Restaurant
29,Quartiere Gianicolense,41.874222,12.45766,0,Italian Restaurant,Pizza Place,Café,Hotel,Ice Cream Shop,Pub,Restaurant,Plaza,Dessert Shop,Brewery
33,Quartiere Monte Sacro,41.940369,12.53273,0,Pizza Place,Café,Plaza,Hotel,Italian Restaurant,Cocktail Bar,Chinese Restaurant,Japanese Restaurant,Park,Bistro
34,Quartiere Trieste,41.9249,12.51652,0,Café,Italian Restaurant,Ice Cream Shop,Pizza Place,Diner,Cocktail Bar,Burger Joint,Bar,Restaurant,Dessert Shop
36,Quartiere Prenestino Centocelle,41.884159,12.56621,0,Pizza Place,Wine Bar,Supermarket,Italian Restaurant,Restaurant,Light Rail Station,Soccer Field,Lounge,Seafood Restaurant,Electronics Store


#### Cluster 1 - Neighborhoods middle-high class residential

In [122]:
rome_merged.loc[rome_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Rione Trevi,41.902199,12.48606,1,Italian Restaurant,Hotel,Ice Cream Shop,Pizza Place,Café,Plaza,Toy / Game Store,Theater,Bed & Breakfast,Fountain
6,Rione Pigna,41.897758,12.48032,1,Italian Restaurant,Plaza,Hotel,Café,Monument / Landmark,Ice Cream Shop,Historic Site,Chinese Restaurant,Pub,Church
8,Rione Ripa,41.882999,12.48204,1,Hotel,Park,Scenic Lookout,Plaza,Italian Restaurant,Greek Restaurant,Restaurant,Beer Bar,Garden,Church
12,Rione Ludovisi,41.907761,12.48955,1,Hotel,Italian Restaurant,Restaurant,Cocktail Bar,Japanese Restaurant,Middle Eastern Restaurant,Boarding House,Chinese Restaurant,Bed & Breakfast,Roman Restaurant
14,Rione Castro Pretorio,41.90551,12.50188,1,Hotel,Italian Restaurant,Hostel,Pizza Place,Roman Restaurant,Café,Bar,Filipino Restaurant,Ethiopian Restaurant,Korean Restaurant
17,Rione Prati,41.908272,12.46498,1,Italian Restaurant,Hotel,Café,Restaurant,Ice Cream Shop,Bed & Breakfast,Pizza Place,Plaza,Burger Joint,Vegetarian / Vegan Restaurant
19,Quartiere Parioli,41.93177,12.48622,1,Hotel,Café,Italian Restaurant,Coffee Shop,Pool,Multiplex,Skating Rink,Brazilian Restaurant,Lake,Supermarket
20,Quartiere Pinciano,41.918861,12.4841,1,Italian Restaurant,Plaza,Hotel,Art Museum,Museum,Restaurant,Garden,Café,Dog Run,Seafood Restaurant
30,Quartiere Aurelio,41.896099,12.43926,1,Hotel,Italian Restaurant,Café,Chinese Restaurant,Restaurant,Park,Dessert Shop,Pub,Mobile Phone Shop,Steakhouse
31,Quartiere Trionfale,41.921082,12.43806,1,Italian Restaurant,Pizza Place,Hotel,Café,Train Station,Plaza,Japanese Restaurant,Mediterranean Restaurant,Park,Cocktail Bar


#### Cluster 2 - Historical / Turistic Areas in Rome

In [123]:
rome_merged.loc[rome_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Rione Monti,41.89492,12.49434,2,Italian Restaurant,Pizza Place,Hotel,Ice Cream Shop,Wine Bar,Plaza,Chinese Restaurant,Sandwich Place,Cocktail Bar,Coffee Shop
2,Rione Campo Marzio,41.907108,12.47786,2,Italian Restaurant,Hotel,Boutique,Ice Cream Shop,Jewelry Store,Wine Bar,Plaza,Women's Store,Pizza Place,Café
3,Rione Ponte,41.89904,12.46711,2,Italian Restaurant,Pizza Place,Roman Restaurant,Café,Trattoria/Osteria,Hotel,Breakfast Spot,Cocktail Bar,Plaza,Spa
4,Rione Regola,41.894791,12.47027,2,Italian Restaurant,Plaza,Sandwich Place,Hotel,Café,Seafood Restaurant,Pub,Trattoria/Osteria,Art Museum,Ice Cream Shop
5,Rione Sant’Eustachio,41.899639,12.47491,2,Italian Restaurant,Hotel,Fountain,Restaurant,Café,Ice Cream Shop,Bistro,Plaza,Diner,Chocolate Shop
7,Rione Sant’Angelo,41.893539,12.47871,2,Italian Restaurant,Historic Site,Plaza,Scenic Lookout,Theater,Roman Restaurant,Café,Bakery,Restaurant,Cheese Shop
9,Rione Trastevere,41.888401,12.46616,2,Italian Restaurant,Pizza Place,Cocktail Bar,Café,Restaurant,Church,Bar,Plaza,Dessert Shop,Other Nightlife
10,Rione Borgo,41.90242,12.46194,2,Italian Restaurant,Hotel,Café,Ice Cream Shop,Trattoria/Osteria,Restaurant,Castle,Cocktail Bar,Wine Bar,Friterie
11,Rione Esquilino,41.894032,12.506,2,Italian Restaurant,Café,Hotel,Indian Restaurant,Bed & Breakfast,Plaza,Hostel,Lounge,Record Shop,Korean Restaurant
13,Rione Sallustiano,41.907829,12.49613,2,Italian Restaurant,Hotel,Pizza Place,Coffee Shop,Café,Sandwich Place,Seafood Restaurant,Monument / Landmark,Mediterranean Restaurant,Juice Bar


## Conclusions

Seems that there is a non-uniform distribution of the venues in Rome. Choosing k = 3 as number of clusters, the difference is clear: 

1. The high-class residential part of Rome lives in the North-West of the city, with an outlier "Quartiere Giuliano Dalmata" due to the presence of sport-related activity in the nearby;
1. The medium-class residential part lives in a strip from south-west to north-east. Using k=4 there was a significative differentiation between the south-west end of the strip, predicted as belonging to medium class inhabitants and the north-east part, characterized by a more humble venues distribution, associated to a suburban part of the city. 
1. the historical part is scattered in the center part of the city, with "Cinecittà" (Quartiere Appio Claudio) as an outlier.

It was only a preparatory analysis, waiting for the final project