# IBM Applied Data Science Capstone Course by Coursera

### Paris's battle of neighborhoods

   -  Build a dataframe of neighborhoods in Paris, France by web scraping the data from a Wikipedia page
   -  Get the geographical coordinates of the neighborhoods
   -  Obtain the venue data for the neighborhoods from Foursquare API
   -  Explore and cluster the neighborhoods
   -  Select the best cluster to open a new restaurant in Paris.

### Installing and Importing the required Libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from bs4 import BeautifulSoup # library to parse HTML and XML documents

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
!conda install -c anaconda xlrd --yes
!conda install -c conda-forge folium=0.5.0 --yes

import folium # map rendering library

print("Libraries imported.")


Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - xlrd


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    xlrd-1.2.0                 |             py_0         108 KB  anaconda
    ca-certificates-2020.1.1   |                0         132 KB  anaconda
    certifi-2020.6.20          |           py36_0         160 KB  anaconda
    openssl-1.1.1g             |       h7b6447c_0         3.8 MB  anaconda
    ------------------------------------------------------------
                                           Total:         4.2 MB

The following packages will be UPDATED:

    ca-certificates: 2020.1.1-0        --> 2020.1.1-0        anaconda
    certifi:         2020.6.20-py36_0  --> 2020.6.20-py36_0  anaconda
    openssl:         1.1.1g-h7b6447c_0 --> 1.1.1g-h7b6447c_0 anaconda
    xlrd:            1.2.0-py_0       

In [2]:
!pip install geocoder
import geocoder # to get coordinates

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 8.2MB/s ta 0:00:011
Collecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


### Scraping the Wikipedia page

In [3]:
source = requests.get('https://en.wikipedia.org/wiki/Quarters_of_Paris').text
soup=BeautifulSoup(source,'lxml')
print(soup.title)
from IPython.display import display_html
table = str(soup.table)
display_html(table,raw=True)

<title>Quarters of Paris - Wikipedia</title>


Arrondissement (Districts),Quartiers (Quarters),Quartiers (Quarters).1,Population in 1999[3],Area (hectares)[3],Map
"1st arrondissement (Called ""du Louvre"")",1st,Saint-Germain-l'Auxerrois,1672,86.9,
"1st arrondissement (Called ""du Louvre"")",2nd,Les Halles,8984,41.2,
"1st arrondissement (Called ""du Louvre"")",3rd,Palais-Royal,3195,27.4,
"1st arrondissement (Called ""du Louvre"")",4th,Place-Vendôme,3044,26.9,
"2nd arrondissement (Called ""de la Bourse"")",5th,Gaillon,1345,18.8,
"2nd arrondissement (Called ""de la Bourse"")",6th,Vivienne,2917,24.4,
"2nd arrondissement (Called ""de la Bourse"")",7th,Mail,5783,27.8,
"2nd arrondissement (Called ""de la Bourse"")",8th,Bonne-Nouvelle,9595,28.2,
"3rd arrondissement (Called ""du Temple"")",9th,Arts-et-Métiers,9560,31.8,
"3rd arrondissement (Called ""du Temple"")",10th,Enfants-Rouges,8562,27.2,


### Convert The html table to a Pandas DataFrame

In [4]:
data = pd.read_html(table)
df=data[0]
df.head(10)

Unnamed: 0,Arrondissement(Districts),Quartiers(Quarters),Quartiers(Quarters).1,Population in1999[3],Area(hectares)[3],Map
0,"1st arrondissement(Called ""du Louvre"")",1st,Saint-Germain-l'Auxerrois,1672,86.9,
1,"1st arrondissement(Called ""du Louvre"")",2nd,Les Halles,8984,41.2,
2,"1st arrondissement(Called ""du Louvre"")",3rd,Palais-Royal,3195,27.4,
3,"1st arrondissement(Called ""du Louvre"")",4th,Place-Vendôme,3044,26.9,
4,"2nd arrondissement(Called ""de la Bourse"")",5th,Gaillon,1345,18.8,
5,"2nd arrondissement(Called ""de la Bourse"")",6th,Vivienne,2917,24.4,
6,"2nd arrondissement(Called ""de la Bourse"")",7th,Mail,5783,27.8,
7,"2nd arrondissement(Called ""de la Bourse"")",8th,Bonne-Nouvelle,9595,28.2,
8,"3rd arrondissement(Called ""du Temple"")",9th,Arts-et-Métiers,9560,31.8,
9,"3rd arrondissement(Called ""du Temple"")",10th,Enfants-Rouges,8562,27.2,


In [5]:
#shape
df.shape

(80, 6)

### Evaluating for Missing Data

In [6]:
missing_data = df.isnull()
missing_data.head(5)

Unnamed: 0,Arrondissement(Districts),Quartiers(Quarters),Quartiers(Quarters).1,Population in1999[3],Area(hectares)[3],Map
0,False,False,False,False,False,True
1,False,False,False,False,False,True
2,False,False,False,False,False,True
3,False,False,False,False,False,True
4,False,False,False,False,False,True


##### drop the columns that we don't need to

In [7]:
data_f=df.copy()

In [8]:
data_f.columns

Index(['Arrondissement(Districts)', 'Quartiers(Quarters)',
       'Quartiers(Quarters).1', 'Population in1999[3]', 'Area(hectares)[3]',
       'Map'],
      dtype='object')

In [9]:
# drop columns
data_f.drop(['Quartiers(Quarters)', 'Area(hectares)[3]','Map'], axis=1, inplace=True)

In [10]:
data_f.head()

Unnamed: 0,Arrondissement(Districts),Quartiers(Quarters).1,Population in1999[3]
0,"1st arrondissement(Called ""du Louvre"")",Saint-Germain-l'Auxerrois,1672
1,"1st arrondissement(Called ""du Louvre"")",Les Halles,8984
2,"1st arrondissement(Called ""du Louvre"")",Palais-Royal,3195
3,"1st arrondissement(Called ""du Louvre"")",Place-Vendôme,3044
4,"2nd arrondissement(Called ""de la Bourse"")",Gaillon,1345


#### rename columns

In [11]:
data_f.columns = ['Borough','Neighborhood','population']
                     

In [12]:
# the new dataset 
data_f.head()

Unnamed: 0,Borough,Neighborhood,population
0,"1st arrondissement(Called ""du Louvre"")",Saint-Germain-l'Auxerrois,1672
1,"1st arrondissement(Called ""du Louvre"")",Les Halles,8984
2,"1st arrondissement(Called ""du Louvre"")",Palais-Royal,3195
3,"1st arrondissement(Called ""du Louvre"")",Place-Vendôme,3044
4,"2nd arrondissement(Called ""de la Bourse"")",Gaillon,1345


### Getting the geographical coordinates

In [13]:
# define a function to get coordinates
def get_coordinate(data):
    # initialize your variable to None
    coordinate = None
    # loop until you get the coordinates
    while(coordinate is None):
        g = geocoder.arcgis('{}, Paris, France'.format(data))
        coordinate = g.latlng
    return coordinate

In [14]:
# call the function to get the coordinates
cd = [ get_coordinate(Neighborhood) for Neighborhood in data_f['Neighborhood'].tolist() ]

In [15]:
cd

[[48.859710000000064, 2.340240000000051],
 [48.86319000000003, 2.342010000000073],
 [48.863500000000045, 2.338760000000036],
 [48.86778000000004, 2.3301100000000474],
 [48.869020000000035, 2.3344500111085558],
 [48.87109994892962, 2.341280062178989],
 [48.84321995614296, 2.322170080500852],
 [48.87106000000006, 2.3479100366437633],
 [48.81769380634804, 2.3340533224475077],
 [48.86294000000004, 2.361239981678191],
 [48.863570025535296, 2.3608899233943887],
 [48.86061042170753, 2.356056113702135],
 [48.85847000000007, 2.350530000000049],
 [48.85480000000007, 2.353930000000048],
 [48.85361000000006, 2.3660800000000677],
 [48.85312999884621, 2.348860049483026],
 [48.84840110617736, 2.350754544036743],
 [48.84214000000003, 2.355670000000032],
 [48.84092000000004, 2.3444300000000453],
 [48.84982000000008, 2.344090000000051],
 [48.85307000000006, 2.343270000000075],
 [48.849140000000034, 2.338640000000055],
 [48.842460000000074, 2.3347700000000486],
 [48.853770000000054, 2.33331000000004],
 [

In [16]:
# create a dataframe with Latitude and Longitude from the list"cd"
df_cd = pd.DataFrame(cd, columns=['Latitude', 'Longitude'])

In [17]:
# merge df_cd(coordonite) with our dataframe
data_f['Latitude'] = df_cd['Latitude']
data_f['Longitude'] = df_cd['Longitude']

In [18]:
# check our new dataframe
data_f.head()

Unnamed: 0,Borough,Neighborhood,population,Latitude,Longitude
0,"1st arrondissement(Called ""du Louvre"")",Saint-Germain-l'Auxerrois,1672,48.85971,2.34024
1,"1st arrondissement(Called ""du Louvre"")",Les Halles,8984,48.86319,2.34201
2,"1st arrondissement(Called ""du Louvre"")",Palais-Royal,3195,48.8635,2.33876
3,"1st arrondissement(Called ""du Louvre"")",Place-Vendôme,3044,48.86778,2.33011
4,"2nd arrondissement(Called ""de la Bourse"")",Gaillon,1345,48.86902,2.33445


In [19]:
# save the DataFrame as CSV file
data_f.to_csv("data.csv", index=False)

In [20]:
# get the coordinates of Paris 
address = 'Paris,France'
geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Paris,France is {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Paris,France is 48.8566969, 2.3514616.


### Visualizing the Neighbourhoods

In [21]:
my_map = folium.Map(location=[48.8566969, 2.3514616],zoom_start=10)

for latitude,longitude,borough,neighborhood in zip(data_f['Latitude'],data_f['Longitude'],data_f['Borough'],data_f['Neighborhood']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
    [latitude,longitude],
    radius=5,
    popup=label,
    color='blue',
    fill=True,
    fill_color='#FF5733',
    fill_opacity=0.7,
    parse_html=False).add_to(my_map)
my_map

In [22]:
# save the map as HTML file
my_map.save('map_paris.html')

### Foursquare API time

<h6> Define Foursquare Credentials and Version </h6>

In [23]:
# define Foursquare Credentials and Version
CLIENT_ID = 'G1YJUGSZYOU1F3F1RFXN2115B1YBFMM545OQJEMGVZVB5RYG' # your Foursquare ID
CLIENT_SECRET = 'XVV4V4FY4USUVZHZLZISWMRW4QRJJZMFIFX5JL2NKKWPQ1IH' # your Foursquare Secret
VERSION = '20180604' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)


Your credentails:
CLIENT_ID: G1YJUGSZYOU1F3F1RFXN2115B1YBFMM545OQJEMGVZVB5RYG
CLIENT_SECRET:XVV4V4FY4USUVZHZLZISWMRW4QRJJZMFIFX5JL2NKKWPQ1IH


#### The top 100 popular venues in Paris

In [26]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(data_f['Latitude'], data_f['Longitude'], data_f['Neighborhood']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))
    

In [27]:
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['Neighborhood', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head()

(7801, 7)


Unnamed: 0,Neighborhood,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Saint-Germain-l'Auxerrois,48.85971,2.34024,Cour Carrée du Louvre,48.86036,2.338543,Pedestrian Plaza
1,Saint-Germain-l'Auxerrois,48.85971,2.34024,Place du Louvre,48.859841,2.340822,Plaza
2,Saint-Germain-l'Auxerrois,48.85971,2.34024,La Vénus de Milo (Vénus de Milo),48.859943,2.337234,Exhibit
3,Saint-Germain-l'Auxerrois,48.85971,2.34024,Église Saint-Germain-l'Auxerrois (Église Saint...,48.85952,2.341306,Church
4,Saint-Germain-l'Auxerrois,48.85971,2.34024,Pont des Arts,48.858565,2.337635,Bridge


#### the top venue category of Paris 

In [28]:
venues_df['VenueCategory'].unique()

array(['Pedestrian Plaza', 'Plaza', 'Exhibit', 'Church', 'Bridge',
       'Coffee Shop', 'Art Museum', 'Museum', 'Wine Bar',
       'French Restaurant', 'Historic Site', 'Theater', 'Cosmetics Shop',
       'Restaurant', 'Hotel', 'Garden', 'Toy / Game Store',
       'Cocktail Bar', 'Italian Restaurant', 'Lebanese Restaurant',
       'Pastry Shop', 'Sandwich Place', 'Bookstore', "Women's Store",
       'Udon Restaurant', 'Ice Cream Shop', 'Seafood Restaurant',
       'Japanese Restaurant', 'Breton Restaurant', 'Burger Joint',
       'Bakery', 'Souvenir Shop', 'Art Gallery', 'Creperie', 'Fountain',
       'Electronics Store', 'Clothing Store', 'Bistro', 'Chocolate Shop',
       'Szechuan Restaurant', 'Tea Room', 'Pizza Place',
       'Furniture / Home Store', 'Park', 'Beer Bar', 'Miscellaneous Shop',
       'Comic Shop', 'Sushi Restaurant', 'Tapas Restaurant',
       'Bubble Tea Shop', 'Indie Movie Theater', 'Comedy Club',
       'Candy Store', "Men's Store", 'Opera House',
       'Argent

In [29]:
print('we have {} uniques categories'.format(len(venues_df['VenueCategory'].unique())))

we have 228 uniques categories


In [30]:
# check if the results of category contain "Hotel"
"Hotel" in venues_df['VenueCategory'].unique()

True

### 6. Analyze Each Neighborhood

In [31]:
# one hot encoding
my_data = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column 
my_data['Neighborhood'] = venues_df['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [my_data.columns[-1]] + list(my_data.columns[:-1])
my_data = my_data[fixed_columns]

print(my_data.shape)
my_data.head()

(7801, 229)


Unnamed: 0,Neighborhood,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auvergne Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Basque Restaurant,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Breton Restaurant,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Butcher,Café,Cambodian Restaurant,Canal,Canal Lock,Candy Store,Caribbean Restaurant,Cemetery,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Circus,Circus School,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Corsican Restaurant,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doner Restaurant,Drive-in Theater,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Film Studio,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,General Entertainment,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Island,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jiangxi Restaurant,Juice Bar,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Library,Liquor Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Noodle House,Office,Opera House,Organic Grocery,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pharmacy,Photography Lab,Pizza Place,Planetarium,Playground,Plaza,Pool,Pop-Up Shop,Portuguese Restaurant,Provençal Restaurant,Pub,Racecourse,Radio Station,Ramen Restaurant,Record Shop,Recording Studio,Restaurant,Rock Club,Roof Deck,Rugby Stadium,Russian Restaurant,Salad Place,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Seafood Restaurant,Shanxi Restaurant,Shopping Mall,Shopping Plaza,Soccer Field,Soccer Stadium,South American Restaurant,Southern / Soul Food Restaurant,Southwestern French Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Street Art,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Park Ride / Attraction,Tourist Information Center,Toy / Game Store,Track,Trail,Train Station,Trattoria/Osteria,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Zoo Exhibit
0,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Saint-Germain-l'Auxerrois,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [32]:
df_grouped = my_data.groupby(["Neighborhood"]).mean().reset_index()

print(df_grouped.shape)
df_grouped

(79, 229)


Unnamed: 0,Neighborhood,African Restaurant,American Restaurant,Antique Shop,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auvergne Restaurant,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Basque Restaurant,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Bike Rental / Bike Share,Bistro,Boat or Ferry,Bookstore,Botanical Garden,Boutique,Boxing Gym,Brasserie,Brazilian Restaurant,Breakfast Spot,Breton Restaurant,Brewery,Bridge,Bubble Tea Shop,Burger Joint,Butcher,Café,Cambodian Restaurant,Canal,Canal Lock,Candy Store,Caribbean Restaurant,Cemetery,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Circus,Circus School,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Convenience Store,Corsican Restaurant,Cosmetics Shop,Creperie,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doner Restaurant,Drive-in Theater,Eastern European Restaurant,Electronics Store,Ethiopian Restaurant,Event Space,Exhibit,Falafel Restaurant,Farm,Farmers Market,Film Studio,Fish & Chips Shop,Fish Market,Flea Market,Food & Drink Shop,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gastropub,General Entertainment,Gluten-free Restaurant,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Health Food Store,Historic Site,History Museum,Hookah Bar,Hostel,Hotel,Hotel Bar,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Island,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jiangxi Restaurant,Juice Bar,Korean Restaurant,Latin American Restaurant,Lebanese Restaurant,Library,Liquor Store,Lounge,Market,Martial Arts Dojo,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Store,Music Venue,Nightclub,Noodle House,Office,Opera House,Organic Grocery,Outdoor Sculpture,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Persian Restaurant,Peruvian Restaurant,Pharmacy,Photography Lab,Pizza Place,Planetarium,Playground,Plaza,Pool,Pop-Up Shop,Portuguese Restaurant,Provençal Restaurant,Pub,Racecourse,Radio Station,Ramen Restaurant,Record Shop,Recording Studio,Restaurant,Rock Club,Roof Deck,Rugby Stadium,Russian Restaurant,Salad Place,Sandwich Place,Scandinavian Restaurant,Scenic Lookout,Science Museum,Seafood Restaurant,Shanxi Restaurant,Shopping Mall,Shopping Plaza,Soccer Field,Soccer Stadium,South American Restaurant,Southern / Soul Food Restaurant,Southwestern French Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Street Art,Supermarket,Sushi Restaurant,Szechuan Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tattoo Parlor,Tea Room,Tech Startup,Tennis Court,Tennis Stadium,Thai Restaurant,Theater,Theme Park Ride / Attraction,Tourist Information Center,Toy / Game Store,Track,Trail,Train Station,Trattoria/Osteria,Turkish Restaurant,Udon Restaurant,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wine Shop,Women's Store,Zoo Exhibit
0,Archives,0.0,0.0,0.0,0.0,0.01,0.02,0.02,0.0,0.01,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.07,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.04,0.0,0.0,0.0,0.01,0.05,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.05,0.0,0.0,0.0
1,Arsenal,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.01,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.06,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.08,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.0,0.0,0.01,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.03,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.04,0.0,0.0,0.0
2,Arts-et-Métiers,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.09,0.04,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.19,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0
3,Auteuil,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.15,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.04,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.01,0.03,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0
4,Batignolles,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.01,0.01,0.04,0.04,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.19,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.08,0.01,0.0,0.0,0.01,0.0,0.0,0.08,0.02,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0
5,Bel-Air,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.02,0.01,0.0,0.01,0.0,0.0,0.01,0.04,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.17,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.04,0.0,0.02,0.0,0.0,0.01,0.07,0.01,0.01,0.01,0.0,0.01,0.0,0.03,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Belleville,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.05,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.02,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.02,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.08,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.07,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.02,0.03,0.0,0.0,0.0
7,Bercy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.04,0.03,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.04,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.12,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.03,0.04,0.0,0.0,0.0
8,Bonne-Nouvelle,0.0,0.0,0.0,0.0,0.01,0.02,0.03,0.0,0.0,0.0,0.0,0.0,0.06,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.06,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.05,0.0,0.01,0.0,0.01,0.0,0.0,0.04,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.09,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.01,0.0
9,Chaillot,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.14,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.17,0.01,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [33]:
len(df_grouped[df_grouped["Hotel"] > 0])

73

#### Create a new DataFrame for Hotel data only

In [34]:
hotel_data = df_grouped[["Neighborhood","Hotel"]]

In [35]:
hotel_data.head()

Unnamed: 0,Neighborhood,Hotel
0,Archives,0.02
1,Arsenal,0.01
2,Arts-et-Métiers,0.04
3,Auteuil,0.02
4,Batignolles,0.08


### 7. Cluster Neighborhoods
Run k-means to cluster the neighborhoods in Auckland into 3 clusters.

In [36]:
# set number of clusters
kclusters = 3

my_clustering = hotel_data.drop(["Neighborhood"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(my_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([0, 0, 0, 0, 2, 2, 0, 0, 0, 1], dtype=int32)

In [37]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
data_merged = hotel_data.copy()

# add clustering labels
data_merged["Cluster Labels"] = kmeans.labels_

In [39]:
data_merged.head()

Unnamed: 0,Neighborhood,Hotel,Cluster Labels
0,Archives,0.02,0
1,Arsenal,0.01,0
2,Arts-et-Métiers,0.04,0
3,Auteuil,0.02,0
4,Batignolles,0.08,2


In [41]:
# merge data_merged with data_f to add latitude/longitude for each neighborhood
data_merged = data_merged.join(data_f.set_index("Neighborhood"), on="Neighborhood")
print(data_merged.shape)
data_merged.head() 

(79, 7)


Unnamed: 0,Neighborhood,Hotel,Cluster Labels,Borough,population,Latitude,Longitude
0,Archives,0.02,0,"3rd arrondissement(Called ""du Temple"")",8609,48.86357,2.36089
1,Arsenal,0.01,0,"4th arrondissement(Called ""de l'Hôtel-de-Ville"")",9474,48.85361,2.36608
2,Arts-et-Métiers,0.04,0,"3rd arrondissement(Called ""du Temple"")",9560,48.817694,2.334053
3,Auteuil,0.02,0,"16th arrondissement(Called ""de Passy"")",67967,48.84694,2.26366
4,Batignolles,0.08,2,"17th arrondissement(Called ""des Batignolles-Mo...",38691,48.88333,2.31667


In [42]:
# sort the results by Cluster Labels
print(data_merged.shape)
data_merged.sort_values(["Cluster Labels"], inplace=True)
data_merged

(79, 7)


Unnamed: 0,Neighborhood,Hotel,Cluster Labels,Borough,population,Latitude,Longitude
0,Archives,0.02,0,"3rd arrondissement(Called ""du Temple"")",8609,48.86357,2.36089
38,Mail,0.03,0,"2nd arrondissement(Called ""de la Bourse"")",5783,48.84322,2.32217
40,Monnaie,0.03,0,"6th arrondissement(Called ""du Luxembourg"")",6185,48.85307,2.34327
41,Montparnasse,0.04,0,"14th arrondissement(Called ""de l'Observatoire"")",18570,48.84313,2.32129
43,Notre-Dame,0.04,0,"4th arrondissement(Called ""de l'Hôtel-de-Ville"")",4087,48.85313,2.34886
44,Notre-Dame-des-Champs,0.02,0,"6th arrondissement(Called ""du Luxembourg"")",24731,48.84246,2.33477
45,Odéon,0.03,0,"6th arrondissement(Called ""du Luxembourg"")",8833,48.84914,2.33864
49,Picpus,0.03,0,"12th arrondissement(Called ""de Reuilly"")",62947,48.84442,2.40227
52,Plaisance,0.02,0,"14th arrondissement(Called ""de l'Observatoire"")",57229,48.84455,2.38994
53,Pont-de-Flandre,0.0,0,"19th arrondissement(Called ""des Buttes-Chaumont"")",24584,48.89437,2.38153


#### Let's visualize the resulting clusters

In [45]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(data_merged['Latitude'], data_merged['Longitude'], data_merged['Neighborhood'], data_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [46]:
# save the map as HTML file
map_clusters.save('map_clusters1.html')

# 8. Examine Clusters

#### Cluster 0

In [47]:
data_merged.loc[data_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Hotel,Cluster Labels,Borough,population,Latitude,Longitude
0,Archives,0.02,0,"3rd arrondissement(Called ""du Temple"")",8609,48.86357,2.36089
38,Mail,0.03,0,"2nd arrondissement(Called ""de la Bourse"")",5783,48.84322,2.32217
40,Monnaie,0.03,0,"6th arrondissement(Called ""du Luxembourg"")",6185,48.85307,2.34327
41,Montparnasse,0.04,0,"14th arrondissement(Called ""de l'Observatoire"")",18570,48.84313,2.32129
43,Notre-Dame,0.04,0,"4th arrondissement(Called ""de l'Hôtel-de-Ville"")",4087,48.85313,2.34886
44,Notre-Dame-des-Champs,0.02,0,"6th arrondissement(Called ""du Luxembourg"")",24731,48.84246,2.33477
45,Odéon,0.03,0,"6th arrondissement(Called ""du Luxembourg"")",8833,48.84914,2.33864
49,Picpus,0.03,0,"12th arrondissement(Called ""de Reuilly"")",62947,48.84442,2.40227
52,Plaisance,0.02,0,"14th arrondissement(Called ""de l'Observatoire"")",57229,48.84455,2.38994
53,Pont-de-Flandre,0.0,0,"19th arrondissement(Called ""des Buttes-Chaumont"")",24584,48.89437,2.38153


#### Cluster 1

In [48]:
data_merged.loc[data_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Hotel,Cluster Labels,Borough,population,Latitude,Longitude
10,Champs-Élysées,0.15,1,"8th arrondissement(Called ""de l'Élysée"")",4614,48.86906,2.30993
9,Chaillot,0.17,1,"16th arrondissement(Called ""de Passy"")",21213,48.86894,2.29151
19,Faubourg-du-Roule,0.14,1,"8th arrondissement(Called ""de l'Élysée"")",10038,48.876818,2.299611
54,Porte-Dauphine,0.12,1,"16th arrondissement(Called ""de Passy"")",27423,48.86753,2.28026
37,Les Ternes,0.17,1,"17th arrondissement(Called ""des Batignolles-Mo...",39137,48.87892,2.29325
25,Gros-Caillou,0.12,1,"7th arrondissement(Called ""du Palais-Bourbon"")",25156,48.85838,2.29901
51,Plaine Monceau,0.16,1,"17th arrondissement(Called ""des Batignolles-Mo...",38958,48.87776,2.310777


#### Cluster 2

In [49]:
data_merged.loc[data_merged['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Hotel,Cluster Labels,Borough,population,Latitude,Longitude
76,Vivienne,0.08,2,"2nd arrondissement(Called ""de la Bourse"")",2917,48.8711,2.34128
31,La Madeleine,0.09,2,"8th arrondissement(Called ""de l'Élysée"")",6045,48.87001,2.32491
32,La Muette,0.1,2,"16th arrondissement(Called ""de Passy"")",45214,48.858248,2.273207
73,Salpêtrière,0.06,2,"13th arrondissement(Called ""des Gobelins"")",18246,48.8353,2.3583
28,Javel,0.07,2,"15th arrondissement(Called ""de Vaugirard"")",49092,48.84387,2.28613
4,Batignolles,0.08,2,"17th arrondissement(Called ""des Batignolles-Mo...",38691,48.88333,2.31667
70,Saint-Vincent-de-Paul,0.06,2,"10th arrondissement(Called ""de l'Entrepôt"")",21624,48.87849,2.35176
69,Saint-Thomas-d'Aquin,0.06,2,"7th arrondissement(Called ""du Palais-Bourbon"")",12661,48.85658,2.32678
5,Bel-Air,0.07,2,"12th arrondissement(Called ""de Reuilly"")",33976,48.84824,2.29692
35,Les Halles,0.08,2,"1st arrondissement(Called ""du Louvre"")",8984,48.86319,2.34201


## Observations:

Most of the hotels are concentrated in the central area of Paris, with the highest number in cluster 0 and moderate number in cluster 1. On the other hand, cluster 2 has no hotels in the central area of Paris. openning a new hotel in cluster 1 is a great opportunity.
we will develop this idea in the report.