## CAPSTONE PROJECT FOR FINDING OPTIMIZED VENUE AT FRANCE

# Introduction

This report is for the final course of the Data Science Specialization. A 9-courses series created by IBM, hosted on Coursera platform. The problem and the analysis approach are left for the learner to decide, with a requirement of leveraging the Foursquare location data to explore or compare neighborhoods or cities of your choice or to come up with a problem that you can use the Foursquare location data to solve.

In this project, the problem is to find the optimal location or finding the cityof cluster which has user preferred venue eg. BAR,PLAZA and GYM in France. To achieve this task, an analytical approach will be used, based on advance machine learning techniques and data analysis,concretely clustering and perhaps some data visualization techniques.

So can the city surrounding has user preferred venues ?
If so, what types of venues cluster has the most affect, both positively and negatively?

The Target Audience for this project is for who prefer to stay in hotel based on 
on their preferred venues(eg.Tourists).


# Import required libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geopy-1.18.1               |             py_0          51 KB  conda-forge
    openssl-1.0.2p             |       h470a237_2         3.1 MB  conda-forge
    geographiclib-1.49         |             py_0          32 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         3.2 MB

The following NEW packages will be INSTALLED:

    geographiclib: 1.49-py_0         conda-forge
    geopy:         1.18.1-py_0       conda-forge

The following packages will be UPDATED:

    openssl:       1.0.2p-h470a237_1 conda-forge --> 1.0.2p-h470a237_2 conda-forge


Downloading and Extracting Packages
geopy-1.18.1         | 51 KB     | #############

# Download and Explore Dataset

Get Data from https://simplemaps.com/data/fr-cities as CSV

In [2]:
df = pd.read_csv('france_geo.csv', sep = ';')

In [3]:
data_df = pd.DataFrame(df)

In [4]:
data_df.head()

Unnamed: 0,city,lat,lng,country,iso2,capital,population
0,Paris,48.866667,2.333333,France,FR,primary,9904000
1,Lyon,45.748457,4.846711,France,FR,admin,1423000
2,Marseille,43.285413,5.37606,France,FR,admin,1400000
3,Lille,50.632971,3.058585,France,FR,admin,1044000
4,Nice,43.713644,7.25952,France,FR,927000,338620


Change the Column names as understandable

In [5]:
data_df.columns = ['CITY', 'LATITUDE', 'LONGITUDE','COUNTRY','COUNTRY_CODE','CAPITAL','POPULATION']

Drop the columns that are not required

In [6]:
data_df = data_df.drop(['COUNTRY_CODE','CAPITAL'], axis=1)

In [7]:
data_df.head()

Unnamed: 0,CITY,LATITUDE,LONGITUDE,COUNTRY,POPULATION
0,Paris,48.866667,2.333333,France,9904000
1,Lyon,45.748457,4.846711,France,1423000
2,Marseille,43.285413,5.37606,France,1400000
3,Lille,50.632971,3.058585,France,1044000
4,Nice,43.713644,7.25952,France,338620


### Use geopy library to get the latitude and longitude values of France

In [8]:
address = 'France'

geolocator = Nominatim()
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of france are {}, {}.'.format(latitude, longitude))

  This is separate from the ipykernel package so we can avoid doing imports until


The geograpical coordinate of france are 46.603354, 1.8883335.


#### Create a map of France with cities superimposed on top.

In [9]:
# create map of france using latitude and longitude values
map_france = folium.Map(location=[latitude, longitude], zoom_start=6)

# add markers to map
for lat, lng, borough, neighborhood in zip(data_df['LATITUDE'], data_df['LONGITUDE'], data_df['COUNTRY'], data_df['CITY']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_france)  
    
map_france

### Define Foursquare Credentials and Version

In [10]:
CLIENT_ID = 'ESYH340ZLYESMFLKUKCHDQ33YNUJINGWDUPRBZC21VVYTFMT' # your Foursquare ID
CLIENT_SECRET = 'EYRI0QRQTSMVWD5AWU1JGD4FXZBNCPOXM4NRO1TKBS3EVHOZ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: ESYH340ZLYESMFLKUKCHDQ33YNUJINGWDUPRBZC21VVYTFMT
CLIENT_SECRET:EYRI0QRQTSMVWD5AWU1JGD4FXZBNCPOXM4NRO1TKBS3EVHOZ


#### Let's explore the first neighborhood/City in our dataframe.

In [11]:
data_df.loc[0,'CITY']

'Paris'

In [12]:
neighborhood_latitude = data_df.loc[0, 'LATITUDE'] # neighborhood latitude value
neighborhood_longitude = data_df.loc[0, 'LONGITUDE'] # neighborhood longitude value

neighborhood_name = data_df.loc[0, 'CITY'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Paris are 48.866667, 2.333333.


In [13]:
# type your answer here
LIMIT = 100 # limit of number of venues returned by Foursquare API

radius = 500 # define radius

#create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=ESYH340ZLYESMFLKUKCHDQ33YNUJINGWDUPRBZC21VVYTFMT&client_secret=EYRI0QRQTSMVWD5AWU1JGD4FXZBNCPOXM4NRO1TKBS3EVHOZ&v=20180605&ll=48.866667,2.333333&radius=500&limit=100'

In [14]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c388426dd57975fd5e7d21f'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Place Vendôme',
  'headerFullLocation': 'Place Vendôme, Paris',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 244,
  'suggestedBounds': {'ne': {'lat': 48.8711670045, 'lng': 2.340161078526742},
   'sw': {'lat': 48.8621669955, 'lng': 2.326504921473258}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4cbdcb0b7148f04d510aefab',
       'name': 'Pierre Hermé',
       'location': {'address': "39 avenue de l'Opéra",
        'lat': 48.86822151447183,
        'lng': 2.333396617684349,
        'labeledLatLngs': [{'label': 'display',
          'lat': 48.86822151

In [15]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [16]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Pierre Hermé,Pastry Shop,48.868222,2.333397
1,Le Roch Hotel & Spa Paris,Hotel,48.8662,2.332995
2,Cantine California,Food Truck,48.867401,2.332017
3,Boulangerie Aki,Bakery,48.866211,2.335458
4,Brasserie Réjane,Restaurant,48.865486,2.334824


In [17]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [18]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Now write the code to run the above function on each cities and create a new dataframe called france_venues.

In [19]:
france_venues = getNearbyVenues(names=data_df['CITY'],
                                   latitudes=data_df['LATITUDE'],
                                   longitudes=data_df['LONGITUDE']
                                  )

Paris
Lyon
Marseille
Lille
Nice
Toulouse
Bordeaux
Rouen
Strasbourg
Nantes
Metz
Grenoble
Toulon
Montpellier
Nancy
Saint-Étienne
Melun
Le Havre
Tours
Clermont-Ferrand
Orléans
Mulhouse
Rennes
Reims
Caen
Angers
Dijon
Nîmes
Limoges
Aix-en-Provence
Perpignan
Biarritz
Brest
Le Mans
Amiens
Besançon
Annecy
Calais
Poitiers
Versailles
Kerbrient
Béziers
La Rochelle
Roanne
Bourges
Arras
Troyes
Cherbourg
Agen
Tarbes
Ajaccio
Saint-Brieuc
Nevers
Vichy
Dieppe
Auxerre
Bastia


#### Let's check the size of the resulting dataframe

In [20]:
print(france_venues.shape)
france_venues.head()

(1371, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Paris,48.866667,2.333333,Pierre Hermé,48.868222,2.333397,Pastry Shop
1,Paris,48.866667,2.333333,Le Roch Hotel & Spa Paris,48.8662,2.332995,Hotel
2,Paris,48.866667,2.333333,Cantine California,48.867401,2.332017,Food Truck
3,Paris,48.866667,2.333333,Boulangerie Aki,48.866211,2.335458,Bakery
4,Paris,48.866667,2.333333,Brasserie Réjane,48.865486,2.334824,Restaurant


In [21]:
df_venues2 = france_venues.copy()
df_venues3 = france_venues.copy()
df_venues_rest = df_venues2[df_venues2['Venue Category'].str.contains('Bar')].reset_index(drop=True)
df_venues_rest['Venue Type'] = 'Bar'
df_venues_hotel = df_venues3[df_venues3['Venue Category'].str.contains('Plaza')].reset_index(drop=True)
df_venues_hotel['Venue Type'] = 'Plaza'
df_venues_final = pd.concat([df_venues_rest,df_venues_hotel]).reset_index(drop=True)
df_venues_final.shape

(218, 8)

In [22]:
df_venues_final.groupby('Neighborhood')['Venue Type']\
.value_counts()\
.unstack(level=1)\
.plot.bar(stacked=True)


<matplotlib.axes._subplots.AxesSubplot at 0x7f371a1d7748>

In [23]:
print('There are {} uniques categories.'.format(len(france_venues['Venue Category'].unique())))

There are 184 uniques categories.


### Analyze Each Cities

In [24]:
# one hot encoding
france_onehot = pd.get_dummies(france_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
france_onehot['Neighborhood'] =  france_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [france_onehot.columns[-1]] + list(france_onehot.columns[:-1])
france_onehot = france_onehot[fixed_columns]

france_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,Bagel Shop,Bakery,Bar,Basque Restaurant,Bed & Breakfast,Beer Bar,Beer Garden,Bistro,Bookstore,Botanical Garden,Boutique,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Camera Store,Candy Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Creperie,Cupcake Shop,Cycle Studio,Dance Studio,Department Store,Dessert Shop,Diner,Dive Bar,Doner Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Flower Shop,Food,Food & Drink Shop,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Latin American Restaurant,Lingerie Store,Lounge,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,Newsstand,Nightclub,Noodle House,Opera House,Optical Shop,Other Nightlife,Other Repair Shop,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pet Service,Pharmacy,Photography Studio,Pizza Place,Plaza,Pool,Portuguese Restaurant,Pub,Public Art,Ramen Restaurant,Record Shop,Rental Car Location,Resort,Restaurant,River,Rock Club,Rugby Pitch,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Stables,Steakhouse,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Stadium,Thai Restaurant,Theater,Theme Park,Tourist Information Center,Toy / Game Store,Trail,Train Station,Tram Station,Udon Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store
0,Paris,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Paris,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Paris,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Paris,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Paris,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [25]:
france_grouped = france_onehot.groupby('Neighborhood').mean().reset_index()
france_grouped

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Aquarium,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,Bagel Shop,Bakery,Bar,Basque Restaurant,Bed & Breakfast,Beer Bar,Beer Garden,Bistro,Bookstore,Botanical Garden,Boutique,Brasserie,Brazilian Restaurant,Breakfast Spot,Brewery,Bubble Tea Shop,Burger Joint,Burrito Place,Bus Station,Bus Stop,Business Service,Café,Camera Store,Candy Store,Castle,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Creperie,Cupcake Shop,Cycle Studio,Dance Studio,Department Store,Dessert Shop,Diner,Dive Bar,Doner Restaurant,Electronics Store,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Financial or Legal Service,Fish & Chips Shop,Flower Shop,Food,Food & Drink Shop,Food Truck,Fountain,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,Gift Shop,Gourmet Shop,Greek Restaurant,Grocery Store,Gym,Gym / Fitness Center,Harbor / Marina,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Home Service,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Irish Pub,Israeli Restaurant,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Kebab Restaurant,Korean Restaurant,Latin American Restaurant,Lingerie Store,Lounge,Mediterranean Restaurant,Men's Store,Metro Station,Mexican Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Moroccan Restaurant,Movie Theater,Multiplex,Museum,Music Venue,New American Restaurant,Newsstand,Nightclub,Noodle House,Opera House,Optical Shop,Other Nightlife,Other Repair Shop,Park,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pet Service,Pharmacy,Photography Studio,Pizza Place,Plaza,Pool,Portuguese Restaurant,Pub,Public Art,Ramen Restaurant,Record Shop,Rental Car Location,Resort,Restaurant,River,Rock Club,Rugby Pitch,Salad Place,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Skating Rink,Snack Place,Soccer Stadium,Spanish Restaurant,Sporting Goods Shop,Stables,Steakhouse,Supermarket,Sushi Restaurant,Taiwanese Restaurant,Tapas Restaurant,Tea Room,Tennis Stadium,Thai Restaurant,Theater,Theme Park,Tourist Information Center,Toy / Game Store,Trail,Train Station,Tram Station,Udon Restaurant,Vietnamese Restaurant,Whisky Bar,Wine Bar,Wine Shop,Women's Store
0,Agen,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Aix-en-Provence,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.03,0.0,0.03,0.01,0.04,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.03,0.0,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.21,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.03,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.02,0.1,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0
2,Ajaccio,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Amiens,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Angers,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.018519,0.111111,0.0,0.0,0.0,0.018519,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.092593,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.037037,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.037037,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0
5,Annecy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.068966,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.068966,0.0,0.034483,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.137931,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Arras,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.235294,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.176471,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.117647,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.058824,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Auxerre,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.083333,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Bastia,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Besançon,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.2,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each cities along with the top 5 most common venues

In [26]:
num_top_venues = 5

for hood in france_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = france_grouped[france_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Agen----
               venue  freq
0        Supermarket  0.33
1       Dance Studio  0.33
2               Park  0.33
3  Accessories Store  0.00
4       Optical Shop  0.00


----Aix-en-Provence----
               venue  freq
0  French Restaurant  0.21
1              Plaza  0.10
2                Pub  0.04
3   Pedestrian Plaza  0.04
4                Bar  0.04


----Ajaccio----
              venue  freq
0             Hotel  0.29
1             Plaza  0.14
2        Restaurant  0.14
3              Café  0.14
4  Sushi Restaurant  0.14


----Amiens----
             venue  freq
0            Hotel  0.50
1              Pub  0.25
2             Café  0.25
3  Other Nightlife  0.00
4           Museum  0.00


----Angers----
                 venue  freq
0                  Bar  0.11
1    French Restaurant  0.09
2                  Pub  0.06
3               Lounge  0.06
4  Japanese Restaurant  0.04


----Annecy----
              venue  freq
0             Hotel  0.14
1    Clothing Store  0.07
2  Departm

In [27]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each cities.

In [28]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = france_grouped['Neighborhood']

for ind in np.arange(france_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(france_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Agen,Supermarket,Dance Studio,Park,Women's Store,Electronics Store,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service
1,Aix-en-Provence,French Restaurant,Plaza,Bar,Pedestrian Plaza,Pub,Bagel Shop,Burger Joint,Italian Restaurant,Asian Restaurant,Ice Cream Shop
2,Ajaccio,Hotel,Sushi Restaurant,French Restaurant,Restaurant,Café,Plaza,Art Museum,Arts & Crafts Store,Food Truck,Food & Drink Shop
3,Amiens,Hotel,Pub,Café,Doner Restaurant,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
4,Angers,Bar,French Restaurant,Pub,Lounge,Sandwich Place,Indian Restaurant,Italian Restaurant,Department Store,Japanese Restaurant,Hotel
5,Annecy,Hotel,Department Store,Bar,Clothing Store,Pizza Place,Candy Store,Shopping Mall,Mobile Phone Shop,Sandwich Place,Café
6,Arras,Bar,French Restaurant,Plaza,Italian Restaurant,Seafood Restaurant,Sandwich Place,Monument / Landmark,Church,Food,Art Museum
7,Auxerre,Hotel,French Restaurant,Tourist Information Center,Historic Site,Harbor / Marina,Grocery Store,Pizza Place,Restaurant,Bar,Plaza
8,Bastia,Café,French Restaurant,Auto Workshop,Mediterranean Restaurant,Plaza,Ice Cream Shop,History Museum,Food & Drink Shop,Flower Shop,Fish & Chips Shop
9,Besançon,Hotel,Tram Station,Train Station,Italian Restaurant,Doner Restaurant,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service


## Cluster Neighborhoods

Run k-means to cluster the neighborhood into 5 clusters.

In [29]:
# set number of clusters
kclusters = 5

france_grouped_clustering = france_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(france_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 0, 4, 4, 1, 4, 0, 4, 0, 4], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each cities.

In [30]:
france_merged = data_df

# add clustering labels
france_merged['Cluster Labels'] = kmeans.labels_

# merge toronto_grouped with france to add latitude/longitude for each neighborhood
france_merged = france_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='CITY')

france_merged.head() # check the last columns!

Unnamed: 0,CITY,LATITUDE,LONGITUDE,COUNTRY,POPULATION,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Paris,48.866667,2.333333,France,9904000,0,Japanese Restaurant,Hotel,French Restaurant,Café,Ramen Restaurant,Jewelry Store,Pastry Shop,Bakery,Udon Restaurant,Bookstore
1,Lyon,45.748457,4.846711,France,1423000,0,Restaurant,Diner,Bistro,Pizza Place,Plaza,Hobby Shop,Italian Restaurant,Sandwich Place,Fast Food Restaurant,French Restaurant
2,Marseille,43.285413,5.37606,France,1400000,4,Plaza,Bus Stop,Lounge,French Restaurant,Cupcake Shop,Church,Scenic Lookout,Hotel,Asian Restaurant,Farmers Market
3,Lille,50.632971,3.058585,France,1044000,4,French Restaurant,Bar,Japanese Restaurant,Pub,Cocktail Bar,Italian Restaurant,Coffee Shop,Plaza,Burger Joint,Hotel
4,Nice,43.713644,7.25952,France,338620,1,French Restaurant,Plaza,Seafood Restaurant,Mediterranean Restaurant,Gym,Farmers Market,Women's Store,Doner Restaurant,Fish & Chips Shop,Financial or Legal Service


Finally, let's visualize the resulting clusters

In [31]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=6)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(france_merged['LATITUDE'], france_merged['LONGITUDE'], france_merged['CITY'], france_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [32]:
france_merged.count()

CITY                      57
LATITUDE                  57
LONGITUDE                 57
COUNTRY                   57
POPULATION                57
Cluster Labels            57
1st Most Common Venue     57
2nd Most Common Venue     57
3rd Most Common Venue     57
4th Most Common Venue     57
5th Most Common Venue     57
6th Most Common Venue     57
7th Most Common Venue     57
8th Most Common Venue     57
9th Most Common Venue     57
10th Most Common Venue    57
dtype: int64

### CLUSTER 1

In [33]:
cluster1 = france_merged.loc[france_merged['Cluster Labels'] == 0, france_merged.columns[[0] + list(range(4, france_merged.shape[1]))]]
cluster1

Unnamed: 0,CITY,POPULATION,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Paris,9904000,0,Japanese Restaurant,Hotel,French Restaurant,Café,Ramen Restaurant,Jewelry Store,Pastry Shop,Bakery,Udon Restaurant,Bookstore
1,Lyon,1423000,0,Restaurant,Diner,Bistro,Pizza Place,Plaza,Hobby Shop,Italian Restaurant,Sandwich Place,Fast Food Restaurant,French Restaurant
6,Bordeaux,803000,0,Hotel,French Restaurant,Plaza,Coffee Shop,Pub,Shopping Mall,Café,Clothing Store,Pedestrian Plaza,Tram Station
8,Strasbourg,439972,0,French Restaurant,Pharmacy,Flower Shop,Bus Station,Supermarket,Bus Stop,Women's Store,Falafel Restaurant,Food,Fish & Chips Shop
10,Metz,409186,0,Bar,French Restaurant,Plaza,Italian Restaurant,Hotel,Sandwich Place,Pub,Coffee Shop,Department Store,Fast Food Restaurant
12,Toulon,168701,0,Supermarket,Shopping Mall,Rugby Pitch,Pizza Place,Cosmetics Shop,Sporting Goods Shop,Fast Food Restaurant,Mobile Phone Shop,Food,Flower Shop
16,Melun,38953,0,Home Service,Pizza Place,History Museum,Boutique,Supermarket,Women's Store,Electronics Store,Food,Flower Shop,Fish & Chips Shop
18,Tours,141621,0,Wine Bar,Brazilian Restaurant,Portuguese Restaurant,Performing Arts Venue,Lingerie Store,Women's Store,Electronics Store,Food,Flower Shop,Fish & Chips Shop
26,Dijon,169946,0,Home Service,Historic Site,Asian Restaurant,River,Women's Store,Doner Restaurant,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service
27,Nîmes,148236,0,Plaza,Bar,Italian Restaurant,French Restaurant,Ice Cream Shop,Mediterranean Restaurant,Supermarket,Coffee Shop,Bakery,Other Nightlife


### CLUSTER 2

In [34]:
france_merged.loc[france_merged['Cluster Labels'] == 1, france_merged.columns[[0] + list(range(4, france_merged.shape[1]))]]

Unnamed: 0,CITY,POPULATION,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Nice,338620,1,French Restaurant,Plaza,Seafood Restaurant,Mediterranean Restaurant,Gym,Farmers Market,Women's Store,Doner Restaurant,Fish & Chips Shop,Financial or Legal Service
11,Grenoble,158552,1,Bar,Camera Store,French Restaurant,Food Truck,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant
15,Saint-Étienne,176280,1,Pub,Plaza,Clothing Store,Nightclub,Movie Theater,French Restaurant,Park,Coffee Shop,Department Store,Bakery
19,Clermont-Ferrand,233050,1,Bar,French Restaurant,Restaurant,Pedestrian Plaza,Café,Sushi Restaurant,Italian Restaurant,Mobile Phone Shop,Plaza,Japanese Restaurant
20,Orléans,217301,1,French Restaurant,Hotel,Shopping Mall,Pub,Gastropub,Tram Station,Art Museum,Fast Food Restaurant,Sandwich Place,Department Store
21,Mulhouse,111430,1,Japanese Restaurant,Women's Store,Doner Restaurant,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
28,Limoges,152199,1,French Restaurant,Plaza,Café,Pedestrian Plaza,Italian Restaurant,Department Store,Clothing Store,Sandwich Place,Bakery,Lounge
30,Perpignan,110706,1,French Restaurant,Plaza,Café,Bar,Convention Center,Pizza Place,Coffee Shop,Music Venue,Castle,Multiplex
34,Amiens,143086,1,Hotel,Pub,Café,Doner Restaurant,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service,Fast Food Restaurant,Farmers Market
35,Besançon,128426,1,Hotel,Tram Station,Train Station,Italian Restaurant,Doner Restaurant,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service


### CLUSTER 3

In [35]:
france_merged.loc[france_merged['Cluster Labels'] == 2, france_merged.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
33,Le Mans,Hotel,Supermarket,Park,Nightclub,French Restaurant,IT Services,History Museum,Doner Restaurant,Flower Shop,Fish & Chips Shop


### CLUSTER 4

In [36]:
france_merged.loc[france_merged['Cluster Labels'] == 3, france_merged.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
22,Rennes,Irish Pub,Creperie,Tea Room,Plaza,Coffee Shop,Bar,Park,Historic Site,Kebab Restaurant,Japanese Restaurant


### CLUSTER 5

In [37]:
france_merged.loc[france_merged['Cluster Labels'] == 4, france_merged.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Marseille,Plaza,Bus Stop,Lounge,French Restaurant,Cupcake Shop,Church,Scenic Lookout,Hotel,Asian Restaurant,Farmers Market
3,Lille,French Restaurant,Bar,Japanese Restaurant,Pub,Cocktail Bar,Italian Restaurant,Coffee Shop,Plaza,Burger Joint,Hotel
5,Toulouse,Plaza,Sandwich Place,Bar,Tapas Restaurant,Italian Restaurant,Gastropub,Metro Station,Burger Joint,River,Farmers Market
7,Rouen,Hotel,French Restaurant,Multiplex,Fast Food Restaurant,Clothing Store,Sandwich Place,Japanese Restaurant,Gym,Electronics Store,Flower Shop
9,Nantes,Bar,French Restaurant,Plaza,Hotel,Coffee Shop,Restaurant,Burger Joint,Indian Restaurant,Tea Room,Bistro
13,Montpellier,French Restaurant,Bar,Wine Bar,Burger Joint,Pub,Coffee Shop,Cocktail Bar,Pizza Place,Café,Plaza
14,Nancy,Bar,French Restaurant,Hotel,Italian Restaurant,Nightclub,Plaza,Coffee Shop,Cosmetics Shop,Historic Site,Pizza Place
17,Le Havre,Hotel,Pizza Place,Japanese Restaurant,Falafel Restaurant,Doner Restaurant,Food & Drink Shop,Food,Flower Shop,Fish & Chips Shop,Financial or Legal Service
23,Reims,Hotel,Italian Restaurant,Asian Restaurant,Supermarket,Tea Room,Camera Store,Opera House,Brewery,Tourist Information Center,Department Store
24,Caen,Pharmacy,Trail,Sandwich Place,Tennis Stadium,Food & Drink Shop,Cupcake Shop,Cycle Studio,Food,Flower Shop,Fish & Chips Shop


In [38]:
get_Hotel = france_merged[france_merged.eq('Hotel').any(axis=1)]  
tot_cluster = get_Hotel[france_merged.eq('Bar','Plaza','Shopping Mall').any(axis=1)]  
#tot_cluster = get_cluster[get_cluster.eq('Plaza').any(axis=1)]
tot_cluster.head(10)

  


Unnamed: 0,CITY,LATITUDE,LONGITUDE,COUNTRY,POPULATION,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Lille,50.632971,3.058585,France,1044000,4,French Restaurant,Bar,Japanese Restaurant,Pub,Cocktail Bar,Italian Restaurant,Coffee Shop,Plaza,Burger Joint,Hotel
9,Nantes,47.216509,-1.552379,France,438537,4,Bar,French Restaurant,Plaza,Hotel,Coffee Shop,Restaurant,Burger Joint,Indian Restaurant,Tea Room,Bistro
10,Metz,49.115461,6.175875,France,409186,0,Bar,French Restaurant,Plaza,Italian Restaurant,Hotel,Sandwich Place,Pub,Coffee Shop,Department Store,Fast Food Restaurant
14,Nancy,48.69211,6.187756,France,105334,4,Bar,French Restaurant,Hotel,Italian Restaurant,Nightclub,Plaza,Coffee Shop,Cosmetics Shop,Historic Site,Pizza Place
25,Angers,47.473806,-0.54774,France,168279,4,Bar,French Restaurant,Pub,Lounge,Sandwich Place,Indian Restaurant,Italian Restaurant,Department Store,Japanese Restaurant,Hotel
32,Brest,48.390756,-4.486165,France,140929,0,Hotel,Fast Food Restaurant,Sandwich Place,Shopping Mall,Pedestrian Plaza,Bookstore,Electronics Store,Café,Thai Restaurant,Bar
36,Annecy,45.906206,6.126699,France,49232,0,Hotel,Department Store,Bar,Clothing Store,Pizza Place,Candy Store,Shopping Mall,Mobile Phone Shop,Sandwich Place,Café
44,Bourges,47.083333,2.4,France,67987,1,Plaza,Pub,French Restaurant,Hotel,Bar,Tourist Information Center,Park,Department Store,Clothing Store,Snack Place
52,Nevers,46.991203,3.157084,France,43988,1,French Restaurant,Historic Site,Supermarket,Diner,Dessert Shop,Park,Bar,Hotel,Dance Studio,Department Store
55,Auxerre,47.7996,3.57033,France,34552,0,Hotel,French Restaurant,Tourist Information Center,Historic Site,Harbor / Marina,Grocery Store,Pizza Place,Restaurant,Bar,Plaza


### USER_CLUSTER 1

Cluster based on user selection

In [39]:
tot_cluster.loc[tot_cluster['Cluster Labels'] == 0, tot_cluster.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Metz,Bar,French Restaurant,Plaza,Italian Restaurant,Hotel,Sandwich Place,Pub,Coffee Shop,Department Store,Fast Food Restaurant
32,Brest,Hotel,Fast Food Restaurant,Sandwich Place,Shopping Mall,Pedestrian Plaza,Bookstore,Electronics Store,Café,Thai Restaurant,Bar
36,Annecy,Hotel,Department Store,Bar,Clothing Store,Pizza Place,Candy Store,Shopping Mall,Mobile Phone Shop,Sandwich Place,Café
55,Auxerre,Hotel,French Restaurant,Tourist Information Center,Historic Site,Harbor / Marina,Grocery Store,Pizza Place,Restaurant,Bar,Plaza


### USER_CLUSTER 2

Cluster based on user selection

In [40]:
tot_cluster.loc[tot_cluster['Cluster Labels'] == 1, tot_cluster.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
44,Bourges,Plaza,Pub,French Restaurant,Hotel,Bar,Tourist Information Center,Park,Department Store,Clothing Store,Snack Place
52,Nevers,French Restaurant,Historic Site,Supermarket,Diner,Dessert Shop,Park,Bar,Hotel,Dance Studio,Department Store


### USER_CLUSTER 3

Cluster based on user selection

In [41]:
tot_cluster.loc[tot_cluster['Cluster Labels'] == 2, tot_cluster.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


### USER_CLUSTER 4

Cluster based on user selection

In [42]:
tot_cluster.loc[tot_cluster['Cluster Labels'] == 3, tot_cluster.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue


### USER_CLUSTER 5

Cluster based on user selection

In [43]:
tot_cluster.loc[tot_cluster['Cluster Labels'] == 4, tot_cluster.columns[[0] + list(range(6, france_merged.shape[1]))]]

Unnamed: 0,CITY,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Lille,French Restaurant,Bar,Japanese Restaurant,Pub,Cocktail Bar,Italian Restaurant,Coffee Shop,Plaza,Burger Joint,Hotel
9,Nantes,Bar,French Restaurant,Plaza,Hotel,Coffee Shop,Restaurant,Burger Joint,Indian Restaurant,Tea Room,Bistro
14,Nancy,Bar,French Restaurant,Hotel,Italian Restaurant,Nightclub,Plaza,Coffee Shop,Cosmetics Shop,Historic Site,Pizza Place
25,Angers,Bar,French Restaurant,Pub,Lounge,Sandwich Place,Indian Restaurant,Italian Restaurant,Department Store,Japanese Restaurant,Hotel


## Create MAP for USER based on user input filter

In [44]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=5)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(tot_cluster['LATITUDE'], tot_cluster['LONGITUDE'], tot_cluster['CITY'], tot_cluster['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

5.)Discussion :

	It is interesting how the venues and people from different cities varies to one another. The main differentiation is after the clusters filtered upon the user inputs but also we could see some common venues among the clusters.

As a recommendation, it must be said in study to make better predictions about the where to locate cluster city with user venue. for example if tourist want to locate the city with hotel clusters based on bar,plaza,gym etc..


6.)Conclusion :

	As far as we can see with this data, some of the clusters are not populated because of user filter. 

It is highly possible that user_cluster 1 & 5 has more cities which has the user preferences of hotel cluster. If the user input data should perform with more data and logic also framed in proper way then we can provide more accurate output .

7.)References

https://developer.foursquare.com/docs/api/venues/

https://simplemaps.com/data/fr-cities

https://www.coursera.org/