# Capstone Project - Where can I open a new restaurant?

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction <a name="introduction"></a>

### Background
The restaurant business is a rather risky business. To qualify for a successful result, you need to think through everything to the smallest detail - menu, interior, audience, location, and so on. Even if you have a successful chain of restaurants or cafes in one city / country, it is not at all necessary that everything will be as successful in another city / country. A mistake can lead to a complete loss of investment. Therefore, you need to make extremely balanced decisions, such as choosing a location.

### Problem 
Based on data about cafes, restaurants, cinemas, hotels and other socially significant places for people, cities can be compared with each other. The aim of this project is to compare and cluster some capitals. The results obtained will be used to select the optimal location for your new establishment, by selecting a city that is as similar as possible to your city, where you already have a successful restaurant.

### Interest
Restaurateurs who plan to expand their business outside their own country.

## Data <a name="data"></a>

## Data sources
Information about developed countries and their capitals is available on the resources: https://en.wikipedia.org/wiki/Developed_country and https://geographyfieldwork.com/WorldCapitalCities.htm. Information about socially significant objects in cities can be obtained using the API https://foursquare.com/.

## Data analysis
The project will use a machine learning method - clustering. The list of capitals of developed countries and information on the number and type of socially significant objects in them (cafes, restaurants, cinemas, hotels, etc.) will be used as input data.

In [271]:
import numpy as np
import requests
import pandas as pd
import bs4
import folium

from geopy.geocoders import Nominatim

from sklearn.cluster import KMeans # import k-means from clustering stage

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

Scrap list of developed countries from wikipedia

In [272]:
developed_countries_wiki_url = "https://en.wikipedia.org/wiki/Developed_country"

# read html table to dataframe
top_31_countries_df = pd.read_html(developed_countries_wiki_url)[1]

# get only name of Countries
top_31_countries_df = top_31_countries_df['Country/territory']

# rename country name column
top_31_countries_df.rename(columns = {'Country/territory': 'Country'}, inplace = True)

top_31_countries_df.head()

Unnamed: 0,Country
0,Norway
1,Switzerland
2,Ireland
3,Germany
4,Hong Kong


Scrap list of capitals from web-page

In [273]:
capitals_url = "https://geographyfieldwork.com/WorldCapitalCities.htm"

# read html page to dataframe
capitals_df = pd.read_html(capitals_url, attrs = {'summary': 'World Capitals'})[0]
capitals_df.rename(columns = {'Capital City': 'Capital'}, inplace = True)
capitals_df.head()

Unnamed: 0,Country,Capital
0,Afghanistan,Kabul
1,Albania,Tirana (Tirane)
2,Algeria,Algiers
3,Andorra,Andorra la Vella
4,Angola,Luanda


Join countries and capitals

In [274]:
df = top_31_countries_df.set_index('Country').join(capitals_df.set_index('Country'), how = 'left')
df.head()

Unnamed: 0_level_0,Capital
Country,Unnamed: 1_level_1
Norway,Oslo
Switzerland,Bern
Ireland,Dublin
Germany,Berlin
Hong Kong,


Fix some problems in data

In [275]:
df.loc['Hong Kong'] = 'Hong Kong'
df.loc['Netherlands'] = 'Amsterdam'
df.loc['Israel'] = 'Jerusalem'
df.loc['Czech Republic'] = 'Prague'

Add latitude and longitude to data

In [276]:
geolocator = Nominatim(user_agent="ny_explorer")

def get_coords(address):    
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    
    return {'lat': latitude, 'lon': longitude}

In [277]:
coords_list = []

for index, row in df.iterrows():
    coords = get_coords(row['Capital'])
    coords_list.append(coords)

coords_df = pd.DataFrame(coords_list)

full_df = df[:]

full_df['lat'] = coords_df['lat'].values
full_df['lon'] = coords_df['lon'].values

full_df.head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  full_df['lat'] = coords_df['lat'].values
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  full_df['lon'] = coords_df['lon'].values


Unnamed: 0_level_0,Capital,lat,lon
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Norway,Oslo,59.91333,10.73897
Switzerland,Bern,46.948271,7.451451
Ireland,Dublin,53.349764,-6.260273
Germany,Berlin,52.517037,13.38886
Hong Kong,Hong Kong,22.279328,114.162813


In [278]:
# show on map capitals
developed_countries_map = folium.Map(location = [50.639944, -23.276366], zoom_start = 2)

for index, row in full_df.iterrows():
    label = '{}, {}'.format(row['Capital'], index)
    label = folium.Popup(label, parse_html=True)
    
    folium.CircleMarker(
        location = [row['lat'], row['lon']],
        radius = 5,
        popup = label,
        color = "blue",
        fill = True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(developed_countries_map)

developed_countries_map

## Methodology <a name="methodology"></a>

Steps:
1. Get venues for every capital in radius 3 km
2. Prepare data for K-means
3. Cluster capitals by method K-means

## Analysis <a name="analysis"></a>

In [279]:
CLIENT_ID = 'CLDQJU3YFVBUIN3UKCEMXWZO2MGBPPPDIG014BVJRCSTYKTD' # your Foursquare ID
CLIENT_SECRET = 'VPHHGAVIH3HD2RWODUKKQTX3KWXWYWRLJVIU4SU3IJULVH0L' # your Foursquare Secret
ACCESS_TOKEN = '4WDYVO1F4403JMFEVAYHYN03YA5LHV5IVO5UIEGLEQLRZYVF' # your FourSquare Access Token

VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: CLDQJU3YFVBUIN3UKCEMXWZO2MGBPPPDIG014BVJRCSTYKTD
CLIENT_SECRET:VPHHGAVIH3HD2RWODUKKQTX3KWXWYWRLJVIU4SU3IJULVH0L


In [280]:
def getNearbyVenues(capitals, countries, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for capital, country, lat, lng in zip(capitals, countries, latitudes, longitudes):
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            capital,
            country,
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = [
        'Capital', 
        'Country',
        'Capital Latitude', 
        'Capital Longitude',
        'Venue', 
        'Venue Latitude', 
        'Venue Longitude', 
        'Venue Category']
    
    return(nearby_venues)

In [281]:
# get venues for capitals in radius 2 km
capitals_venues = getNearbyVenues(full_df['Capital'], full_df.index, full_df['lat'], full_df['lon'], radius = 2 * 1000)

In [282]:
capitals_venues

Unnamed: 0,Capital,Country,Capital Latitude,Capital Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Oslo,Norway,59.91333,10.738970,Ben & Jerry's,59.914360,10.737109,Ice Cream Shop
1,Oslo,Norway,59.91333,10.738970,Stockfleths,59.913656,10.741206,Coffee Shop
2,Oslo,Norway,59.91333,10.738970,Nordvegan,59.915591,10.737863,Vegetarian / Vegan Restaurant
3,Oslo,Norway,59.91333,10.738970,Dinner,59.913957,10.734757,Chinese Restaurant
4,Oslo,Norway,59.91333,10.738970,Det Norske Teatret,59.915360,10.738657,Theater
...,...,...,...,...,...,...,...,...
2858,Nicosia,Cyprus,35.17393,33.364726,Derviş Büfe,35.184503,33.364319,Café
2859,Nicosia,Cyprus,35.17393,33.364726,Aphrodite's Snacks,35.169175,33.348400,Sandwich Place
2860,Nicosia,Cyprus,35.17393,33.364726,Sicily,35.162069,33.353693,Coffee Shop
2861,Nicosia,Cyprus,35.17393,33.364726,Il Bacaro,35.165232,33.349866,Wine Bar


In [283]:
print('There are {} uniques categories.'.format(len(capitals_venues['Venue Category'].unique())))

There are 331 uniques categories.


In [284]:
# one hot encoding
capitals_onehot = pd.get_dummies(capitals_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
capitals_onehot['Capital'] = capitals_venues['Capital'] 

# move neighborhood column to the first column
fixed_columns = [capitals_onehot.columns[-1]] + list(capitals_onehot.columns[:-1])
capitals_onehot = capitals_onehot[fixed_columns]

capitals_onehot.head()

Unnamed: 0,Capital,Accessories Store,African Restaurant,Airport,Airport Service,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,...,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,Yoshoku Restaurant,Zoo
0,Oslo,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Oslo,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Oslo,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Oslo,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Oslo,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [285]:
capitals_grouped = capitals_onehot.groupby('Capital').mean().reset_index()
capitals_grouped

Unnamed: 0,Capital,Accessories Store,African Restaurant,Airport,Airport Service,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,...,Vietnamese Restaurant,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Women's Store,Yoga Studio,Yoshoku Restaurant,Zoo
0,Amsterdam,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,...,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.02,0.0,0.0
1,Berlin,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,...,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bern,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,...,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01
3,Brussels,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0
4,Canberra,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Copenhagen,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.0,0.0
6,Dublin,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,...,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0
7,Helsinki,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,...,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.0
8,Hong Kong,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,...,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.04,0.0,0.01
9,Jerusalem,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [286]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [287]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Capital']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
capital_venues_sorted = pd.DataFrame(columns=columns)
capital_venues_sorted['Capital'] = capitals_grouped['Capital']

for ind in np.arange(capitals_grouped.shape[0]):
    capital_venues_sorted.iloc[ind, 1:] = return_most_common_venues(capitals_grouped.iloc[ind, :], num_top_venues)

capital_venues_sorted.head()

Unnamed: 0,Capital,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Amsterdam,Hotel,Bar,Restaurant,Coffee Shop,Italian Restaurant,Canal,Sandwich Place,Cocktail Bar,Beer Bar,Market
1,Berlin,Plaza,Coffee Shop,Hotel,Monument / Landmark,History Museum,Bookstore,Concert Hall,Exhibit,Modern European Restaurant,Art Museum
2,Bern,Café,Bar,Swiss Restaurant,Restaurant,Plaza,Park,Hotel,Ice Cream Shop,Science Museum,Pizza Place
3,Brussels,Bar,Chocolate Shop,Plaza,Bookstore,Hotel,Toy / Game Store,Seafood Restaurant,Sandwich Place,Beer Bar,Art Museum
4,Canberra,Park,Café,Pizza Place,Sports Club,Gym / Fitness Center,Doner Restaurant,Gas Station,Fish & Chips Shop,Garden Center,Beach


In [288]:
# set number of clusters
kclusters = 5

capital_grouped_clustering = capitals_grouped.drop('Capital', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(capital_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:1000] 

array([3, 0, 3, 0, 4, 2, 2, 2, 0, 3, 2, 0, 1, 0, 2, 2, 0, 0, 3, 2, 0, 0,
       2, 2, 2, 0, 3, 3, 0, 0, 2])

In [289]:
# add clustering labels
capital_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

capitals_merged = full_df

capitals_merged = capitals_merged.join(capital_venues_sorted.set_index('Capital'), on='Capital')

capitals_merged.head() # check the last columns!

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Norway,Oslo,59.91333,10.73897,2,Coffee Shop,Indian Restaurant,Restaurant,Scandinavian Restaurant,Tapas Restaurant,Park,Burger Joint,Cocktail Bar,Bar,Movie Theater
Switzerland,Bern,46.948271,7.451451,3,Café,Bar,Swiss Restaurant,Restaurant,Plaza,Park,Hotel,Ice Cream Shop,Science Museum,Pizza Place
Ireland,Dublin,53.349764,-6.260273,2,Coffee Shop,Café,Pub,Park,Italian Restaurant,Burger Joint,Irish Pub,Theater,Indie Movie Theater,Cocktail Bar
Germany,Berlin,52.517037,13.38886,0,Plaza,Coffee Shop,Hotel,Monument / Landmark,History Museum,Bookstore,Concert Hall,Exhibit,Modern European Restaurant,Art Museum
Hong Kong,Hong Kong,22.279328,114.162813,0,Hotel,Japanese Restaurant,Yoga Studio,Café,Gym / Fitness Center,Chinese Restaurant,Cantonese Restaurant,Steakhouse,Italian Restaurant,Bakery


In [290]:
capitals_merged

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Norway,Oslo,59.91333,10.73897,2,Coffee Shop,Indian Restaurant,Restaurant,Scandinavian Restaurant,Tapas Restaurant,Park,Burger Joint,Cocktail Bar,Bar,Movie Theater
Switzerland,Bern,46.948271,7.451451,3,Café,Bar,Swiss Restaurant,Restaurant,Plaza,Park,Hotel,Ice Cream Shop,Science Museum,Pizza Place
Ireland,Dublin,53.349764,-6.260273,2,Coffee Shop,Café,Pub,Park,Italian Restaurant,Burger Joint,Irish Pub,Theater,Indie Movie Theater,Cocktail Bar
Germany,Berlin,52.517037,13.38886,0,Plaza,Coffee Shop,Hotel,Monument / Landmark,History Museum,Bookstore,Concert Hall,Exhibit,Modern European Restaurant,Art Museum
Hong Kong,Hong Kong,22.279328,114.162813,0,Hotel,Japanese Restaurant,Yoga Studio,Café,Gym / Fitness Center,Chinese Restaurant,Cantonese Restaurant,Steakhouse,Italian Restaurant,Bakery
Australia,Canberra,-35.297591,149.101268,4,Park,Café,Pizza Place,Sports Club,Gym / Fitness Center,Doner Restaurant,Gas Station,Fish & Chips Shop,Garden Center,Beach
Iceland,Reykjavik,64.145981,-21.942237,2,Bar,Café,Seafood Restaurant,Restaurant,Coffee Shop,Hotel,Scandinavian Restaurant,Burger Joint,Bakery,Concert Hall
Sweden,Stockholm,59.325117,18.071093,2,Scandinavian Restaurant,Hotel,Coffee Shop,Bakery,Café,Clothing Store,Bookstore,Plaza,Gym / Fitness Center,Falafel Restaurant
Singapore,Singapore,1.357107,103.819499,2,Chinese Restaurant,Café,Thai Restaurant,Trail,Asian Restaurant,Coffee Shop,Restaurant,Indian Restaurant,Ice Cream Shop,Bakery
Netherlands,Amsterdam,52.37276,4.893604,3,Hotel,Bar,Restaurant,Coffee Shop,Italian Restaurant,Canal,Sandwich Place,Cocktail Bar,Beer Bar,Market


In [291]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=2)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
rainbow

['#8000ff', '#00b5eb', '#80ffb4', '#ffb360', '#ff0000']

In [292]:
# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(capitals_merged['lat'], capitals_merged['lon'], capitals_merged['Capital'], capitals_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

## Results and Discussion <a name="results"></a>

Our analysis showed that the capitals of developed countries can be divided into 5 groups. If the cities are in the same group, then this with a certain degree of probability indicates that these cities are similar to each other.

The best location for your new restaurant is a city from the same group where you already have a successful restaurant.

List of capitals in every group see next.

In [293]:
capitals_merged.loc[capitals_merged['Cluster Labels'] == 0]

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Germany,Berlin,52.517037,13.38886,0,Plaza,Coffee Shop,Hotel,Monument / Landmark,History Museum,Bookstore,Concert Hall,Exhibit,Modern European Restaurant,Art Museum
Hong Kong,Hong Kong,22.279328,114.162813,0,Hotel,Japanese Restaurant,Yoga Studio,Café,Gym / Fitness Center,Chinese Restaurant,Cantonese Restaurant,Steakhouse,Italian Restaurant,Bakery
Canada,Ottawa,45.421106,-75.690308,0,Hotel,Coffee Shop,Restaurant,Tapas Restaurant,Café,Seafood Restaurant,Concert Hall,Museum,Mexican Restaurant,Pub
United Kingdom,London,51.507322,-0.127647,0,Theater,Plaza,Hotel,Bakery,Ice Cream Shop,Coffee Shop,Bookstore,Garden,Dessert Shop,Movie Theater
United States,Washington D.C.,38.895037,-77.036543,0,Hotel,Coffee Shop,Monument / Landmark,Art Museum,History Museum,Mediterranean Restaurant,Plaza,Hotel Bar,Indian Restaurant,Garden
Belgium,Brussels,50.846557,4.351697,0,Bar,Chocolate Shop,Plaza,Bookstore,Hotel,Toy / Game Store,Seafood Restaurant,Sandwich Place,Beer Bar,Art Museum
Japan,Tokyo,35.682839,139.759455,0,Hotel,Bakery,French Restaurant,Sushi Restaurant,Yoshoku Restaurant,Café,Soba Restaurant,Theater,Lounge,Japanese Restaurant
Austria,Vienna,48.208354,16.372504,0,Austrian Restaurant,Hotel,Plaza,Restaurant,Concert Hall,Art Museum,Park,Café,Cocktail Bar,French Restaurant
South Korea,Seoul,37.566679,126.978291,0,Hotel,Korean Restaurant,Chinese Restaurant,Lounge,Bookstore,Japanese Restaurant,Historic Site,Palace,Coffee Shop,Café
Spain,Madrid,40.416705,-3.703582,0,Plaza,Restaurant,Café,Tapas Restaurant,Spanish Restaurant,Hotel,Bookstore,Theater,Mediterranean Restaurant,Cocktail Bar


In [294]:
capitals_merged.loc[capitals_merged['Cluster Labels'] == 1]

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Luxembourg,Luxembourg,49.815868,6.129675,1,Racetrack,BBQ Joint,Zoo,Factory,Eastern European Restaurant,Electronics Store,Escape Room,Ethiopian Restaurant,Event Space,Exhibit


In [295]:
capitals_merged.loc[capitals_merged['Cluster Labels'] == 2]

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Norway,Oslo,59.91333,10.73897,2,Coffee Shop,Indian Restaurant,Restaurant,Scandinavian Restaurant,Tapas Restaurant,Park,Burger Joint,Cocktail Bar,Bar,Movie Theater
Ireland,Dublin,53.349764,-6.260273,2,Coffee Shop,Café,Pub,Park,Italian Restaurant,Burger Joint,Irish Pub,Theater,Indie Movie Theater,Cocktail Bar
Iceland,Reykjavik,64.145981,-21.942237,2,Bar,Café,Seafood Restaurant,Restaurant,Coffee Shop,Hotel,Scandinavian Restaurant,Burger Joint,Bakery,Concert Hall
Sweden,Stockholm,59.325117,18.071093,2,Scandinavian Restaurant,Hotel,Coffee Shop,Bakery,Café,Clothing Store,Bookstore,Plaza,Gym / Fitness Center,Falafel Restaurant
Singapore,Singapore,1.357107,103.819499,2,Chinese Restaurant,Café,Thai Restaurant,Trail,Asian Restaurant,Coffee Shop,Restaurant,Indian Restaurant,Ice Cream Shop,Bakery
Denmark,Copenhagen,55.686724,12.570072,2,Coffee Shop,Bakery,Beer Bar,Scandinavian Restaurant,Park,Wine Bar,Cocktail Bar,Pizza Place,Café,Plaza
Finland,Helsinki,60.167488,24.942747,2,Scandinavian Restaurant,Hotel,Café,Coffee Shop,Pizza Place,Park,Middle Eastern Restaurant,Toy / Game Store,Beer Bar,Filipino Restaurant
New Zealand,Wellington,-41.288795,174.777211,2,Coffee Shop,Restaurant,Café,Bar,Hotel,Brewery,Italian Restaurant,Park,Pizza Place,Vietnamese Restaurant
Slovenia,Ljubljana,46.04998,14.50686,2,Café,Eastern European Restaurant,Plaza,Coffee Shop,Restaurant,Park,Bar,Bistro,Pub,Burger Joint
Estonia,Tallinn,59.437216,24.745369,2,Scenic Lookout,Wine Bar,Restaurant,Hotel,Bar,Coffee Shop,Park,Cocktail Bar,Asian Restaurant,Pub


In [296]:
capitals_merged.loc[capitals_merged['Cluster Labels'] == 3]

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Switzerland,Bern,46.948271,7.451451,3,Café,Bar,Swiss Restaurant,Restaurant,Plaza,Park,Hotel,Ice Cream Shop,Science Museum,Pizza Place
Netherlands,Amsterdam,52.37276,4.893604,3,Hotel,Bar,Restaurant,Coffee Shop,Italian Restaurant,Canal,Sandwich Place,Cocktail Bar,Beer Bar,Market
Liechtenstein,Vaduz,47.139286,9.522796,3,Hotel,Café,History Museum,Italian Restaurant,Supermarket,Gas Station,Swiss Restaurant,Tourist Information Center,Bed & Breakfast,Bar
Israel,Jerusalem,31.795924,35.211981,3,Hotel,Café,Middle Eastern Restaurant,Italian Restaurant,Bar,Restaurant,Mediterranean Restaurant,Coffee Shop,BBQ Joint,Ice Cream Shop
Czech Republic,Prague,50.087465,14.421254,3,Café,Hotel,Cocktail Bar,Dessert Shop,Italian Restaurant,Island,Boutique,Plaza,Beer Garden,Garden
Malta,Valletta,35.898982,14.513676,3,Mediterranean Restaurant,Restaurant,Italian Restaurant,Historic Site,Café,Bar,Cocktail Bar,Hotel,Garden,Ice Cream Shop


In [297]:
capitals_merged.loc[capitals_merged['Cluster Labels'] == 4]

Unnamed: 0_level_0,Capital,lat,lon,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
Country,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1
Australia,Canberra,-35.297591,149.101268,4,Park,Café,Pizza Place,Sports Club,Gym / Fitness Center,Doner Restaurant,Gas Station,Fish & Chips Shop,Garden Center,Beach


## Conclusion <a name="conclusion"></a>

This project has shown that machine learning algorithms can be used to make key business decisions.

Of course, important decisions cannot be made based on data alone. The results obtained can help to make the right choice, but the final decision should be with an expert in a particular type of business.