## Capstone Project - The Battle of Neighborhoods

### Introduction/Business Problem

In this project we will try to find an optimal location for a Hotel. Specifically, this report will be targeted to stakeholders interested in opening a new Hotel in one of the regional capital cities of Chile.

Since there are not lots of hotels in Chile we will try to detect **locations that are not already crowded with hotels**. We are also particularly interested in **areas with no Hotels in the principal venues**. 

We will use our data science powers to generate a few most promissing neighborhoods based on this criteria. Advantages of each area will then be clearly expressed so that best possible final location can be chosen by stakeholders.

## Data

Based on definition of our problem, factors that will influence our decission are:
* Location of each regional capital city in Chile
* List of the venues of each capital city.

The data used for this project was taken from the wikipedia webapage https://es.wikipedia.org/wiki/Anexo:Ciudades_de_Chile, which was processed in Excel to obtain the table with the latitude and longitude for each capital city.

The Foursquare API searh feature will be used to collect the venues registered in each capital city with and specific radius from the location.

In addition, various Python packages will be used to create maps and machine learning models to undertand the data collected and give the best advice for the business problem.

## Getting the data for the regional capital cities in Chile

#### Importing libraries required for the Notebook

In [1]:
import pandas as pd
import numpy as np
import requests

#### Getting the locations for the Captial Cities in Chile from .csv file (Wikipedia Table)

The file was charged to the IBM Data Cloud and impoted with the code showed below

In [2]:
# The code was removed by Watson Studio for sharing.

Unnamed: 0,City,Province,Region,Latitude,Longitude
0,Arica,Arica,Arica y Parinacota,-18.455,-70.29
1,Iquique,Iquique,Tarapaca,-20.244,-70.139
2,Antofagasta,Antofagasta,Antofagasta,-23.651,-70.395
3,Calama,El Loa,Antofagasta,-22.474,-68.924
4,Copiapo,Copiapo,Atacama,-27.375,-70.329


# Explore and cluster the Capital Cities in Chile

Import the libraries required for this part of the assignment

In [3]:
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - folium=0.5.0


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    branca-0.4.1               |             py_0          26 KB  conda-forge
    python_abi-3.6             |          1_cp36m           4 KB  conda-forge
    certifi-2020.6.20          |   py36h9f0ad1d_0         151 KB  conda-forge
    ca-certificates-2020.6.20  |       hecda079_0         145 KB  conda-forge
    vincent-0.4.4              |             py_1          28 KB  conda-forge
    openssl-1.1.1g             |       h516909a_1         2.1 MB  conda-forge
    folium-0.5.0               |             py_0          45 KB  conda-forge
    altair-4.1.0               |             py_1         614 KB  conda-forge
    ------------------------------------------------------------
                       

Use geopy library to get the latitude and longitude values of Santiago (capital of the metropolitan area in Chile)

In [4]:
address = 'Santiago, Chile'

geolocator = Nominatim(user_agent="Toronto")
location = geolocator.geocode(address)
latitude_santiago = location.latitude
longitude_santiago = location.longitude
print('The geograpical coordinate of Santiago, Chile are {}, {}.'.format(latitude_toronto, longitude_toronto))


NameError: name 'latitude_toronto' is not defined

In [5]:
# Changing the name of the dataframe data_chile to df_santiago (capital of chile) which is used in the rest of the Notebook

df_santiago = data_chile
df_santiago

Unnamed: 0,City,Province,Region,Latitude,Longitude
0,Arica,Arica,Arica y Parinacota,-18.455,-70.29
1,Iquique,Iquique,Tarapaca,-20.244,-70.139
2,Antofagasta,Antofagasta,Antofagasta,-23.651,-70.395
3,Calama,El Loa,Antofagasta,-22.474,-68.924
4,Copiapo,Copiapo,Atacama,-27.375,-70.329
5,La Serena,Elqui,Coquimbo,-29.907,-71.247
6,Valparaiso,Valparaiso,Valparaiso,-33.05,-71.616
7,Rancagua,Cachapoal,Lib. Gral. Bernardo OHiggins,-34.162,-70.741
8,Talca,Talca,Maule,-35.423,-71.657
9,Chillan,Diguillin,Nuble,-36.601,-72.109


Create a map of centered in the coordinates for Chile, adding the marks for the Cities from the df_santiago dataframe

In [6]:
#Chile is the longest country in the world. It is a bit difficult to center the map in the middle of the country. You can use the hand to scroll it

map_santiago = folium.Map(location=[latitude_santiago, longitude_santiago], zoom_start=3.8)

# Add marks to the Map wiht the Cities
for lat, lng, city in zip(df_santiago['Latitude'], df_santiago['Longitude'], df_santiago['City']):
    label = '{}'.format(city)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_santiago)  
    
map_santiago

#### Define Foursquare Credentials and Version

In [7]:
CLIENT_ID = Deleted for sharing' # your Foursquare ID
CLIENT_SECRET = Deleted for sharing' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: Deleted for sharin

### Explore Capital Cities in Chile

#### Let's create a function to explore all the neighborhoods in defined in the dataframe df_santiago

In [8]:
def getNearbyVenues(names, latitudes, longitudes):

#The radius used for this analysis is bigger becauses I am exploring the capital cities of each region (40.000 metres)

    radius=40000 
    LIMIT=100
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Now write the code to run the above function on each neighborhood and create a new dataframe called *santiago_venues*.

In [9]:
santiago_venues = getNearbyVenues(names=df_santiago['City'],
                                   latitudes=df_santiago['Latitude'],
                                   longitudes=df_santiago['Longitude']
                                  )

Arica
Iquique
Antofagasta
Calama
Copiapo
La Serena
Valparaiso
Rancagua
Talca
Chillan
Concepcion
Temuco
Valdivia
Puerto Montt
Coyhaique
Punta Arenas
Santiago


In [10]:
print(santiago_venues.shape)
santiago_venues

(1237, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Arica,-18.455,-70.290000,Playa Chinchorro,-18.454318,-70.301581,Beach
1,Arica,-18.455,-70.290000,Rayú,-18.462694,-70.304099,South American Restaurant
2,Arica,-18.455,-70.290000,Playa Las Machas,-18.445029,-70.298442,Surf Spot
3,Arica,-18.455,-70.290000,Valle de Azapa,-18.492309,-70.280021,Field
4,Arica,-18.455,-70.290000,Pacay Sangucheria,-18.454745,-70.296948,Sandwich Place
5,Arica,-18.455,-70.290000,La Fontana,-18.484163,-70.303503,Ice Cream Shop
6,Arica,-18.455,-70.290000,SushiLounge,-18.483530,-70.293474,Sushi Restaurant
7,Arica,-18.455,-70.290000,Playa El Laucho,-18.487818,-70.326610,Beach
8,Arica,-18.455,-70.290000,Milkhouse,-18.483351,-70.304355,Snack Place
9,Arica,-18.455,-70.290000,Lucano Pizzeria Napoletana,-18.474736,-70.315091,Italian Restaurant


Let's check how many venues were returned for each neighborhood

In [11]:
santiago_venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Antofagasta,100,100,100,100,100,100
Arica,53,53,53,53,53,53
Calama,54,54,54,54,54,54
Chillan,100,100,100,100,100,100
Concepcion,100,100,100,100,100,100
Copiapo,43,43,43,43,43,43
Coyhaique,47,47,47,47,47,47
Iquique,51,51,51,51,51,51
La Serena,76,76,76,76,76,76
Puerto Montt,100,100,100,100,100,100


#### Let's find out how many unique categories can be curated from all the returned venues

In [12]:
print('There are {} uniques categories.'.format(len(santiago_venues['Venue Category'].unique())))

There are 208 uniques categories.


In [13]:
santiago_venues.shape

(1237, 7)

## Analyze Each City

In [14]:
# one hot encoding
santiago_onehot = pd.get_dummies(santiago_venues[['Venue Category']], prefix="", prefix_sep="")

# add city column back to dataframe
santiago_onehot['City_fixed'] = santiago_venues['City'] 

# move City column to the first column

fixed_columns = [santiago_onehot.columns[-1]] + list(santiago_onehot.columns[:-1])


santiago_onehot = santiago_onehot[fixed_columns]

santiago_onehot.head()

Unnamed: 0,City_fixed,Airport,Airport Terminal,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,...,Vineyard,Waste Facility,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Yoga Studio,Zoo
0,Arica,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Arica,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Arica,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Arica,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Arica,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Next, let's group rows by city and by taking the mean of the frequency of occurrence of each category

In [15]:
santiago_grouped = santiago_onehot.groupby('City_fixed').mean().reset_index()
santiago_grouped

Unnamed: 0,City_fixed,Airport,Airport Terminal,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,...,Vineyard,Waste Facility,Water Park,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Winery,Yoga Studio,Zoo
0,Antofagasta,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.01,...,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0
1,Arica,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Calama,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chillan,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0
4,Concepcion,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0
5,Copiapo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Coyhaique,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Iquique,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,La Serena,0.0,0.0,0.013158,0.0,0.013158,0.013158,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.013158
9,Puerto Montt,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's confirm the new size

In [16]:
santiago_grouped.shape

(17, 209)

#### Let's print each neighborhood along with the top 5 most common venues

In [17]:
num_top_venues = 10

for hood in santiago_grouped['City_fixed']:
    print("----"+hood+"----")
    temp = santiago_grouped[santiago_grouped['City_fixed'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Antofagasta----
                 venue  freq
0                Beach  0.10
1           Restaurant  0.06
2   Chinese Restaurant  0.05
3  Peruvian Restaurant  0.04
4               Bistro  0.04
5                 Park  0.04
6   Seafood Restaurant  0.03
7                Hotel  0.03
8         Soccer Field  0.03
9          Pizza Place  0.03


----Arica----
                           venue  freq
0                     Restaurant  0.09
1                          Beach  0.09
2                          Hotel  0.06
3                 Ice Cream Shop  0.06
4                      Surf Spot  0.06
5      South American Restaurant  0.04
6                            Pub  0.04
7                 History Museum  0.04
8             Chinese Restaurant  0.04
9  Vegetarian / Vegan Restaurant  0.04


----Calama----
                  venue  freq
0                 Hotel  0.15
1            Restaurant  0.11
2                   Gym  0.06
3        Breakfast Spot  0.06
4  Fast Food Restaurant  0.06
5     Convenience S

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [18]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues for each neighborhood.

In [19]:
import numpy as np

In [20]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City_fixed']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['City_fixed'] = santiago_grouped['City_fixed']

for ind in np.arange(santiago_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(santiago_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted

Unnamed: 0,City_fixed,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Antofagasta,Beach,Restaurant,Chinese Restaurant,Bistro,Park,Peruvian Restaurant,Seafood Restaurant,Hotel,Soccer Field,Pizza Place
1,Arica,Restaurant,Beach,Hotel,Ice Cream Shop,Surf Spot,Chinese Restaurant,History Museum,South American Restaurant,Pub,Vegetarian / Vegan Restaurant
2,Calama,Hotel,Restaurant,Breakfast Spot,Gym,Fast Food Restaurant,Historic Site,Convenience Store,Nightclub,Chinese Restaurant,Mountain
3,Chillan,Plaza,Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Park,Hotel,Peruvian Restaurant,Café,Asian Restaurant
4,Concepcion,Beach,Sandwich Place,Pizza Place,Italian Restaurant,Hotel,Park,City,Ice Cream Shop,Sushi Restaurant,Plaza
5,Copiapo,Hotel,Nightclub,Pizza Place,Café,History Museum,Diner,Restaurant,Pub,Sushi Restaurant,Movie Theater
6,Coyhaique,Bed & Breakfast,Café,Restaurant,Park,Pub,Sushi Restaurant,BBQ Joint,Scenic Lookout,Sandwich Place,Hotel
7,Iquique,Hotel,Plaza,Ice Cream Shop,Beach,Latin American Restaurant,Restaurant,Museum,Pizza Place,Sandwich Place,Coffee Shop
8,La Serena,Seafood Restaurant,Restaurant,Beach,Burger Joint,Surf Spot,Pizza Place,Dessert Shop,Spa,Sandwich Place,Pool
9,Puerto Montt,Hotel,Restaurant,Café,Scenic Lookout,Beach,Boat or Ferry,German Restaurant,History Museum,Rental Car Location,Bed & Breakfast


## Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 5 clusters.

In [21]:
# set number of clusters
kclusters = 5

santiago_grouped_clustering = santiago_grouped.drop('City_fixed', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(santiago_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([2, 1, 3, 4, 2, 3, 0, 4, 1, 0], dtype=int32)

Let's create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [22]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

santiago_merged = df_santiago

# merge santiago_grouped with df_santiago to add latitude/longitude for each city
santiago_merged = santiago_merged.join(neighborhoods_venues_sorted.set_index('City_fixed'), on='City')

santiago_merged

Unnamed: 0,City,Province,Region,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arica,Arica,Arica y Parinacota,-18.455,-70.29,1,Restaurant,Beach,Hotel,Ice Cream Shop,Surf Spot,Chinese Restaurant,History Museum,South American Restaurant,Pub,Vegetarian / Vegan Restaurant
1,Iquique,Iquique,Tarapaca,-20.244,-70.139,4,Hotel,Plaza,Ice Cream Shop,Beach,Latin American Restaurant,Restaurant,Museum,Pizza Place,Sandwich Place,Coffee Shop
2,Antofagasta,Antofagasta,Antofagasta,-23.651,-70.395,2,Beach,Restaurant,Chinese Restaurant,Bistro,Park,Peruvian Restaurant,Seafood Restaurant,Hotel,Soccer Field,Pizza Place
3,Calama,El Loa,Antofagasta,-22.474,-68.924,3,Hotel,Restaurant,Breakfast Spot,Gym,Fast Food Restaurant,Historic Site,Convenience Store,Nightclub,Chinese Restaurant,Mountain
4,Copiapo,Copiapo,Atacama,-27.375,-70.329,3,Hotel,Nightclub,Pizza Place,Café,History Museum,Diner,Restaurant,Pub,Sushi Restaurant,Movie Theater
5,La Serena,Elqui,Coquimbo,-29.907,-71.247,1,Seafood Restaurant,Restaurant,Beach,Burger Joint,Surf Spot,Pizza Place,Dessert Shop,Spa,Sandwich Place,Pool
6,Valparaiso,Valparaiso,Valparaiso,-33.05,-71.616,2,Scenic Lookout,Coffee Shop,Pizza Place,Italian Restaurant,Peruvian Restaurant,Restaurant,Hotel,Beach,Seafood Restaurant,Bakery
7,Rancagua,Cachapoal,Lib. Gral. Bernardo OHiggins,-34.162,-70.741,4,Restaurant,Pizza Place,Hotel,BBQ Joint,Soccer Field,Plaza,Gastropub,Gas Station,Ice Cream Shop,Sandwich Place
8,Talca,Talca,Maule,-35.423,-71.657,2,Park,Coffee Shop,Ice Cream Shop,Vineyard,Pizza Place,Other Great Outdoors,Cocktail Bar,Resort,Restaurant,Bookstore
9,Chillan,Diguillin,Nuble,-36.601,-72.109,4,Plaza,Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Park,Hotel,Peruvian Restaurant,Café,Asian Restaurant


In [23]:
# create map
map_clusters = folium.Map(location=[latitude_santiago, longitude_santiago], zoom_start=4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(santiago_merged['Latitude'], santiago_merged['Longitude'], santiago_merged['City'], santiago_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[int(cluster)-1],
        fill=True,
        fill_color=rainbow[int(cluster)-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [24]:
santiago_merged.loc[santiago_merged['Cluster Labels'] == 0, 
                    santiago_merged.columns[[0] + list(range(5, santiago_merged.shape[1]))]]

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Valdivia,0,Restaurant,Scenic Lookout,Hotel,Bed & Breakfast,Beach,Bar,Café,Brewery,Historic Site,Park
13,Puerto Montt,0,Hotel,Restaurant,Café,Scenic Lookout,Beach,Boat or Ferry,German Restaurant,History Museum,Rental Car Location,Bed & Breakfast
14,Coyhaique,0,Bed & Breakfast,Café,Restaurant,Park,Pub,Sushi Restaurant,BBQ Joint,Scenic Lookout,Sandwich Place,Hotel
15,Punta Arenas,0,Restaurant,Hotel,Café,History Museum,Other Great Outdoors,Hostel,Scenic Lookout,Tea Room,Gastropub,Diner


In [25]:
santiago_merged.loc[santiago_merged['Cluster Labels'] == 1, 
                    santiago_merged.columns[[0] + list(range(5, santiago_merged.shape[1]))]]

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Arica,1,Restaurant,Beach,Hotel,Ice Cream Shop,Surf Spot,Chinese Restaurant,History Museum,South American Restaurant,Pub,Vegetarian / Vegan Restaurant
5,La Serena,1,Seafood Restaurant,Restaurant,Beach,Burger Joint,Surf Spot,Pizza Place,Dessert Shop,Spa,Sandwich Place,Pool


In [26]:
santiago_merged.loc[santiago_merged['Cluster Labels'] == 2, 
                    santiago_merged.columns[[0] + list(range(5, santiago_merged.shape[1]))]]

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,Antofagasta,2,Beach,Restaurant,Chinese Restaurant,Bistro,Park,Peruvian Restaurant,Seafood Restaurant,Hotel,Soccer Field,Pizza Place
6,Valparaiso,2,Scenic Lookout,Coffee Shop,Pizza Place,Italian Restaurant,Peruvian Restaurant,Restaurant,Hotel,Beach,Seafood Restaurant,Bakery
8,Talca,2,Park,Coffee Shop,Ice Cream Shop,Vineyard,Pizza Place,Other Great Outdoors,Cocktail Bar,Resort,Restaurant,Bookstore
10,Concepcion,2,Beach,Sandwich Place,Pizza Place,Italian Restaurant,Hotel,Park,City,Ice Cream Shop,Sushi Restaurant,Plaza
16,Santiago,2,Park,Bakery,Pizza Place,Vineyard,Golf Course,Snack Place,Scenic Lookout,Deli / Bodega,Museum,Mountain


In [27]:
santiago_merged.loc[santiago_merged['Cluster Labels'] == 3, 
                    santiago_merged.columns[[0] + list(range(5, santiago_merged.shape[1]))]]

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
3,Calama,3,Hotel,Restaurant,Breakfast Spot,Gym,Fast Food Restaurant,Historic Site,Convenience Store,Nightclub,Chinese Restaurant,Mountain
4,Copiapo,3,Hotel,Nightclub,Pizza Place,Café,History Museum,Diner,Restaurant,Pub,Sushi Restaurant,Movie Theater


In [28]:
santiago_merged.loc[santiago_merged['Cluster Labels'] == 4, 
                    santiago_merged.columns[[0] + list(range(5, santiago_merged.shape[1]))]]

Unnamed: 0,City,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Iquique,4,Hotel,Plaza,Ice Cream Shop,Beach,Latin American Restaurant,Restaurant,Museum,Pizza Place,Sandwich Place,Coffee Shop
7,Rancagua,4,Restaurant,Pizza Place,Hotel,BBQ Joint,Soccer Field,Plaza,Gastropub,Gas Station,Ice Cream Shop,Sandwich Place
9,Chillan,4,Plaza,Restaurant,Sushi Restaurant,Pizza Place,Coffee Shop,Park,Hotel,Peruvian Restaurant,Café,Asian Restaurant
11,Temuco,4,Plaza,Café,Sandwich Place,Sushi Restaurant,Tea Room,Pizza Place,Diner,Burger Joint,Restaurant,Coffee Shop
