# Greetings dear viewer!

### We have been entasked with the search of an appropriate sector in Bogota, Colombia in which to establish a brand new hotel! 
### In the following Notebook we will import data about the different localities in Bogota and the venues surrounding them, use K-means cluster to separate them into different clusters and see what characterizes each of them!  
### Let us begin by importing the required libraries.

In [2]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

## Package Plan ##

  environment location: /opt/conda/envs/Python36

  added / updated specs: 
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py36_0         149 KB  conda-forge
    openssl-1.1.1d             |       h516909a_0         2.1 MB  conda-forge
    ca-certificates-2019.11.28 |       hecc5488_0         145 KB  conda-forge
    geopy-1.20.0               |             py_0          57 KB  conda-forge
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.5 MB

The following NEW packages will be INSTALLED:

    geographiclib:   1.50-py_0         conda-forge
    geopy:           1.20.0-py_0       conda-forge

The following packages will be UPDATED:

    ca-

In [3]:

!pip install geocoder
import geocoder

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 20.2MB/s ta 0:00:01
[?25hCollecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


#### Now let's get information about the Localities in Bogota

In [4]:
df = pd.read_html('https://es.wikipedia.org/wiki/Anexo:Localidades_de_Bogot%C3%A1')[0]
df.head()

Unnamed: 0,Nº,Localidad,Códigos Postales,Superficie km²[2]​,Población[3]​,Densidad hab/km²
0,1,Usaquén,110111-110151,65.31,501 999,7 686.4
1,2,Chapinero,110211-110231,38.15,139 701,3 661.88
2,3,Santa Fe,110311-110321,45.17,110 048,2 436.3
3,4,San Cristóbal,110411-110441,49.09,404 697,8 243.98
4,5,Usme,110511-110571,215.06,457 302,2 126.39


In [5]:
l = list(df.columns)
l = l[2:6]

In [6]:
l

['Códigos Postales',
 'Superficie km²[2]\u200b',
 'Población[3]\u200b',
 'Densidad hab/km²']

In [7]:
for i in l:
    del df[i]

df.head()

Unnamed: 0,Nº,Localidad
0,1,Usaquén
1,2,Chapinero
2,3,Santa Fe
3,4,San Cristóbal
4,5,Usme


In [8]:
df.rename(columns={"Nº": "Community Code", "Localidad": "Community"}, inplace = True)
df.head()

Unnamed: 0,Community Code,Community
0,1,Usaquén
1,2,Chapinero
2,3,Santa Fe
3,4,San Cristóbal
4,5,Usme


### We will now acquire the coordinates of each locality using the geocoder 

In [9]:
# define a function to get coordinates
def get_latlng(neighborhood):
    # initialize your variable to None
    lat_lng_coords = None
    # loop until you get the coordinates
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Bogota, Colombia'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

In [10]:
coords = [ get_latlng(neighborhood) for neighborhood in df["Community"].tolist() ]

In [11]:
coords

[[4.692590000000052, -74.03008999999997],
 [4.638480000000072, -74.06020999999998],
 [4.594590000000039, -74.06404999999995],
 [4.576430000000073, -74.09313999999995],
 [4.4982800000000225, -74.10744999999997],
 [4.561820000000068, -74.12733999999995],
 [4.609740000000045, -74.18279999999999],
 [4.627480000000048, -74.17021999999997],
 [4.686370000000068, -74.15099999999995],
 [4.701270000000022, -74.11268999999999],
 [4.734380000000044, -74.08562999999998],
 [4.669710000000066, -74.07784999999996],
 [4.623290000000054, -74.07224999999994],
 [4.604310000000055, -74.08978999999994],
 [4.596580000000074, -74.11201999999997],
 [4.633340000000032, -74.10627999999997],
 [4.594370000000026, -74.07688999999993],
 [4.624070000000074, -74.06613999999996],
 [4.553670000000068, -74.14647999999994],
 [4.554740000000038, -74.14691999999997]]

In [12]:
df_coords = pd.DataFrame(coords, columns=['Latitude', 'Longitude'])
df['Latitude'] = df_coords['Latitude']
df['Longitude'] = df_coords['Longitude']

In [13]:
print(df.shape)
df.head()


(20, 4)


Unnamed: 0,Community Code,Community,Latitude,Longitude
0,1,Usaquén,4.69259,-74.03009
1,2,Chapinero,4.63848,-74.06021
2,3,Santa Fe,4.59459,-74.06405
3,4,San Cristóbal,4.57643,-74.09314
4,5,Usme,4.49828,-74.10745


In [14]:
# get the coordinates of Bogota
address = 'Bogota, Colombia'
geolocator = Nominatim(user_agent="Coursera Capstone")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Bogota, Colombia {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Bogota, Colombia 4.59808, -74.0760439.


### Let's see them in a map now

In [15]:
 map_dxb = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(df['Latitude'], df['Longitude'], df['Community']):
    label = '{}'.format(neighborhood)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_dxb)  
map_dxb

### Now we will use foursquare to get info about the venues near each locality

In [16]:
CLIENT_ID = '3YGHGVXACCDCKV442IBQO5VQ3KVFCZPH1XK5T201WVP0EO0N' # your Foursquare ID
CLIENT_SECRET = '4IXEW0VVLD1HIBCB2ODYG3PRLYVRVKNFWSO35POQXUNCHUPF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: 3YGHGVXACCDCKV442IBQO5VQ3KVFCZPH1XK5T201WVP0EO0N
CLIENT_SECRET:4IXEW0VVLD1HIBCB2ODYG3PRLYVRVKNFWSO35POQXUNCHUPF


### Get info about the venues in each Community

In [17]:
#Since there are fewer communitys compared to the lab, each with quite a big area, let's put higher values in the radius and limit
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT = 200):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [18]:


bogota_venues = getNearbyVenues(names=df['Community'],
                                   latitudes=df['Latitude'],
                                   longitudes=df['Longitude']
                                  )

Usaquén
Chapinero
Santa Fe
San Cristóbal
Usme
Tunjuelito
Bosa
Kennedy
Fontibón
Engativá
Suba
Barrios Unidos
Teusaquillo
Los Mártires
Antonio Nariño
Puente Aranda
La Candelaria
Rafael Uribe Uribe
Ciudad Bolívar
Sumapaz


### Checking new dataset with venues

In [19]:
print(bogota_venues.shape)
bogota_venues.head()

(333, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Usaquén,4.69259,-74.03009,La Puerta De Alcalá,4.694399,-74.029996,Spanish Restaurant
1,Usaquén,4.69259,-74.03009,La Provence,4.694381,-74.030867,French Restaurant
2,Usaquén,4.69259,-74.03009,Hotel NH Collection Bogotá Hacienda Royal,4.691981,-74.031946,Hotel
3,Usaquén,4.69259,-74.03009,La Tarta,4.694171,-74.031004,Dessert Shop
4,Usaquén,4.69259,-74.03009,Shake It Funny Bar,4.694474,-74.03015,Dessert Shop


### Checking how many venues were returned for each locality

In [20]:
bogota_venues.groupby('Neighborhood').count()[['Venue Category','Venue Category']]

Unnamed: 0_level_0,Venue Category,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1
Antonio Nariño,4,4
Barrios Unidos,20,20
Bosa,7,7
Chapinero,28,28
Ciudad Bolívar,2,2
Engativá,18,18
Fontibón,4,4
Kennedy,4,4
La Candelaria,44,44
Los Mártires,9,9


### Checking unique venues alphabetically from all the retrieved ones

In [21]:
print('There are {} uniques categories.'.format(len(bogota_venues['Venue Category'].unique())))

There are 108 uniques categories.


In [22]:
with np.printoptions(linewidth=150):
    print(np.sort(bogota_venues['Venue Category'].unique()))

['American Restaurant' 'Arepa Restaurant' 'Argentinian Restaurant' 'Art Gallery' 'Art Museum' 'Asian Restaurant' 'Auto Garage' 'BBQ Joint' 'Bakery'
 'Bar' 'Beer Bar' 'Beer Garden' 'Bike Rental / Bike Share' 'Bookstore' 'Boutique' 'Bowling Alley' 'Brazilian Restaurant' 'Breakfast Spot' 'Brewery'
 'Burger Joint' 'Burrito Place' 'Café' 'Candy Store' 'Caribbean Restaurant' 'Clothing Store' 'Cocktail Bar' 'Coffee Shop' 'Concert Hall'
 'Construction & Landscaping' 'Convenience Store' 'Coworking Space' 'Creperie' 'Cuban Restaurant' 'Cultural Center' 'Cupcake Shop' 'Deli / Bodega'
 'Department Store' 'Dessert Shop' 'Dog Run' 'Donut Shop' 'Electronics Store' 'Farmers Market' 'Fast Food Restaurant' 'Flea Market' 'Flower Shop'
 'Food Court' 'French Restaurant' 'Fried Chicken Joint' 'Furniture / Home Store' 'Gastropub' 'General Entertainment' 'Grocery Store' 'Gun Range'
 'Gym' 'Gym / Fitness Center' 'Gymnastics Gym' 'History Museum' 'Hockey Arena' 'Hockey Field' 'Home Service' 'Hot Dog Joint' 'Hot

### Onehot encoding each Community

In [23]:
# one hot encoding
bogota_onehot = pd.get_dummies(bogota_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
bogota_onehot['Neighborhood'] = bogota_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [bogota_onehot.columns[-1]] + list(bogota_onehot.columns[:-1])
bogota_onehot = bogota_onehot[fixed_columns]

bogota_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Auto Garage,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Candy Store,Caribbean Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Dog Run,Donut Shop,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Flower Shop,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gastropub,General Entertainment,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gymnastics Gym,History Museum,Hockey Arena,Hockey Field,Home Service,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Juice Bar,Latin American Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Motorcycle Shop,Mountain,Movie Theater,Multiplex,Museum,Music Venue,Nightclub,Park,Peruvian Restaurant,Pizza Place,Plaza,Pub,Recreation Center,Restaurant,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soccer Field,South American Restaurant,Spanish Restaurant,Steakhouse,Sushi Restaurant,Taco Place,Tea Room,Tennis Court,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Whisky Bar,Wings Joint,Women's Store
0,Usaquén,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
1,Usaquén,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Usaquén,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Usaquén,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Usaquén,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [24]:
bogota_onehot.shape

(333, 109)

### Grouping Neighborhoods and checking frequency

In [25]:
bogota_grouped = bogota_onehot.groupby('Neighborhood').mean().reset_index()
bogota_grouped

Unnamed: 0,Neighborhood,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Auto Garage,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Candy Store,Caribbean Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Dog Run,Donut Shop,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Flower Shop,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gastropub,General Entertainment,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gymnastics Gym,History Museum,Hockey Arena,Hockey Field,Home Service,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Juice Bar,Latin American Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Motorcycle Shop,Mountain,Movie Theater,Multiplex,Museum,Music Venue,Nightclub,Park,Peruvian Restaurant,Pizza Place,Plaza,Pub,Recreation Center,Restaurant,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soccer Field,South American Restaurant,Spanish Restaurant,Steakhouse,Sushi Restaurant,Taco Place,Tea Room,Tennis Court,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Whisky Bar,Wings Joint,Women's Store
0,Antonio Nariño,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Barrios Unidos,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bosa,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chapinero,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.035714,0.035714,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.071429,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.035714,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.035714,0.0,0.035714,0.0
4,Ciudad Bolívar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Engativá,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.055556,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.055556,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.055556,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0
6,Fontibón,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Kennedy,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25
8,La Candelaria,0.0,0.0,0.022727,0.022727,0.068182,0.0,0.0,0.022727,0.045455,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.159091,0.0,0.022727,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.113636,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.022727,0.0,0.0,0.068182,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.113636,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0
9,Los Mártires,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.222222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [26]:
bogota_grouped.shape

(20, 109)

In [27]:
bogota_grouped.shape

(20, 109)

### I'll clone the Data Frame just in case I mess up

In [28]:
test = bogota_grouped.copy()

In [29]:
test.head(n=6)

Unnamed: 0,Neighborhood,American Restaurant,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Auto Garage,BBQ Joint,Bakery,Bar,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bookstore,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Burger Joint,Burrito Place,Café,Candy Store,Caribbean Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Concert Hall,Construction & Landscaping,Convenience Store,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Dog Run,Donut Shop,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Flower Shop,Food Court,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Gastropub,General Entertainment,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gymnastics Gym,History Museum,Hockey Arena,Hockey Field,Home Service,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Italian Restaurant,Japanese Restaurant,Juice Bar,Latin American Restaurant,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Miscellaneous Shop,Mobile Phone Shop,Motorcycle Shop,Mountain,Movie Theater,Multiplex,Museum,Music Venue,Nightclub,Park,Peruvian Restaurant,Pizza Place,Plaza,Pub,Recreation Center,Restaurant,Sandwich Place,Seafood Restaurant,Shoe Store,Shopping Mall,Snack Place,Soccer Field,South American Restaurant,Spanish Restaurant,Steakhouse,Sushi Restaurant,Taco Place,Tea Room,Tennis Court,Theater,Theme Restaurant,Vegetarian / Vegan Restaurant,Whisky Bar,Wings Joint,Women's Store
0,Antonio Nariño,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Barrios Unidos,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Bosa,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.285714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Chapinero,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.035714,0.035714,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.071429,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.035714,0.035714,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.035714,0.0,0.035714,0.0
4,Ciudad Bolívar,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Engativá,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.055556,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.055556,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.055556,0.0,0.0,0.055556,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0


### Now since we want to put in a Hotel, it is convinient to see things that are useful for a Hotel to have nearby. So let's combine the columns with data of venues with similarities in them Let's also create a "new" data frame just in case. 

In [30]:
import re 

In [31]:
new = pd.DataFrame()
test_list = ['Restaurant', 'Café', 'Bar', 'Store', 'Mall', 'Market', 'Bowling', 'Center', 'Movie', 'Park', 'Nightclub', 'Theater', 'Plaza', 'Recreation', 'Field']
for word in test_list:
    result = test.columns.str.contains(pat = word) 
    result_series = pd.Series(result)
    values = test.columns[result_series]
    test[word] = 0
    for i in values:
        test[word] = test[word] + test[i] 
        
    new[word] = test[word]
        

In [32]:
new.head()

Unnamed: 0,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field
0,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0
1,0.2,0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0,0,0.1,0,0.0,0.05
2,0.857143,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0
3,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0
4,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0


In [33]:
new['Neighborhood'] = test['Neighborhood']

In [34]:
new.head()

Unnamed: 0,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Neighborhood
0,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0,Antonio Nariño
1,0.2,0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0,0,0.1,0,0.0,0.05,Barrios Unidos
2,0.857143,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,Bosa
3,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,Chapinero
4,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,Ciudad Bolívar


In [35]:
fixed_columns = [new.columns[-1]] + list(new.columns[:-1])
new = new[fixed_columns]
new.head()

Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field
0,Antonio Nariño,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0
1,Barrios Unidos,0.2,0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0,0,0.1,0,0.0,0.05
2,Bosa,0.857143,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0
3,Chapinero,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0
4,Ciudad Bolívar,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0


### K-Means Clustering

In [50]:
# set number of clusters
kclusters = 3

bogota_clustering =new.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(bogota_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([0, 2, 1, 1, 0, 2, 0, 0, 1, 0], dtype=int32)

In [51]:
#Just in case...
bogota_merged = new.copy()

# add clustering labels
bogota_merged["Cluster Labels"] = kmeans.labels_

### Adding Longitude and Latitude

In [53]:
bogota_merged_total = bogota_merged.join(df.set_index("Community"), on="Neighborhood")

print(bogota_merged_total.shape)
bogota_merged_total.head()

(20, 20)


Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Cluster Labels,Community Code,Latitude,Longitude
0,Antonio Nariño,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0,0,15,4.59658,-74.11202
1,Barrios Unidos,0.2,0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0,0,0.1,0,0.0,0.05,2,12,4.66971,-74.07785
2,Bosa,0.857143,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,7,4.60974,-74.1828
3,Chapinero,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,2,4.63848,-74.06021
4,Ciudad Bolívar,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,19,4.55367,-74.14648


In [54]:
# sort the results by Cluster Labels
print(bogota_merged_total.shape)
bogota_merged_total.sort_values(["Cluster Labels"], inplace=True)
bogota_merged_total

(20, 20)


Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Cluster Labels,Community Code,Latitude,Longitude
0,Antonio Nariño,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0,0,15,4.59658,-74.11202
15,Sumapaz,0.333333,0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,20,4.55474,-74.14692
14,Suba,0.181818,0,0.090909,0.272727,0.0,0.0,0.0,0.181818,0.090909,0,0,0.181818,0,0.090909,0.090909,0,11,4.73438,-74.08563
7,Kennedy,0.0,0,0.0,0.5,0.25,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,8,4.62748,-74.17022
6,Fontibón,0.0,0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,9,4.68637,-74.151
9,Los Mártires,0.222222,0,0.0,0.222222,0.222222,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,14,4.60431,-74.08979
4,Ciudad Bolívar,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,19,4.55367,-74.14648
3,Chapinero,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,2,4.63848,-74.06021
8,La Candelaria,0.431818,0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0,0,0.0,0,0.0,0.0,1,17,4.59437,-74.07689
18,Usaquén,0.64,0,0.03,0.01,0.01,0.01,0.0,0.01,0.01,0,0,0.02,0,0.0,0.0,1,1,4.69259,-74.03009


### Mapping up.

In [55]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bogota_merged_total['Latitude'], bogota_merged_total['Longitude'], bogota_merged_total['Neighborhood'], bogota_merged_total['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [56]:
bogota_merged_total.loc[bogota_merged_total['Cluster Labels'] == 0]

Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Cluster Labels,Community Code,Latitude,Longitude
0,Antonio Nariño,0.0,0,0.0,0.25,0.0,0.0,0.0,0.25,0.0,0,0,0.0,0,0.0,0.0,0,15,4.59658,-74.11202
15,Sumapaz,0.333333,0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,20,4.55474,-74.14692
14,Suba,0.181818,0,0.090909,0.272727,0.0,0.0,0.0,0.181818,0.090909,0,0,0.181818,0,0.090909,0.090909,0,11,4.73438,-74.08563
7,Kennedy,0.0,0,0.0,0.5,0.25,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,8,4.62748,-74.17022
6,Fontibón,0.0,0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,9,4.68637,-74.151
9,Los Mártires,0.222222,0,0.0,0.222222,0.222222,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,14,4.60431,-74.08979
4,Ciudad Bolívar,0.0,0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,0,19,4.55367,-74.14648


In [57]:
bogota_merged_total.loc[bogota_merged_total['Cluster Labels'] == 1]

Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Cluster Labels,Community Code,Latitude,Longitude
3,Chapinero,0.392857,0,0.071429,0.107143,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,2,4.63848,-74.06021
8,La Candelaria,0.431818,0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0,0,0.0,0,0.0,0.0,1,17,4.59437,-74.07689
18,Usaquén,0.64,0,0.03,0.01,0.01,0.01,0.0,0.01,0.01,0,0,0.02,0,0.0,0.0,1,1,4.69259,-74.03009
11,Rafael Uribe Uribe,0.431373,0,0.078431,0.039216,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,18,4.62407,-74.06614
2,Bosa,0.857143,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,7,4.60974,-74.1828
16,Teusaquillo,0.642857,0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,1,13,4.62329,-74.07225


In [58]:
bogota_merged_total.loc[bogota_merged_total['Cluster Labels'] == 2]

Unnamed: 0,Neighborhood,Restaurant,Café,Bar,Store,Mall,Market,Bowling,Center,Movie,Park,Nightclub,Theater,Plaza,Recreation,Field,Cluster Labels,Community Code,Latitude,Longitude
10,Puente Aranda,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,2,16,4.63334,-74.10628
12,San Cristóbal,0.0,0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,2,4,4.57643,-74.09314
13,Santa Fe,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,2,3,4.59459,-74.06405
1,Barrios Unidos,0.2,0,0.0,0.05,0.0,0.05,0.0,0.0,0.05,0,0,0.1,0,0.0,0.05,2,12,4.66971,-74.07785
17,Tunjuelito,0.0,0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,2,6,4.56182,-74.12734
5,Engativá,0.222222,0,0.0,0.055556,0.055556,0.0,0.111111,0.0,0.055556,0,0,0.111111,0,0.0,0.0,2,10,4.70127,-74.11269
19,Usme,0.0,0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0,0,0.0,0,0.0,0.0,2,5,4.49828,-74.10745


# Conclusion

The clustering algorithm, with the help of the Foursquare database has provided 3 different clusters involving the 20 localities of Bogota, Colombia. As it can be seen from the segmentation of each cluster, the localities in Cluster No. 0 seem to have slightly more presence of cultural and entertainment facilities in comparison with the other clusters. Cluster No. 1 has a significantly higher presence of restaurants and food-related businesses compared to the other clusters. Lastly, Cluster No. 2 seems to be rather empty. 

In conclusion, even though Cluster No. 0 does seem to provide more entertainment facilities, it is a slight difference compared to Cluster No. 1, which in turn has a significant upper-hand in restaurants and similar businesses. Therefore, in regards to the positioning of a new Hotel, Cluster No. 1 is the recmomended option. 