## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Introduction: Business Problem <a name="introduction"></a>

The objective of this project is to find an optimal location for a bar, this report will be targeted to stakeholders interested in opening a bar in the city of Quito, Ecuador

In the city there are lots of bars, specifically, 64 identified in Quito in the Foursquare application. That's why if someone is looking to open a bar it is important to know where to open it, because the location will influence the influx of people and, therefore, the success of the business.

## Data <a name="data"></a>

Based on the problem, factors that will influence our decission are the location of the most common venues visited by the public.

The data sources that will be needed to extract/generate the required information are the number of bars and location in every neighborhood obtained with Foursquare API

**Download and Explore the Dataset**

In [11]:
import pandas as pd
df_Quito=pd.read_csv('C:/Users/karol/Desktop/BarriosQuito.csv')

In [12]:
df_Quito.head()

Unnamed: 0,Neighborhoods
0,Alangasí
1,Atucucho
2,Bellavista
3,Carcelén
4,Caupichu


In [13]:
df_Quito.Neighborhoods[1]

'Atucucho'

**Get the Latitude and Longitude of Quito City using geopy library**

In [21]:
#!pip install geocoder
#!pip install geopandas
#!pip install geopy
from geopy.geocoders import Nominatim
import geopy
#, geopandas
import geocoder
import numpy as np
locator = Nominatim(user_agent="myGeocoderKaro",timeout=3)

#location = locator.geocode("San Martin, Quito, Ecuador")
#print("Latitude = {}, Longitude = {}".format(location.latitude, location.longitude))
latitude=[]
longitude=[]
for i in range(0,len(df_Quito)):
    
    print(df_Quito.Neighborhoods[i]+", Quito, Ecuador")
    location1=df_Quito.Neighborhoods[i]+", Quito, Ecuador"
    location = locator.geocode(location1)
    
    latitude.append(location.latitude)
    longitude.append(location.longitude)
    print("Latitude = {}, Longitude = {}".format(location.latitude, location.longitude))

Alangasí, Quito, Ecuador
Latitude = -0.3083181, Longitude = -78.4151607
Atucucho, Quito, Ecuador
Latitude = -0.1289647, Longitude = -78.5136987
Bellavista, Quito, Ecuador
Latitude = 0.1687322, Longitude = -78.5797978
Carcelén, Quito, Ecuador
Latitude = -0.0833844, Longitude = -78.4668033
Caupichu, Quito, Ecuador
Latitude = -0.3149596, Longitude = -78.5379399
Centro Histórico, Quito, Ecuador
Latitude = -0.2236462, Longitude = -78.5155931
Chilibulo, Quito, Ecuador
Latitude = -0.2366418, Longitude = -78.5333404
Chillogallo, Quito, Ecuador
Latitude = -0.2759473, Longitude = -78.5539684
Chimbacalle, Quito, Ecuador
Latitude = -0.2423312, Longitude = -78.5123345
Ciudadela del Ejército, Quito, Ecuador
Latitude = -0.29496985, Longitude = -78.56274376792595
Ciudadela Ibarra, Quito, Ecuador
Latitude = -0.2983475, Longitude = -78.5698703
Comité del Pueblo, Quito, Ecuador
Latitude = -0.1120565, Longitude = -78.46879197649866
Conocoto, Quito, Ecuador
Latitude = -0.2802732, Longitude = -78.4780120954

In [202]:
df1_Quito=pd.concat([df_Quito, pd.DataFrame(latitude, columns=['Latitude'])], axis=1)
df2_Quito=pd.concat([df1_Quito, pd.DataFrame(longitude, columns=['Longitude'])], axis=1)
df2_Quito

Unnamed: 0,Neighborhoods,Latitude,Longitude
0,Alangasí,-0.308318,-78.415161
1,Atucucho,-0.128965,-78.513699
2,Bellavista,0.168732,-78.579798
3,Carcelén,-0.083384,-78.466803
4,Caupichu,-0.314960,-78.537940
...,...,...,...
81,Tababela,-0.184774,-78.344679
82,Toctiuco,-0.298896,-78.474674
83,Tumbaco,-0.212255,-78.404495
84,Turubamba,-0.335882,-78.532407


**Create a map of Quito with 	Neighborhoods superimposed on top.**

In [203]:
!pip install folium 
import folium 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

geolocator = Nominatim()
locationQ = geolocator.geocode('Quito,Ecuador')

latitude_Q = locationQ.latitude
longitude_Q = locationQ.longitude

# create map of London using latitude and longitude values
map_Quito = folium.Map(location=[latitude_Q, longitude_Q], zoom_start=10)

## add markers to map
for lat, lng, Neighborhoods in zip(df2_Quito['Latitude'], df2_Quito['Longitude'], df2_Quito['Neighborhoods']):
    label = '{}'.format(Neighborhoods)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7).add_to(map_Quito)  

map_Quito



You should consider upgrading via the 'python -m pip install --upgrade pip' command.
  """


**Read the latitude and longitude coordinates of all the neighborhoods in Quito**

In [24]:
CLIENT_ID = '450VSYH04TSSMNF0HAYJIG2PF4FESUL15TJKNHS3MAWLX0QU' # your Foursquare ID
CLIENT_SECRET = 'UTISVQ4GYLYCJBJGEXKJMB1LW3TPLKWK4DWTLHIJ2VTXR1IC' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

## The correct answer is:
LIMIT = 100 
radius = 500 


# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    -0.308318,
    -78.415161,  
    radius, 
    LIMIT)
url # display URL

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e5bed6347e0d6001b43dfba'},
  'headerLocation': 'Current map view',
  'headerFullLocation': 'Current map view',
  'headerLocationGranularity': 'unknown',
  'totalResults': 2,
  'suggestedBounds': {'ne': {'lat': -0.3038179954999955,
    'lng': -78.41066933249103},
   'sw': {'lat': -0.31281800450000447, 'lng': -78.41965266750897}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4e3b558f2271d21e86dacc84',
       'name': 'Parque De Alangasi',
       'location': {'lat': -0.30770054114391665,
        'lng': -78.41541170488495,
        'labeledLatLngs': [{'label': 'display',
          'lat': -0.30770054114391665,
          'lng': -78.41541170488495}],
        'distance': 74,
        'cc': 'EC',
        'state': 'Pichincha',
        'country

**Extract the Categories of the Venues**

In [204]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [205]:
results = requests.get(url).json()

venues = results['response']['groups'][0]['items']
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]
nearby_venues

Unnamed: 0,name,categories,lat,lng
0,Parque De Alangasi,Park,-0.307701,-78.415412
1,Cancha Vieja,Soccer Field,-0.306081,-78.415789


## Methodology <a name="methodology"></a>

In this project it will be detected the areas of Quito that have high bar density. 

In first step the required data has been collected: location and type of every venue.

Second step in the analysis is the calculation and exploration of venues density across different areas of Quito.

The third and final step is the creation of clusters of locations. It will be presented a map of all locations and clusters, using k-means clustering, of those locations to identify optimal venue location by stakeholders.

## Analysis <a name="analysis"></a>

Basic explanatory data analysis and some additional info from the raw data.

In [256]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

**New dataframe called Quito venue**

In [29]:
Quito_venues = getNearbyVenues(names=df2_Quito['Neighborhoods'],
                                   latitudes=df2_Quito['Latitude'],
                                   longitudes=df2_Quito['Longitude']
                                  )

Alangasí
Atucucho
Bellavista
Carcelén
Caupichu
Centro Histórico
Chilibulo
Chillogallo
Chimbacalle
Ciudadela del Ejército
Ciudadela Ibarra
Comité del Pueblo
Conocoto
Cornejo
Cotocollao
Cumbayá
El Batán
El Beaterio
El Calzado
El Camal
El Condado
El Dorado
El Ejido
El Inca
El Panecillo
El Pintado
El Tejar
El Troje
Guajalo
Guamaní
Guápulo
Iñaquito
Kennedy
La Argelia
La Bota
La Ecuatoriana
La Ferroviaria
La Floresta
La Florida
La Forestal
La González Suárez
La Guaragua
La Libertad
La Loma Grande
La Magdalena
La Marín
La Mariscal
La Mena
La Ronda
La Tola
La Vicentina
La Victoria
Las Casas
Lucha de los Pobres
Luluncoto
Manuelita Saenz
Mena de Hierro
Miraflores
Monjas
Nueva Aurora
Oriente Quiteño
Pifo
Ponceano
Puembo
Puengasí
Quito Norte
Quito Sur
Quito Tennis
Quitumbe
Reino de Quito
Rumiñahui
San
San Bartolo
San Carlos
San Diego
San Juan
San Marcos
San Martin
San Rafael
Santa Rita
Solanda
Tababela
Toctiuco
Tumbaco
Turubamba
Villaflora


In [206]:
len(Quito_venues)

509

In [207]:
print(Quito_venues.shape)
Quito_venues

(509, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Alangasí,-0.308318,-78.415161,Parque De Alangasi,-0.307701,-78.415412,Park
1,Alangasí,-0.308318,-78.415161,Cancha Vieja,-0.306081,-78.415789,Soccer Field
2,Atucucho,-0.128965,-78.513699,Teatro Nacional Casa de la Cultura,-0.129168,-78.512561,Music Venue
3,Atucucho,-0.128965,-78.513699,Parque La Carolina,-0.129168,-78.512561,Music Venue
4,Atucucho,-0.128965,-78.513699,Maquillate,-0.128100,-78.510902,Health & Beauty Service
...,...,...,...,...,...,...,...
504,Villaflora,-0.245672,-78.521399,Pizzería Hornero,-0.247817,-78.520198,Pizza Place
505,Villaflora,-0.245672,-78.521399,Supermercados Santa Maria,-0.244234,-78.517966,Supermarket
506,Villaflora,-0.245672,-78.521399,Sandry,-0.243998,-78.519549,Fried Chicken Joint
507,Villaflora,-0.245672,-78.521399,Trolebús: Villa Flora,-0.244392,-78.518940,Bus Station


**Check how many venues were returned for each neighborhood**

In [208]:
Quito_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alangasí,2,2,2,2,2,2
Atucucho,3,3,3,3,3,3
Carcelén,5,5,5,5,5,5
Centro Histórico,27,27,27,27,27,27
Chilibulo,8,8,8,8,8,8
...,...,...,...,...,...,...
Solanda,4,4,4,4,4,4
Tababela,4,4,4,4,4,4
Toctiuco,2,2,2,2,2,2
Tumbaco,6,6,6,6,6,6


In [209]:
df3_Quito=Quito_venues.groupby('Neighborhood').count()

df4_Quito=df3_Quito.index.values
df4_Quito[4]
df2_Quito.head()

Unnamed: 0,Neighborhoods,Latitude,Longitude
0,Alangasí,-0.308318,-78.415161
1,Atucucho,-0.128965,-78.513699
2,Bellavista,0.168732,-78.579798
3,Carcelén,-0.083384,-78.466803
4,Caupichu,-0.31496,-78.53794


In [237]:
df5_Quito=[]

for k in range(0,len(df4_Quito)):
    df5_Quito.append(df2_Quito.loc[df2_Quito['Neighborhoods'] == df4_Quito[k]])



df5_Quito = []
for i, day in enumerate(df4_Quito):
  data_day = df2_Quito.loc[df2_Quito['Neighborhoods'] == df4_Quito[i]]
  df5_Quito .append(data_day)
df_Quito_f = pd.concat(df5_Quito)
df_Quito_f

Unnamed: 0,Neighborhoods,Latitude,Longitude
0,Alangasí,-0.308318,-78.415161
1,Atucucho,-0.128965,-78.513699
3,Carcelén,-0.083384,-78.466803
5,Centro Histórico,-0.223646,-78.515593
6,Chilibulo,-0.236642,-78.533340
...,...,...,...
80,Solanda,-0.275049,-78.541256
81,Tababela,-0.184774,-78.344679
82,Toctiuco,-0.298896,-78.474674
83,Tumbaco,-0.212255,-78.404495


 **Analyze Each Neighborhood in Quito**

In [35]:
# one hot encoding
Quito_onehot = pd.get_dummies(Quito_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Quito_onehot['Neighborhood'] = Quito_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Quito_onehot.columns[-1]] + list(Quito_onehot.columns[:-1])
Quito_onehot = Quito_onehot[fixed_columns]

Quito_onehot.head()

Unnamed: 0,Neighborhood,Airport Terminal,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,...,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Tourist Information Center,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store
0,Alangasí,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Alangasí,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Atucucho,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Atucucho,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Atucucho,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


**Group rows by neighborhood and by taking the mean of the frequency of occurrence of each category**

In [36]:
Quito_grouped = Quito_onehot.groupby('Neighborhood').mean().reset_index()
Quito_grouped.head()

Unnamed: 0,Neighborhood,Airport Terminal,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,...,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Tourist Information Center,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store
0,Alangasí,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Atucucho,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Carcelén,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Centro Histórico,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0
4,Chilibulo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [37]:
Quito_grouped.columns

Index(['Neighborhood', 'Airport Terminal', 'American Restaurant',
       'Argentinian Restaurant', 'Art Gallery', 'Art Museum',
       'Arts & Crafts Store', 'Asian Restaurant', 'Auto Workshop', 'BBQ Joint',
       ...
       'Tennis Court', 'Tex-Mex Restaurant', 'Thai Restaurant', 'Theater',
       'Tourist Information Center', 'Vegetarian / Vegan Restaurant',
       'Vietnamese Restaurant', 'Wine Bar', 'Wings Joint', 'Women's Store'],
      dtype='object', length=139)

In [38]:
Quito_grouped.shape

(64, 139)

**The top 5 most common venues for each neighborhood**

In [39]:
num_top_venues = 5
for hood in Quito_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = Quito_grouped[Quito_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alangasí----
              venue  freq
0              Park   0.5
1      Soccer Field   0.5
2  Airport Terminal   0.0
3      Noodle House   0.0
4     Movie Theater   0.0


----Atucucho----
                     venue  freq
0              Music Venue  0.67
1  Health & Beauty Service  0.33
2     Other Great Outdoors  0.00
3            Movie Theater  0.00
4                Multiplex  0.00


----Carcelén----
                venue  freq
0         Pizza Place   0.4
1                Park   0.2
2  Seafood Restaurant   0.2
3      Farmers Market   0.2
4    Airport Terminal   0.0


----Centro Histórico----
                       venue  freq
0                      Hotel  0.19
1                      Plaza  0.11
2             History Museum  0.11
3  Latin American Restaurant  0.07
4                 Restaurant  0.07


----Chilibulo----
               venue  freq
0         Restaurant  0.25
1             Bakery  0.12
2  Other Repair Shop  0.12
3                Gym  0.12
4   Basketball Court  0.12


--

**Sort the venues in descending order**

In [40]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

**Top 10 venues for each neighborhood**

In [41]:
num_top_venues = 10
indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
        
# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Quito_grouped['Neighborhood']

for ind in np.arange(Quito_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Quito_grouped.iloc[ind, :], num_top_venues)

In [42]:
neighborhoods_venues_sorted

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alangasí,Park,Soccer Field,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
1,Atucucho,Music Venue,Health & Beauty Service,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
2,Carcelén,Pizza Place,Park,Farmers Market,Seafood Restaurant,Diner,Fast Food Restaurant,Farm,Empanada Restaurant,Electronics Store,Dog Run
3,Centro Histórico,Hotel,History Museum,Plaza,Church,Restaurant,Latin American Restaurant,Bed & Breakfast,Museum,Diner,Coffee Shop
4,Chilibulo,Restaurant,Bus Station,Other Repair Shop,Martial Arts Dojo,Gym,Basketball Court,Bakery,Women's Store,Electronics Store,Farmers Market
...,...,...,...,...,...,...,...,...,...,...,...
59,Solanda,BBQ Joint,Big Box Store,Bus Station,Chinese Restaurant,Empanada Restaurant,Food & Drink Shop,Food,Fast Food Restaurant,Farmers Market,Farm
60,Tababela,Airport Terminal,Hostel,Bed & Breakfast,Farm,Electronics Store,Food,Fast Food Restaurant,Farmers Market,Empanada Restaurant,Dog Run
61,Toctiuco,Park,Paper / Office Supplies Store,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Dessert Shop
62,Tumbaco,Fast Food Restaurant,Bakery,Ice Cream Shop,Plaza,Gymnastics Gym,Seafood Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store


**Cluster the Neighborhoods**

In [259]:
from sklearn.cluster import KMeans
# set number of clusters
kclusters = 5
Quito_grouped_clustering = Quito_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Quito_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

Quito_grouped

Unnamed: 0,Neighborhood,Airport Terminal,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auto Workshop,BBQ Joint,...,Tex-Mex Restaurant,Thai Restaurant,Theater,Tourist Information Center,Vegetarian / Vegan Restaurant,Vietnamese Restaurant,Wine Bar,Wings Joint,Women's Store,Cluster Labels
0,Alangasí,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0
1,Atucucho,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,4
2,Carcelén,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,4
3,Centro Histórico,0.00,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.037037,...,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,4
4,Chilibulo,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
59,Solanda,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.250000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,2
60,Tababela,0.25,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,4
61,Toctiuco,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,0
62,Tumbaco,0.00,0.0,0.0,0.000000,0.0,0.0,0.0,0.0,0.000000,...,0.0,0.0,0.0,0.0,0.0,0.0,0.000000,0.0,0.0,4


In [241]:
Quito_merged = df_Quito_f
#print(len(Quito_merged),len(kmeans.labels_),len(neighborhoods_venues_sorted))
Quito_merged['Cluster Labels'] = kmeans.labels_
# add clustering labels
# merge Neighborhoods dataframe with Newham borough dataframe to add latitude/longitude for each neighborhood
Quito_merged = Quito_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhoods')

Quito_merged.head() # check the last columns!


Unnamed: 0,Neighborhoods,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alangasí,-0.308318,-78.415161,0,Park,Soccer Field,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
1,Atucucho,-0.128965,-78.513699,4,Music Venue,Health & Beauty Service,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
3,Carcelén,-0.083384,-78.466803,4,Pizza Place,Park,Farmers Market,Seafood Restaurant,Diner,Fast Food Restaurant,Farm,Empanada Restaurant,Electronics Store,Dog Run
5,Centro Histórico,-0.223646,-78.515593,4,Hotel,History Museum,Plaza,Church,Restaurant,Latin American Restaurant,Bed & Breakfast,Museum,Diner,Coffee Shop
6,Chilibulo,-0.236642,-78.53334,4,Restaurant,Bus Station,Other Repair Shop,Martial Arts Dojo,Gym,Basketball Court,Bakery,Women's Store,Electronics Store,Farmers Market


In [248]:
# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
# create map
map_clusters = folium.Map(location=[latitude_Q, longitude_Q], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
y = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(y)))
rainbow = [colors.rgb2hex(i) for i in colors_array]



# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(df_Quito_f['Latitude'], df_Quito_f['Longitude'], df_Quito_f['Neighborhoods'], Quito_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

**Examine the Clusters**

In [249]:
Quito_merged.loc[Quito_merged['Cluster Labels'] == 0]

Unnamed: 0,Neighborhoods,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alangasí,-0.308318,-78.415161,0,Park,Soccer Field,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
14,Cotocollao,-0.115963,-78.500099,0,Restaurant,Burger Joint,Park,Food Truck,Gym / Fitness Center,Women's Store,Dog Run,Farmers Market,Farm,Empanada Restaurant
18,El Calzado,-0.258902,-78.532287,0,Pizza Place,Dog Run,Park,Soccer Field,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Women's Store
21,El Dorado,-0.292912,-78.472998,0,Restaurant,BBQ Joint,Steakhouse,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store
27,El Troje,-0.103967,-78.395351,0,Park,Women's Store,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
47,La Mena,-0.263388,-78.555313,0,Shopping Mall,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner,Food & Drink Shop
52,Las Casas,-0.190091,-78.506226,0,Restaurant,Empanada Restaurant,Women's Store,Diner,Fast Food Restaurant,Farmers Market,Farm,Electronics Store,Dog Run,Dessert Shop
75,San Juan,-0.213202,-78.512286,0,Park,Taco Place,Bed & Breakfast,Shopping Mall,Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Dog Run
82,Toctiuco,-0.298896,-78.474674,0,Park,Paper / Office Supplies Store,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Dessert Shop


In [250]:
Quito_merged.loc[Quito_merged['Cluster Labels'] == 1]

Unnamed: 0,Neighborhoods,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
48,La Ronda,-0.114997,-78.413893,1,Pizza Place,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner,Convenience Store
55,Manuelita Saenz,-0.256524,-78.490066,1,Pizza Place,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner,Convenience Store
63,Puembo,-0.177678,-78.357972,1,Pizza Place,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner,Convenience Store


In [251]:
Quito_merged.loc[Quito_merged['Cluster Labels'] == 3]

Unnamed: 0,Neighborhoods,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
36,La Ferroviaria,-0.255578,-78.510732,3,Mexican Restaurant,Women's Store,Electronics Store,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Dog Run,Food Court
39,La Forestal,-0.254945,-78.504436,3,Mexican Restaurant,Women's Store,Electronics Store,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Dog Run,Food Court


In [252]:
Quito_merged.loc[Quito_merged['Cluster Labels'] == 4]

Unnamed: 0,Neighborhoods,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,Atucucho,-0.128965,-78.513699,4,Music Venue,Health & Beauty Service,Women's Store,Dog Run,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
3,Carcelén,-0.083384,-78.466803,4,Pizza Place,Park,Farmers Market,Seafood Restaurant,Diner,Fast Food Restaurant,Farm,Empanada Restaurant,Electronics Store,Dog Run
5,Centro Histórico,-0.223646,-78.515593,4,Hotel,History Museum,Plaza,Church,Restaurant,Latin American Restaurant,Bed & Breakfast,Museum,Diner,Coffee Shop
6,Chilibulo,-0.236642,-78.53334,4,Restaurant,Bus Station,Other Repair Shop,Martial Arts Dojo,Gym,Basketball Court,Bakery,Women's Store,Electronics Store,Farmers Market
7,Chillogallo,-0.275947,-78.553968,4,Breakfast Spot,Stadium,Grocery Store,Food & Drink Shop,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Women's Store
8,Chimbacalle,-0.242331,-78.512334,4,Chinese Restaurant,Women's Store,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
10,Ciudadela Ibarra,-0.298347,-78.56987,4,Middle Eastern Restaurant,Pharmacy,Dog Run,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Electronics Store,Diner
9,Ciudadela del Ejército,-0.29497,-78.562744,4,Hotel,Business Service,Women's Store,Electronics Store,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Dog Run
11,Comité del Pueblo,-0.112057,-78.468792,4,Motel,Women's Store,Electronics Store,Food,Fast Food Restaurant,Farmers Market,Farm,Empanada Restaurant,Dog Run,Convenience Store
12,Conocoto,-0.280273,-78.478012,4,Spa,Farm,Dog Run,Food,Fast Food Restaurant,Farmers Market,Empanada Restaurant,Electronics Store,Women's Store,Food & Drink Shop


## Results and Discussion <a name="results"></a>

The analysis shows that there is not great number of bars in Quito, because in most of the neighborhoods this venues were not the most common ones.

Then, all location were clustered to create zones of interest which contain greatest number of location candidates. 

Result of this is 5 clusters containing largest number of potential new venues locations based on number of existing venues. This analysis will provide info on areas with lots of bars in Quito. 

The recommended neighborhoods are just a starting point for deeper analysis with other factors taken into account.

## Conclusion <a name="conclusion"></a>

The urpose of this project was to identify optimal location for a new bar in Quito neighborhoods. By calculating venues density distribution from Foursquare data it was identified the top venues in the neighborhoods. Clustering of those locations was then performed in order to create major zones of interest for final exploration.

After examining the 5 clusters, it is recommended to stakeholders that El Batan, El Pintado, Kennedy, Quito Tenis, La Bota, are the best neighborhoods in Quito, to open a bar. Because in these areas, one of the most common venue visited is pubs. This also concludes that it is more advisable to put a bar in the north of Quito than in other sectors.