# Competitive analysis: coffees and restaurant sector in Uruguay

The research has the purpose of studying the market and making a recommendation about the opportunity to open a cafe in Montevideo. The analysis will be structured as follows:
1. Uruguay economy overview
2. Main cities characterization
3. Business opportunities in Montevideo (capital city) and its neighborhood

# Data section

## Uruguay economy overview and main cities characterization
- Doing Business: https://www.doingbusiness.org/
- World Bank: https://www.worldbank.org/
- Instituto Nacional de Estadisitcas (national statistics and census): https://www.ine.gub.uy/

## Business opportunities in Montevideo (capital city) and its neighborhood
- Foursquare: https://developer.foursquare.com/


### Dependencies

In [3]:
import json
import requests
import numpy as np # library to handle data in a vectorized manner
import pandas as pd # library for data analsysis

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
from pandas import json_normalize

import random # library for random number generation

from sklearn.cluster import KMeans

#!pip install geopy
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

# libraries for displaying images
from IPython.display import Image 
from IPython.core.display import HTML 

#! pip install folium==0.5.0
! pip install folium
import folium # plotting library

from pandas import ExcelWriter

!pip install geopy
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors


!pip install bokeh
!pip install selenium



### Foursquare App Settings (fill in with your ID)

In [4]:
CLIENT_ID='VTLL41IJT01LOJ3WO5WYD5KTDBCCPTRNO53PFQ1O1WXIOWEN'
CLIENT_SECRET='RK02SWLA1P4RODIKLQ5CE432I2EKLERWQTML2KBKEPXXYCY4'
VERSION='20180323'

# 1. Uruguay economy overview

## Is Uruguay a good country to invest?

We have different sources of information to evaluate a country. In this case, I will use World Bank to understand the situation of Uruguay. 
The World Bank publics in its web the following overview of Uruguay, these are the main ideas summarizes:

Uruguay stands out in Latin America for being an egalitarian society and for its high income per capita, low level of inequality and poverty and the almost complete absence of extreme poverty. In relative terms, its middle class is the largest in America, and represents more than 60% of its population. **Uruguay is positioned among the first places in the region in relation to various well-being indices, such as the Human Development Index, the Human Opportunity Index and the Economic Freedom Index.** Institutional stability and low levels of corruption are reflected in the high level of public trust in government. According to the Human Opportunity Index constructed by the World Bank, Uruguay has managed to attain a high level of equal opportunities in terms of access to basic services such as education, running water, electricity and sanitation.
**In July 2013, the World Bank classified Uruguay as a high-income country. By 2017, the Gross National Income per capita at purchasing power parity (PPP) amounted to US$21,870.**

Since 2003, the Uruguayan economy has had positive economic growth rates, averaging 4,1% from 2003 to 2018. Uruguay’s economic growth has remained positive even in 2017 and 2018, in spite of recessions experienced by Argentina and Brazil, thus departing from previous patterns when growth was synchronized with that of its main neighbors. Prudent macroeconomic policies and a commitment to diversify its markets and products within the dominant agriculture and forestry sectors have increased the country’s ability to withstand regional shocks.

In order to reduce the dependency on its main trading partners, **Uruguay diversified its export markets. In 2018, Brazil and Argentina, Uruguay’s traditional trading partners, only represented 12% and 5% respectively of the total merchandise exports.** Nowadays, its main trading partners are China (26%) and the European Union (18%).
Two main characteristics —a solid social contract and economic openness— paved the way to poverty reduction and the promotion of shared prosperity that Uruguay successfully followed in the last decade.
According to official measures, moderate poverty went from 32.5% in 2006 to 8.1% in 2018, while extreme poverty has practically disappeared: it went down from 2.5% to 0.1% in the same period. In terms of equity, income levels among the poorest 40% of the Uruguayan population increased much faster than the average growth rate of income levels of the entire population. Nonetheless, there are significant differences: the proportion of the population below the (national) poverty line is still significantly higher in the North of the country; among children and youth (17.2% among children younger than 6, and 15% and 13.9% among the age groups 6 to 12 and 13 to 17, respectively); and, among the afro descendant population (17.4%).

Inclusive social policies have focused on expanding program coverage; according to the World Bank, **around 87% of the population aged 65 or more is covered by the pension system. This is one of the highest coefficients in Latin America and the Caribbean alongside Argentina and Brazil.**

The strong macroeconomic performance was also reflected in the labor market, with a historically low unemployment rate recorded in 2011 (6.3%). However, given the noticeable slowdown in economic growth, the unemployment rate increased to 7.9% in 2018.

Despite recent progress in Uruguay, several structural constraints to growth remain, in particular in the areas of infrastructure investment, integration into global value chains and education/skills performance, which may obstruct the progress towards sustainable development outcomes. **The strong institutional performance in other areas, such as public trust in government, low corruption and a consensus-based political approach, together with a deep commitment to strengthening its institutional set-up, give the country a solid basis from which to continue renewing its social contract and put in place policies to address such constraints.**
Uruguay maintains an adequate macroeconomic framework but in a much more complicated external environment.

Source : (https://www.worldbank.org/en/country/uruguay/overview#1)


## What are the trends if we compare Uruguay vs. Argentina in terms of GDP?

Uruguay and Argentina have strong cultural bonds. The societies have the same passions for soccer, and Tango, and good meat, but the economies have taken a different paths. These charts show the progress of the Uruguay economy against Argentina. I use The World Bank information to graph the evolution of the GDP per capita and the Adjusted Net National Income per capita. These graphs show the actual value of the economy and the constant growth that Uruguay has achieved.

Source:  (https://data.worldbank.org/country/uruguay)

In [5]:
workbook = pd.ExcelFile('PIB_per_capita.xlsx')
dictionary = {}
for sheet_name in workbook.sheet_names:
    pbi_per_capita = workbook.parse (sheet_name)
    dictionary [sheet_name] = pbi_per_capita

In [6]:
from bokeh.plotting import figure, output_file, show
from bokeh.io import output_notebook, push_notebook, show, export_png
from bokeh.models import ColumnDataSource
from bokeh.models.tools import HoverTool

p = figure(plot_width=800, plot_height=400, title="PBI per capita", x_axis_label='Year', y_axis_label='U$D')
p.title.text_font = 'arial'
p.title.text_font_size = '14pt'
p.title.text_font_style = 'bold'
p.title.align = 'center'

source = ColumnDataSource(pbi_per_capita)

# add a line renderer
p.line( x='Year', y='Argentina',source=source, line_width=2, legend_label="Argentina")
p.circle( x='Year', y='Argentina',source=source, fill_color="white", size=8)
p.line( x='Year', y='Uruguay',source=source, line_width=2,line_color="red", legend_label="Uruguay")
p.circle( x='Year', y='Uruguay',source=source, line_color="red", fill_color="white", size=8)
p.legend.location = "bottom_right"

hover = HoverTool()
hover.tooltips=[
    ('Argentina',"@Argentina{,}"),  ('Uruguay',"@Uruguay{,}"), ('datos.bancomundial.org','12/2020')
]

p.add_tools(hover)

output_file("pbi_per_capita.html")
export_png(p, filename="pbi_per_capita.png")

output_notebook()
show(p)

In [7]:
workbook = pd.ExcelFile('Ingreso_nacional_neto_ajustado_per_capita.xlsx')
dictionary = {}
for sheet_name in workbook.sheet_names:
    ingreso_nacional_per_capita = workbook.parse (sheet_name)
    dictionary [sheet_name] = ingreso_nacional_per_capita

In [8]:
from bokeh.plotting import figure, output_file, show
from bokeh.io import output_notebook, push_notebook, show
from bokeh.models import ColumnDataSource
from bokeh.models.tools import HoverTool

p = figure(plot_width=800, plot_height=400, title="Ingreso nacional neto per capita", x_axis_label='Year', y_axis_label='U$D')
p.title.text_font = 'arial'
p.title.text_font_size = '14pt'
p.title.text_font_style = 'bold'
p.title.align = 'center'

source = ColumnDataSource(ingreso_nacional_per_capita)

# add a line renderer
p.line( x='Year', y='Argentina',source=source, line_width=2, legend_label="Argentina")
p.circle( x='Year', y='Argentina',source=source, fill_color="white", size=8)
p.line( x='Year', y='Uruguay',source=source, line_width=2,line_color="red", legend_label="Uruguay")
p.circle( x='Year', y='Uruguay',source=source, line_color="red", fill_color="white", size=8)
p.legend.location = "bottom_right"

hover = HoverTool()
hover.tooltips=[
    ('Argentina',"@Argentina{,}"),  ('Uruguay',"@Uruguay{,}"), ('datos.bancomundial.org','12/2020')
]

p.add_tools(hover)

output_file("pingreso_nacional_per_capita.html")
export_png(p, filename="ingreso_nacional_per_capita.png")

output_notebook()
show(p)

# 2. Main cities characterization

## Population
En el censo del ano 2011, se registtraon 3,3 millones of habitantes en Uruguay. If you compare Uruguay with its neighbor countries is a small country. 
Uruguay has 19 districts, 616 main cities. Each district has an average of 32 cities and a population of 112 mil inhabitants.

Demographics from Wikipedia:

Uruguay's rate of population growth is much lower than in other Latin American countries.[22] Its median age is 35.3 years, is higher than the global average[24] due to its low birth rate, high life expectancy, and relatively high rate of emigration among younger people. A quarter of the population is less than 15 years old and about a sixth are aged 60 and older.[22] In 2017 the average total fertility rate (TFR) across Uruguay was 1.70 children born per woman, below the replacement rate of 2.1, it remains considerably below the high of 5.76 children born per woman in 1882.[108]

Metropolitan Montevideo is the only large city, with around 1.9 million inhabitants, or more than half the country's total population. The rest of the urban population lives in about 30 towns.[24] 



Source (https://ine.gub.uy/inicio)

Source (https://en.wikipedia.org/wiki/Uruguay#Demographics)

In [9]:
workbook = pd.ExcelFile('localidades_uy_2011.xlsx')
dictionary = {}
for sheet_name in workbook.sheet_names:
    main_cities = workbook.parse (sheet_name)
    dictionary [sheet_name] = main_cities

main_cities['latitude'] = range (len(main_cities))
main_cities['longitude'] = range (len(main_cities))

## Main 10 cities

The ten most populated cities in Uruguay account for 61% of the total population. Montevideo, the capital, is 42% of the population, but if you add the nearby cities the percentage increases to 54%. If you want to do business in Uruguay, **Montevideo is the most important city to have a commercial presence.**

In [10]:

main_cities = main_cities.iloc[0:10]
main_cities.head(11)

Unnamed: 0,DEPARTAMEN,NOMBRE_LOC,CODLOC,POBL,PROP,DENS_HB_KM,latitude,longitude
0,MONTEVIDEO,MONTEVIDEO,1020,1304729,0.419491,5440,0,0
1,SALTO,SALTO,15120,104011,0.033441,2812,1,1
2,PAYSANDU,PAYSANDU,11120,76412,0.024568,3539,2,2
3,CANELONES,LAS PIEDRAS,3221,71258,0.022911,3150,3,3
4,RIVERA,RIVERA,13220,64465,0.020727,2221,4,4
5,MALDONADO,MALDONADO,10320,62590,0.020124,4916,5,5
6,TACUAREMBO,TACUAREMBO,18220,54755,0.017605,1721,6,6
7,CERRO LARGO,MELO,4220,51830,0.016664,3131,7,7
8,SORIANO,MERCEDES,17220,41974,0.013495,3757,8,8
9,ARTIGAS,ARTIGAS,2220,40657,0.013072,2740,9,9


### The main ten cities and Montevideo in % of the population

In [11]:
print(f'Ten cities are the {"%.2f" % (main_cities.iloc[:,4].sum()*100)}% of the population')
print(f'And Montevideo represents {"%.2f" % (main_cities.iloc[0,4].sum()*100)}% of the population')

Ten cities are the 60.21% of the population
And Montevideo represents 41.95% of the population


### Add Longitude and latitude to the top 10 cities

In [12]:
for i in range (0, len (main_cities)):
    localidad = main_cities.iloc[i,1]
    departamento =  main_cities.iloc[i,0]
    address = [localidad, departamento]
    
               
    geolocator = Nominatim(user_agent="to_explorer")
    
    location = geolocator.geocode(address)
    
    geocode = RateLimiter(geolocator.geocode, min_delay_seconds=4)
    
    latitude=location.latitude
    longitude=location.longitude
    
    main_cities.iloc[i,6] = latitude
    main_cities.iloc[i,7] = longitude

main_cities.head(20)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  isetter(loc, value)


Unnamed: 0,DEPARTAMEN,NOMBRE_LOC,CODLOC,POBL,PROP,DENS_HB_KM,latitude,longitude
0,MONTEVIDEO,MONTEVIDEO,1020,1304729,0.419491,5440,-34.905892,-56.19131
1,SALTO,SALTO,15120,104011,0.033441,2812,-31.38889,-57.960888
2,PAYSANDU,PAYSANDU,11120,76412,0.024568,3539,-32.321726,-58.089214
3,CANELONES,LAS PIEDRAS,3221,71258,0.022911,3150,-34.72749,-56.216498
4,RIVERA,RIVERA,13220,64465,0.020727,2221,-30.900058,-55.540815
5,MALDONADO,MALDONADO,10320,62590,0.020124,4916,-34.908716,-54.958272
6,TACUAREMBO,TACUAREMBO,18220,54755,0.017605,1721,-31.711018,-55.978876
7,CERRO LARGO,MELO,4220,51830,0.016664,3131,-32.377277,-54.14943
8,SORIANO,MERCEDES,17220,41974,0.013495,3757,-33.24842,-58.029867
9,ARTIGAS,ARTIGAS,2220,40657,0.013072,2740,-30.398414,-56.463808


In [13]:
writer= ExcelWriter ('10_main_cities.xlsx')
main_cities.to_excel(writer, '10_main_cities')
writer.save()

### Map with the top 10 cities
The radius of the circle is 100 *% of the population of each city. The idea is to show the size of the potential demand in each city. For reference, Montevideo is ten times larger than Salto, the second city in terms of population.

In [14]:
cities_map = folium.Map(tiles='Stamen Toner',location=[-32.522779, -55.765835], zoom_start=6) # generate map centred in the center of Uruguay

for lat, lng, label, prop in zip(main_cities.latitude, main_cities.longitude,main_cities.NOMBRE_LOC ,main_cities.PROP):
    folium.CircleMarker(
        [lat, lng],
        radius=100*prop,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.5,
        parse_html=False,
        max_width=100
    ).add_to(cities_map)


cities_map.save('cities_map.html')


cities_map

# 3. Business opportunities in Montevideo (capital city) and its neighborhood

### Montevideo Location

In [15]:
latitude = -34.8833
longitude = -56.1667
[latitude, longitude]

[-34.8833, -56.1667]

### Example of using **/search** method to find a coffee shop near a given location

In [16]:
url = 'https://api.foursquare.com/v2/venues/search'

params = dict(
client_id= CLIENT_ID,
client_secret= CLIENT_SECRET,
v=VERSION,
ll=f'{latitude},{longitude}',
query='cafe',
limit=90,
#radius=1500,
)
resp = requests.get(url=url, params=params)
data = json.loads(resp.text)

In [17]:
# assign relevant part of JSON to venues
venues = data['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)

In [18]:
# keep only columns that include venue name, and anything that is associated with location
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
dataframe_filtered = dataframe.loc[:, filtered_columns]

# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
dataframe_filtered['categories'] = dataframe_filtered.apply(get_category_type, axis=1)

# clean column names by keeping only last term
dataframe_filtered.columns = [column.split('.')[-1] for column in dataframe_filtered.columns]

# clear column name of special characters
dataframe_filtered.head(5)
dataframe_filtered.name = dataframe_filtered.name.str.replace('á','a')
dataframe_filtered.name = dataframe_filtered.name.str.replace('é','e')
dataframe_filtered.name = dataframe_filtered.name.str.replace('í','i')
dataframe_filtered.name = dataframe_filtered.name.str.replace('ó','o')
dataframe_filtered.name = dataframe_filtered.name.str.replace('ú','u')
dataframe_filtered.name = dataframe_filtered.name.str.replace('ñ','n')
dataframe_filtered.name = dataframe_filtered.name.str.replace('&',' ')
dataframe_filtered.name = dataframe_filtered.name.str.replace("Camila's",'Camila s')

In [22]:
venues_map = folium.Map(tiles='Stamen Toner', location=[-34.8833, -56.1667], zoom_start=12) # generate map centred around Montevideo

# add a red circle marker to represent Montevideo
folium.CircleMarker(
    [latitude, longitude],
    radius=5,
    color='red',
    popup='Montevideo',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

#add the Cafe as blue circle markers
for lat, lng, label in zip(dataframe_filtered.lat, dataframe_filtered.lng, dataframe_filtered.name):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6,
        parse_html=False,
        max_width=100
    ).add_to(venues_map)


venues_map.save('venues_map_search.html')


venues_map

### Example of using **/explore** to find a coffee shop near a given location. This method is better than /search. I will use this search in the analysis

In [23]:

url = 'https://api.foursquare.com/v2/venues/explore'

params = dict(
client_id= CLIENT_ID,
client_secret= CLIENT_SECRET,
v=VERSION,
ll=f'{latitude},{longitude}',
near = 'Montevideo, Uruguay',
#section='coffee',
categoryId= '4bf58dd8d48988d1e0931735',
#friendVisits= 'visited',
#sortByPopularity=1,
limit=50,
#radius=1500,
)
resp = requests.get(url=url, params=params)
explore = json.loads(resp.text)


In [24]:
# define the dataframe columns
column_names = ['Id','name', 'latitude', 'longitude','Rating'] 

# instantiate the dataframe
explorename = pd.DataFrame(columns=column_names)
explorename.head()

Unnamed: 0,Id,name,latitude,longitude,Rating


In [25]:
#explorename={}
for i in range(0,len(explore['response']['groups'][0]['items'])):
    Id = explore['response']['groups'][0]['items'][i]['venue']['id']
    name = explore['response']['groups'][0]['items'][i]['venue']['name']
    latitude = explore['response']['groups'][0]['items'][i]['venue']['location']['lat']
    longitude= explore['response']['groups'][0]['items'][i]['venue']['location']['lng']
    #neighborhood_name = data['properties']['name']
        
    #neighborhood_latlon = data['geometry']['coordinates']
    #eighborhood_lat = neighborhood_latlon[1]
    #neighborhood_lon = neighborhood_latlon[0]
    
    explorename = explorename.append({'Id':Id,
                                      'name': name,
                                     'latitude': latitude,
                                     'longitude':longitude},ignore_index=True)

Folium has problem with spanish character, so I fix the it in this way.

In [26]:
explorename.name = explorename.name.str.replace('á','a')
explorename.name = explorename.name.str.replace('é','e')
explorename.name = explorename.name.str.replace('í','i')
explorename.name = explorename.name.str.replace('ó','o')
explorename.name = explorename.name.str.replace('ú','u')
explorename.name = explorename.name.str.replace('ñ','n')
explorename.name = explorename.name.str.replace('&',' ')
explorename.name = explorename.name.str.replace("Camila's",'Camila s')
explorename.name = explorename.name.str.replace("Diego's",'Diego s')
explorename.name = explorename.name.str.replace("Valentino's Coffee",'Valentino s Coffee')

In [27]:
for i in range (0, len (explorename)):    
    VENUE_ID = explorename.iloc [i,0]
    client_id= CLIENT_ID,
    client_secret= CLIENT_SECRET,
    VERSION='20180323'
    url2= f'https://api.foursquare.com/v2/venues/{VENUE_ID}?client_id={CLIENT_ID}&client_secret={CLIENT_SECRET}&v={VERSION}'
    resp2 = requests.get(url2)
    venue_explore = json.loads(resp2.text)
    Rating = venue_explore['response']['venue']['rating']
    explorename.iloc[i,4] = Rating


In [28]:
explorename.tail(10)

Unnamed: 0,Id,name,latitude,longitude,Rating
40,5915fe3d5315934e1bdce427,Chesterhouse,-34.905136,-56.20012,7.3
41,52f516c811d2de79b7c1c6bc,McCafe,-34.8691,-56.169433,7.2
42,5bbd58d386f4cc002c0936dc,La Latina Cafe,-34.910305,-56.152765,8.1
43,4de7b4daae60b9d735340595,Palacio del Cafe,-34.903053,-56.190908,7.2
44,57cf4103498e0e7c2c585605,Ramona,-34.907307,-56.196889,7.2
45,4c8b99981556bfb7943f0293,PV Restaurant Lounge,-34.906684,-56.201397,7.1
46,5ab2b3dba92d9801c7544674,Starbucks,-34.90354,-56.137189,7.1
47,4e9b4ca22c5b4d6405b4ca3c,Oro Del Rhin,-34.915486,-56.148886,7.1
48,54a1b782498e8a88a0bd56a2,Coco Petit Cafe,-34.915356,-56.155919,7.0
49,4da1f93163b5a35dc108f419,Oro del Rhin,-34.924214,-56.158342,7.0


In [29]:
writer= ExcelWriter ('50_cafes_foursquare.xlsx')
explorename.to_excel(writer, '5o cafes Montevideo')
writer.save()

In [30]:
from folium import plugins

explorename_map = folium.Map(tiles='Stamen Toner', location=[-34.8833, -56.1667], zoom_start=12) # generate map centred around the Montevideo

loc = 'Los 50 principales cafes de Montevideo segun foursquare.com'
title_html = '''
             <h3 align="center" style="font-size:16px"><b>{}</b></h3>
             '''.format(loc) 

explorename_map.get_root().html.add_child(folium.Element(title_html))

cafes = plugins.MarkerCluster().add_to(explorename_map)


# add a red circle marker to represent Montevideo
folium.CircleMarker(
    [-34.8833, -56.1667],
    radius=5,
    color='red',
    popup='Montevideo',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(cafes)

# add the Cafe restaurants as blue circle markers
for lat, lng, name, rating in zip(explorename.latitude, explorename.longitude, explorename.name, explorename.Rating):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup= (f'Name:{name} - Rating:{rating}'),
        fill = True,
        fill_color='blue',
        fill_opacity=0.6,
        parse_html=False,
        max_width=100
    ).add_to(cafes)

explorename_map.save('50_cafes_foursquare.html')

explorename_map

### Municipios Monteviedo

In [31]:
workbook = pd.ExcelFile('municipios_montevideo.xlsx')
dictionary = {}
for sheet_name in workbook.sheet_names:
    municipios_montevideo = workbook.parse (sheet_name)
    dictionary [sheet_name] = municipios_montevideo

In [32]:
municipios_montevideo.head(10)

Unnamed: 0,Municipio,Direccion,Latitude,Longitude
0,Municipio A,Av. Carlos María Ramírez esq. Rivera Indarte,-34.86631,-56.23627
1,Municipio B,Joaquín Requena 1701,-34.89722,-56.17121
2,Municipio C,L. A. de Herrera 4547,-34.85758,-56.19821
3,Municipio CH,Brito del Pino 1590,-34.90042,-56.15713
4,Municipio D,Av. Gral. Flores 4694. Anexo,-34.84551,-56.15445
5,Municipio E,Av. Bolivia S/Nº - Estadio Charrúa,-34.877822,-56.087809
6,Municipio F,Av. 8 de Octubre 4700,-34.858619,-56.133591
7,Municipio G,Cno. Castro 730 esq. Ma. Orticochea,-34.850122,-56.201355


#### Let's create a function to search venues to all the neighborhoods in Montevideo

In [33]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            100)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### Create a new dataframe called *montevideo_venues*.

In [34]:


montevideo_venues = getNearbyVenues(names=municipios_montevideo['Municipio'],
                                   latitudes=municipios_montevideo['Latitude'],
                                   longitudes=municipios_montevideo['Longitude']
                                  )



Municipio A
Municipio B
Municipio C
Municipio CH
Municipio D
Municipio E
Municipio F
Municipio G


In [35]:
len (montevideo_venues)

265

Let's check how many venues were returned for each neighborhood

In [36]:
montevideo_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Municipio A,4,4,4,4,4,4
Municipio B,97,97,97,97,97,97
Municipio C,30,30,30,30,30,30
Municipio CH,82,82,82,82,82,82
Municipio D,8,8,8,8,8,8
Municipio E,28,28,28,28,28,28
Municipio F,5,5,5,5,5,5
Municipio G,11,11,11,11,11,11


### Analyze Each Neighborhood

In [38]:
# one hot encoding
montevideo_onehot = pd.get_dummies(montevideo_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
montevideo_onehot['Neighborhood'] = montevideo_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [montevideo_onehot.columns[-1]] + list(montevideo_onehot.columns[:-1])
montevideo_onehot = montevideo_onehot[fixed_columns]

montevideo_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Basketball Court,...,Sports Club,Stadium,Steakhouse,Supermarket,Tapas Restaurant,Tennis Court,Theater,Track Stadium,Vegetarian / Vegan Restaurant,Women's Store
0,Municipio A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Municipio A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Municipio A,0,0,0,0,1,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Municipio A,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Municipio B,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### Let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [39]:
montevideo_grouped = montevideo_onehot.groupby('Neighborhood').mean().reset_index()
montevideo_grouped

Unnamed: 0,Neighborhood,American Restaurant,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,BBQ Joint,Bakery,Bar,Basketball Court,...,Sports Club,Stadium,Steakhouse,Supermarket,Tapas Restaurant,Tennis Court,Theater,Track Stadium,Vegetarian / Vegan Restaurant,Women's Store
0,Municipio A,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Municipio B,0.010309,0.010309,0.0,0.0,0.010309,0.041237,0.010309,0.092784,0.010309,...,0.0,0.0,0.010309,0.010309,0.0,0.0,0.020619,0.0,0.0,0.0
2,Municipio C,0.0,0.033333,0.0,0.0,0.0,0.033333,0.033333,0.0,0.033333,...,0.0,0.033333,0.0,0.033333,0.0,0.033333,0.0,0.0,0.0,0.0
3,Municipio CH,0.0,0.0,0.012195,0.02439,0.012195,0.060976,0.0,0.012195,0.0,...,0.012195,0.0,0.0,0.012195,0.012195,0.0,0.0,0.012195,0.02439,0.0
4,Municipio D,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Municipio E,0.0,0.0,0.0,0.0,0.0,0.035714,0.035714,0.0,0.0,...,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.035714
6,Municipio F,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0
7,Municipio G,0.0,0.090909,0.0,0.0,0.0,0.0,0.090909,0.0,0.090909,...,0.0,0.090909,0.0,0.090909,0.0,0.0,0.0,0.0,0.0,0.0


#### Let's print each neighborhood along with the top 5 most common venues

In [42]:
num_top_venues = 5

for hood in montevideo_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = montevideo_grouped[montevideo_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Municipio A----
                 venue  freq
0                Plaza  0.25
1   Athletics & Sports  0.25
2        Grocery Store  0.25
3         Soccer Field  0.25
4  American Restaurant  0.00


----Municipio B----
               venue  freq
0                Bar  0.09
1        Pizza Place  0.07
2     Sandwich Place  0.06
3  Electronics Store  0.04
4          BBQ Joint  0.04


----Municipio C----
            venue  freq
0          Garden  0.13
1       Nightclub  0.07
2            Park  0.07
3             Gym  0.07
4  Ice Cream Shop  0.03


----Municipio CH----
               venue  freq
0        Pizza Place  0.07
1          BBQ Joint  0.06
2         Restaurant  0.06
3     Soccer Stadium  0.06
4  Electronics Store  0.04


----Municipio D----
               venue  freq
0       Dance Studio  0.25
1  Electronics Store  0.12
2      Deli / Bodega  0.12
3   Basketball Court  0.12
4      Movie Theater  0.12


----Municipio E----
                  venue  freq
0        Ice Cream Shop  0.07
1    

### Write a function to sort the venues in descending order and create the new dataframe and display the top 10 venues for each neighborhood.

In [45]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [46]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = montevideo_grouped['Neighborhood']

for ind in np.arange(montevideo_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(montevideo_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Municipio A,Plaza,Athletics & Sports,Grocery Store,Soccer Field,Food Truck,Dessert Shop,Diner,Electronics Store,Farmers Market,Fast Food Restaurant
1,Municipio B,Bar,Pizza Place,Sandwich Place,BBQ Joint,Restaurant,Electronics Store,Ice Cream Shop,Pharmacy,Pub,Coffee Shop
2,Municipio C,Garden,Nightclub,Park,Gym,Ice Cream Shop,Soccer Stadium,Deli / Bodega,Fast Food Restaurant,Coffee Shop,Mediterranean Restaurant
3,Municipio CH,Pizza Place,BBQ Joint,Restaurant,Soccer Stadium,Sandwich Place,Electronics Store,Gym,Café,Latin American Restaurant,Italian Restaurant
4,Municipio D,Dance Studio,Soccer Field,Electronics Store,Basketball Court,Movie Theater,Skate Park,Deli / Bodega,Dessert Shop,Diner,Farmers Market


## Cluster Neighborhoods

Run *k*-means to cluster the neighborhood into 3 clusters.

In [47]:
# set number of clusters
kclusters = 3

montevideo_grouped_clustering = montevideo_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(montevideo_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:8] 

array([1, 0, 0, 0, 0, 0, 2, 0])

In [48]:
municipios_montevideo['Cluster'] = range (0, len(municipios_montevideo))

for i in range (0, len (municipios_montevideo)):    
    municipios_montevideo.iloc[i,4] = kmeans.labels_[i]
    
municipios_montevideo.head(10)

Unnamed: 0,Municipio,Direccion,Latitude,Longitude,Cluster
0,Municipio A,Av. Carlos María Ramírez esq. Rivera Indarte,-34.86631,-56.23627,1
1,Municipio B,Joaquín Requena 1701,-34.89722,-56.17121,0
2,Municipio C,L. A. de Herrera 4547,-34.85758,-56.19821,0
3,Municipio CH,Brito del Pino 1590,-34.90042,-56.15713,0
4,Municipio D,Av. Gral. Flores 4694. Anexo,-34.84551,-56.15445,0
5,Municipio E,Av. Bolivia S/Nº - Estadio Charrúa,-34.877822,-56.087809,0
6,Municipio F,Av. 8 de Octubre 4700,-34.858619,-56.133591,2
7,Municipio G,Cno. Castro 730 esq. Ma. Orticochea,-34.850122,-56.201355,0


# Visualize the resulting clusters

In [51]:

municipios_montevideo_map = folium.Map(tiles='Stamen Toner', location=[-34.8833, -56.1667], zoom_start=12) # generate map centred around Montevideo

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]


loc = 'Los 8 municipios de Montevideo and cluster'
title_html = '''
             <h3 align="center" style="font-size:16px"><b>{}</b></h3>
             '''.format(loc) 

municipios_montevideo_map.get_root().html.add_child(folium.Element(title_html))


# add a red circle marker to Montevideo center
folium.CircleMarker(
    [-34.8833, -56.1667],
    radius=5,
    color='red',
    popup='Montevideo',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(municipios_montevideo_map)

# add municipios as blue circle markers
for lat, lng, name, cluster in zip(municipios_montevideo.Latitude, municipios_montevideo.Longitude, municipios_montevideo.Municipio, municipios_montevideo.Cluster):
    folium.Circle(
        [lat, lng],
        radius=1000,
        color='blue',
        popup=(f'Name:{name} - Cluster:{cluster}'),
        fill = True,
        fill_color= rainbow[cluster-1],
        fill_opacity=0.6,
        parse_html=False,
        max_width=100
    ).add_to(municipios_montevideo_map)


municipios_montevideo_map.save('municipios_montevideo_cluster.html')


municipios_montevideo_map