<a href="https://colab.research.google.com/github/SarahLares/Coursera_Capstone/blob/master/Capstone_The_Battle_of_the_Neighborhoods_for_a_Tea_Salon.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Capstone Project - The Battle of the Neighborhoods (Week 2)
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)


## Introduction: Business Problem <a name="introduction"></a>

A **tea room**, is a small establishment where light food and tea related drinks are served and offered. As of 2009, with the arrival of large tea chains, the boom of tea rooms in **Uruguay** begins, the consumption of this drink has become popular and there are more entrepreneurs wanting to start a tea business. 

On the other hand, coffee shops have also become very popular in the City of **Montevideo**, and they are usually the main competition for tea rooms.

There are various tea franchises in Uruguay, with the arrival of these new tea entrepreneurs and the coffe shop franchises, the **tea franchises have to choose strategically where to open their new stores** so that this is a success and thus not lose valuable company resources, these **franchises are the target audience**.

The main objective of this project is the **determination of the location for a new store for a tea room franchise in the city of Montevideo through the implementation of machine learning algorithms**.


## Data <a name="data"></a>


* List of neighborhoods in Montevideo, Uruguay.

* Latitude and Longitude of these neighborhoods. 

* Information about noable locations in the city.

* Information about the bus stops in the city.

* Venue data related to tea rooms and coffes. This will help us find the neighborhoods that are most suitable to open a tea room. 


The data sets of the city of Montevideo, was obteined from the page [catalogodatos.gub.uy](https://catalogodatos.gub.uy/). The National Open Data Catalog allows access to open data from public bodies, academia, civil society organizations and private companies. Anyone can freely use published data for storytelling, research, visualization, civic applications, and entrepreneurship.

First, download fourth libraries important to read the coordenates file.

In [117]:
# Important library for many geopython libraries
!apt install gdal-bin python-gdal python3-gdal 
!apt install python3-rtree 
!pip install git+git://github.com/geopandas/geopandas.git
!pip install descartes 

Reading package lists... Done
Building dependency tree       
Reading state information... Done
gdal-bin is already the newest version (2.2.3+dfsg-2).
python-gdal is already the newest version (2.2.3+dfsg-2).
python3-gdal is already the newest version (2.2.3+dfsg-2).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
Reading package lists... Done
Building dependency tree       
Reading state information... Done
python3-rtree is already the newest version (0.8.3+ds-1).
The following package was automatically installed and is no longer required:
  libnvidia-common-440
Use 'apt autoremove' to remove it.
0 upgraded, 0 newly installed, 0 to remove and 39 not upgraded.
Collecting git+git://github.com/geopandas/geopandas.git
  Cloning git://github.com/geopandas/geopandas.git to /tmp/pip-req-build-wy2ksdrd
  Running command git clone -q git://githu

Importing the necesaries libraries.

In [118]:
import folium 
import matplotlib.cm as cm
import matplotlib.colors as colors
from shapely.geometry import Polygon
import geopandas as gpd
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim

###Coordenate of the City of Montevideo

In [119]:
address = 'Montevideo, Uruguay'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Montevideo City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Montevideo City are -34.9059039, -56.1913569.


The cordinates of the municipalities are writting in POLYGON format. So, we calculate the centroid of the poligon for each 
municipalities.

In [120]:
MV_gdf = gpd.read_file('sig_municipios.dbf')
MV_gdf['centroid_lon'] = MV_gdf['geometry'].centroid.x
MV_gdf['centroid_lat'] = MV_gdf['geometry'].centroid.y

In [121]:
MV_gdf.head()

Unnamed: 0,GID,MUNICIPIO,SERIE,LIMITES,AREA_HA,geometry,centroid_lon,centroid_lat
0,33,G,"BZA, BZB, BZC, BRA, BRB, BRC.","Arroyo Miguelete, Carlos M. de Pena, Camino Le...",14240.0,"POLYGON ((562865.152 6159697.654, 562877.897 6...",568015.195234,6151534.0
1,34,D,"BOA, BOB, BDD, BNC, BNB, BBC, BBD, BBB, BBA","Arroyo Miguelete, Limite Departamental, Camino...",8603.33,"POLYGON ((578626.759 6158316.307, 578633.837 6...",577773.266481,6150149.0
2,35,F,"BDA, BDB, BDC, BDE, BDF","Arroyo Carrasco, Camino Carrasco, Pan de Azuca...",8514.36,"POLYGON ((585913.756 6141330.154, 585884.175 6...",583137.219809,6146240.0
3,36,E,"BCE, BCB, BCG, BCF, BCA, BCC, BCD","Rio de la Plata, Bvr. Jose Batlle y Ordo?ez, A...",2683.4,"POLYGON ((586190.696 6141118.461, 586197.037 6...",583199.073761,6139526.0
4,37,CH,"ATB, AUA, AUB, AZA, BAB, BAA, AZB, AXA, AXB","Rio de la Plata, Bvr. Jose Batlle y Ordo?ez, A...",1194.85,"POLYGON ((577615.401 6139432.128, 577890.719 6...",577791.047597,6137397.0



###Map of the Montevideo Municipalities


The city of Montevideo is divided into municipalities.

In [122]:
m_MV = folium.Map(location=[latitude,longitude], zoom_start=11)
folium.GeoJson(MV_gdf['geometry']).add_to(m_MV)
m_MV

### Maps of the Zonal Community Centers in Montevideo

In turn, the municipalities are divided into zonal community centers

In [123]:
MV_ccz = gpd.read_file('sig_comunales.dbf',crs="EPSG:32721")
MV_ccz['centroid_lon'] = MV_ccz['geometry'].centroid.x
MV_ccz['centroid_lat'] = MV_ccz['geometry'].centroid.y
MV_ccz.crs = "EPSG:32721"


In [124]:
MV_ccz.head()

Unnamed: 0,GID,ZONA_LEGAL,geometry,centroid_lon,centroid_lat
0,8647694.0,CCZ12,"POLYGON ((562865.152 6159697.654, 562877.897 6...",567274.400914,6152640.0
1,8647695.0,CCZ10,"POLYGON ((578626.759 6158316.307, 578633.837 6...",578136.698416,6151954.0
2,8647696.0,CCZ17,"POLYGON ((559541.850 6143183.792, 559579.581 6...",564830.826564,6140448.0
3,8647697.0,CCZ14,"POLYGON ((572356.482 6142949.631, 572837.455 6...",570029.947126,6142748.0
4,8647698.0,CCZ09,"POLYGON ((585913.756 6141330.154, 585884.175 6...",583137.219809,6146240.0


In [125]:
m = folium.Map(location=[latitude,longitude], zoom_start=11)
folium.GeoJson(MV_ccz['geometry']).add_to(m)
m 

### Places of Interest


*   Education Centers

*   Notabel Locations


In [126]:
MV_edu = gpd.read_file('uptu_educacion.dbf')
MV_nl = gpd.read_file('v_mdg_ubicaciones_notables.dbf')

##Foursquare

We are interted in the categories related with room tea or coffe shop. This is the id of these categories.

In [127]:
#cafeteria = '4bf58dd8d48988d128941735'
#cafe = '4bf58dd8d48988d16d941735'
#Juice Bar = '4bf58dd8d48988d112941735'
#Pet Café = '56aa371be4b08b9a8d573508'
#Tea Room = '4bf58dd8d48988d1dc931735'

categories = ['4bf58dd8d48988d128941735', '4bf58dd8d48988d16d941735', 
              '4bf58dd8d48988d112941735','56aa371be4b08b9a8d573508',
              '4bf58dd8d48988d1dc931735']

In [129]:
def get_tea_salon_nearby(lat, lon, category, client_id, client_secret, radius=600, limit=50):
    version = date.today().strftime("%Y%m%d")
    categories = category if isinstance(category, str) else ','.join(category)
    url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(client_id, 
                                                                                                                                            client_secret, version, lat, lon, categories, radius, limit)
    try:
        results = requests.get(url).json()['response']['venues']
        venues = [(item['id'],
                   item['name'],
                   get_categories(item['categories']),
                   (item['location']['lat'], item['location']['lng']),
                   format_address(item['location']),
                   item['location']['distance']) for item in results]        
    except:
        venues = []
        print(f'\nError on {url}')
    return venues

In [130]:
CLIENT_ID = '41KH5UCUZIRFBZDRECEWCZIVIKUQZIH5ABT2TJ0LXVE4QLMZ'
CLIENT_SECRET = 'OLQMMY3ROCH41CJFYDCV4DKZCBRS01SCRV5XGNZCU1F4HWIA' 
VERSION = '20180605' 

In [None]:
get_tea_salon_nearby(lat, lon, category, client_id, client_secret, radius=600, limit=50)