# Capstone Project Battle of the Neighborhoods (Week 4)

## Business Problem
Mexico is a developing country, where people come from Central America to start with something. When you go to a new country, you have no idea where to put your new restaurant or coffee shop. There are two main ways of doing this. One is walking for ages on the streets and looking with your own eyes the possible location where to start your empire, or using data. We will use data to make the decision on the place that is the most convenient for your new business.

With the data, we will see all the coffee shops in three neighborhoods in Mexico City with a map. Once we have the map and some statistics, we will make a conclusion for the best new location to open a coffee shop.

As we said previously, Mexico is a developing country, there are many opportunities for opening new businesses. For the new coffee shop place, I am hoping that it is going to be at a spot recognized for having coffee shops but it is not too crowded.

## Data
With the business problem well defined the next factors are crucial:
- Mexico city neighborhoods
- Neighborhood's latitude and longitude
- Venues around the neighborhood
- How are the coffee shops distributed geographically
- **Rating** and **price** for each venue

#### Data sources where we will extract information
- To get 3 Neighborhoods form Mexico, we wil use information from **Wikipedia** (<a href="https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Mexico_City">*Click Here*)</a> 
- Coffee shops in Mexico City from **Foursqueare API**
- Latitude and longitude of the neighborhoods provided by **geocoder** library

(Hidden annotations)
<!-- ## Junta con Salvador
- Tomar el puntaje de las cafeterías y qué tan caro es. Ver la relación que existe entre puntaje del negocio y precio que tiene
- La rentabilidad de una cafetería es el signo de pesos por la cantidad de gente que ha visitado

### Machine Learning
Crear un modelo de regresión con random forest que prediga con vecindario (one hot encoding), tipo de precio del negocio (1, 2, 3 o 4) (one hot encoding), y el puntaje (que puede ser standard scaler) para predecir la rentabilidad del negocio


- Condesa
- San Ángel
- Coyoacán -->

In [1]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values
from IPython.display import Image 
from IPython.core.display import HTML 
from pandas.io.json import json_normalize
import folium # plotting library
from bs4 import BeautifulSoup
import re
print('Libraries imported.')

Libraries imported.


## Mexico's neighborhoods

In [2]:
res = requests.get("https://en.wikipedia.org/wiki/List_of_neighborhoods_in_Mexico_City")
res

<Response [200]>

In [3]:
soup = BeautifulSoup(res.content,'lxml')
soup

<!DOCTYPE html>
<html class="client-nojs" dir="ltr" lang="en">
<head>
<meta charset="utf-8"/>
<title>List of neighborhoods in Mexico City - Wikipedia</title>
<script>document.documentElement.className="client-js";RLCONF={"wgBreakFrames":!1,"wgSeparatorTransformTable":["",""],"wgDigitTransformTable":["",""],"wgDefaultDateFormat":"dmy","wgMonthNames":["","January","February","March","April","May","June","July","August","September","October","November","December"],"wgRequestId":"XsNVHQpAAEYAAG1rav0AAACN","wgCSPNonce":!1,"wgCanonicalNamespace":"","wgCanonicalSpecialPageName":!1,"wgNamespaceNumber":0,"wgPageName":"List_of_neighborhoods_in_Mexico_City","wgTitle":"List of neighborhoods in Mexico City","wgCurRevisionId":951001915,"wgRevisionId":951001915,"wgArticleId":29906394,"wgIsArticle":!0,"wgIsRedirect":!1,"wgAction":"view","wgUserName":null,"wgUserGroups":["*"],"wgCategories":["Articles to be expanded from March 2013","All articles to be expanded","Articles with empty sections from March

In [4]:
ul = soup.find_all('ul')[0] 
ul

<ul><li><a href="/wiki/Bosques_de_las_Lomas" title="Bosques de las Lomas">Bosques de las Lomas</a>-Upscale residential neighborhood and business center.</li>
<li><a class="mw-redirect" href="/wiki/Historic_Center_of_Mexico_City" title="Historic Center of Mexico City">Centro</a> - Covers the historic downtown (<i>centro histórico</i>) of Mexico City.</li>
<li><a href="/wiki/Condesa" title="Condesa">Condesa</a> - Twenties post-Revolution neighborhood.</li>
<li><a href="/wiki/Colonia_Roma" title="Colonia Roma">Roma</a> - <a href="/wiki/Beaux-Arts_architecture" title="Beaux-Arts architecture">Beaux Arts</a> neighbourhood next to Condesa, one of the oldest in Mexico City.</li>
<li><a class="mw-redirect" href="/wiki/Colonia_Juarez_(Mexico_City)" title="Colonia Juarez (Mexico City)">Colonia Juarez</a> - includes the Zona Rosa area</li>
<li><a href="/wiki/Coyoac%C3%A1n" title="Coyoacán">Coyoacán</a> - Town founded by Cortés swallowed by the city in the 1950s, <a class="mw-redirect" href="/wiki

In [5]:
neighborhoods_list = []
for li in ul:
    try:
        neigh = li.find('a').contents[0]
        neighborhoods_list.append(neigh)
    except:
        continue

In [6]:
neighborhoods_list

['Bosques de las Lomas',
 'Centro',
 'Condesa',
 'Roma',
 'Colonia Juarez',
 'Coyoacán',
 'Del Valle',
 'Jardines del Pedregal',
 'Lomas de Chapultepec',
 'Nápoles',
 'San Ángel',
 'Santa Fe',
 'Polanco',
 'Tepito',
 'Tlatelolco',
 'Zona Rosa']

## Choosing neighborhoods
For this project we will choose three neighborhoods:
- Condesa
- San Ángel
- Coyoacán

In [7]:
geolocator = Nominatim(user_agent='mexico')

Find the coordinates of each neighborhood with an error handler. In case the geolocator is not being able to find it on the first attempt. Lastly, if in **five** attempts it is not able to find the neighborhood's coordinates, we give up and put the coordinates to: *0.000, 0.000*

In [8]:
data = []
for neigh in neighborhoods_list:
    address = neigh + ', Mexico City, Mexico'
    print(neigh)
    attempts = 1
    while True:
        try:
            location = geolocator.geocode(address)
            latitude = location.latitude
            longitude = location.longitude
            data.append((neigh, latitude, longitude))
        except:
            print("Sill havent found")
            attempts += 1
            if attempts > 5:
                latitude = 0.000
                longitude = 0.000
                data.append((neigh, latitude, longitude))
                break
        else:
            print("Found")
            print()
            break
            
        
data

Bosques de las Lomas
Found

Centro
Found

Condesa
Found

Roma
Found

Colonia Juarez
Found

Coyoacán
Found

Del Valle
Found

Jardines del Pedregal
Found

Lomas de Chapultepec
Found

Nápoles
Found

San Ángel
Found

Santa Fe
Found

Polanco
Found

Tepito
Found

Tlatelolco
Found

Zona Rosa
Sill havent found
Sill havent found
Sill havent found
Sill havent found
Sill havent found


[('Bosques de las Lomas', 19.403526, -99.2454929),
 ('Centro', 19.4326296, -99.1331785),
 ('Condesa', 19.4148639, -99.176429),
 ('Roma', 19.4326296, -99.1331785),
 ('Colonia Juarez', 16.21, -95.0275),
 ('Coyoacán', 19.32804005, -99.15106340693589),
 ('Del Valle', 19.3942366, -99.1670335),
 ('Jardines del Pedregal', 19.3197812, -99.2073801),
 ('Lomas de Chapultepec', 16.716111, -99.611389),
 ('Nápoles', 19.3937697, -99.1766017),
 ('San Ángel', 19.3498069, -99.0388727),
 ('Santa Fe', 19.382986, -99.2396662),
 ('Polanco', 19.43353, -99.190915),
 ('Tepito', 19.445545, -99.1273622),
 ('Tlatelolco', 19.455014, -99.1431694),
 ('Zona Rosa', 0.0, 0.0)]

In [9]:
df_neighs = pd.DataFrame.from_records(data, columns =['Neighborhood', 'Latitude', 'Longitude']) 
df_neighs

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Bosques de las Lomas,19.403526,-99.245493
1,Centro,19.43263,-99.133178
2,Condesa,19.414864,-99.176429
3,Roma,19.43263,-99.133178
4,Colonia Juarez,16.21,-95.0275
5,Coyoacán,19.32804,-99.151063
6,Del Valle,19.394237,-99.167034
7,Jardines del Pedregal,19.319781,-99.20738
8,Lomas de Chapultepec,16.716111,-99.611389
9,Nápoles,19.39377,-99.176602


In [10]:
df_neighs = df_neighs[ (df_neighs['Latitude']!=0.000) & (df_neighs['Longitude']!=0.000) ]
df_neighs

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Bosques de las Lomas,19.403526,-99.245493
1,Centro,19.43263,-99.133178
2,Condesa,19.414864,-99.176429
3,Roma,19.43263,-99.133178
4,Colonia Juarez,16.21,-95.0275
5,Coyoacán,19.32804,-99.151063
6,Del Valle,19.394237,-99.167034
7,Jardines del Pedregal,19.319781,-99.20738
8,Lomas de Chapultepec,16.716111,-99.611389
9,Nápoles,19.39377,-99.176602


In [11]:
map_mexicocity = folium.Map(location=['19.434345', '-99.140411'], zoom_start=11)

# add markers to map
for lat, lng, neighborhood in zip(df_neighs['Latitude'], df_neighs['Longitude'], df_neighs['Neighborhood']):
#     label = '{}'.format(neighborhood)
#     label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=7,
#         popup=label,
        color='yellow',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_mexicocity)  
    
# for lat, lng, in zip(df_neighs['Latitude'], df_neighs['Longitude']):
#     map_mexicocity.add_child(
#         folium.CircleMarker(
#             [lat, lng],
#             radius=5, # define how big you want the circle markers to be
#             color='yellow',
#             fill=True,
#             fill_color='blue',
#             fill_opacity=0.6
#         )
#     )
    
for lat, lng, label in zip(df_neighs['Latitude'], df_neighs['Longitude'], df_neighs['Neighborhood']):
    folium.Marker([lat, lng], popup=label).add_to(map_mexicocity) 
    
map_mexicocity

## Foursquare

In [12]:
CLIENT_ID = 'QTOOIUFSMFSNTFTOLRWGLW5DH22A5GDYONTMVTI3XO0OVXXI' # your Foursquare ID
CLIENT_SECRET = 'DVIVOQ2QKV55JUD5XV1ZKPKN0QLJVAPCYQ15P2QWCL1VK22O' # your Foursquare Secret
CODE = 'ZCRBIJI4DGJWTOTLV5BFCZU1GKOZGEJGPKMWZWN0322K2GJZ#_=_'
ACCESS_TOKEN = 'CQLOUPPTSRBWP5L32JSHTSRWDAGB4SIHTGTV4LKKVMTOQ4YQ'
# Until last year
VERSION = '20191231'
LIMIT = 100
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET: ' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: QTOOIUFSMFSNTFTOLRWGLW5DH22A5GDYONTMVTI3XO0OVXXI
CLIENT_SECRET: DVIVOQ2QKV55JUD5XV1ZKPKN0QLJVAPCYQ15P2QWCL1VK22O


## Exploring each neighborhood
Get the venues of each neighborhood with the Foursquare API

In [39]:
# Get all the Venues (lugares) of all the neghborhoods in manhattan
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        #print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        print(results)
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['id'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    
    #[
        #('Stuyvesant Town', 40.73099955477061, -73.97405170469203, 'Stuyvesant Cove Park', 40.73251294059505, -73.97387623786926, 'Park'), 
        #('Stuyvesant Town', 40.73099955477061, -73.97405170469203, 'Con Ed Field', 40.72920622814957, -73.97373220467514, 'Baseball Field'), 
        # ...
        #('Stuyvesant Town', 40.73099955477061, -73.97405170469203, 'Zum Kaboom', 40.7348, -73.974526, 'German Restaurant')
    #],
    #...
    venues_tuple_list = []
    #for item in venues_list:
    #    for tup in item:
    #        venues_tuple_list.append(tup)
    #venues_tuple_list = pd.DataFrame(venues_tuple_list)
    
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])

    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue ID', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [40]:
# type your answer here
mexico_city_venues = getNearbyVenues(names=df_neighs['Neighborhood'],
                                   latitudes=df_neighs['Latitude'],
                                   longitudes=df_neighs['Longitude']
                                  )

mexico_city_venues

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '553bf6fc498e07485b2cab0a', 'name': 'Puntarena', 'location': {'lat': 19.403129293289645, 'lng': -99.24252664150134, 'labeledLatLngs': [{'label': 'display', 'lat': 19.403129293289645, 'lng': -99.24252664150134}], 'distance': 314, 'cc': 'MX', 'country': 'México', 'formattedAddress': ['México']}, 'categories': [{'id': '4bf58dd8d48988d1ce941735', 'name': 'Seafood Restaurant', 'pluralName': 'Seafood Restaurants', 'shortName': 'Seafood', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/seafood_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-553bf6fc498e07485b2cab0a-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '53487059498eb1a054c70906', 'name': 'Häggen-Dazs', 'location':

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4b058700f964a520827a22e3', 'name': 'Catedral Metropolitana de la Asunción de María', 'location': {'address': 'Plaza de la Constitución S/N', 'lat': 19.433526472529614, 'lng': -99.13320365052626, 'distance': 99, 'postalCode': '06000', 'cc': 'MX', 'city': 'Cuauhtemoc', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Plaza de la Constitución S/N', '06000 Cuauhtémoc, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d132941735', 'name': 'Church', 'pluralName': 'Churches', 'shortName': 'Church', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/religious_church_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4b058700f964a520827a22e3-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reaso

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4baec3d5f964a52018d63be3', 'name': 'Nevería Roxy', 'location': {'address': 'Fernando Montes De Oca 89', 'crossStreet': 'e/ Mazatlán', 'lat': 19.414208266948297, 'lng': -99.17708368696618, 'labeledLatLngs': [{'label': 'display', 'lat': 19.414208266948297, 'lng': -99.17708368696618}], 'distance': 100, 'postalCode': '06140', 'cc': 'MX', 'city': 'Cuauhtemoc', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Fernando Montes De Oca 89 (e/ Mazatlán)', '06140 Cuauhtémoc, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d1c9941735', 'name': 'Ice Cream Shop', 'pluralName': 'Ice Cream Shops', 'shortName': 'Ice Cream', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/icecream_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4baec3d5f964a52018d

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4b058700f964a520827a22e3', 'name': 'Catedral Metropolitana de la Asunción de María', 'location': {'address': 'Plaza de la Constitución S/N', 'lat': 19.433526472529614, 'lng': -99.13320365052626, 'distance': 99, 'postalCode': '06000', 'cc': 'MX', 'city': 'Cuauhtemoc', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Plaza de la Constitución S/N', '06000 Cuauhtémoc, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d132941735', 'name': 'Church', 'pluralName': 'Churches', 'shortName': 'Church', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/religious_church_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4b058700f964a520827a22e3-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reaso

[]
[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4dc07258b5924d4e44295bf7', 'name': 'Sportium Club', 'location': {'address': 'Clavel 70, Col. Candelaria, Del. Coyoacán', 'lat': 19.327617993053895, 'lng': -99.1473473713616, 'labeledLatLngs': [{'label': 'display', 'lat': 19.327617993053895, 'lng': -99.1473473713616}], 'distance': 393, 'postalCode': '04380', 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Clavel 70, Col. Candelaria, Del. Coyoacán', '04380 Ciudad de México, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d175941735', 'name': 'Gym / Fitness Center', 'pluralName': 'Gyms or Fitness Centers', 'shortName': 'Gym / Fitness', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/building/gym_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4d28c661915fa093502a000a', 'name': 'HairKut', 'location': {'address': 'Mier y Pesado 317', 'lat': 19.394073762807878, 'lng': -99.16560651535896, 'labeledLatLngs': [{'label': 'display', 'lat': 19.394073762807878, 'lng': -99.16560651535896}], 'distance': 150, 'postalCode': '03100', 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Mier y Pesado 317', '03100 Ciudad de México, Distrito Federal', 'México']}, 'categories': [{'id': '54541900498ea6ccd0202697', 'name': 'Health & Beauty Service', 'pluralName': 'Health & Beauty Services', 'shortName': 'Health & Beauty', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/salon_barber_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4d28c661915fa093502a000a-0'}, {'reasons': {

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4c768199604a3704d9888549', 'name': 'Gymboree Pedregal', 'location': {'address': 'Crater 540', 'crossStreet': 'Lluvia', 'lat': 19.319172360220445, 'lng': -99.2085661018883, 'labeledLatLngs': [{'label': 'display', 'lat': 19.319172360220445, 'lng': -99.2085661018883}], 'distance': 141, 'postalCode': '01900', 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Crater 540 (Lluvia)', '01900 Ciudad de México, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d1e7941735', 'name': 'Playground', 'pluralName': 'Playgrounds', 'shortName': 'Playground', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/playground_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4c768199604a3704d9888549-0'}, {'reasons

[]
[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '5603047b498e711f2b0726df', 'name': 'J towers Hotel boutique', 'location': {'address': 'Texas 17', 'crossStreet': 'Dakota Y New York', 'lat': 19.392960494087223, 'lng': -99.17678770174082, 'labeledLatLngs': [{'label': 'display', 'lat': 19.392960494087223, 'lng': -99.17678770174082}], 'distance': 92, 'postalCode': '03810', 'cc': 'MX', 'city': 'Nápoles', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Texas 17 (Dakota Y New York)', '03810 Nápoles, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d12f951735', 'name': 'Resort', 'pluralName': 'Resorts', 'shortName': 'Resort', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/travel/resort_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-5603047b498e711f2b0726df-0'}, {'reasons': {'count': 0

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4f24c808e4b0c41c53099402', 'name': 'Tacos "Los camaradas"', 'location': {'address': 'Av circunvalacion', 'crossStreet': 'Av 12', 'lat': 19.349670262945025, 'lng': -99.03875950665173, 'labeledLatLngs': [{'label': 'display', 'lat': 19.349670262945025, 'lng': -99.03875950665173}], 'distance': 19, 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Av circunvalacion (Av 12)', 'Ciudad de México, Distrito Federal', 'México']}, 'categories': [{'id': '4bf58dd8d48988d151941735', 'name': 'Taco Place', 'pluralName': 'Taco Places', 'shortName': 'Tacos', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/taco_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-4f24c808e4b0c41c53099402-0'}, {'reasons': {'count': 0, 'items': [{'summa

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '55578d40498e8d41087ab559', 'name': 'Tlacoyos Doña Lucy', 'location': {'lat': 19.382477573147238, 'lng': -99.24037477792317, 'labeledLatLngs': [{'label': 'display', 'lat': 19.382477573147238, 'lng': -99.24037477792317}], 'distance': 93, 'cc': 'MX', 'country': 'México', 'formattedAddress': ['México']}, 'categories': [{'id': '4bf58dd8d48988d1cb941735', 'name': 'Food Truck', 'pluralName': 'Food Trucks', 'shortName': 'Food Truck', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/streetfood_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-55578d40498e8d41087ab559-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '4ef4f7c949010be35edc0cc6', 'name': 'Tlacoyos', 'location': {'ad

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '5b5f32d9e55d8b002cedabec', 'name': 'Estado Natural', 'location': {'address': 'Newton 133', 'crossStreet': 'Horacio', 'lat': 19.432869, 'lng': -99.189234, 'labeledLatLngs': [{'label': 'display', 'lat': 19.432869, 'lng': -99.189234}], 'distance': 191, 'postalCode': '11560', 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'CDMX', 'country': 'México', 'formattedAddress': ['Newton 133 (Horacio)', '11560 Ciudad de México, CDMX', 'México']}, 'categories': [{'id': '50aa9e744b90af0d42d5de0e', 'name': 'Health Food Store', 'pluralName': 'Health Food Stores', 'shortName': 'Health Food Store', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/food_grocery_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-5b5f32d9e55d8b002cedabec-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '53a5d68a498e0ff4e31c2ed0', 'name': 'plaza del perfume', 'location': {'address': 'Caridad 23', 'lat': 19.44499124919996, 'lng': -99.1260703027428, 'labeledLatLngs': [{'label': 'display', 'lat': 19.44499124919996, 'lng': -99.1260703027428}], 'distance': 148, 'cc': 'MX', 'city': 'Barrio De Tepito', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Caridad 23', 'Barrio De Tepito, Distrito Federal', 'México']}, 'categories': [{'id': '52f2ab2ebcbc57f1066b8b23', 'name': 'Perfume Shop', 'pluralName': 'Perfume Shops', 'shortName': 'Perfume Shop', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/default_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'groups': []}}, 'referralId': 'e-0-53a5d68a498e0ff4e31c2ed0-0'}, {'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': '

[{'reasons': {'count': 0, 'items': [{'summary': 'This spot is popular', 'type': 'general', 'reasonName': 'globalInteractionReason'}]}, 'venue': {'id': '5ca0eb09603d2a002cdd586c', 'name': 'Wal-Mart Puerta Tlatelolco', 'location': {'address': 'Manuel González 95', 'crossStreet': 'Zoltan Kodaly', 'lat': 19.4557734786961, 'lng': -99.1422713263875, 'labeledLatLngs': [{'label': 'display', 'lat': 19.4557734786961, 'lng': -99.1422713263875}], 'distance': 126, 'postalCode': '06920', 'cc': 'MX', 'neighborhood': 'San Simón Tolnahuac', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Manuel González 95 (Zoltan Kodaly)', '06920 Ciudad de México, Distrito Federal', 'México']}, 'categories': [{'id': '52f2ab2ebcbc57f1066b8b42', 'name': 'Big Box Store', 'pluralName': 'Big Box Stores', 'shortName': 'Big Box Store', 'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/default_', 'suffix': '.png'}, 'primary': True}], 'photos': {'count': 0, 'gr

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue ID,Venue Latitude,Venue Longitude,Venue Category
0,Bosques de las Lomas,19.403526,-99.245493,Puntarena,553bf6fc498e07485b2cab0a,19.403129,-99.242527,Seafood Restaurant
1,Bosques de las Lomas,19.403526,-99.245493,Häggen-Dazs,53487059498eb1a054c70906,19.402976,-99.242554,Dessert Shop
2,Bosques de las Lomas,19.403526,-99.245493,Häagen-Dazs,4be9b6636295c9b6638f8508,19.403008,-99.242641,Ice Cream Shop
3,Bosques de las Lomas,19.403526,-99.245493,Krispy Kreme,53f673b4498e43f6d687df3a,19.403058,-99.242446,Donut Shop
4,Bosques de las Lomas,19.403526,-99.245493,Hard Candy Fitness Mexico,4cae1b3f8c48a09382dd712c,19.403628,-99.242275,Gym
...,...,...,...,...,...,...,...,...
732,Tlatelolco,19.455014,-99.143169,Hamburguesas San Simon,4d2d19896cdea09075c15143,19.456407,-99.144894,Food Truck
733,Tlatelolco,19.455014,-99.143169,Restaurant los Alcatraces,50dc8055e4b0298a125bc7cd,19.453209,-99.141337,Mexican Restaurant
734,Tlatelolco,19.455014,-99.143169,Gym Guerrero,520a658811d2a29ac7a93951,19.450867,-99.141399,Gym
735,Tlatelolco,19.455014,-99.143169,Teatro Félix Azuela,50c3eea9e4b092542caf3675,19.455277,-99.147097,General Entertainment


In [41]:
mexico_city_venues.shape

(737, 8)

#### Show how many venues there are for each neighborhood

In [42]:
mexico_city_venues.groupby('Neighborhood').count()['Venue Category']

Neighborhood
Bosques de las Lomas      43
Centro                   100
Condesa                   62
Coyoacán                  13
Del Valle                 95
Jardines del Pedregal     18
Nápoles                  100
Polanco                   85
Roma                     100
San Ángel                 16
Santa Fe                  13
Tepito                    44
Tlatelolco                48
Name: Venue Category, dtype: int64

**Get unique count of venues in Mexico City**

In [43]:
print('There are {} uniques categories.'.format(len(mexico_city_venues['Venue Category'].unique())))

There are 165 uniques categories.


Do the *one hot encoding* technique on the venues, because categorical data often must be encoded when working with machine learning algorithms

In [44]:
# one hot encoding
mexico_city_onehot = pd.get_dummies(mexico_city_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
mexico_city_onehot['Neighborhood'] = mexico_city_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [mexico_city_onehot.columns[-1]] + list(mexico_city_onehot.columns[:-1])
mexico_city_onehot = mexico_city_onehot[fixed_columns]

mexico_city_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Sushi Restaurant,Taco Place,Tapas Restaurant,Tattoo Parlor,Tea Room,Theater,Thrift / Vintage Store,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Water Park
0,Bosques de las Lomas,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Bosques de las Lomas,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Bosques de las Lomas,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Bosques de las Lomas,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Bosques de las Lomas,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Get some statistics con each venue grouped by the neighborhod

In [45]:
mexico_city_grouped = mexico_city_onehot.groupby('Neighborhood').mean().reset_index()
mexico_city_grouped

Unnamed: 0,Neighborhood,American Restaurant,Antique Shop,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,...,Sushi Restaurant,Taco Place,Tapas Restaurant,Tattoo Parlor,Tea Room,Theater,Thrift / Vintage Store,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Water Park
0,Bosques de las Lomas,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.069767,0.0,0.0,0.0,0.046512,0.0,0.0,0.0,0.0,0.0
1,Centro,0.0,0.01,0.0,0.0,0.0,0.06,0.06,0.0,0.0,...,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0
2,Condesa,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.016129,0.080645,0.016129,0.016129,0.016129,0.016129,0.0,0.016129,0.0,0.0
3,Coyoacán,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,...,0.0,0.153846,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Del Valle,0.0,0.0,0.010526,0.010526,0.0,0.0,0.0,0.010526,0.0,...,0.010526,0.042105,0.0,0.0,0.0,0.0,0.0,0.010526,0.010526,0.010526
5,Jardines del Pedregal,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.055556,0.055556,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0
6,Nápoles,0.0,0.0,0.01,0.01,0.01,0.01,0.0,0.0,0.0,...,0.01,0.03,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0
7,Polanco,0.011765,0.0,0.0,0.011765,0.011765,0.0,0.0,0.0,0.0,...,0.011765,0.011765,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Roma,0.0,0.01,0.0,0.0,0.0,0.06,0.06,0.0,0.0,...,0.0,0.02,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0
9,San Ángel,0.0,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,...,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [46]:
num_top_venues = 5

for hood in mexico_city_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = mexico_city_grouped[mexico_city_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
#     Take everything but the hood name
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Bosques de las Lomas----
                venue  freq
0      Ice Cream Shop  0.07
1    Sushi Restaurant  0.07
2          Restaurant  0.05
3          Food Truck  0.05
4  Seafood Restaurant  0.05


----Centro----
                 venue  freq
0           Art Museum  0.06
1  Arts & Crafts Store  0.06
2               Museum  0.06
3             Boutique  0.05
4       Ice Cream Shop  0.05


----Condesa----
                  venue  freq
0            Taco Place  0.08
1                Bakery  0.05
2    Italian Restaurant  0.03
3  Gym / Fitness Center  0.03
4            Restaurant  0.03


----Coyoacán----
                venue  freq
0          Taco Place  0.15
1              Lounge  0.15
2          Food Truck  0.08
3  Seafood Restaurant  0.08
4        Costume Shop  0.08


----Del Valle----
                venue  freq
0  Mexican Restaurant  0.11
1          Restaurant  0.06
2              Bakery  0.05
3  Seafood Restaurant  0.05
4         Coffee Shop  0.04


----Jardines del Pedregal----
       

## Some quick conclusions
- In Mexico City there are lots of restaurants (mexican restaurants, to be precise)
- Take the neighborhoods that have in their top 5 Coffee Shops, because we want a place that is known to have coffee shops, no matter the freq the coffe shop has:
    - Del Valle
        - **Coffee Shop**
    - Jardines del Pedregal
        - **Coffee Shop**
    - Nápoles 
        - **Coffee Shop**
        - **Café**
    - Polanco
        - **Coffee Shop**
    - San Ángel
        - **Café**

Maybe the freq is not a good measure for this experiment because it is not telling the amount of coffee shops in an area. It is tellign us how many coffee shops there are per venue. I'll take another approach, to be explicit, count the number of coffee shops per venue.

In [47]:
list_hoods_count = []
for hood in df_neighs['Neighborhood']:
    hood_coffee_shops = mexico_city_venues[ 
        (mexico_city_venues['Neighborhood']==hood) 
        & ( (mexico_city_venues['Venue Category']=='Coffee Shop')
           | (mexico_city_venues['Venue Category']=='Café')
          ) 
    ]
    
    try:
        list_hoods_count.append( (hood, hood_coffee_shops['Neighborhood'].value_counts()[0]) )
    except:
        list_hoods_count.append( (hood, 0) )
        
        
list_hoods_count = sorted(list_hoods_count, key=lambda x: x[1], reverse=True)

for hood, count in list_hoods_count:
    print(hood, count)

Nápoles 18
Del Valle 7
Polanco 5
Centro 3
Roma 3
Tlatelolco 3
Bosques de las Lomas 2
Condesa 2
Jardines del Pedregal 2
San Ángel 1
Colonia Juarez 0
Coyoacán 0
Lomas de Chapultepec 0
Santa Fe 0
Tepito 0


Grab the top 3

In [48]:
neighborhoods_wanted = ['Del Valle', 'Nápoles', 'Polanco']

In [49]:
df_wanted_neighs = df_neighs[ df_neighs['Neighborhood'].isin(neighborhoods_wanted) ].reset_index(drop=True)
df_wanted_neighs

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Del Valle,19.394237,-99.167034
1,Nápoles,19.39377,-99.176602
2,Polanco,19.43353,-99.190915


In [50]:
df_wanted_neighs['Radius'] = [30, 30, 30]
df_wanted_neighs['Color'] = ['purple', 'red', 'yellow']
df_wanted_neighs

Unnamed: 0,Neighborhood,Latitude,Longitude,Radius,Color
0,Del Valle,19.394237,-99.167034,30,purple
1,Nápoles,19.39377,-99.176602,30,red
2,Polanco,19.43353,-99.190915,30,yellow


Just to remeber what ```mexico_city_venues``` has

In [53]:
mexico_city_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue ID,Venue Latitude,Venue Longitude,Venue Category
0,Bosques de las Lomas,19.403526,-99.245493,Puntarena,553bf6fc498e07485b2cab0a,19.403129,-99.242527,Seafood Restaurant
1,Bosques de las Lomas,19.403526,-99.245493,Häggen-Dazs,53487059498eb1a054c70906,19.402976,-99.242554,Dessert Shop
2,Bosques de las Lomas,19.403526,-99.245493,Häagen-Dazs,4be9b6636295c9b6638f8508,19.403008,-99.242641,Ice Cream Shop
3,Bosques de las Lomas,19.403526,-99.245493,Krispy Kreme,53f673b4498e43f6d687df3a,19.403058,-99.242446,Donut Shop
4,Bosques de las Lomas,19.403526,-99.245493,Hard Candy Fitness Mexico,4cae1b3f8c48a09382dd712c,19.403628,-99.242275,Gym


In [54]:
mexico_city_coffee_shops = mexico_city_venues[ 
    (mexico_city_venues['Neighborhood'].isin(neighborhoods_wanted)) 
    & ( (mexico_city_venues['Venue Category']=='Coffee Shop')
       | (mexico_city_venues['Venue Category']=='Café')
      ) 
]

In [55]:
mexico_city_coffee_shops.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue ID,Venue Latitude,Venue Longitude,Venue Category
320,Del Valle,19.394237,-99.167034,Café Passmar,4bc8d26c68f976b03b0c5c83,19.395669,-99.166141,Coffee Shop
331,Del Valle,19.394237,-99.167034,Momentto Café 100% Colombiano,51aa469f498e2fcb2aefbf76,19.395554,-99.168838,Coffee Shop
371,Del Valle,19.394237,-99.167034,El color de mi tierra,4cdd6379cea2224bc6de904c,19.397338,-99.166295,Café
374,Del Valle,19.394237,-99.167034,Starbucks,4bc4e60ce58e9521bc21c9e1,19.396978,-99.165122,Coffee Shop
386,Del Valle,19.394237,-99.167034,Starbucks,58f90a64dec1d656aaa74b23,19.397023,-99.166367,Coffee Shop


In [56]:
venues_map = folium.Map(location=[19.415695, -99.180428], zoom_start=13.3) # generate map centred around the Conrad Hotel

# add a red circle marker to represent the Conrad Hotel
folium.CircleMarker(
    [latitude, longitude],
    radius=10,
    color='red',
    popup='Zócalo',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(venues_map)

# add the Italian restaurants as blue circle markers
for lat, lng, label, radius, color in zip(df_wanted_neighs['Latitude'], df_wanted_neighs['Longitude'], df_wanted_neighs['Neighborhood'], df_wanted_neighs['Radius'], df_wanted_neighs['Color']):
    print
    folium.CircleMarker(
        [lat, lng],
        radius=radius,
        color='green',
        popup=label,
        fill = True,
        fill_color=color,
        fill_opacity=0.3
    ).add_to(venues_map)

for lat, lng, label in zip(mexico_city_coffee_shops['Venue Latitude'], mexico_city_coffee_shops['Venue Longitude'], mexico_city_coffee_shops['Venue']):
    folium.Marker([lat, lng], popup=label).add_to(venues_map) 

venues_map

## Get venues info

In [57]:
mexico_city_coffee_shops

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue ID,Venue Latitude,Venue Longitude,Venue Category
320,Del Valle,19.394237,-99.167034,Café Passmar,4bc8d26c68f976b03b0c5c83,19.395669,-99.166141,Coffee Shop
331,Del Valle,19.394237,-99.167034,Momentto Café 100% Colombiano,51aa469f498e2fcb2aefbf76,19.395554,-99.168838,Coffee Shop
371,Del Valle,19.394237,-99.167034,El color de mi tierra,4cdd6379cea2224bc6de904c,19.397338,-99.166295,Café
374,Del Valle,19.394237,-99.167034,Starbucks,4bc4e60ce58e9521bc21c9e1,19.396978,-99.165122,Coffee Shop
386,Del Valle,19.394237,-99.167034,Starbucks,58f90a64dec1d656aaa74b23,19.397023,-99.166367,Coffee Shop
400,Del Valle,19.394237,-99.167034,Café El Pretexto,4cf3d56a94feb1f733c824ba,19.389754,-99.166955,Café
403,Del Valle,19.394237,-99.167034,El Cafe-to,4c490a6f6594be9ae6cf8a24,19.393482,-99.169818,Café
448,Nápoles,19.39377,-99.176602,Starbucks lobby WTC,576d6761498e3df00d40549f,19.394639,-99.173981,Coffee Shop
451,Nápoles,19.39377,-99.176602,Cielito Querido Café,4db0dec8fa8ca4b3e9ec3449,19.395085,-99.174784,Coffee Shop
452,Nápoles,19.39377,-99.176602,Hey! Brew Bar,55a2b168498e873deaff9378,19.392193,-99.179207,Café


Por cada vecindario, ver el promedio de rating y de price y de ahi tomar una decision de donde hay oportunidad para poner un cafe

In [77]:
prices = []
ratings = []
for index, row in mexico_city_coffee_shops.iterrows():
    venue_id = row['Venue ID']
    url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
    result = requests.get(url).json()
    
    price = result['response']['venue']['price']['tier']
    rating = result['response']['venue']['rating']
    
    prices.append(price)
    ratings.append(rating)

KeyError: 'venue'

In [87]:
mexico_city_coffee_shops['Price'] = prices
mexico_city_coffee_shops['Rating'] = ratings
mexico_city_coffee_shops

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue ID,Venue Latitude,Venue Longitude,Venue Category,Price,Rating
320,Del Valle,19.394237,-99.167034,Café Passmar,4bc8d26c68f976b03b0c5c83,19.395669,-99.166141,Coffee Shop,2,8.7
331,Del Valle,19.394237,-99.167034,Momentto Café 100% Colombiano,51aa469f498e2fcb2aefbf76,19.395554,-99.168838,Coffee Shop,2,8.3
371,Del Valle,19.394237,-99.167034,El color de mi tierra,4cdd6379cea2224bc6de904c,19.397338,-99.166295,Café,1,8.0
374,Del Valle,19.394237,-99.167034,Starbucks,4bc4e60ce58e9521bc21c9e1,19.396978,-99.165122,Coffee Shop,2,8.0
386,Del Valle,19.394237,-99.167034,Starbucks,58f90a64dec1d656aaa74b23,19.397023,-99.166367,Coffee Shop,1,7.5
400,Del Valle,19.394237,-99.167034,Café El Pretexto,4cf3d56a94feb1f733c824ba,19.389754,-99.166955,Café,1,7.3
403,Del Valle,19.394237,-99.167034,El Cafe-to,4c490a6f6594be9ae6cf8a24,19.393482,-99.169818,Café,1,6.7
448,Nápoles,19.39377,-99.176602,Starbucks lobby WTC,576d6761498e3df00d40549f,19.394639,-99.173981,Coffee Shop,1,8.6
451,Nápoles,19.39377,-99.176602,Cielito Querido Café,4db0dec8fa8ca4b3e9ec3449,19.395085,-99.174784,Coffee Shop,1,8.1
452,Nápoles,19.39377,-99.176602,Hey! Brew Bar,55a2b168498e873deaff9378,19.392193,-99.179207,Café,2,8.9


In [94]:
mexico_city_coffee_shops[  ['Neighborhood', 'Price', 'Rating'] ] \
.groupby('Neighborhood') \
.agg({'Price':'mean', 'Rating':['mean', 'count']}) \
.reset_index()


Unnamed: 0_level_0,Neighborhood,Price,Rating,Rating
Unnamed: 0_level_1,Unnamed: 1_level_1,mean,mean,count
0,Del Valle,1.428571,7.785714,7
1,Nápoles,1.277778,8.055556,18
2,Polanco,1.6,8.46,5


# Final conclusions
To start our new Coffee Shop Empire in Mexico City, I would recommend to build it in **Del Valle** because it is kind of recognized for his coffee shops, as is the 2nd with more coffee shops, but it is not too crowded like **Napoles**. But the differentiating factor between your shop and the others is that your shop must have a good service to stand out. As the average of rating in Del Valle is below 8, there is a business oportunity you can be exploited

**Here comes a bunch of trash code that at the beginning helped to this analyis**

```python
venue_id = '4c23d5f1f1272d7fce1282c5'
url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&v={}'.format(venue_id, CLIENT_ID, CLIENT_SECRET, VERSION)
url
```

```python
print(result['response']['venue']['id'])
print(result['response']['venue']['name'])
print(result['response']['venue']['stats'])
print(result['response']['venue']['price']['tier'])
print(result['response']['venue']['rating'])
```

``` python
Get the ids of the categories of coffee shops
#                      Cafe                    Coffee Shop
categoriesId = '4bf58dd8d48988d16d941735,4bf58dd8d48988d1e0931735'
radius = 1000
print(categoriesId + ' .... OK!')
latitude = 19.414864
longitude = -99.176429
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&categoryId={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude, VERSION, categoriesId, radius, LIMIT)
url
```

``` python
results = requests.get(url).json()
results
```

``` python
# assign relevant part of JSON to venues
venues = results['response']['venues']

# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe.shape
```

``` python
filtered_columns = ['name', 'categories'] + [col for col in dataframe.columns if col.startswith('location.')] + ['id']
filtered_columns
```

``` python
df_coffee_shops = dataframe.loc[:, filtered_columns]
```

``` python
def get_category_type(row):
    categories_list = row['categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

# filter the category for each row
df_coffee_shops['categories'] = df_coffee_shops.apply(get_category_type, axis=1)

# clean column names by keeping only last term
df_coffee_shops.columns = [column.split('.')[-1] for column in df_coffee_shops.columns]

df_coffee_shops
```

``` python
df_coffee_shops['categories'].value_counts()
```

``` python
df_coffee_shops['name'].value_counts()
```