## Capstone Project - The Battle of Neighborhoods (Week 2)

**A description of the problem and a discussion of the background.**   

Talking about typical Mexican drinks is an interesting topic for many people. The love for drinks like tequila, mezcal, and even pulque has always come from Mexicans. But Mexico, in addition to these so, has a huge variety of famous drinks that range from simple fresh waters to a called Tuba made with the flower of coconut trees in Colima and Nayarit.

Knowing this, we can say that it would be competitive to start a beverage business, more specifically a bar.

First of all, we must collect data from the entire beverage business in Mexico City, including its name, identification (address, latitude, longitude) and then look for the one that is most frequented by people. For active data we use FourSquare and apply folio to visualize a particular beverage business in which we will observe customer "traffic" and predict the appropriate location of a new bar in the city.

**A description of the data and how it will be used to solve the problem.** 

We will use the data collected from FourSquare to predict the proper location to start a new beverage business in town.

**Installing and Importing the required Libraries** 

In [2]:
import requests # library to handle requests
import pandas as pd # library for data analsysis
import numpy as np # library to handle data in a vectorized manner
import random # library for random number generation
!conda install -c conda-forge folium=0.5.0 --yes
import folium # plotting library

Collecting package metadata (current_repodata.json): done
Solving environment: done

# All requested packages already installed.



**Credentials for FourSquare** 

In [3]:
CLIENT_ID = 'D3S0DGXFOQB0OVI0J4AURKICYYGGH2FAI3XCMNPVAQNS4MIM' # your Foursquare ID
CLIENT_SECRET = 'BULV32R1WKYCMVCDBJPE1SDNNYJIK340A5BXJMEIQEXOXGBG' # your Foursquare Secret
VERSION = '20201104'
LIMIT = 50
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: D3S0DGXFOQB0OVI0J4AURKICYYGGH2FAI3XCMNPVAQNS4MIM
CLIENT_SECRET:BULV32R1WKYCMVCDBJPE1SDNNYJIK340A5BXJMEIQEXOXGBG


**Get request near Mexico City** 

In [4]:
import requests

request_parameters = {
    "client_id": CLIENT_ID,
    "client_secret": CLIENT_SECRET,
    "v": '20180605',
    "section": "drinks",
    "near": "Mexico City",
    "radius": 1000,
    "limit": 50}

data = requests.get("https://api.foursquare.com/v2/venues/explore", params=request_parameters)

**Transform data into json then request geocode.** 

In [5]:
d = data.json()["response"]
d.keys()

dict_keys(['suggestedFilters', 'geocode', 'headerLocation', 'headerFullLocation', 'headerLocationGranularity', 'query', 'totalResults', 'suggestedBounds', 'groups'])

In [6]:
d["headerLocationGranularity"], d["headerLocation"], d["headerFullLocation"]

('city', 'Mexico City', 'Mexico City')

In [7]:
d["suggestedBounds"], d["totalResults"]

({'ne': {'lat': 19.437794487323732, 'lng': -99.12664159911786},
  'sw': {'lat': 19.419832607587598, 'lng': -99.13756824320605}},
 52)

In [8]:
d["geocode"]

{'what': '',
 'where': 'mexico city',
 'center': {'lat': 19.42847, 'lng': -99.12766},
 'displayString': 'Mexico City, DF, Mexico',
 'cc': 'MX',
 'geometry': {'bounds': {'ne': {'lat': 19.515304989460464,
    'lng': -99.05579900650167},
   'sw': {'lat': 19.356858007471764, 'lng': -99.25983899084375}}},
 'slug': 'mexico-city',
 'longId': '72057594041458533'}

**We start creating group including information which is recommended.** 

In [9]:
d["groups"][0].keys()

dict_keys(['type', 'name', 'items'])

In [10]:
d["groups"][0]["type"], d["groups"][0]["name"]

('Recommended Places', 'recommended')

**Creating items of bars and their attributes - id, address, name, etc** 

In [11]:
items = d["groups"][0]["items"]
print("number of items: %i" % len(items))
items[0]

number of items: 50


{'reasons': {'count': 0,
  'items': [{'summary': 'This spot is popular',
    'type': 'general',
    'reasonName': 'globalInteractionReason'}]},
 'venue': {'id': '4f3f65d4e4b0ae0655090150',
  'name': 'Pulqueria la Elegante',
  'location': {'lat': 19.42349945614243,
   'lng': -99.12861024918861,
   'labeledLatLngs': [{'label': 'display',
     'lat': 19.42349945614243,
     'lng': -99.12861024918861}],
   'cc': 'MX',
   'state': 'Distrito Federal',
   'country': 'México',
   'formattedAddress': ['Distrito Federal', 'México']},
  'categories': [{'id': '50327c8591d4c4b30a586d5d',
    'name': 'Brewery',
    'pluralName': 'Breweries',
    'shortName': 'Brewery',
    'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/food/brewery_',
     'suffix': '.png'},
    'primary': True}],
  'photos': {'count': 0, 'groups': []}},
 'referralId': 'e-4-4f3f65d4e4b0ae0655090150-0'}

**Based on that we start to organize what we have got.** 

In [12]:
df_raw = []
for item in items:
    venue = item["venue"]
    categories, uid, name, location = venue["categories"], venue["id"], venue["name"], venue["location"]
    print(location)
    assert len(categories) == 1
    shortname = categories[0]["shortName"]
    address =  ''
    if hasattr(location, 'address'):
      address = location['address']
    if not "postalCode" in location:
        continue
    postalcode = location["postalCode"]
    lat = location["lat"]
    lng = location["lng"]
    datarow = (uid, name, shortname, address, postalcode, lat, lng)
    df_raw.append(datarow)
df = pd.DataFrame(df_raw, columns=["uid", "name", "shortname", "address", "postalcode", "lat", "lng"])
df.head()

{'lat': 19.42349945614243, 'lng': -99.12861024918861, 'labeledLatLngs': [{'label': 'display', 'lat': 19.42349945614243, 'lng': -99.12861024918861}], 'cc': 'MX', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Distrito Federal', 'México']}
{'address': 'Alhóndiga 26', 'lat': 19.431050123135268, 'lng': -99.12713826475824, 'labeledLatLngs': [{'label': 'display', 'lat': 19.431050123135268, 'lng': -99.12713826475824}], 'cc': 'MX', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddress': ['Alhóndiga 26', 'Ciudad de México, Distrito Federal', 'México']}
{'address': 'República de Guatemala 4', 'crossStreet': 'República de Brasil', 'lat': 19.43530583753121, 'lng': -99.13365729219127, 'labeledLatLngs': [{'label': 'display', 'lat': 19.43530583753121, 'lng': -99.13365729219127}], 'postalCode': '06020', 'cc': 'MX', 'neighborhood': 'Downtown', 'city': 'Ciudad de México', 'state': 'Distrito Federal', 'country': 'México', 'formattedAddres

Unnamed: 0,uid,name,shortname,address,postalcode,lat,lng
0,4f67bea9c2ee09e7b3e57225,Terraza Catedral,Beer Garden,,6020,19.435306,-99.133657
1,4d0dbb4dbe6d6ea800e706b5,Hostería La Bota,Tapas,,6050,19.427067,-99.137017
2,4b0586fef964a520c47922e3,La Casa de las Sirenas,Cocktail,,6000,19.434975,-99.132292
3,51a0208d498e210b5bd14aec,Terraza Regina,Beer Garden,,6090,19.42777,-99.135235
4,50c7edd9e4b0dec51dff8ef4,Hilaria Gastrobar,Bar,,6000,19.433335,-99.135603


**As we can see that there are many bars without address we need to execute hasattr() to determine if each object (beverage business) has a attribute (address).**

**Next step we will execute a very important part - get coordinates of Mexico City and create folium map which will help visualize what we have got from data.** 

In [13]:
mc_center = d["geocode"]["center"]
mc_center

{'lat': 19.42847, 'lng': -99.12766}

In [14]:
from folium import plugins

map_mc = folium.Map(location=[19.42847, -99.12766], zoom_start=14)

def add_markers(df):
    for (j, row) in df.iterrows():
        label = folium.Popup(row["name"], parse_html=True)
        folium.CircleMarker(
            [row["lat"], row["lng"]],
            radius=5,
            popup=label,
            color='red',
            fill=True,
            fill_color='#3186cc',
            fill_opacity=0.7,
            parse_html=False).add_to(map_mc)

add_markers(df)
hm_data = df[["lat", "lng"]].to_numpy().tolist()
map_mc.add_child(plugins.HeatMap(hm_data))

map_mc

**On the map we can see the busiest bars in the center of Mexico City**   
**We need a place near and at a certain distance from the busiest places to decrease business competitiveness** 

In [17]:
lat = 19.42847
lng = -99.12766
map_mc = folium.Map(location=[lat, lng], zoom_start=16)
add_markers(df)
folium.CircleMarker(
    [lat, lng],
    radius=15,
    popup="My New Bar!",
    color='green',
    fill=True,
    fill_color='#3186cc',
    fill_opacity=0.7,
    parse_html=False).add_to(map_mc)
map_mc

**Here we can see the hypothetical place where the new bar will be. It is a place in the center of Mexico City near tourist places where nightlife is better in the company of friends and why not a good drink.**  