# Capstone Final Project - The Battle of the Neighborhoods - Exploring Bogota

## Introduction

We will explore Bogota city

We will give tourist a top 5 of venues in each district where they can experiment different things. We know that tourist whenever travels to a location, he tries to find best spots around in that specific location to explore. In order to give good recommendations we will use the Foursquare API to get the venues of each district and finally recommend tourist a district, with venues where he can visit.

## Description of the data that we will be using to solve the problem or execute the idea.

In this notebook we are going to explore Bogota´s top venues.

Explore the capital of Colombia is 

For this analysis I choose 2 sources:
1. Bogota city is divided in 20 districts: 19 urban and one rural. Many of them are the oldest towns in Colombia. Each distric is conformed by a group of "UPZ" (Unidades de Planeamiento Zonal) and each UPZ is subdivided in neighborhoods. In this analysis we will explore the districts, to get the geographical coordinates of the localidades we download the csv file of this site:
    
    Link: https://bogota-laburbano.opendatasoft.com/explore/dataset/georeferencia-puntual-por-localidad/table/
    
    
2. Using foursquare API to get the data about venues in Bogota.

With this data we can explore all the venues in each district and find out the top venues for recommendations to tourist.

###### Table of contents

    1. Import libraries
    2. Download and read the data
    3. Define Foursquare Credentials and Version
    4. Clustering districts
    

### 1. Import libraries

In [4]:
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe
import types
from botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share the notebook.
client_b3db8fa85a8946ad834caaa4f83f4eba = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='0C8vzYa-FmPgDc17RnZ9bqodBMVxX1HyNMHpvc7d1w6q',
    ibm_auth_endpoint="https://iam.cloud.ibm.com/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3-api.us-geo.objectstorage.service.networklayer.com')

print("Libraries imported")

Libraries imported


### 2. Download and read the data

In [5]:
body = client_b3db8fa85a8946ad834caaa4f83f4eba.get_object(Bucket='capstoneprojectcourse-donotdelete-pr-qxydsseiqruq4s',Key='localidades_bta.csv')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(body, "__iter__"): body.__iter__ = types.MethodType( __iter__, body )
bogota_data = pd.read_csv(body)
bogota_data.head()

Unnamed: 0,LOCALIDAD,LONGITUD,LATITUD
0,CHAPINERO,-74.0467,4.6569
1,TUNJUELITO,-74.1407,4.5875
2,ANTONIO NARINO,-74.1009,4.5486
3,BARRIOS UNIDOS,-74.084,4.6664
4,ENGATIVA,-74.1072,4.7071


In [6]:
bogota_data.rename(columns={"LOCALIDAD":"District","LONGITUD":"Longitude","LATITUD":"Latitude"}, inplace=True)
bogota_data = bogota_data[["District","Latitude","Longitude"]]
bogota_data.head()

Unnamed: 0,District,Latitude,Longitude
0,CHAPINERO,4.6569,-74.0467
1,TUNJUELITO,4.5875,-74.1407
2,ANTONIO NARINO,4.5486,-74.1009
3,BARRIOS UNIDOS,4.6664,-74.084
4,ENGATIVA,4.7071,-74.1072


### Get the geograpical coordinates of Bogota

In [7]:
!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

Solving environment: done

# All requested packages already installed.



In [8]:
address = 'Bogota, CO'
geolocator = Nominatim(user_agent = "bo_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print("The geograpical coordinate of Bogota are {}, {}".format(latitude, longitude))

The geograpical coordinate of Bogota are 4.59808, -74.0760439


### Visualize the data

Create a map of Bogota with districts superimposed on top

In [9]:
!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

Solving environment: done

# All requested packages already installed.



In [10]:
map_bogota = folium.Map(location=[latitude, longitude], zoom_start=11)

#add markers to map
for lat, lng, localidad in zip(bogota_data["Latitude"], bogota_data["Longitude"], bogota_data["District"]):
    label = '{}'.format(localidad)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat,lng],
        radius=5,
        popup=label,
        color='green',
        fill=True,
        fill_color="#3186cc",
        fill_opacity=0.7,
        parse_html=False).add_to(map_bogota)

map_bogota

### 3. Define Foursquare Credentials and Version

In [11]:
CLIENT_ID = 'JXVDPJQP2QBNAN1CFTLQ0T332JLEN4GCO3DGRDN0PDATC3PX' # your Foursquare ID
CLIENT_SECRET = 'ORFBC2GRRZPU1TUH55CIOKJCECEBZNZP1S0A20TBQ4ZY5C3V' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: JXVDPJQP2QBNAN1CFTLQ0T332JLEN4GCO3DGRDN0PDATC3PX
CLIENT_SECRET:ORFBC2GRRZPU1TUH55CIOKJCECEBZNZP1S0A20TBQ4ZY5C3V


### Create a funtion to explore all districts in Bogota

In [12]:
def getNearbyVenues(names, latitudes, longitudes, radius=500, LIMIT=100):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
        
        #create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
                CLIENT_ID,
                CLIENT_SECRET,
                VERSION,
                lat,
                lng,
                radius,
                LIMIT)

        #make the GET request
        results = requests.get(url).json()["response"]["groups"][0]["items"]

        #return only relevant information for each nerby venue
        venues_list.append([(
            name,
            lat,
            lng,
            v["venue"]["name"],
            v["venue"]['id'],
            v["venue"]["location"]["lat"],
            v["venue"]["location"]["lng"],
            v["venue"]["categories"][0]["name"]) for v in results])
    
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['District', 
                  'District Latitude', 
                  'District Longitude', 
                  'Venue', 'id',
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

### Now create a DataFrame with the above funtion on each district

In [13]:
bogota_venues= getNearbyVenues(names=bogota_data["District"],
                                latitudes=bogota_data["Latitude"],
                                longitudes=bogota_data["Longitude"])

CHAPINERO
TUNJUELITO
ANTONIO NARINO
BARRIOS UNIDOS
ENGATIVA
SUMAPAZ
TEUSAQUILLO
LA CANDELARIA
SANTA FE
SUBA
FONTIBON
LOS MARTIRES
SAN CRISTOBAL
USME
PUENTE ARANDA
USAQUEN
BOSA
CIUDAD BOLIVAR
RAFAEL URIBE URIBE
KENNEDY


In [14]:
print(bogota_venues.shape)

bogota_venues.head()

(149, 8)


Unnamed: 0,District,District Latitude,District Longitude,Venue,id,Venue Latitude,Venue Longitude,Venue Category
0,CHAPINERO,4.6569,-74.0467,Cerros Orientales Bogotá,5117bb47e4b046a9093beaf1,4.653562,-74.049712,Trail
1,TUNJUELITO,4.5875,-74.1407,Country's Pizza,4fafebb6e4b078fd5b9f20ac,4.590904,-74.138513,Fast Food Restaurant
2,TUNJUELITO,4.5875,-74.1407,Oma Parque de la 93,4e39da9881301b0bdc247080,4.589493,-74.142212,Café
3,TUNJUELITO,4.5875,-74.1407,Pescadería EL CHEFF MARINO,51d08e92498ef93fccacd5ac,4.589536,-74.138519,Fish & Chips Shop
4,TUNJUELITO,4.5875,-74.1407,Tienda Azul,5246715511d256064663ebf9,4.590491,-74.137632,Bar


Checking how many venues were returned for each district

In [15]:
bogota_venues.groupby("District").count()

Unnamed: 0_level_0,District Latitude,District Longitude,Venue,id,Venue Latitude,Venue Longitude,Venue Category
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
ANTONIO NARINO,4,4,4,4,4,4,4
BARRIOS UNIDOS,24,24,24,24,24,24,24
BOSA,5,5,5,5,5,5,5
CHAPINERO,1,1,1,1,1,1,1
ENGATIVA,8,8,8,8,8,8,8
FONTIBON,6,6,6,6,6,6,6
KENNEDY,4,4,4,4,4,4,4
LA CANDELARIA,59,59,59,59,59,59,59
LOS MARTIRES,7,7,7,7,7,7,7
PUENTE ARANDA,5,5,5,5,5,5,5


### Analyze each District

In [16]:
#one hot encoding
bogota_onehot = pd.get_dummies(bogota_venues[["Venue Category"]], prefix="", prefix_sep="")

#add localidades column back to dataframe
bogota_onehot["District"] = bogota_venues["District"]

#move neighborhood column to the first column
fixed_columns = [bogota_onehot.columns[-1]] + list(bogota_onehot.columns[:-1])
bogota_onehot = bogota_onehot[fixed_columns]

bogota_onehot.head()

Unnamed: 0,District,Argentinian Restaurant,Art Gallery,Art Museum,BBQ Joint,Bakery,Bar,Bookstore,Breakfast Spot,Burger Joint,Burrito Place,Café,Campground,Caribbean Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cultural Center,Deli / Bodega,Department Store,Dessert Shop,Diner,Dog Run,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Golf Course,Grocery Store,Gym / Fitness Center,History Museum,Hostel,Hot Dog Joint,Hotel,Italian Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant,Motorcycle Shop,Museum,Music Store,Nightlife Spot,Optical Shop,Paintball Field,Park,Performing Arts Venue,Peruvian Restaurant,Pizza Place,Plaza,Pub,Restaurant,Sandwich Place,Seafood Restaurant,Shopping Mall,Soccer Field,South American Restaurant,Steakhouse,Supermarket,Theater,Trail,Vegetarian / Vegan Restaurant,Water Park
0,CHAPINERO,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
1,TUNJUELITO,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,TUNJUELITO,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,TUNJUELITO,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,TUNJUELITO,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [17]:
bogota_grouped = bogota_onehot.groupby("District").sum().reset_index()
bogota_grouped.head()

Unnamed: 0,District,Argentinian Restaurant,Art Gallery,Art Museum,BBQ Joint,Bakery,Bar,Bookstore,Breakfast Spot,Burger Joint,Burrito Place,Café,Campground,Caribbean Restaurant,Clothing Store,Cocktail Bar,Coffee Shop,Comfort Food Restaurant,Concert Hall,Construction & Landscaping,Convenience Store,Cultural Center,Deli / Bodega,Department Store,Dessert Shop,Diner,Dog Run,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Food,French Restaurant,Fried Chicken Joint,Furniture / Home Store,Golf Course,Grocery Store,Gym / Fitness Center,History Museum,Hostel,Hot Dog Joint,Hotel,Italian Restaurant,Latin American Restaurant,Mediterranean Restaurant,Mexican Restaurant,Motorcycle Shop,Museum,Music Store,Nightlife Spot,Optical Shop,Paintball Field,Park,Performing Arts Venue,Peruvian Restaurant,Pizza Place,Plaza,Pub,Restaurant,Sandwich Place,Seafood Restaurant,Shopping Mall,Soccer Field,South American Restaurant,Steakhouse,Supermarket,Theater,Trail,Vegetarian / Vegan Restaurant,Water Park
0,ANTONIO NARINO,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0
1,BARRIOS UNIDOS,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,1,0,1,0,0,0,0,0,5,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,3,0,0,1,0,2,0,0,1,1,0,0,0,0,1
2,BOSA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,1,0,0,0,1,0,0,0,0
3,CHAPINERO,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
4,ENGATIVA,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0


In [18]:
top_venues = 5

for loc in bogota_grouped["District"]:
    print("-----"+loc+"-----")
    temp = bogota_grouped[bogota_grouped["District"] == loc].T.reset_index()
    temp.columns = ["venue","freq"]
    temp = temp.iloc[1:]
    temp["freq"] = temp["freq"].astype(float)
    temp = temp.round({"freq":2})
    print(temp.sort_values("freq", ascending= False).reset_index(drop=True).head(top_venues))
    print("\n")

-----ANTONIO NARINO-----
                        venue  freq
0  Construction & Landscaping   1.0
1                  Restaurant   1.0
2                Burger Joint   1.0
3                  Campground   1.0
4                      Museum   0.0


-----BARRIOS UNIDOS-----
                venue  freq
0         Golf Course   5.0
1        Dessert Shop   3.0
2         Pizza Place   3.0
3              Bakery   2.0
4  Seafood Restaurant   2.0


-----BOSA-----
             venue  freq
0    Grocery Store   1.0
1      Supermarket   1.0
2  Motorcycle Shop   1.0
3       Restaurant   1.0
4    Shopping Mall   1.0


-----CHAPINERO-----
                      venue  freq
0                     Trail   1.0
1    Argentinian Restaurant   0.0
2  Mediterranean Restaurant   0.0
3              Optical Shop   0.0
4            Nightlife Spot   0.0


-----ENGATIVA-----
                  venue  freq
0  Fast Food Restaurant   2.0
1          Optical Shop   1.0
2           Supermarket   1.0
3   Fried Chicken Joint   1.0


Put the results into a DataFrame

In [19]:
#first write a funtion to sort the venues in descending order

def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [23]:
import numpy as np
num_top_venues = 5

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['District']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
district_venues_sorted = pd.DataFrame(columns=columns)
district_venues_sorted['District'] = bogota_grouped['District']

for ind in np.arange(bogota_grouped.shape[0]):
    district_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bogota_grouped.iloc[ind, :], num_top_venues)

district_venues_sorted

Unnamed: 0,District,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue
0,ANTONIO NARINO,Burger Joint,Restaurant,Campground,Construction & Landscaping,Fish & Chips Shop
1,BARRIOS UNIDOS,Golf Course,Dessert Shop,Pizza Place,Bakery,Seafood Restaurant
2,BOSA,Restaurant,Grocery Store,Supermarket,Motorcycle Shop,Shopping Mall
3,CHAPINERO,Trail,Water Park,Diner,Construction & Landscaping,Convenience Store
4,ENGATIVA,Fast Food Restaurant,Fried Chicken Joint,Convenience Store,Burger Joint,Optical Shop
5,FONTIBON,Bar,Fried Chicken Joint,Latin American Restaurant,Furniture / Home Store,Grocery Store
6,KENNEDY,Department Store,Coffee Shop,Soccer Field,Diner,Construction & Landscaping
7,LA CANDELARIA,Café,Restaurant,History Museum,Latin American Restaurant,Italian Restaurant
8,LOS MARTIRES,Shopping Mall,Burger Joint,Clothing Store,Steakhouse,Deli / Bodega
9,PUENTE ARANDA,Burger Joint,Burrito Place,Grocery Store,Latin American Restaurant,Cultural Center
