<h1 align=center><font size = 5> ANALYSIS OF DESIRABLE NEIGHBORHOODS IN LATIN-AMERICAN CITIES FOR EXPANSION</font></h1>


##### IBM Data Science Applied Capstone Project - April 2020

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Introduction</a>

2. <a href="#item2">Data</a>

3. <a href="#item3">Methodology</a>

4. <a href="#item4">Results</a>

5. <a href="#item5">Discussion</a>    

5. <a href="#item5">Conclusion</a>   
    

## 1. Introduction:

A renowed Colombian food chain firm is analyzing a business expansion, this includes opening 4 new restaurants in the latinamerican capitals before 2020 ends. Covid19 effects on the food business may offer good opportunities for new participants entering the market. The venue selection for the new restaurants plays a major role in order to maximize profits and mitigate possible deployment risks, for this purpose the first stage of selection will be locating neighborhoods with the closests conditions to those restaurants already in operation locally.

## 2. Data:

This report will provide an analysis of 4 latinamerican capitals neighborhoods in order to clasify and identify one or more neighborhoods in each city with the closest characteristics to the neighborhoods were does the company has already operations in Colombia. This will be the first step in the venue selection process. The analysis will require the obtention of information of all heighborhoods into Rio de Janeiro, Lima, Buenos Aires, Montevideo and Bogota. Conforming datasets per citie with neighborhoods and coordinates. 

Finally an additional dataset will be required for Bogota in order to select only the Neighborhood where does the firm has operative restaurants.


## 3. Methodology

The information collection process will require to acquire datasets for the different neighborhoods for the pre-selected cities, this part of the process will be done by using Wikipedia information which is easily accessed online. The project will be segmented into different notebooks to make results and analysis more readable, however the different stages will be provided with external links to the sub-project notebooks. Once all neighborhoods data is obtained and processed foursquare will be used in order to identify the main characteristics in terms of venues available at each neighborhood. Another dataset needs to be obtained, this corresponds to the current neighborhoods where does the firm is successfully operating its restaurants in Bogota, this dataset will be the key to compare and find similarities to the other capitals.

Neighborhood clustering will be performed by usng k-means, this tool will be used due it's simplicity to implement and effectiveness, the expected result will be a selection of the of neighborhoods that may offer the same type of services or venues similar to existing conditions of the operation in Bogota.

Results will be limited to provide a first glance of possible neighborhoods were does the conditions are similar to the existing restaurants, but will not be providing specific venue designation. Specific venue designation needs to be processed later to include budgetarian and viability analysis.

### Stage 1. Datasets acquisition:

#### 1.1 WEB SCRAPING TO COLLECT NEIGHBORHOODS INFORMATION

The first step in this data acquisition was to use webscraping in order to download the neighborhoods information from wikipedia, however it takes a lot of computation time, therefore here we'll be using only the resultant dataset, nevertheless the full process for the data acquisition is available at the following link:

[Webscraping Notebook](https://github.com/AndresReinoso/Coursera_Capstone/blob/master/WEBSCRAPING%20BOGOTA.ipynb)

Notice this link only shows the process for Bogota, however can be easily extended to the remaining cities, in this case we'll use a pre-processed csv file with the results already filled in and will import it to make it shorter, this will be highly beneficial as long as both geopy and foursquare have limited calls per day.

**Note**. This primary step was done on Watson Studio as long as Skills Network environment does not have BS installed.

#### 1.2 COORDINATES OBTENTION USING GEOPY 

Another important step on the acquisition of the data corresponds to the query needed to obtain the geographical coordinates of the different neighborhoods, unfortunately I've found two restrictions here, first takes a long time to process the query for the 1900 plus neighborhoods in Bogota and also only 2500 calls were possible per day, therefore it's difficult to implement this is one single notebook.

[Geopy Notebook](https://github.com/AndresReinoso/Coursera_Capstone/blob/master/GEOPY%20COORDINATES%20SEARCH.ipynb)

So in this case we'll be using again a pre-processed csv file located on my project in Watson Studio. However below you can  find all the details on how I was able to construct the datasets with coordinates.

#### 1.3 LOADING THE CSV FILES FOR ALL LOCATIONS WITH COORDINATES

So, once cleared the webscraping and coordinates obtention, we are ready to load the datasets to the main notebooK, but first let's import the libraries needed to move forward:

In [1]:
import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values
        
import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Collecting package metadata (current_repodata.json): done
Solving environment: done

## Package Plan ##

  environment location: /home/jupyterlab/conda/envs/python

  added / updated specs:
    - geopy


The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    geographiclib-1.50         |             py_0          34 KB  conda-forge
    geopy-1.21.0               |             py_0          58 KB  conda-forge
    openssl-1.1.1g             |       h516909a_0         2.1 MB  conda-forge
    ------------------------------------------------------------
                                           Total:         2.2 MB

The following NEW packages will be INSTALLED:

  geographiclib      conda-forge/noarch::geographiclib-1.50-py_0
  geopy              conda-forge/noarch::geopy-1.21.0-py_0

The following packages will be UPDATED:

  openssl                                 1.1.1f-h516909a_0 --> 1.1.1g-h51

Loading CSV's

Starting with Rio de Janeiro, same process is repeated to all datasets.

In [69]:
df_rio = pd.read_csv (r'/resources/labs/DP0701EN/RIOCOORDINATES.csv')
df_rio2=df_rio.drop(['Latitude', 'Longitude'], axis=1)
df_rio2['City']="Rio de Janeiro"  # please notice this part will be used later to create the list of neighborhoods by City
df_rio.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Caju,-22.880306,-43.221494
1,Gamboa,-22.897749,-43.192904
2,Santo Cristo,-22.900766,-43.203393
3,Saúde,-22.897184,-43.184154
4,Centro,-22.904393,-43.183065


In [70]:
df_lima = pd.read_csv (r'/resources/labs/DP0701EN/LIMACOORDINATES.csv')
df_lima2=df_lima.drop(['Latitude', 'Longitude'], axis=1)
df_lima2['City']="Lima"  # please notice this part will be used later to create the list of neighborhoods by City
df_lima.head() 

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Ancón,-11.696554,-77.111655
1,Ate Vitarte,-12.036748,-76.932624
2,Barranco,-12.143959,-77.020268
3,Breña,-12.0597,-77.050119
4,Carabayllo,-11.794993,-76.989292


In [71]:
df_baires = pd.read_csv (r'/resources/labs/DP0701EN/BAIRESCOORDINATES.csv')
df_baires2=df_baires.drop(['Latitude', 'Longitude'], axis=1)
df_baires2['City']="Buenos Aires"  # please notice this part will be used later to create the list of neighborhoods by City
df_baires.head() 

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Agronomía,-34.591516,-58.485385
1,Almagro,-34.609988,-58.422233
2,Barracas,-34.645285,-58.387562
3,Belgrano,-34.561308,-58.456545
4,Boedo,-34.630252,-58.41879


In [72]:
df_montev = pd.read_csv (r'/resources/labs/DP0701EN/MONTEVCOORDINATES.csv')
df_montev2=df_montev.drop(['Latitude', 'Longitude'], axis=1)
df_montev2['City']="Montevideo"  # please notice this part will be used later to create the list of neighborhoods by City
df_montev.head() 

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Ciudad Vieja,-34.906351,-56.20598
1,Centro,-34.906067,-56.189656
2,Barrio Sur,-34.911202,-56.194784
3,Cordón,-34.900827,-56.180125
4,Palermo,-34.911351,-56.184365


Finally, the dataset with coordinates for Bogota, please notice that for Bogota, a much more reduced dataset from the original one on Wikipedia was used, this is because only those neighborhoods were does the Firm we are analyzing has restaurants already in operation were taken into account, in order to make it easier.

In [74]:
df_bogota = pd.read_csv (r'/resources/labs/DP0701EN/NeighborhoodsBog.csv')
df_bogota2=df_bogota.drop(['Neighborhood Latitude', 'Neighborhood Longitude'], axis=1)
df_bogota2['City']="Bogota"  # please notice this part will be used later to create the list of neighborhoods by City
df_bogota.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude
0,Ainsuca,4.630006,-74.066195
1,Belmira,4.719519,-74.030192
2,La Sonora,4.61053,-74.113048
3,Nueva Autopista,4.721539,-74.048015
4,Escuela de Infantería,4.683122,-74.043817


In [7]:
# This step only renames the columns in order to have the same names in all cities datasets and be able to merge them.
df_bogota.rename(columns={'Neighborhood Latitude': 'Latitude','Neighborhood Longitude': 'Longitude'}, inplace=True)
df_bogota.tail()

Unnamed: 0,Neighborhood,Latitude,Longitude
43,El Motorista,4.636931,-74.098484
44,Bombay,4.657891,-74.060228
45,Las Torres,4.619145,-74.084762
46,Casa Loma,4.631093,-74.070698
47,Perpetuo Socorro,4.612864,-74.065908


### Stage 2. Cleaning up and merging the data:

In [8]:
# Now let's merge the different dataframes into a single total one.
df_total = df_rio.append([df_baires,df_lima,df_montev,df_bogota])
df_total.tail() # checking if the dataset is complete and working!

Unnamed: 0,Neighborhood,Latitude,Longitude
43,El Motorista,4.636931,-74.098484
44,Bombay,4.657891,-74.060228
45,Las Torres,4.619145,-74.084762
46,Casa Loma,4.631093,-74.070698
47,Perpetuo Socorro,4.612864,-74.065908


In [9]:
# First we will check how many nan rows are into the file
nan_df=df_total[df_total.isna().any(axis=1)]
nan_df.head()

Unnamed: 0,Neighborhood,Latitude,Longitude


No NaN data was found which, so next we'll find if duplicates need to be removed.

In [10]:
#Removing duplicates on the dataset
df_total.drop_duplicates(subset= 'Neighborhood',keep="first", inplace=True)
df_total.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Caju,-22.880306,-43.221494
1,Gamboa,-22.897749,-43.192904
2,Santo Cristo,-22.900766,-43.203393
3,Saúde,-22.897184,-43.184154
4,Centro,-22.904393,-43.183065


In [11]:
# Now checking our final dataset shape
df_total.shape

(356, 3)

### Stage 3. Obtaining venues information

Once ready the full list of neighborhoods we'll proceed with the query to determine the most common venues for every neighborhood into our dataset, then we'll sort and select the 10 most common places in order to organize our clusters based on this criteria.

Obtaining foursquare credentials:

In [20]:
CLIENT_ID = 'COISKV5Z0DSXOOOGO34XSCOW1GXVTLMJHYUEQAMBTBFF5LKE' # your Foursquare ID
CLIENT_SECRET = 'GEFPX4HIOT2DX4AVVKK3PUYRQSHVKSPZ1FK1Q5A2EOAMUDYB' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
REGISTERED_REDIRECT_URI='https://www.google.com'
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: COISKV5Z0DSXOOOGO34XSCOW1GXVTLMJHYUEQAMBTBFF5LKE
CLIENT_SECRET:GEFPX4HIOT2DX4AVVKK3PUYRQSHVKSPZ1FK1Q5A2EOAMUDYB


In [21]:
#foursquare authentication process
#Step 1
url = 'https://foursquare.com/oauth2/authenticate?client_id='+CLIENT_ID+'&response_type=code&redirect_uri='+REGISTERED_REDIRECT_URI
print(url)

https://foursquare.com/oauth2/authenticate?client_id=COISKV5Z0DSXOOOGO34XSCOW1GXVTLMJHYUEQAMBTBFF5LKE&response_type=code&redirect_uri=https://www.google.com


In [22]:
# Including the code obtained on the google redirected website
code='2BJS13XO5FATKU2ALFNVPYF2513TJ1VBJCFFNFMCJQDFH0IO#_=_'

In [23]:
#Step 3 Obtaining the access token
url2= 'https://foursquare.com/oauth2/access_token?client_id='+CLIENT_ID+'&client_secret='+CLIENT_SECRET+'&grant_type=authorization_code&redirect_uri='+REGISTERED_REDIRECT_URI+'&code='+code
print(url2)

https://foursquare.com/oauth2/access_token?client_id=COISKV5Z0DSXOOOGO34XSCOW1GXVTLMJHYUEQAMBTBFF5LKE&client_secret=GEFPX4HIOT2DX4AVVKK3PUYRQSHVKSPZ1FK1Q5A2EOAMUDYB&grant_type=authorization_code&redirect_uri=https://www.google.com&code=2BJS13XO5FATKU2ALFNVPYF2513TJ1VBJCFFNFMCJQDFH0IO#_=_


In [24]:
# Retrieving the access token
ACCESS ='I2SAAVASBE4HQABDZ0H3SDMIANEFSRU0IDYMOAQGQJEXIWPC'
LIMIT = 100
radius = 500

In [25]:
# Function to obtain the nearby venues
def getNearbyVenues(names, latitudes, longitudes, radius=500):    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        user_id='484542633'
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&oauth_token={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,ACCESS)         
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']      
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']   
    return(nearby_venues)

In [26]:
# Apply the function on our df_total dataset
Venues_results = getNearbyVenues(names=df_total['Neighborhood'],
                                   latitudes=df_total['Latitude'],
                                   longitudes=df_total['Longitude']
                                  )

In [27]:
Venues_results.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Caju,-22.880306,-43.221494,Viação Cometa S/A,-22.87809,-43.2238,Bus Station
1,Caju,-22.880306,-43.221494,Seninha,-22.882607,-43.217525,Gun Range
2,Gamboa,-22.897749,-43.192904,Instituto de Pesquisa e Memória Pretos Novos (...,-22.895975,-43.192908,History Museum
3,Gamboa,-22.897749,-43.192904,Mississippi Delta Blues Bar,-22.896467,-43.19461,Bar
4,Gamboa,-22.897749,-43.192904,Tocando o terror no Youpix com a DarkSide®,-22.896161,-43.188457,Bookstore


In [28]:
# Now, I'll be saving the total dataset into a csv to have it available in case I've be needing to access the informmation again.
Venues_results.to_csv('Venues_results_total.csv',index=False)

### Stage 4. Analyze Each Neighborhood

In [29]:
# one hot encoding
Neighborhoods_onehot = pd.get_dummies(Venues_results[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
Neighborhoods_onehot['Neighborhood'] = Venues_results['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [Neighborhoods_onehot.columns[-1]] + list(Neighborhoods_onehot.columns[:-1])
Neighborhoods_onehot = Neighborhoods_onehot[fixed_columns]

Neighborhoods_onehot.head()

Unnamed: 0,Yoga Studio,ATM,Acai House,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Amphitheater,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auditorium,Austrian Restaurant,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Basketball Stadium,Bathing Area,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cable Car,Cafeteria,Café,Cajun / Creole Restaurant,Campground,Candy Store,Caribbean Restaurant,Casino,Cave,Central Brazilian Restaurant,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Quad,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dive Shop,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Entertainment Service,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Film Studio,Fire Station,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Football Stadium,Fountain,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Go Kart Track,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hockey Field,Home Service,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Insurance Office,Internet Cafe,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lawyer,Leather Goods Store,Light Rail Station,Lighting Store,Lingerie Store,Liquor Store,Lottery Retailer,Lounge,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Military Base,Mineiro Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Nature Preserve,Neighborhood,New American Restaurant,Nightclub,Noodle House,Northeastern Brazilian Restaurant,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Outdoors & Recreation,Outlet Store,Paintball Field,Paper / Office Supplies Store,Park,Pastelaria,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Piercing Parlor,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,River,Road,Rock Club,Roof Deck,Rugby Stadium,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Salsa Club,Samba School,Sandwich Place,Sausage Shop,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Ski Trail,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southeastern Brazilian Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Stationery Store,Steakhouse,Street Art,Street Fair,Student Center,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Swiss Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tapiocaria,Tattoo Parlor,Taxi Stand,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tiki Bar,Toll Booth,Tourist Information Center,Toy / Game Store,Track,Trail,Train,Train Station,Tram Station,Tunnel,Turkish Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfall,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Caju,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Caju,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Gamboa,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Gamboa,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,Gamboa,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [30]:
Neighborhoods_onehot.shape

(8471, 398)

In [31]:
Neighborhoods_grouped = Neighborhoods_onehot.groupby('Neighborhood').mean().reset_index()
Neighborhoods_grouped.head()

Unnamed: 0,Neighborhood,Yoga Studio,ATM,Acai House,Accessories Store,Adult Boutique,Afghan Restaurant,American Restaurant,Amphitheater,Arcade,Arepa Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Auditorium,Austrian Restaurant,Auto Dealership,Auto Garage,Auto Workshop,Automotive Shop,BBQ Joint,Baby Store,Bagel Shop,Bakery,Bank,Bar,Baseball Field,Basketball Court,Basketball Stadium,Bathing Area,Beach,Beach Bar,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Belgian Restaurant,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Board Shop,Boarding House,Boat or Ferry,Bookstore,Border Crossing,Botanical Garden,Boutique,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Bridge,Bubble Tea Shop,Buffet,Building,Burger Joint,Burrito Place,Bus Line,Bus Station,Bus Stop,Business Service,Butcher,Cable Car,Cafeteria,Café,Cajun / Creole Restaurant,Campground,Candy Store,Caribbean Restaurant,Casino,Cave,Central Brazilian Restaurant,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Club House,Cocktail Bar,Coffee Shop,College Academic Building,College Quad,Colombian Restaurant,Comedy Club,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Convention Center,Cosmetics Shop,Costume Shop,Coworking Space,Creperie,Cuban Restaurant,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Design Studio,Dessert Shop,Dim Sum Restaurant,Diner,Discount Store,Dive Bar,Dive Shop,Doctor's Office,Dog Run,Donut Shop,Drugstore,Dumpling Restaurant,Eastern European Restaurant,Electronics Store,Empanada Restaurant,English Restaurant,Entertainment Service,Event Service,Event Space,Exhibit,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Film Studio,Fire Station,Fish & Chips Shop,Fish Market,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Service,Food Stand,Food Truck,Football Stadium,Fountain,Frame Store,French Restaurant,Fried Chicken Joint,Frozen Yogurt Shop,Fruit & Vegetable Store,Furniture / Home Store,Gaming Cafe,Garden,Garden Center,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Go Kart Track,Golf Course,Gourmet Shop,Government Building,Greek Restaurant,Grocery Store,Gun Range,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Harbor / Marina,Hardware Store,Hawaiian Restaurant,Health & Beauty Service,Health Food Store,Heliport,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Hockey Arena,Hockey Field,Home Service,Hostel,Hot Dog Joint,Hotel,Hotel Bar,Hotel Pool,IT Services,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Insurance Office,Internet Cafe,Intersection,Irish Pub,Italian Restaurant,Japanese Restaurant,Jazz Club,Jewelry Store,Jewish Restaurant,Juice Bar,Karaoke Bar,Kebab Restaurant,Kids Store,Kitchen Supply Store,Korean Restaurant,Lake,Latin American Restaurant,Laundromat,Lawyer,Leather Goods Store,Light Rail Station,Lighting Store,Lingerie Store,Liquor Store,Lottery Retailer,Lounge,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Military Base,Mineiro Restaurant,Miscellaneous Shop,Mobile Phone Shop,Modern European Restaurant,Molecular Gastronomy Restaurant,Monument / Landmark,Moroccan Restaurant,Motel,Motorcycle Shop,Mountain,Movie Theater,Moving Target,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Nature Preserve,New American Restaurant,Nightclub,Noodle House,Northeastern Brazilian Restaurant,Office,Opera House,Optical Shop,Organic Grocery,Other Great Outdoors,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Outdoors & Recreation,Outlet Store,Paintball Field,Paper / Office Supplies Store,Park,Pastelaria,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Perfume Shop,Peruvian Restaurant,Pet Service,Pet Store,Pharmacy,Photography Studio,Pie Shop,Pier,Piercing Parlor,Pilates Studio,Pizza Place,Planetarium,Platform,Playground,Plaza,Pool,Pool Hall,Portuguese Restaurant,Print Shop,Pub,Public Art,Racetrack,Ramen Restaurant,Record Shop,Recording Studio,Recreation Center,Rental Car Location,Rental Service,Residential Building (Apartment / Condo),Resort,Rest Area,Restaurant,River,Road,Rock Club,Roof Deck,Rugby Stadium,Russian Restaurant,Sake Bar,Salad Place,Salon / Barbershop,Salsa Club,Samba School,Sandwich Place,Sausage Shop,Scandinavian Restaurant,Scenic Lookout,School,Science Museum,Seafood Restaurant,Shoe Store,Shop & Service,Shopping Mall,Shopping Plaza,Skate Park,Ski Trail,Smoke Shop,Smoothie Shop,Snack Place,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southeastern Brazilian Restaurant,Southern / Soul Food Restaurant,Souvenir Shop,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Sports Club,Stadium,Stationery Store,Steakhouse,Street Art,Street Fair,Student Center,Supermarket,Supplement Shop,Sushi Restaurant,Swim School,Swiss Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tapiocaria,Tattoo Parlor,Taxi Stand,Tea Room,Tennis Court,Tex-Mex Restaurant,Thai Restaurant,Theater,Theme Park,Theme Park Ride / Attraction,Theme Restaurant,Thrift / Vintage Store,Tiki Bar,Toll Booth,Tourist Information Center,Toy / Game Store,Track,Trail,Train,Train Station,Tram Station,Tunnel,Turkish Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Venezuelan Restaurant,Veterinarian,Video Game Store,Video Store,Vietnamese Restaurant,Warehouse Store,Water Park,Waterfall,Waterfront,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Women's Store
0,Abayubá,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Abolição,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.142857,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.071429,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.035714,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Acari,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Agronomía,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Aguada,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [32]:
Neighborhoods_grouped.to_csv('Neighborhoods_grouped.csv',index=False)

In [36]:
import numpy as np

In [37]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [38]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = Neighborhoods_grouped['Neighborhood']

for ind in np.arange(Neighborhoods_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Neighborhoods_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Abayubá,Gym / Fitness Center,Plaza,Food & Drink Shop,Bus Stop,Mobile Phone Shop,Factory,Basketball Court,Bakery,Pizza Place,Stadium
1,Abolição,BBQ Joint,Deli / Bodega,Burger Joint,Bus Station,Portuguese Restaurant,Gymnastics Gym,Food Truck,Frame Store,Steakhouse,Snack Place
2,Acari,Market,Ice Cream Shop,Churrascaria,Soccer Field,Pizza Place,Film Studio,Event Space,Exhibit,Fabric Shop,Factory
3,Agronomía,Bus Stop,BBQ Joint,Farmers Market,Plaza,Garden Center,Burger Joint,Trail,Athletics & Sports,Tunnel,Fish & Chips Shop
4,Aguada,Bakery,Nightclub,Train Station,Food & Drink Shop,Fast Food Restaurant,Convenience Store,Pizza Place,Ice Cream Shop,Rental Car Location,Event Space


Forming the clusters

In [39]:
# set number of clusters
kclusters = 5

Neighborhoods_grouped_clustering = Neighborhoods_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Neighborhoods_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([4, 4, 3, 4, 4, 2, 4, 2, 2, 2], dtype=int32)

In [40]:
# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

neighborhoods_merged = df_total

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
neighborhoods_merged = neighborhoods_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

neighborhoods_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Caju,-22.880306,-43.221494,2.0,Bus Station,Gun Range,Women's Store,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Film Studio
1,Gamboa,-22.897749,-43.192904,4.0,Brazilian Restaurant,Factory,Restaurant,Theater,Fast Food Restaurant,Cable Car,Samba School,Bus Station,History Museum,Mountain
2,Santo Cristo,-22.900766,-43.203393,4.0,Brazilian Restaurant,Tram Station,Pharmacy,Factory,Hotel,Bus Station,Restaurant,Photography Studio,Bar,Entertainment Service
3,Saúde,-22.897184,-43.184154,4.0,Brazilian Restaurant,Bar,Music Venue,Nightclub,Restaurant,Tram Station,Dive Bar,Seafood Restaurant,Bistro,Coffee Shop
4,Centro,-22.904393,-43.183065,2.0,Brazilian Restaurant,Middle Eastern Restaurant,Café,Coffee Shop,Chocolate Shop,Music Venue,Bakery,Bar,Gourmet Shop,Art Gallery


Now, creating the map of clusters we have the final results, please notice you'll have to navigate on the map, as long as the initial coordinates will correspond to Bogota exclusively.

In [42]:
address = 'Bogota'
geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Bogota are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Bogota are 4.59808, -74.0760439.


In [45]:
# Here need to cast to int the cluster labels and also replaced the Nan by 0:
neighborhoods_merged['Cluster Labels']=neighborhoods_merged['Cluster Labels'].replace(np.nan,0)
neighborhoods_merged['Cluster Labels']=neighborhoods_merged['Cluster Labels'].astype(int)

In [48]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=10)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(neighborhoods_merged['Latitude'], neighborhoods_merged['Longitude'], neighborhoods_merged['Neighborhood'], neighborhoods_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

We can notice from the map generated above that all of the neighborhoods in Bogota correspond to the same **cluster 2**, therefore we will analyze this cluster in order to check all Neighborhoods with similar characteristics found with K-means

### Stage 5. Analyzing the Cluster 2 Results

Here we have a first glance of the 10 most common places that are found in the Neighborhoods for the cluster 2:

In [54]:
neighborhoods_merged.loc[neighborhoods_merged['Cluster Labels'] == 2, neighborhoods_merged.columns[[0] + list(range(4, neighborhoods_merged.shape[1]))]]

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Caju,Bus Station,Gun Range,Women's Store,Fabric Shop,Factory,Falafel Restaurant,Farm,Farmers Market,Fast Food Restaurant,Film Studio
4,Centro,Brazilian Restaurant,Middle Eastern Restaurant,Café,Coffee Shop,Chocolate Shop,Music Venue,Bakery,Bar,Gourmet Shop,Art Gallery
10,Botafogo,Dessert Shop,Hotel,Japanese Restaurant,Coffee Shop,Pizza Place,Bookstore,Brazilian Restaurant,Scenic Lookout,Chocolate Shop,Creperie
11,Catete,Hotel,Bar,Dance Studio,Gym / Fitness Center,Hostel,Coffee Shop,Brazilian Restaurant,Café,Stationery Store,Farmers Market
12,Cosme Velho,Café,Tourist Information Center,Bakery,Pharmacy,Event Service,Fast Food Restaurant,Sushi Restaurant,BBQ Joint,Garden,Coffee Shop
15,Humaitá,Bar,Gift Shop,Pie Shop,Deli / Bodega,Café,Bakery,Restaurant,Farmers Market,Gym / Fitness Center,Supermarket
16,Laranjeiras,Bar,Gym / Fitness Center,Lounge,Fruit & Vegetable Store,Yoga Studio,Bakery,Thai Restaurant,Beer Garden,Gastropub,Sports Club
18,Copacabana,Hotel,Gym / Fitness Center,Pizza Place,Beach Bar,Ice Cream Shop,Bakery,Bar,Gym,Restaurant,Theater
19,Leme,Hotel,Beach,Scenic Lookout,Seafood Restaurant,Beer Garden,Pizza Place,Mountain,Bar,Brazilian Restaurant,Historic Site
20,Gávea,Art Gallery,Park,Cultural Center,Brazilian Restaurant,Exhibit,History Museum,College Quad,Athletics & Sports,Bakery,Breakfast Spot


Finally, we'll generate a list of the neighborhoods by country:

In [87]:
df_results_segmentation=neighborhoods_merged[['Neighborhood','Cluster Labels']]
df_results_segmentation.head(3)

Unnamed: 0,Neighborhood,Cluster Labels
0,Caju,2
1,Gamboa,4
2,Santo Cristo,4


In [88]:
df_results_segmentation.shape

(356, 2)

In [89]:
# Now let's merge the different dataframes generated at first with the City name for each neighborhood into a single total one.
df_total2 = df_rio2.append([df_baires2,df_lima2,df_montev2,df_bogota2])
df_total2.tail(3) # checking if the dataset is complete and working!

Unnamed: 0,Neighborhood,City
45,Las Torres,Bogota
46,Casa Loma,Bogota
47,Perpetuo Socorro,Bogota


In [103]:
# Now merging this with the results of the k-means.
result = pd.merge(df_total2, df_results_segmentation, on='Neighborhood')
result.rename(columns={'Cluster Labels': 'Cluster'}, inplace=True)
result.head()

Unnamed: 0,Neighborhood,City,Cluster
0,Caju,Rio de Janeiro,2
1,Gamboa,Rio de Janeiro,4
2,Santo Cristo,Rio de Janeiro,4
3,Saúde,Rio de Janeiro,4
4,Centro,Rio de Janeiro,2


Now, filtering data to obtain only cluster 2 results:

In [133]:
cluster=result.Cluster ==2
results_cluster2=result[cluster]

In [130]:
results_cluster2.head()

Unnamed: 0,Neighborhood,City,Cluster
0,Caju,Rio de Janeiro,2
4,Centro,Rio de Janeiro,2
5,Centro,Montevideo,2
11,Botafogo,Rio de Janeiro,2
12,Catete,Rio de Janeiro,2


In [None]:
results_cluster2.set_index('City',inplace=True) # this line changes the index of our df

In [119]:
results_cluster2.sort_values(by=['City'],inplace=True,ascending=False) # Sorting the df 
results_cluster2.shape

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  """Entry point for launching an IPython kernel.


(165, 2)

With this we've finalized the analysis of k-means for the different neighborhoods evaluated.

## 5. Results

So finally we have completed our data acquisition, data preparation and neighborhood segmentation processes, with this we have generated a dataframe containing all the neighborhoods that have similar conditions/characteristics with the neighborhoods located in Bogota where does our firm has operative restaurants. The list included on the dataframe results_cluster2 constitutes the first part of the venue selection process for the firm restaurants expansion.

However, before ending, we'll check some details per each capital:

In [142]:
results_cluster2.groupby('City').count()

Unnamed: 0_level_0,Neighborhood,Cluster
City,Unnamed: 1_level_1,Unnamed: 2_level_1
Bogota,48,48
Buenos Aires,29,29
Lima,33,33
Montevideo,24,24
Rio de Janeiro,31,31


As you may notice, the complete dataframe still contains the information for the neighborhoods in Bogota, however we can easily removed those ones if needed. Now, here we can find that most of the cities evaluated have strong similarities to those neighborhoods we use for comparison. Ranging from 24 to 33 neighborhoods for Capital.

We can also generate the full list by City if needed:

## 5. Discussion

Even when does our deliverable (main list with desirable neighborhoods for expansion) was already obtained, it's also recommended to check for the specific neighborhoods that were pre-selected in order to find information about but not limited to: rent/sqr meter, rental contracts on the area and other relevant factors that were not taken into account primarly, in order to find more suitable venues for the possible expansion.

For this probably we may need to separate by city the dataframe generated before, therefore here's the first step which will need to be completed with local information in each case.

### Neighborhoods Lima:

In [140]:
City2=result.City == "Lima"
results_lima=results_cluster2[City2]
results_lima.head()

  


Unnamed: 0,Neighborhood,City,Cluster
198,Ate Vitarte,Lima,2
199,Barranco,Lima,2
200,Breña,Lima,2
201,Carabayllo,Lima,2
203,Chorrillos,Lima,2


In [141]:
results_lima.shape

(33, 3)

So for Lima we have found 33 possible neighborhoods, where does the restaurants viability can be assesed.

### 6. Conclusion

During the analysis of the different neighborhoods that may potentially be included into the 2020 expansion plan for the food chain company used as reference, we found a total of 117 options to deploy the new restaurants, those correspond exclusively to the neighborhoods on the same cluster as the current operative restaurants.

A dataframe containing all neighborhoods was generated fulfilling the main objective of this stage, however there is still much more information needed in order to make a more accurate selection. 

Further steps may include collecting current restaurants square meter footage, rental costs, and of course databases for all neighborhoods venues available for rental, which can be included on future steps.

### This concludes my Project Capstone, thanks!