# Introduction/Business Problem
---

`Clearly define a problem or an idea of your choice, where you would need to leverage the Foursquare location data to solve or execute. Remember that data science problems always target an audience and are meant to help a group of stakeholders solve a problem, so make sure that you explicitly describe your audience and why they would care about your problem`


If someone is looking to open a restaurant in blumenau, which neighbourhood would you recommend?

This is the defining problem for this capstone final project. The audience would be anyone who wants to or is thinking about starting a restuarent in Blumenau. Blumenau, is a small, yet, rapidly growing city in the south of Brazil. Because the city is growing, Blumenau has become a hot spot or perfect place to begin a restaurant. 

Numerous events occur in the city that promote the ever increaseing influx of foreigners, domestic and international, such as Oktober fest. As such, Blumenau does indeed offer a prefect place to begin a restaurant.  


# Data
`Describe the data that you will be using to solve the problem or execute your idea. Remember that you will need to use the Foursquare location data to solve the problem or execute your idea. You can absolutely use other datasets in combination with the Foursquare location data. So make sure that you provide adequate explanation and discussion, with examples, of the data that you will be using, even if it is only Foursquare location data.`

I will be using a simple table of neighbourhoods also known as bairros in portguese. The data can be acquired from the local government [website](https://www.blumenau.sc.gov.br/secretarias/secretaria-de-desenvolvimento-urbano/pagina/historia-sobre-municipio/divisa-administrativa-bairros). Foursquare can then be used after the geocoordinates of each barrio is found.

The following is the data from the table that will be scrapped using BeautifulSoup:

Sobre o Município - Bairros - Divisão Administrativa
Bairros - Divisão Administrativa
Bairro Água Verde
Bairro Badenfurt
Bairro Boa Vista
Bairro Bom Retiro
Bairro Centro
Bairro Da Glória
Bairro Do Salto
Bairro Escola Agrícola
Bairro Fidélis
Bairro Fortaleza
Bairro Fortaleza Alta
Bairro Garcia
Bairro Itoupava Central
Bairro Itoupava Norte
Bairro Itoupava Seca
Bairro Itoupavazinha
Bairro Jardim Blumenau
Bairro Nova Esperança
Bairro Passo Manso
Bairro Ponta Aguda
Bairro Progresso
Bairro Ribeirão Fresco
Bairro Salto do Norte
Bairro Salto Weissbach
Bairro Testo Salto
Bairro Tribess
Bairro Valparaíso
Bairro Velha
Bairro Velha Central
Bairro Velha Grande
Bairro Victor Konder
Bairro Vila Formosa
Bairro Vila Itoupava
Bairro Vila Nova
Bairro Vorstardt



In [1]:
import os
import time
import json, requests
from bs4 import BeautifulSoup
import pandas as pd
import numpy as np

from dotenv import load_dotenv
load_dotenv()

client_id = os.getenv("client_id")
client_secret = os.getenv("client_secret")
version = '20180604'
limit = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/explore'

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import matplotlib.cm as cm
import matplotlib.colors as colors

from sklearn.cluster import KMeans
from sklearn.preprocessing import normalize, MinMaxScaler
from sklearn.model_selection import KFold, GridSearchCV
from sklearn.metrics import silhouette_score

import folium 

from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="capstone project app")

resp = requests.get('https://www.blumenau.sc.gov.br/secretarias/secretaria-de-desenvolvimento-urbano/pagina/historia-sobre-municipio/divisa-administrativa-bairros').text
soup = BeautifulSoup(resp, 'lxml')
data = soup.find('div',{'id':'ultimas'})

address = 'Blumenau, Brazil'

location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Blumenau are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Blumenau are -26.9195567, -49.0658025.


In [2]:
bairros = []
for row in data.findAll('li')[1:]:
    cells = row.find_all(['span'])
   
    try:
        
        if(cells[0].text== 'Bairro Vorstardt'):
            bairro = 'Bairro Vorstadt'
        else:
            bairro = cells[0].text
        
        
    except IndexError:
        continue

    bairros.append(bairro.rstrip())

print(bairros)

['Bairro Água Verde', 'Bairro Badenfurt', 'Bairro Boa Vista', 'Bairro Bom Retiro', 'Bairro Centro', 'Bairro Da Glória', 'Bairro Do Salto', 'Bairro Escola Agrícola', 'Bairro Fidélis', 'Bairro Fortaleza', 'Bairro Fortaleza Alta', 'Bairro Garcia', 'Bairro Itoupava Central', 'Bairro Itoupava Norte', 'Bairro Itoupava Seca', 'Bairro Itoupavazinha', 'Bairro Jardim Blumenau', 'Bairro Nova Esperança', 'Bairro Passo Manso', 'Bairro Ponta Aguda', 'Bairro Progresso', 'Bairro Ribeirão Fresco', 'Bairro Salto do Norte', 'Bairro Salto Weissbach', 'Bairro Testo Salto', 'Bairro Tribess', 'Bairro Valparaíso', 'Bairro Velha', 'Bairro Velha Central', 'Bairro Velha Grande', 'Bairro Victor Konder', 'Bairro Vila Formosa', 'Bairro Vila Itoupava', 'Bairro Vila Nova', 'Bairro Vorstadt']


In [3]:
df = pd.DataFrame(bairros, columns=['Bairros'])
df['Bairros'] = df['Bairros'].map(lambda x: str(x)[7:])

df.head()

Unnamed: 0,Bairros
0,Água Verde
1,Badenfurt
2,Boa Vista
3,Bom Retiro
4,Centro


In [4]:
# def locate(x):
#     try:
#         location = geolocator.geocode('blumenau {}'.format(x))
#         print(x, location.latitude, location.longitude)
#     except:
#         time.sleep(2)
#         location = geolocator.geocode('blumenau {}'.format(x))
#         print(x, location.latitude, location.longitude)
#     time.sleep(2)
#     return location.latitude, location.longitude

# df["Latitude"], df["Longitude"] = zip(*df["Bairros"].map(locate))

# in fact we could just use the coords file, but in case we add addition columns in future df I will leave this way
latlong = pd.read_csv('coords.csv')
df = pd.merge(df, latlong, on='Bairros')

df.head()

Unnamed: 0,Bairros,Latitude,Longitude
0,Água Verde,-26.910743,-49.107369
1,Badenfurt,-26.88306,-49.135753
2,Boa Vista,-26.901357,-49.066842
3,Bom Retiro,-26.925561,-49.071635
4,Centro,-26.919902,-49.065934


# Methodology 

`the main component of the report where you discuss and describe any exploratory data analysis that you did, any inferential statistical testing that you performed, and what machine learnings were used and why.`

**important**

Because we work with a unlabeled dataset, I will use K-means clustering to find interesting groups/clusters within the dataset. I will also use cross validation and ensemble learning to fine-tune the model.

---

After data processing, the latitude and longitude for each bairro was used with Foursquare to obtain a list of venues specifically restuarants. 43 unique categories were found. The 20 most popular venues were selected and then a K-Fold and GridSearchCV with the following values were used:

```python
rand_state=50

folds=3

k_fold = KFold(n_splits=folds, shuffle=True, random_state=rand_state)

hyperparams = {
    "n_clusters": [2, 3, 4],
    "n_init": [10, 15, 20],
    "max_iter": [100, 200, 300, 400, 500],
    "tol": [.0000001, .000001, .00001, .0001],
}
```

`GridSearchCV()` typicall returns best parameters of `{'max_iter': 200, 'n_clusters': 4, 'n_init': 15, 'tol': 1e-05}` with a score of 0.33 (closer to 1 is best).




In [5]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)

        params = dict(
          client_id = client_id,
          client_secret = client_secret,
          v=version,
          ll='{},{}'.format(lat,lng),
          radius=radius,
          query='Restaurant',
          limit=limit
        )

        resp = requests.get(url=url, params=params)
        data = json.loads(resp.text)

        results = data["response"]['groups'][0]['items']      
        
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Bairros', 
                  'Bairros Latitude', 
                  'Bairros Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

blumenau_venues = getNearbyVenues(
    names=df['Bairros'],
    latitudes=df['Latitude'],
    longitudes=df['Longitude'])

Água Verde
Badenfurt
Boa Vista
Bom Retiro
Centro
Da Glória
Do Salto
Escola Agrícola
Fidélis
Fortaleza
Fortaleza Alta
Garcia
Itoupava Central
Itoupava Norte
Itoupava Seca
Itoupavazinha
Jardim Blumenau
Nova Esperança
Passo Manso
Ponta Aguda
Progresso
Ribeirão Fresco
Salto do Norte
Salto Weissbach
Testo Salto
Tribess
Valparaíso
Velha
Velha Central
Velha Grande
Victor Konder
Vila Formosa
Vila Itoupava
Vila Nova
Vorstadt


In [6]:
print('There are {} uniques categories.'.format(len(blumenau_venues['Venue Category'].unique())))

There are 42 uniques categories.


In [7]:
# drop certain Venue Categories
ignore_list = ['Bakery', 'Café', 'Snack Place', 'Food']

# one hot encoding
blumenau_onehot = pd.get_dummies(blumenau_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
blumenau_onehot['Bairros'] = blumenau_venues['Bairros'] 

# move neighborhood column to the first column
fixed_columns = [blumenau_onehot.columns[-1]] + list(blumenau_onehot.columns[:-1])
blumenau_onehot = blumenau_onehot[fixed_columns]

blumenau_onehot.drop(ignore_list, axis=1, inplace=True)

blumenau_onehot.head()

Unnamed: 0,Bairros,American Restaurant,BBQ Joint,Bagel Shop,Bistro,Brazilian Restaurant,Breakfast Spot,Burger Joint,Cafeteria,Chinese Restaurant,Churrascaria,Creperie,Deli / Bodega,Diner,Fast Food Restaurant,Fish & Chips Shop,Food Court,Food Stand,Food Truck,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Hawaiian Restaurant,Hot Dog Joint,Italian Restaurant,Japanese Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Pastelaria,Pizza Place,Restaurant,Salad Place,Sandwich Place,Southern Brazilian Restaurant,Steakhouse,Sushi Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant
0,Água Verde,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Água Verde,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Badenfurt,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
3,Badenfurt,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Boa Vista,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0


In [8]:
blumenau_grouped = blumenau_onehot.groupby('Bairros').mean().reset_index()
blumenau_grouped

Unnamed: 0,Bairros,American Restaurant,BBQ Joint,Bagel Shop,Bistro,Brazilian Restaurant,Breakfast Spot,Burger Joint,Cafeteria,Chinese Restaurant,Churrascaria,Creperie,Deli / Bodega,Diner,Fast Food Restaurant,Fish & Chips Shop,Food Court,Food Stand,Food Truck,Fried Chicken Joint,Gastropub,German Restaurant,Gluten-free Restaurant,Hawaiian Restaurant,Hot Dog Joint,Italian Restaurant,Japanese Restaurant,Mexican Restaurant,Middle Eastern Restaurant,Pastelaria,Pizza Place,Restaurant,Salad Place,Sandwich Place,Southern Brazilian Restaurant,Steakhouse,Sushi Restaurant,Turkish Restaurant,Vegetarian / Vegan Restaurant
0,Badenfurt,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Boa Vista,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0
2,Bom Retiro,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Centro,0.0,0.0,0.0,0.0,0.166667,0.016667,0.083333,0.0,0.016667,0.0,0.0,0.0,0.016667,0.05,0.0,0.033333,0.0,0.0,0.0,0.016667,0.0,0.016667,0.016667,0.0,0.066667,0.033333,0.0,0.0,0.016667,0.066667,0.1,0.0,0.016667,0.016667,0.0,0.0,0.0,0.05
4,Da Glória,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Do Salto,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
6,Escola Agrícola,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Fortaleza,0.0,0.0,0.0,0.0,0.083333,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.083333,0.0,0.0,0.0,0.0,0.166667,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Fortaleza Alta,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Garcia,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.166667,0.0,0.0


In [9]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [10]:
num_top_venues = 20

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Bairros']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
bairros_venues_sorted = pd.DataFrame(columns=columns)
bairros_venues_sorted['Bairros'] = blumenau_grouped['Bairros']

for ind in np.arange(blumenau_grouped.shape[0]):
    bairros_venues_sorted.iloc[ind, 1:] = return_most_common_venues(blumenau_grouped.iloc[ind, :], num_top_venues)

bairros_venues_sorted

Unnamed: 0,Bairros,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
0,Badenfurt,Brazilian Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Chinese Restaurant,Food Truck,Burger Joint,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint
1,Boa Vista,Food Stand,Steakhouse,Pastelaria,Chinese Restaurant,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Churrascaria,Vegetarian / Vegan Restaurant,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Cafeteria
2,Bom Retiro,Italian Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint
3,Centro,Brazilian Restaurant,Restaurant,Burger Joint,Italian Restaurant,Pizza Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Japanese Restaurant,Food Court,Hawaiian Restaurant,Chinese Restaurant,Gastropub,Sandwich Place,Gluten-free Restaurant,Diner,Breakfast Spot,Southern Brazilian Restaurant,Pastelaria,BBQ Joint,Bagel Shop
4,Da Glória,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint
5,Do Salto,Food Truck,Bistro,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Vegetarian / Vegan Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bagel Shop,BBQ Joint,Chinese Restaurant,Fried Chicken Joint
6,Escola Agrícola,Creperie,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint
7,Fortaleza,Restaurant,Pizza Place,Hot Dog Joint,Italian Restaurant,Brazilian Restaurant,Burger Joint,Vegetarian / Vegan Restaurant,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Churrascaria,Chinese Restaurant,Food Stand,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint
8,Fortaleza Alta,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint
9,Garcia,Sushi Restaurant,Fast Food Restaurant,Restaurant,Deli / Bodega,Vegetarian / Vegan Restaurant,Chinese Restaurant,Food Court,Fish & Chips Shop,Diner,Creperie,Churrascaria,Cafeteria,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand


In [11]:
rand_state=50
folds=3
k_fold = KFold(n_splits=folds, shuffle=True, random_state=rand_state)
hyperparams = {
    "n_clusters": [2, 3, 4],
    "n_init": [10, 15, 20],
    "max_iter": [100, 200, 300, 400, 500],
    "tol": [.0000001, .000001, .00001, .0001],
}

k_means = KMeans()

ensemble = GridSearchCV(
    estimator=k_means,
    param_grid=hyperparams,
    cv=k_fold,
    n_jobs=-1
)

blumenau_grouped_clustering = blumenau_grouped.drop('Bairros', 1)
ensemble.fit(blumenau_grouped_clustering)

labels = ensemble.predict(blumenau_grouped_clustering)
score = silhouette_score(blumenau_grouped_clustering, labels)

print(score)
print(ensemble.best_params_)

0.3240350214087167
{'max_iter': 200, 'n_clusters': 4, 'n_init': 10, 'tol': 1e-05}


# Results 

Using the following parameters for K-means:
`{'max_iter': 200, 'n_clusters': 4, 'n_init': 10, 'tol': 1e-06}`, which showed the best silhoutte score. The silhoutte score shows how close the points are to the center of their clusters where tighter clusters will give a better score. If the data points are very scattered, the clusters are too loose. 

We obtained 4 clusters and dropped all NaN rows leaving 31 bairros to examine while 4 were removed. 

Based on the Folium map below and from the output for cluster label 1, we can see that this cluster represents the bulk of the bairros (26). Cluster label 2 only represent a single neighbourhood while cluster label 0, and 3 represent two. 

![Folium Map](folium.png)


In [12]:
kmeans = KMeans(n_clusters=ensemble.best_params_['n_clusters'], max_iter=ensemble.best_params_['max_iter'], n_init=ensemble.best_params_['n_init'], tol=ensemble.best_params_['tol'], random_state=rand_state).fit(blumenau_grouped_clustering)
print(len(kmeans.labels_), len(blumenau_grouped_clustering), len(df), len(bairros_venues_sorted))

31 31 35 31


In [13]:
blumenau_merged = df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
blumenau_merged = blumenau_merged.join(bairros_venues_sorted.set_index('Bairros'), on='Bairros')

# drop all rows with NaN
blumenau_merged = blumenau_merged.dropna()

# add clustering labels
blumenau_merged['Cluster Labels'] = kmeans.labels_

blumenau_merged.head()

Unnamed: 0,Bairros,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
0,Água Verde,-26.910743,-49.107369,Fast Food Restaurant,Hot Dog Joint,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,3
1,Badenfurt,-26.88306,-49.135753,Brazilian Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Chinese Restaurant,Food Truck,Burger Joint,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,2
2,Boa Vista,-26.901357,-49.066842,Food Stand,Steakhouse,Pastelaria,Chinese Restaurant,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Churrascaria,Vegetarian / Vegan Restaurant,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Cafeteria,0
3,Bom Retiro,-26.925561,-49.071635,Italian Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,1
4,Centro,-26.919902,-49.065934,Brazilian Restaurant,Restaurant,Burger Joint,Italian Restaurant,Pizza Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Japanese Restaurant,Food Court,Hawaiian Restaurant,Chinese Restaurant,Gastropub,Sandwich Place,Gluten-free Restaurant,Diner,Breakfast Spot,Southern Brazilian Restaurant,Pastelaria,BBQ Joint,Bagel Shop,2


In [14]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(ensemble.best_params_['n_clusters'])
ys = [i+x+(i*x)**2 for i in range(ensemble.best_params_['n_clusters'])]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(blumenau_merged['Latitude'], blumenau_merged['Longitude'], blumenau_merged['Bairros'], blumenau_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [15]:
blumenau_merged.loc[blumenau_merged['Cluster Labels'] == 0, blumenau_merged.columns[list(range(blumenau_merged.shape[1]))]]

Unnamed: 0,Bairros,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
2,Boa Vista,-26.901357,-49.066842,Food Stand,Steakhouse,Pastelaria,Chinese Restaurant,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Churrascaria,Vegetarian / Vegan Restaurant,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Cafeteria,0


In [16]:
blumenau_merged.loc[blumenau_merged['Cluster Labels'] == 1, blumenau_merged.columns[list(range(blumenau_merged.shape[1]))]]

Unnamed: 0,Bairros,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
3,Bom Retiro,-26.925561,-49.071635,Italian Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,1
7,Escola Agrícola,-26.895078,-49.099026,Creperie,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint,1
10,Fortaleza Alta,-26.847192,-49.050457,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint,1
12,Itoupava Central,-26.81619,-49.089223,Restaurant,Pizza Place,Vegetarian / Vegan Restaurant,Chinese Restaurant,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Churrascaria,Cafeteria,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,1
13,Itoupava Norte,-26.879553,-49.07824,Sushi Restaurant,Burger Joint,Japanese Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,1
15,Itoupavazinha,-26.848878,-49.113873,Restaurant,Vegetarian / Vegan Restaurant,Chinese Restaurant,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Churrascaria,Cafeteria,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,1
18,Passo Manso,-26.907455,-49.148027,Brazilian Restaurant,Hot Dog Joint,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,1
19,Ponta Aguda,-26.915872,-49.064282,Brazilian Restaurant,Pizza Place,Burger Joint,Restaurant,Churrascaria,Bagel Shop,Diner,Pastelaria,Gluten-free Restaurant,Fish & Chips Shop,Fast Food Restaurant,Deli / Bodega,Creperie,Vegetarian / Vegan Restaurant,Chinese Restaurant,Cafeteria,Food Stand,Breakfast Spot,Bistro,BBQ Joint,1
26,Valparaíso,-26.957671,-49.073054,Food Truck,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Vegetarian / Vegan Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Chinese Restaurant,Fried Chicken Joint,1
28,Velha Central,-26.930023,-49.126869,BBQ Joint,Diner,Pizza Place,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Deli / Bodega,Creperie,Cafeteria,Chinese Restaurant,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,Food Stand,Fried Chicken Joint,1


In [17]:
blumenau_merged.loc[blumenau_merged['Cluster Labels'] == 2, blumenau_merged.columns[list(range(blumenau_merged.shape[1]))]]

Unnamed: 0,Bairros,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
1,Badenfurt,-26.88306,-49.135753,Brazilian Restaurant,Restaurant,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Chinese Restaurant,Food Truck,Burger Joint,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,2
4,Centro,-26.919902,-49.065934,Brazilian Restaurant,Restaurant,Burger Joint,Italian Restaurant,Pizza Place,Vegetarian / Vegan Restaurant,Fast Food Restaurant,Japanese Restaurant,Food Court,Hawaiian Restaurant,Chinese Restaurant,Gastropub,Sandwich Place,Gluten-free Restaurant,Diner,Breakfast Spot,Southern Brazilian Restaurant,Pastelaria,BBQ Joint,Bagel Shop,2
5,Da Glória,-26.964187,-49.059479,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint,2
6,Do Salto,-26.883472,-49.102599,Food Truck,Bistro,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Vegetarian / Vegan Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bagel Shop,BBQ Joint,Chinese Restaurant,Fried Chicken Joint,2
9,Fortaleza,-26.879053,-49.065259,Restaurant,Pizza Place,Hot Dog Joint,Italian Restaurant,Brazilian Restaurant,Burger Joint,Vegetarian / Vegan Restaurant,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Churrascaria,Chinese Restaurant,Food Stand,Breakfast Spot,Bistro,Bagel Shop,BBQ Joint,2
16,Jardim Blumenau,-26.926254,-49.061806,Restaurant,Vegetarian / Vegan Restaurant,Brazilian Restaurant,Chinese Restaurant,Pizza Place,Burger Joint,Bistro,Food Court,Japanese Restaurant,Sushi Restaurant,Food Truck,Bagel Shop,BBQ Joint,Breakfast Spot,Cafeteria,Churrascaria,Creperie,Deli / Bodega,Diner,Fast Food Restaurant,2
20,Progresso,-26.972253,-49.075171,Restaurant,Burger Joint,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Cafeteria,Chinese Restaurant,Food Truck,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,2
22,Salto do Norte,-26.87048,-49.100269,Gastropub,Diner,Restaurant,Food Truck,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Deli / Bodega,Creperie,Chinese Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,2
23,Salto Weissbach,-26.896694,-49.129936,German Restaurant,Diner,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Fast Food Restaurant,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,2
24,Testo Salto,-26.849167,-49.146605,Vegetarian / Vegan Restaurant,Churrascaria,Food Stand,Food Court,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Turkish Restaurant,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Truck,Fried Chicken Joint,2


In [18]:
blumenau_merged.loc[blumenau_merged['Cluster Labels'] == 3, blumenau_merged.columns[list(range(blumenau_merged.shape[1]))]]

Unnamed: 0,Bairros,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
0,Água Verde,-26.910743,-49.107369,Fast Food Restaurant,Hot Dog Joint,Vegetarian / Vegan Restaurant,Churrascaria,Food Court,Fish & Chips Shop,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Food Truck,Cafeteria,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,Fried Chicken Joint,3
11,Garcia,-26.934577,-49.059467,Sushi Restaurant,Fast Food Restaurant,Restaurant,Deli / Bodega,Vegetarian / Vegan Restaurant,Chinese Restaurant,Food Court,Fish & Chips Shop,Diner,Creperie,Churrascaria,Cafeteria,Food Truck,Burger Joint,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,Food Stand,3
14,Itoupava Seca,-26.895138,-49.081718,Restaurant,American Restaurant,Japanese Restaurant,Turkish Restaurant,Burger Joint,Italian Restaurant,Fish & Chips Shop,Fast Food Restaurant,Diner,Deli / Bodega,Creperie,Chinese Restaurant,Churrascaria,Food Stand,Cafeteria,Breakfast Spot,Brazilian Restaurant,Bistro,Bagel Shop,BBQ Joint,3



# Discussion
---

Using a groupby we can see that a bulk of venues in category label 1 are generic restaurants. This category also contains the most restaurants.

Although we did not specify, in the business plan, the type of restaurant, we can see that majority of fast food resturants are located in Água Verde. Brazilian Restaurants are mainly found in Badenfurt, Centro, Passo Manso, and Ponta Aguda. Vila Formosa and Victor Konder are common places for burger joints, Tribess for fried chicken, Salto Weissbach for German food, Bom Retiro for italian food, Da Glória, Fortaleza Alta, Testo Salto for vegetarian food, Itoupava Norte and Garcia for sushi. Meanwhile, for restaurants in general, the following bairros are popular Fortaleza, Itoupava Central, Itoupava Seca, Itoupavazinha, Velha, Jardim Blumenau, Progresso, Velha and Vila Nova.

These results may be interrupted in several ways for a new restaurant idea. The following bairros, Fortaleza, Itoupava Central, Itoupava Seca, Itoupavazinha, Velha, Jardim Blumenau, Progresso, Velha and Vila Nova, may be the best locations for generic restaurants; however, one may need to consider the overall cost for that location along with competition factors. Alternatively, these locations may be best due to general thinking that these areas contain the most generic restaurants bringing forth the most customers. 

**important**

Below, one may see a visualization of the top 20 venues and their abundancy.

![Visual Map](visual.png)

In [19]:
blumenau_merged.groupby(['Cluster Labels', '1st Most Common Venue']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Bairros,Latitude,Longitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue
Cluster Labels,1st Most Common Venue,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
0,Food Stand,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,BBQ Joint,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Brazilian Restaurant,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
1,Burger Joint,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Creperie,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Diner,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Food Truck,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Italian Restaurant,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
1,Restaurant,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2
1,Sushi Restaurant,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1


In [20]:
blumenau_merged.groupby(['1st Most Common Venue', 'Bairros']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Latitude,Longitude,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,11th Most Common Venue,12th Most Common Venue,13th Most Common Venue,14th Most Common Venue,15th Most Common Venue,16th Most Common Venue,17th Most Common Venue,18th Most Common Venue,19th Most Common Venue,20th Most Common Venue,Cluster Labels
1st Most Common Venue,Bairros,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
BBQ Joint,Velha Central,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
BBQ Joint,Vorstadt,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Brazilian Restaurant,Badenfurt,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Brazilian Restaurant,Centro,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Brazilian Restaurant,Passo Manso,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Brazilian Restaurant,Ponta Aguda,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Burger Joint,Victor Konder,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Burger Joint,Vila Formosa,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Creperie,Escola Agrícola,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
Diner,Vila Itoupava,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1


In [21]:
graph=blumenau_merged.groupby(['1st Most Common Venue']).size()
# graph=graph.unstack()
graph.plot(kind='bar', figsize=(24,12))

<matplotlib.axes._subplots.AxesSubplot at 0x118e97048>

# Conclusion
---

In summary, there are four distinct categories described by the K-means method. I used K-fold and GridSearchCV to optimize the best values for the K-means method. The first category contains the bulk of the bairros that contain restaurants while category 0, 2, and 3 contain two or less bairros. Selection of bairro is strongly dictated by the type or theme of the restaurant; however, the most common bairros for general restaurants are  Fortaleza, Itoupava Central, Itoupava Seca, Itoupavazinha, Velha, Jardim Blumenau, Progresso, Velha and Vila Nova. These bairros can represent either the best location to place a new restuarant or the worse. Factors such as rental costs, competition, etc. play an important role in the decision to open a restaurant in these locations, and must be considered.