# Capstone Project - The Battle of Neighborhoods (Week 2)

# Doing Business in Brazil

## Introduction/Business Problem

The Business problem is to determine what are Brazilians behaviors and how to create a rapport to facilitate doing business in Brazil.

One important thing when doing business in a country is to understand the local behavior and what is most appreciated. Understanding this, you can create rapport and facilitates your negotiations.


## Data and how it will be used to solve the problem

I will explore the cities that contribute most to the Brazilian GDP as described in a Wikipedia page that has all the information I need. ('https://pt.wikipedia.org/wiki/Lista_de_munic%C3%ADpios_do_Brasil_por_PIB')

I will use the Foursquare API to explore the cities and will use the **explore** function to get the most common venue categories in each city. I will use the *k*-means clustering algorithm to complete this task. Finally, I will use the Folium library to visualize the Cities, compare the venues and determine what type of places Brazilians like.


## Table of Contents

#### 1. <a href="#item1">Download and Explore Dataset

#### 2. <a href="#item2">Explore Cities in Brazil

#### 3. <a href="#item3">Analyze Each City

#### 4. <a href="#item4">Cluster City

#### 5. <a href="#item5">Examine Clusters

Before I get the data and start exploring it, let me download all the dependencies that I will need.

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


## 1. Download and Explore Dataset

For the data, a Wikipedia page exists that has all the information I need to explore and cluster the cities in Brazil. I  will scrape the Wikipedia page and wrangle the data, clean it, and then read it into a pandas dataframe so that it is in a structured format.

In [2]:
Table = pd.read_html('https://pt.wikipedia.org/wiki/Lista_de_munic%C3%ADpios_do_Brasil_por_PIB',header=0)[0]
Table.rename(columns={'Município':'City','PIB 2016 (R$ 1.000)':'GDP'},inplace=True)
Table.head(50)

Unnamed: 0,Posição,City,GDP,Estado
0,1,São Paulo,687 035 890,SP
1,2,Rio de Janeiro,329 431 360,RJ
2,3,Brasília,235 497 107,DF
3,4,Belo Horizonte,88 277 463,MG
4,5,Curitiba,83 788 904,PR
5,6,Osasco,74 402 691,SP
6,7,Porto Alegre,73 425 264,RS
7,8,Manaus,70 296 364,AM
8,9,Salvador,61 102 373,BA
9,10,Fortaleza,60 141 145,CE


In [3]:
Table.shape

(5570, 4)

In [4]:
gdp_cities_brazil =Table.drop(columns=['Posição','GDP'],axis=1)
gdp_cities_brazil.head()

Unnamed: 0,City,Estado
0,São Paulo,SP
1,Rio de Janeiro,RJ
2,Brasília,DF
3,Belo Horizonte,MG
4,Curitiba,PR


In [5]:
gdp_cities_brazil.shape

(5570, 2)

I will work with top 50 cities

In [6]:
top_50 = gdp_cities_brazil.head(50)
top_50.head()

Unnamed: 0,City,Estado
0,São Paulo,SP
1,Rio de Janeiro,RJ
2,Brasília,DF
3,Belo Horizonte,MG
4,Curitiba,PR


In [7]:
for index, row in top_50.iterrows():
    print (row['City'],row['Estado'])

São Paulo SP
Rio de Janeiro RJ
Brasília DF
Belo Horizonte MG
Curitiba PR
Osasco SP
Porto Alegre RS
Manaus AM
Salvador BA
Fortaleza CE
Campinas SP
Guarulhos SP
Recife PE
Barueri SP
Goiânia GO
São Bernardo do Campo SP
Duque de Caxias RJ
Jundiaí SP
São José dos Campos SP
Uberlândia MG
Paulínia SP
Sorocaba SP
Ribeirão Preto SP
Belém PA
São Luís MA
Contagem MG
Santo André SP
Campo Grande MS
Joinville SC
Betim MG
Niterói RJ
Cuiabá MT
Santos SP
Camaçari BA
Natal RN
Vitória ES
Piracicaba SP
Maceió AL
Caxias do Sul RS
São José dos Pinhais PR
Canoas RS
Itajaí SC
Teresina PI
João Pessoa PB
Florianópolis SC
Londrina PR
Serra ES
Cubatão SP
Macaé RJ
Campos dos Goytacazes RJ


#### Use geopy library to get the latitude and longitude values of Cities.

In order to define an instance of the geocoder, I need to define a user_agent. I will name the agent <em>br_explorer</em>, as shown below.

In [8]:
for index, row in top_50.iterrows():
    address = row['City'] + ", " + row['Estado']
    geolocator = Nominatim(user_agent="br_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    print('The geograpical coordinate of {} are {}, {}.'.format(address,latitude, longitude))

The geograpical coordinate of São Paulo, SP are -23.5506507, -46.6333824.
The geograpical coordinate of Rio de Janeiro, RJ are -22.9110137, -43.2093727.
The geograpical coordinate of Brasília, DF are -15.7934036, -47.8823172.
The geograpical coordinate of Belo Horizonte, MG are -19.9227318, -43.9450948.
The geograpical coordinate of Curitiba, PR are -25.4295963, -49.2712724.
The geograpical coordinate of Osasco, SP are -23.5324859, -46.7916801.
The geograpical coordinate of Porto Alegre, RS are -30.0324999, -51.2303767.
The geograpical coordinate of Manaus, AM are -3.1316333, -59.9825041.
The geograpical coordinate of Salvador, BA are -12.9822499, -38.4812772.
The geograpical coordinate of Fortaleza, CE are -3.7304512, -38.5217989.
The geograpical coordinate of Campinas, SP are -22.90556, -47.06083.
The geograpical coordinate of Guarulhos, SP are -23.4430602, -46.524459.
The geograpical coordinate of Recife, PE are -8.0641931, -34.8781517.
The geograpical coordinate of Barueri, SP are 

In [9]:
# define the dataframe columns
column_names = ['State', 'City', 'Latitude', 'Longitude'] 

# instantiate the dataframe
Brazil = pd.DataFrame(columns=column_names)
Brazil

Unnamed: 0,State,City,Latitude,Longitude


In [10]:
for index, row in top_50.iterrows():
    address = row['City'] + ", " + row['Estado']
    geolocator = Nominatim(user_agent="br_explorer")
    location = geolocator.geocode(address)
    latitude = location.latitude
    longitude = location.longitude
    
    Brazil = Brazil.append({'State': row['Estado'],
                            'City': row['City'],
                            'Latitude': latitude,
                            'Longitude': longitude}, ignore_index=True)

In [11]:
Brazil

Unnamed: 0,State,City,Latitude,Longitude
0,SP,São Paulo,-23.550651,-46.633382
1,RJ,Rio de Janeiro,-22.911014,-43.209373
2,DF,Brasília,-15.793404,-47.882317
3,MG,Belo Horizonte,-19.922732,-43.945095
4,PR,Curitiba,-25.429596,-49.271272
5,SP,Osasco,-23.532486,-46.79168
6,RS,Porto Alegre,-30.0325,-51.230377
7,AM,Manaus,-3.131633,-59.982504
8,BA,Salvador,-12.98225,-38.481277
9,CE,Fortaleza,-3.730451,-38.521799


In [12]:
address = 'Brazil'

geolocator = Nominatim(user_agent="br_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Brazil are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Brazil are -10.3333333, -53.2.


#### Create a map of Brazil.

In [13]:
# create map of Brazil using latitude and longitude values
map_brazil = folium.Map(location=[latitude, longitude], zoom_start=4)

# add markers to map
for lat, lng, state, city in zip(Brazil['Latitude'], Brazil['Longitude'], Brazil['State'], Brazil['City']):
    label = '{}, {}'.format(city, state)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_brazil)
map_brazil

Next, we are going to start utilizing the Foursquare API to explore the cities and segment them.

#### Define Foursquare Credentials and Version

In [14]:
CLIENT_ID = 'LPHXG0C4IITZCYW2BJN1T5KJXBL1EF4AGODY54HKZPLOZ5SC' # your Foursquare ID
CLIENT_SECRET = 'IIFSFH05BMHIJBCQS3GJ4WW1YWMJKO05RWOLP2HVHBT4TVKF' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: LPHXG0C4IITZCYW2BJN1T5KJXBL1EF4AGODY54HKZPLOZ5SC
CLIENT_SECRET:IIFSFH05BMHIJBCQS3GJ4WW1YWMJKO05RWOLP2HVHBT4TVKF


#### Let's explore the first city in my dataframe.

In [15]:
Brazil.loc[0, 'City']

'São Paulo'

Get the city's latitude and longitude values.

In [16]:
city_latitude = Brazil.loc[0, 'Latitude'] # city latitude value
city_longitude = Brazil.loc[0, 'Longitude'] # city longitude value

city_name = Brazil.loc[0, 'City'] # city name

print('Latitude and longitude values of {} are {}, {}.'.format(city_name, 
                                                               city_latitude, 
                                                               city_longitude))

Latitude and longitude values of São Paulo are -23.5506507, -46.6333824.


#### Now, let's get the top 100 venues that are in Sao Paulo.

First, let's create the GET request URL. Name your URL **url**.

In [17]:
# type your answer here
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 50000 # define radius
# create URL
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    city_latitude, 
    city_longitude, 
    radius, 
    LIMIT)
url # display URL


'https://api.foursquare.com/v2/venues/explore?&client_id=LPHXG0C4IITZCYW2BJN1T5KJXBL1EF4AGODY54HKZPLOZ5SC&client_secret=IIFSFH05BMHIJBCQS3GJ4WW1YWMJKO05RWOLP2HVHBT4TVKF&v=20180605&ll=-23.5506507,-46.6333824&radius=50000&limit=100'

Send the GET request and examine the resutls

In [18]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5c6314dc9fb6b72ab0aacd31'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'São Paulo',
  'headerFullLocation': 'São Paulo',
  'headerLocationGranularity': 'city',
  'totalResults': 217,
  'suggestedBounds': {'ne': {'lat': -23.10065024999955,
    'lng': -46.14341107531995},
   'sw': {'lat': -24.000651150000447, 'lng': -47.12335372468005}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b17eb00f964a520a1c923e3',
       'name': 'Centro Cultural Banco do Brasil (CCBB)',
       'location': {'address': 'R. Álvares Penteado, 112',
        'crossStreet': 'R. Quitanda',
        'lat': -23.547588190396358,
        'lng': -46.6346831174672,
        'lab

From the Foursquare lab in the previous module, we know that all the information is in the *items* key. Before we proceed, let's borrow the **get_category_type** function from the Foursquare lab.

In [19]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

Now we are ready to clean the json and structure it into a *pandas* dataframe.

In [20]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

Unnamed: 0,name,categories,lat,lng
0,Centro Cultural Banco do Brasil (CCBB),Cultural Center,-23.547588,-46.634683
1,Teatro Renault,Theater,-23.55412,-46.638695
2,Theatro Municipal de São Paulo,Theater,-23.545387,-46.638765
3,Casa de Francisca,Music Venue,-23.548733,-46.634763
4,Casa Mathilde,Dessert Shop,-23.545409,-46.634746


And how many venues were returned by Foursquare?

In [21]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


#### Let's find out how many unique categories can be curated from all the returned venues

In [22]:
print('There are {} uniques categories.'.format(len(nearby_venues['categories'].unique())))

There are 55 uniques categories.


## 2. Explore Cities in Brazil

#### Let's create a function to repeat the same process to all the cities in Brazil

In [23]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['City', 
                  'City Latitude', 
                  'City Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [25]:
City_venues = getNearbyVenues(names=Brazil['City'],
                                   latitudes=Brazil['Latitude'],
                                   longitudes=Brazil['Longitude']
                                  )

São Paulo
Rio de Janeiro
Brasília
Belo Horizonte
Curitiba
Osasco
Porto Alegre
Manaus
Salvador
Fortaleza
Campinas
Guarulhos
Recife
Barueri
Goiânia
São Bernardo do Campo
Duque de Caxias
Jundiaí
São José dos Campos
Uberlândia
Paulínia
Sorocaba
Ribeirão Preto
Belém
São Luís
Contagem
Santo André
Campo Grande
Joinville
Betim
Niterói
Cuiabá
Santos
Camaçari
Natal
Vitória
Piracicaba
Maceió
Caxias do Sul
São José dos Pinhais
Canoas
Itajaí
Teresina
João Pessoa
Florianópolis
Londrina
Serra
Cubatão
Macaé
Campos dos Goytacazes


In [26]:
print(City_venues.shape)
City_venues.head()

(2427, 7)


Unnamed: 0,City,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,São Paulo,-23.550651,-46.633382,Caixa Cultural,-23.549381,-46.632849,Art Gallery
1,São Paulo,-23.550651,-46.633382,Casa de Francisca,-23.548733,-46.634763,Music Venue
2,São Paulo,-23.550651,-46.633382,Kopenhagen,-23.551759,-46.63537,Chocolate Shop
3,São Paulo,-23.550651,-46.633382,Livraria Saraiva,-23.55191,-46.634302,Bookstore
4,São Paulo,-23.550651,-46.633382,João Justino Jóias,-23.549357,-46.634534,Jewelry Store


In [27]:
City_venues.groupby('City').count()

Unnamed: 0_level_0,City Latitude,City Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
City,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Barueri,33,33,33,33,33,33
Belo Horizonte,80,80,80,80,80,80
Belém,58,58,58,58,58,58
Betim,45,45,45,45,45,45
Brasília,69,69,69,69,69,69
Camaçari,42,42,42,42,42,42
Campinas,70,70,70,70,70,70
Campo Grande,54,54,54,54,54,54
Campos dos Goytacazes,39,39,39,39,39,39
Canoas,46,46,46,46,46,46


In [28]:
print('There are {} uniques categories.'.format(len(City_venues['Venue Category'].unique())))

There are 254 uniques categories.


## 3. Analyze Each City

<a id='item2'></a>

In [29]:
# one hot encoding
Brazil_onehot = pd.get_dummies(City_venues[['Venue Category']], prefix="", prefix_sep="")

# add city column back to dataframe
Brazil_onehot['City'] = City_venues['City'] 

# move neighborhood column to the first column
fixed_columns = [Brazil_onehot.columns[-1]] + list(Brazil_onehot.columns[:-1])
Brazil_onehot = Brazil_onehot[fixed_columns]

Brazil_onehot.head()

Unnamed: 0,City,Acai House,Accessories Store,Adult Boutique,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bistro,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Buffet,Building,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Café,Camera Store,Candy Store,Capitol Building,Castle,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,College Gym,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doctor's Office,Dumpling Restaurant,Electronics Store,Empada House,Empanada Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Fishing Store,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Friterie,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Goiano Restaurant,Gourmet Shop,Government Building,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Hardware Store,Health & Beauty Service,Health Food Store,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Korean Restaurant,Library,Lingerie Store,Lottery Retailer,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mineiro Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Motel,Movie Theater,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Newsstand,Nightclub,Noodle House,Office,Optical Shop,Other Event,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Paella Restaurant,Paper / Office Supplies Store,Park,Pastelaria,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Lab,Piadineria,Pie Shop,Pizza Place,Planetarium,Plaza,Pool,Pool Hall,Portuguese Restaurant,Pub,Public Art,RV Park,Record Shop,Recording Studio,Rental Car Location,Restaurant,Road,Rock Club,Sake Bar,Salad Place,Salon / Barbershop,Samba School,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southern Brazilian Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tapiocaria,Tattoo Parlor,Tea Room,Theater,Theme Park,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Train Station,Tree,Turkish Home Cooking Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Watch Shop,Water Park,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo
0,São Paulo,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,São Paulo,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,São Paulo,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,São Paulo,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,São Paulo,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [30]:
Brazil_onehot.shape

(2427, 255)

#### Next, let's group rows by taking the mean of the frequency of occurrence of each category

In [31]:
Brazil_grouped = Brazil_onehot.groupby('City').mean().reset_index()
Brazil_grouped

Unnamed: 0,City,Acai House,Accessories Store,Adult Boutique,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bistro,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Buffet,Building,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Café,Camera Store,Candy Store,Capitol Building,Castle,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,College Gym,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doctor's Office,Dumpling Restaurant,Electronics Store,Empada House,Empanada Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Fishing Store,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Friterie,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Goiano Restaurant,Gourmet Shop,Government Building,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Hardware Store,Health & Beauty Service,Health Food Store,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Korean Restaurant,Library,Lingerie Store,Lottery Retailer,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mineiro Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Motel,Movie Theater,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Newsstand,Nightclub,Noodle House,Office,Optical Shop,Other Event,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Paella Restaurant,Paper / Office Supplies Store,Park,Pastelaria,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Lab,Piadineria,Pie Shop,Pizza Place,Planetarium,Plaza,Pool,Pool Hall,Portuguese Restaurant,Pub,Public Art,RV Park,Record Shop,Recording Studio,Rental Car Location,Restaurant,Road,Rock Club,Sake Bar,Salad Place,Salon / Barbershop,Samba School,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southern Brazilian Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tapiocaria,Tattoo Parlor,Tea Room,Theater,Theme Park,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Train Station,Tree,Turkish Home Cooking Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Watch Shop,Water Park,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo
0,Barueri,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0
1,Belo Horizonte,0.0,0.0,0.0,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0125,0.125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0625,0.0125,0.0125,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0125,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0
2,Belém,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.034483,0.017241,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.034483,0.051724,0.017241,0.0,0.0,0.051724,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0
3,Betim,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.088889,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0
4,Brasília,0.014493,0.014493,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.028986,0.028986,0.0,0.043478,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.057971,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.028986,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028986,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028986,0.0,0.0,0.0,0.0,0.0,0.0,0.057971,0.0,0.0,0.072464,0.0,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.028986,0.0,0.014493,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.057971,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Camaçari,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.047619,0.0,0.071429,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.071429,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0
6,Campinas,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042857,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.042857,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.142857,0.0,0.014286,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.042857,0.014286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.014286,0.014286,0.0,0.042857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.028571,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Campo Grande,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.037037,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.037037,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.018519,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.018519,0.0,0.055556,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.055556,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Campos dos Goytacazes,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.051282,0.102564,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.076923,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0
9,Canoas,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.065217,0.0,0.0,0.0,0.021739,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0


In [32]:
Brazil_grouped.shape

(50, 255)

#### Let's print the top 5 most common venues

In [33]:
num_top_venues = 5

for hood in Brazil_grouped['City']:
    print("----"+hood+"----")
    temp = Brazil_grouped[Brazil_grouped['City'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Barueri----
                  venue  freq
0                   Pub  0.06
1  Fast Food Restaurant  0.06
2                   Bar  0.06
3           Snack Place  0.06
4        Sandwich Place  0.06


----Belo Horizonte----
                  venue  freq
0                   Bar  0.12
1            Restaurant  0.06
2  Brazilian Restaurant  0.06
3                  Café  0.05
4  Gym / Fitness Center  0.04


----Belém----
                  venue  freq
0  Brazilian Restaurant  0.07
1                 Hotel  0.07
2              Pharmacy  0.05
3           Pizza Place  0.05
4                 Plaza  0.03


----Betim----
                  venue  freq
0            Restaurant  0.11
1                Bakery  0.09
2                 Plaza  0.07
3  Brazilian Restaurant  0.07
4  Fast Food Restaurant  0.07


----Brasília----
            venue  freq
0  Clothing Store  0.09
1  Ice Cream Shop  0.07
2            Café  0.06
3         Theater  0.06
4           Hotel  0.06


----Camaçari----
           venue  freq
0 

#### Let's put that into a *pandas* dataframe

First, let's write a function to sort the venues in descending order.

In [34]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Now let's create the new dataframe and display the top 10 venues.

In [35]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['City']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
City_venues_sorted = pd.DataFrame(columns=columns)
City_venues_sorted['City'] = Brazil_grouped['City']

for ind in np.arange(Brazil_grouped.shape[0]):
    City_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Brazil_grouped.iloc[ind, :], num_top_venues)

City_venues_sorted.head(50)

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Barueri,Sandwich Place,Bar,Pub,Snack Place,Fast Food Restaurant,Ice Cream Shop,Food Truck,Restaurant,Plaza,Pizza Place
1,Belo Horizonte,Bar,Restaurant,Brazilian Restaurant,Café,Gym / Fitness Center,Bookstore,Juice Bar,BBQ Joint,Coffee Shop,Hotel
2,Belém,Brazilian Restaurant,Hotel,Pharmacy,Pizza Place,Japanese Restaurant,Gym,Burger Joint,Snack Place,Shopping Mall,Restaurant
3,Betim,Restaurant,Bakery,Brazilian Restaurant,Plaza,Fast Food Restaurant,Gym / Fitness Center,Ice Cream Shop,Burger Joint,Bar,Music Venue
4,Brasília,Clothing Store,Ice Cream Shop,Café,Hotel,Theater,Restaurant,Brazilian Restaurant,Men's Store,Fast Food Restaurant,Nightclub
5,Camaçari,Shopping Mall,Plaza,Ice Cream Shop,Pizza Place,Electronics Store,Theater,Gym,Café,Department Store,Taco Place
6,Campinas,Brazilian Restaurant,Coffee Shop,Arts & Crafts Store,Department Store,Bar,Juice Bar,Buffet,Café,Sandwich Place,Bookstore
7,Campo Grande,Sporting Goods Shop,Plaza,Pastelaria,Ice Cream Shop,Middle Eastern Restaurant,Chocolate Shop,Bar,Theater,Buffet,Museum
8,Campos dos Goytacazes,Bar,Pizza Place,Bakery,Gym,Coffee Shop,Brazilian Restaurant,Café,Plaza,Burger Joint,Snack Place
9,Canoas,Restaurant,Brazilian Restaurant,Café,Burger Joint,Sushi Restaurant,Snack Place,Buffet,Optical Shop,Churrascaria,Furniture / Home Store


<a id='item4'></a>

## 4. Cluster Cities

Run *k*-means to cluster the city into 5 clusters.

In [36]:
# set number of clusters
kclusters = 5

Brazil_grouped_clustering = Brazil_grouped.drop('City', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Brazil_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:50] 

array([1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,
       1, 1, 1, 4, 1, 0, 1, 2, 3, 0, 1, 1, 1, 1, 1, 1, 1, 1, 3, 0, 1, 1,
       1, 1, 1, 1, 1, 3], dtype=int32)

In [37]:
Brazil_grouped_clustering

Unnamed: 0,Acai House,Accessories Store,Adult Boutique,American Restaurant,Antique Shop,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Arts & Entertainment,Asian Restaurant,Athletics & Sports,Australian Restaurant,Automotive Shop,BBQ Joint,Bagel Shop,Bakery,Bar,Basketball Court,Basketball Stadium,Bed & Breakfast,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bistro,Board Shop,Bookstore,Boutique,Brazilian Restaurant,Breakfast Spot,Brewery,Bridal Shop,Buffet,Building,Burger Joint,Bus Line,Bus Station,Bus Stop,Business Service,Café,Camera Store,Candy Store,Capitol Building,Castle,Champagne Bar,Cheese Shop,Chinese Restaurant,Chocolate Shop,Church,Churrascaria,Clothing Store,Cocktail Bar,Coffee Shop,College Bookstore,College Gym,Comfort Food Restaurant,Comic Shop,Concert Hall,Construction & Landscaping,Convenience Store,Cosmetics Shop,Costume Shop,Credit Union,Cultural Center,Cupcake Shop,Dance Studio,Deli / Bodega,Department Store,Dessert Shop,Dim Sum Restaurant,Diner,Dive Bar,Doctor's Office,Dumpling Restaurant,Electronics Store,Empada House,Empanada Restaurant,Fabric Shop,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Fishing Store,Flea Market,Flower Shop,Food,Food & Drink Shop,Food Court,Food Truck,French Restaurant,Fried Chicken Joint,Friterie,Fruit & Vegetable Store,Furniture / Home Store,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Gluten-free Restaurant,Goiano Restaurant,Gourmet Shop,Government Building,Grocery Store,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Halal Restaurant,Hardware Store,Health & Beauty Service,Health Food Store,Herbs & Spices Store,Historic Site,History Museum,Hobby Shop,Hostel,Hot Dog Joint,Hotel,Hotel Bar,IT Services,Ice Cream Shop,Indie Movie Theater,Indie Theater,Italian Restaurant,Japanese Restaurant,Jewelry Store,Juice Bar,Karaoke Bar,Korean Restaurant,Library,Lingerie Store,Lottery Retailer,Lounge,Mac & Cheese Joint,Market,Martial Arts Dojo,Massage Studio,Mediterranean Restaurant,Men's Store,Mexican Restaurant,Middle Eastern Restaurant,Mineiro Restaurant,Miscellaneous Shop,Mobile Phone Shop,Molecular Gastronomy Restaurant,Monument / Landmark,Motel,Movie Theater,Multiplex,Museum,Music School,Music Store,Music Venue,Nail Salon,Newsstand,Nightclub,Noodle House,Office,Optical Shop,Other Event,Other Nightlife,Other Repair Shop,Outdoor Sculpture,Paella Restaurant,Paper / Office Supplies Store,Park,Pastelaria,Pastry Shop,Pedestrian Plaza,Performing Arts Venue,Pet Store,Pharmacy,Photography Lab,Piadineria,Pie Shop,Pizza Place,Planetarium,Plaza,Pool,Pool Hall,Portuguese Restaurant,Pub,Public Art,RV Park,Record Shop,Recording Studio,Rental Car Location,Restaurant,Road,Rock Club,Sake Bar,Salad Place,Salon / Barbershop,Samba School,Sandwich Place,Scenic Lookout,Science Museum,Sculpture Garden,Seafood Restaurant,Shoe Store,Shopping Mall,Shopping Plaza,Skate Park,Smoke Shop,Snack Place,Soba Restaurant,Soccer Field,Soccer Stadium,Social Club,Soup Place,South American Restaurant,Southern Brazilian Restaurant,Spa,Spanish Restaurant,Speakeasy,Sporting Goods Shop,Sports Bar,Stadium,Steakhouse,Street Art,Supermarket,Supplement Shop,Sushi Restaurant,Taco Place,Tailor Shop,Tapas Restaurant,Tapiocaria,Tattoo Parlor,Tea Room,Theater,Theme Park,Thrift / Vintage Store,Tourist Information Center,Toy / Game Store,Train Station,Tree,Turkish Home Cooking Restaurant,Used Bookstore,Vegetarian / Vegan Restaurant,Video Game Store,Video Store,Watch Shop,Water Park,Wine Bar,Wine Shop,Women's Store,Yoga Studio,Zoo
0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.030303,0.0,0.030303,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.060606,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.0,0.0
1,0.0,0.0,0.0,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0125,0.125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0625,0.0125,0.0125,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0125,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0125,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0375,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0,0.025,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0125,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0125,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0625,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0125,0.0,0.0,0.0,0.0
2,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.034483,0.017241,0.017241,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.017241,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.068966,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.0,0.0,0.034483,0.051724,0.017241,0.0,0.0,0.051724,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.034483,0.0,0.0,0.034483,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017241,0.0,0.0,0.017241,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.088889,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.044444,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.022222,0.0,0.066667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.111111,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.022222,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022222,0.0,0.0
4,0.014493,0.014493,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.028986,0.028986,0.0,0.043478,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.057971,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.028986,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028986,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028986,0.0,0.0,0.0,0.0,0.0,0.0,0.057971,0.0,0.0,0.072464,0.0,0.0,0.0,0.0,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.028986,0.0,0.014493,0.014493,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.057971,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014493,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.02381,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.047619,0.0,0.071429,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.02381,0.071429,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0,0.0,0.0,0.0,0.047619,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02381,0.0,0.0
6,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.042857,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.042857,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.142857,0.0,0.014286,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.028571,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.057143,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.042857,0.014286,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.014286,0.014286,0.0,0.042857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.028571,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.014286,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.037037,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.037037,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.018519,0.018519,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.018519,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.018519,0.0,0.055556,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.018519,0.0,0.055556,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.037037,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.018519,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.051282,0.102564,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.076923,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.051282,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025641,0.0,0.0,0.0,0.0
9,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.086957,0.0,0.0,0.0,0.043478,0.0,0.065217,0.0,0.0,0.0,0.021739,0.065217,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.021739,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.086957,0.0,0.0,0.0,0.0,0.021739,0.0,0.021739,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.043478,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.021739,0.0,0.0


Let's create a new dataframe that includes the cluster as well as the top 20 venues for each city.

In [38]:
Brazil.head()

Unnamed: 0,State,City,Latitude,Longitude
0,SP,São Paulo,-23.550651,-46.633382
1,RJ,Rio de Janeiro,-22.911014,-43.209373
2,DF,Brasília,-15.793404,-47.882317
3,MG,Belo Horizonte,-19.922732,-43.945095
4,PR,Curitiba,-25.429596,-49.271272


In [39]:
City_venues_sorted.head(50)

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Barueri,Sandwich Place,Bar,Pub,Snack Place,Fast Food Restaurant,Ice Cream Shop,Food Truck,Restaurant,Plaza,Pizza Place
1,Belo Horizonte,Bar,Restaurant,Brazilian Restaurant,Café,Gym / Fitness Center,Bookstore,Juice Bar,BBQ Joint,Coffee Shop,Hotel
2,Belém,Brazilian Restaurant,Hotel,Pharmacy,Pizza Place,Japanese Restaurant,Gym,Burger Joint,Snack Place,Shopping Mall,Restaurant
3,Betim,Restaurant,Bakery,Brazilian Restaurant,Plaza,Fast Food Restaurant,Gym / Fitness Center,Ice Cream Shop,Burger Joint,Bar,Music Venue
4,Brasília,Clothing Store,Ice Cream Shop,Café,Hotel,Theater,Restaurant,Brazilian Restaurant,Men's Store,Fast Food Restaurant,Nightclub
5,Camaçari,Shopping Mall,Plaza,Ice Cream Shop,Pizza Place,Electronics Store,Theater,Gym,Café,Department Store,Taco Place
6,Campinas,Brazilian Restaurant,Coffee Shop,Arts & Crafts Store,Department Store,Bar,Juice Bar,Buffet,Café,Sandwich Place,Bookstore
7,Campo Grande,Sporting Goods Shop,Plaza,Pastelaria,Ice Cream Shop,Middle Eastern Restaurant,Chocolate Shop,Bar,Theater,Buffet,Museum
8,Campos dos Goytacazes,Bar,Pizza Place,Bakery,Gym,Coffee Shop,Brazilian Restaurant,Café,Plaza,Burger Joint,Snack Place
9,Canoas,Restaurant,Brazilian Restaurant,Café,Burger Joint,Sushi Restaurant,Snack Place,Buffet,Optical Shop,Churrascaria,Furniture / Home Store


In [40]:
# add clustering labels
City_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

Brazil_merged = Brazil

# merge Brazil_grouped with Brazil to add latitude/longitude for each city
Brazil_merged = Brazil_merged.join(City_venues_sorted.set_index('City'), on='City')

Brazil_merged.head(50) # check the last columns!

Unnamed: 0,State,City,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,SP,São Paulo,-23.550651,-46.633382,1,Café,Japanese Restaurant,Bookstore,Brazilian Restaurant,Chocolate Shop,Historic Site,Cosmetics Shop,Restaurant,Miscellaneous Shop,Nightclub
1,RJ,Rio de Janeiro,-22.911014,-43.209373,1,Bar,Restaurant,Plaza,Café,Brazilian Restaurant,Train Station,Gym,Diner,Samba School,Snack Place
2,DF,Brasília,-15.793404,-47.882317,1,Clothing Store,Ice Cream Shop,Café,Hotel,Theater,Restaurant,Brazilian Restaurant,Men's Store,Fast Food Restaurant,Nightclub
3,MG,Belo Horizonte,-19.922732,-43.945095,1,Bar,Restaurant,Brazilian Restaurant,Café,Gym / Fitness Center,Bookstore,Juice Bar,BBQ Joint,Coffee Shop,Hotel
4,PR,Curitiba,-25.429596,-49.271272,1,Café,Historic Site,Brazilian Restaurant,Theater,Middle Eastern Restaurant,Plaza,Bar,Gym / Fitness Center,Snack Place,Pizza Place
5,SP,Osasco,-23.532486,-46.79168,3,Bar,Pizza Place,Hot Dog Joint,Farmers Market,General Entertainment,Brazilian Restaurant,Brewery,Burger Joint,Café,Sandwich Place
6,RS,Porto Alegre,-30.0325,-51.230377,1,Brazilian Restaurant,Buffet,Gym / Fitness Center,Coffee Shop,Café,Art Museum,Restaurant,Bar,Theater,Bakery
7,AM,Manaus,-3.131633,-59.982504,0,Hotel,Brazilian Restaurant,Korean Restaurant,Pharmacy,Gift Shop,Music Venue,Food Truck,Food Court,Food & Drink Shop,Food
8,BA,Salvador,-12.98225,-38.481277,1,Ice Cream Shop,Gym,Pizza Place,Automotive Shop,Gym / Fitness Center,Gluten-free Restaurant,Furniture / Home Store,Fast Food Restaurant,Dive Bar,Restaurant
9,CE,Fortaleza,-3.730451,-38.521799,1,Hostel,Gym / Fitness Center,Pizza Place,Restaurant,Café,Buffet,Brazilian Restaurant,Furniture / Home Store,Chinese Restaurant,Chocolate Shop


Finally, let's visualize the resulting clusters

In [41]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=4)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(Brazil_merged['Latitude'], Brazil_merged['Longitude'], Brazil_merged['City'], Brazil_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

<a id='item5'></a>

## 5. Examine Clusters

Now, you can examine each cluster and determine the discriminating venue categories that distinguish each cluster. Based on the defining categories, you can then assign a name to each cluster. I will leave this exercise to you.

#### Cluster 1 - Food

In [42]:
Brazil_merged.loc[Brazil_merged['Cluster Labels'] == 0, Brazil_merged.columns[[1] + list(range(5, Brazil_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Manaus,Hotel,Brazilian Restaurant,Korean Restaurant,Pharmacy,Gift Shop,Music Venue,Food Truck,Food Court,Food & Drink Shop,Food
11,Guarulhos,Grocery Store,Sandwich Place,Steakhouse,Bakery,Farmers Market,Gym,Gym / Fitness Center,Gym Pool,Fast Food Restaurant,Pet Store
20,Paulínia,Ice Cream Shop,Brazilian Restaurant,Hotel,Coffee Shop,Gym,Pizza Place,Pharmacy,Japanese Restaurant,Gym / Fitness Center,Convenience Store
21,Sorocaba,Plaza,Department Store,Hotel,Restaurant,Brazilian Restaurant,Pharmacy,Arts & Crafts Store,Coffee Shop,Japanese Restaurant,Bookstore
23,Belém,Brazilian Restaurant,Hotel,Pharmacy,Pizza Place,Japanese Restaurant,Gym,Burger Joint,Snack Place,Shopping Mall,Restaurant


#### Cluster 2 - Nightlife and Food

In [43]:
Brazil_merged.loc[Brazil_merged['Cluster Labels'] == 1, Brazil_merged.columns[[1] + list(range(5, Brazil_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,São Paulo,Café,Japanese Restaurant,Bookstore,Brazilian Restaurant,Chocolate Shop,Historic Site,Cosmetics Shop,Restaurant,Miscellaneous Shop,Nightclub
1,Rio de Janeiro,Bar,Restaurant,Plaza,Café,Brazilian Restaurant,Train Station,Gym,Diner,Samba School,Snack Place
2,Brasília,Clothing Store,Ice Cream Shop,Café,Hotel,Theater,Restaurant,Brazilian Restaurant,Men's Store,Fast Food Restaurant,Nightclub
3,Belo Horizonte,Bar,Restaurant,Brazilian Restaurant,Café,Gym / Fitness Center,Bookstore,Juice Bar,BBQ Joint,Coffee Shop,Hotel
4,Curitiba,Café,Historic Site,Brazilian Restaurant,Theater,Middle Eastern Restaurant,Plaza,Bar,Gym / Fitness Center,Snack Place,Pizza Place
6,Porto Alegre,Brazilian Restaurant,Buffet,Gym / Fitness Center,Coffee Shop,Café,Art Museum,Restaurant,Bar,Theater,Bakery
8,Salvador,Ice Cream Shop,Gym,Pizza Place,Automotive Shop,Gym / Fitness Center,Gluten-free Restaurant,Furniture / Home Store,Fast Food Restaurant,Dive Bar,Restaurant
9,Fortaleza,Hostel,Gym / Fitness Center,Pizza Place,Restaurant,Café,Buffet,Brazilian Restaurant,Furniture / Home Store,Chinese Restaurant,Chocolate Shop
10,Campinas,Brazilian Restaurant,Coffee Shop,Arts & Crafts Store,Department Store,Bar,Juice Bar,Buffet,Café,Sandwich Place,Bookstore
12,Recife,Brazilian Restaurant,Restaurant,Historic Site,Bookstore,Sandwich Place,Cosmetics Shop,Museum,Coffee Shop,Vegetarian / Vegan Restaurant,Shopping Mall


#### Cluster 3 - Stores

In [44]:
Brazil_merged.loc[Brazil_merged['Cluster Labels'] == 2, Brazil_merged.columns[[1] + list(range(5, Brazil_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
30,Niterói,Hardware Store,Market,Furniture / Home Store,Smoke Shop,Motel,Zoo,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market


#### Cluster 4 - Nightlife

In [45]:
Brazil_merged.loc[Brazil_merged['Cluster Labels'] == 3, Brazil_merged.columns[[1] + list(range(5, Brazil_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,Osasco,Bar,Pizza Place,Hot Dog Joint,Farmers Market,General Entertainment,Brazilian Restaurant,Brewery,Burger Joint,Café,Sandwich Place
35,Vitória,Bar,Plaza,Spanish Restaurant,Restaurant,Tapas Restaurant,Coffee Shop,Theater,Sandwich Place,Breakfast Spot,Boutique
46,Serra,Bar,Mediterranean Restaurant,Castle,Movie Theater,Paella Restaurant,Fishing Store,Fast Food Restaurant,Fish & Chips Shop,Fish Market,Zoo


#### Cluster 5 - Food

In [49]:
Brazil_merged.loc[Brazil_merged['Cluster Labels'] == 4, Brazil_merged.columns[[1] + list(range(5, Brazil_merged.shape[1]))]]

Unnamed: 0,City,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
48,Macaé,Brazilian Restaurant,Burger Joint,Pizza Place,Grocery Store,Zoo,Fishing Store,Farmers Market,Fast Food Restaurant,Fish & Chips Shop,Fish Market
