# Best place to live in Oporto
### (Venue analysis in different neighbourhoods around Oporto cities)

João Martins | __Project Capstone Assignment 3 - Week 1__

Date: 15/MAR/2021

#### __Introduction:__


This study will support any foreign student or portuguese citizen that want to live in Porto to find the best place for his accommodation in the city/neighbourhood within the Porto district, located in the North of Portugal.

The independent variables that can affect the decision of a person to chose one place on another are the venues categories around the city centre on average 1.000m


#### __Business Problem__

__Question:__ Where I should select my accommodation if I have to move to Porto for working or studying?

#### __Data collection, data cleaning and data preparation__

The data from the all country (cities, districts, population and geolocation) will be collected from the Governement open source website. Grouping the cities by districts we can find the cities and geolocation for the 27 cities within the Porto district.

We need to check if there is any null value, drop columns with information not clear or not necessary. Check the type of the values (dtype), dimension (shape) of the dataframe and count the amount of cities within each district.

From the foursquare API we can extract the top 100 venues within the Porto district in a radius of 1.000meters then we can group the venues by neighbourhoods.

After we put the top venues in the graph we can cluster the locations according the most common venue category within the neighbourhood.

Analysing the clusters segments we can decide which is the best location to live within Porto district acording to the clusters segment in terms of most 50th common venues. 

In [1]:
# Import the libraries required for this project
import pandas as pd              #library for data analsysis
import numpy as np               #library to handle data in a vectorized manner
import requests
import plotly.graph_objects as go
from plotly.subplots import make_subplots

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim   #convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Libraries imported.


In [2]:
pt_cities = 'pt.csv'
pt_df = pd.read_csv(pt_cities)
pt_df.head()

Unnamed: 0,city,lat,lng,country,iso2,admin_name,capital,population,population_proper
0,Lisbon,38.7452,-9.1604,Portugal,PT,Lisboa,primary,506654.0,506654.0
1,Vila Nova de Gaia,41.1333,-8.6167,Portugal,PT,Porto,minor,302295.0,302295.0
2,Porto,41.1495,-8.6108,Portugal,PT,Porto,admin,237591.0,237591.0
3,Braga,41.5333,-8.4167,Portugal,PT,Braga,admin,181494.0,181494.0
4,Matosinhos,41.2077,-8.6674,Portugal,PT,Porto,minor,175478.0,175478.0


In [3]:
pt_df.dtypes

city                  object
lat                  float64
lng                  float64
country               object
iso2                  object
admin_name            object
capital               object
population           float64
population_proper    float64
dtype: object

In [4]:
pt_df.describe()

Unnamed: 0,lat,lng,population,population_proper
count,336.0,336.0,267.0,267.0
mean,39.651551,-9.207921,31731.419476,31731.419476
std,1.801183,3.929988,50926.536431,50926.536431
min,32.6412,-28.6333,1065.0,1065.0
25%,38.7488,-8.751775,6355.5,6355.5
50%,39.91645,-8.36765,13391.0,13391.0
75%,41.0833,-7.8,36285.5,36285.5
max,42.1167,-6.2667,506654.0,506654.0


In [5]:
pt_df.columns

Index(['city', 'lat', 'lng', 'country', 'iso2', 'admin_name', 'capital',
       'population', 'population_proper'],
      dtype='object')

In [6]:
pt_df.drop(columns=['iso2','country','capital', 'population_proper',], axis=1, inplace=True)
pt_df.head()

Unnamed: 0,city,lat,lng,admin_name,population
0,Lisbon,38.7452,-9.1604,Lisboa,506654.0
1,Vila Nova de Gaia,41.1333,-8.6167,Porto,302295.0
2,Porto,41.1495,-8.6108,Porto,237591.0
3,Braga,41.5333,-8.4167,Braga,181494.0
4,Matosinhos,41.2077,-8.6674,Porto,175478.0


In [7]:
pt_df.rename(columns = {'city':'City', 'lat':'Latitude', 'lng':'Longitude','admin_name':'District', 'population':'Population'}, inplace = True)
pt_df.head()

Unnamed: 0,City,Latitude,Longitude,District,Population
0,Lisbon,38.7452,-9.1604,Lisboa,506654.0
1,Vila Nova de Gaia,41.1333,-8.6167,Porto,302295.0
2,Porto,41.1495,-8.6108,Porto,237591.0
3,Braga,41.5333,-8.4167,Braga,181494.0
4,Matosinhos,41.2077,-8.6674,Porto,175478.0


In [8]:
pt_df.shape

(336, 5)

In [9]:
pt_missing_data = pt_df.isnull()
pt_missing_data.head(5)

Unnamed: 0,City,Latitude,Longitude,District,Population
0,False,False,False,False,False
1,False,False,False,False,False
2,False,False,False,False,False
3,False,False,False,False,False
4,False,False,False,False,False


In [10]:
for column in pt_missing_data.columns.values.tolist():
    print(column)
    print (pt_missing_data[column].value_counts())
    print("")

City
False    336
Name: City, dtype: int64

Latitude
False    336
Name: Latitude, dtype: int64

Longitude
False    336
Name: Longitude, dtype: int64

District
False    336
Name: District, dtype: int64

Population
False    267
True      69
Name: Population, dtype: int64



In [11]:
pt_df['District'].value_counts()

Porto               27
Lisboa              26
Viseu               24
Santarém            22
Aveiro              20
Faro                19
Coimbra             19
Leiria              18
Braga               18
Portalegre          15
Beja                15
Setúbal             14
Évora               14
Guarda              14
Vila Real           14
Azores              13
Bragança            12
Castelo Branco      11
Viana do Castelo    11
Madeira             10
Name: District, dtype: int64

In [12]:
pt_df['District'].value_counts().idxmax()

'Porto'

In [13]:
porto_df = pt_df[pt_df['District']=="Porto"]
porto_df.reset_index(drop=True, inplace=True)
porto_df.head(27)

Unnamed: 0,City,Latitude,Longitude,District,Population
0,Vila Nova de Gaia,41.1333,-8.6167,Porto,302295.0
1,Porto,41.1495,-8.6108,Porto,237591.0
2,Matosinhos,41.2077,-8.6674,Porto,175478.0
3,Gondomar,41.15,-8.5333,Porto,168027.0
4,Maia,41.2333,-8.6167,Porto,135306.0
5,Valongo,41.1833,-8.5,Porto,93858.0
6,Paredes,41.2,-8.3333,Porto,86854.0
7,Vila do Conde,41.35,-8.75,Porto,79533.0
8,Penafiel,41.2,-8.2833,Porto,72265.0
9,Póvoa de Varzim,41.3916,-8.7571,Porto,63408.0


In [14]:
print('Teh dataframe from Portugal has {} districts and {} cities.'.format(
        len(pt_df['District'].unique()),
        pt_df.shape[0]
    )
)

Teh dataframe from Portugal has 20 districts and 336 cities.


In [17]:
address = 'Porto'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Porto City are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Porto City are 41.1494512, -8.6107884.


In [18]:
# create map of Porto using latitude and longitude values
map_porto = folium.Map(location=[latitude, longitude], zoom_start=10)
map_porto

In [19]:
# create map of Totonto using latitude and longitude values
map_porto = folium.Map(location=[latitude, longitude], zoom_start=10)

# add markers to map
for lat, lng, borough, neighborhood in zip(porto_df['Latitude'], porto_df['Longitude'], porto_df['District'], porto_df['City']):
    label = '{}, {}'.format(neighborhood, borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_porto)  
    
map_porto

In [20]:
CLIENT_ID = 'BEONOV3310S3XZTDHSJLC50E4O0KL53NKFWEGT5KKFDIYSK0' # your Foursquare ID
CLIENT_SECRET = 'JMK35K44HWHZK2EEKJKKRFNNULB2AFLDSQIBNMYM5BJHUI3E' # your Foursquare Secret
ACCESS_TOKEN = 'XWEIL4PYQWW2HTOK2UKCRPFXPRIKDBE0P0AMGS50KDM52AMF' # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 30
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: BEONOV3310S3XZTDHSJLC50E4O0KL53NKFWEGT5KKFDIYSK0
CLIENT_SECRET:JMK35K44HWHZK2EEKJKKRFNNULB2AFLDSQIBNMYM5BJHUI3E


In [21]:
LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 1000 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    latitude, 
    longitude, 
    radius, 
    LIMIT)
url # display URL

'https://api.foursquare.com/v2/venues/explore?&client_id=BEONOV3310S3XZTDHSJLC50E4O0KL53NKFWEGT5KKFDIYSK0&client_secret=JMK35K44HWHZK2EEKJKKRFNNULB2AFLDSQIBNMYM5BJHUI3E&v=20180604&ll=41.1494512,-8.6107884&radius=1000&limit=100'

In [22]:
results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '604f2dbdb915ec4d1be22002'},
 'response': {'suggestedFilters': {'header': 'Tap to show:',
   'filters': [{'name': 'Open now', 'key': 'openNow'}]},
  'headerLocation': 'Porto',
  'headerFullLocation': 'Porto',
  'headerLocationGranularity': 'city',
  'totalResults': 230,
  'suggestedBounds': {'ne': {'lat': 41.15845120900001,
    'lng': -8.598858445454079},
   'sw': {'lat': 41.14045119099999, 'lng': -8.622718354545922}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4c22b79d9085d13a28de86cc',
       'name': 'Avenida dos Aliados',
       'location': {'address': 'Av. dos Aliados',
        'lat': 41.148302294633744,
        'lng': -8.61104001015237,
        'labeledLatLngs': [{'label': 'display',
          'lat': 41.148302294633744,
     

In [23]:
# function that extracts the category of the venue
def get_category_type(row):
    try:
        categories_list = row['categories']
    except:
        categories_list = row['venue.categories']
        
    if len(categories_list) == 0:
        return None
    else:
        return categories_list[0]['name']

In [24]:
venues = results['response']['groups'][0]['items']
    
nearby_venues = json_normalize(venues) # flatten JSON

# filter columns
filtered_columns = ['venue.name', 'venue.categories', 'venue.location.lat', 'venue.location.lng']
nearby_venues =nearby_venues.loc[:, filtered_columns]

# filter the category for each row
nearby_venues['venue.categories'] = nearby_venues.apply(get_category_type, axis=1)

# clean columns
nearby_venues.columns = [col.split(".")[-1] for col in nearby_venues.columns]

nearby_venues.head()

  This is separate from the ipykernel package so we can avoid doing imports until


Unnamed: 0,name,categories,lat,lng
0,Avenida dos Aliados,Plaza,41.148302,-8.61104
1,Tábua Rasa,Portuguese Restaurant,41.149303,-8.612494
2,Rivoli Cinema Hostel,Hostel,41.147622,-8.609883
3,Boa-Bao,Asian Restaurant,41.149274,-8.613109
4,Porto Lounge Hostel,Hostel,41.149567,-8.612209


In [25]:
print('{} venues were returned by Foursquare.'.format(nearby_venues.shape[0]))

100 venues were returned by Foursquare.


In [26]:
def getNearbyVenues(names, latitudes, longitudes, radius=1000):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [27]:
porto_venues = getNearbyVenues(names=porto_df['City'],
                                   latitudes=porto_df['Latitude'],
                                   longitudes=porto_df['Longitude']
                                  )

Vila Nova de Gaia
Porto
Matosinhos
Gondomar
Maia
Valongo
Paredes
Vila do Conde
Penafiel
Póvoa de Varzim
Felgueiras
Paços de Ferreira
Amarante
Marco de Canavezes
Rio Tinto
Lousada
Trofa
Ermezinde
Vizela
Baião
Alfena
Arcozelo
Valadares
Aver-o-Mar
Castelões de Cepeda
Olival
Santo Tirso


In [28]:
print(porto_venues.shape)
porto_venues.head()

(684, 7)


Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Vila Nova de Gaia,41.1333,-8.6167,Croft Port,41.134585,-8.614832,Wine Shop
1,Vila Nova de Gaia,41.1333,-8.6167,Caves Taylor's,41.134341,-8.614405,Winery
2,Vila Nova de Gaia,41.1333,-8.6167,The Yeatman,41.133652,-8.612981,Hotel
3,Vila Nova de Gaia,41.1333,-8.6167,Barão Fladgate,41.134561,-8.614298,Portuguese Restaurant
4,Vila Nova de Gaia,41.1333,-8.6167,7groaster,41.136979,-8.613956,Café


In [29]:
porto_venues.tail()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
679,Santo Tirso,41.3428,-8.4775,Café Kanimambo,41.33981,-8.473931,Sports Bar
680,Santo Tirso,41.3428,-8.4775,Galp,41.337416,-8.478757,Gas Station
681,Santo Tirso,41.3428,-8.4775,Loja MEO,41.338511,-8.472309,Tech Startup
682,Santo Tirso,41.3428,-8.4775,Pantir,41.33637,-8.476125,Bakery
683,Santo Tirso,41.3428,-8.4775,Estação Ferroviária de Santo Tirso,41.35085,-8.473071,Train Station


In [30]:
porto_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alfena,5,5,5,5,5,5
Amarante,33,33,33,33,33,33
Arcozelo,13,13,13,13,13,13
Aver-o-Mar,11,11,11,11,11,11
Baião,4,4,4,4,4,4
Castelões de Cepeda,25,25,25,25,25,25
Ermezinde,40,40,40,40,40,40
Felgueiras,4,4,4,4,4,4
Gondomar,10,10,10,10,10,10
Lousada,11,11,11,11,11,11


In [31]:
print('There are {} uniques categories.'.format(len(porto_venues['Venue Category'].unique())))

There are 129 uniques categories.


In [32]:
porto_venues.shape

(684, 7)

In [33]:
# one hot encoding
porto_onehot = pd.get_dummies(porto_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
porto_onehot['Neighborhood'] = porto_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [porto_onehot.columns[-1]] + list(porto_onehot.columns[:-1])
porto_onehot = porto_onehot[fixed_columns]

porto_onehot.head()

Unnamed: 0,Neighborhood,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,...,Theme Park,Toll Plaza,Trail,Train Station,Vegetarian / Vegan Restaurant,Waterfront,Wine Bar,Wine Shop,Winery,Yoga Studio
0,Vila Nova de Gaia,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
1,Vila Nova de Gaia,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,1,0
2,Vila Nova de Gaia,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Vila Nova de Gaia,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Vila Nova de Gaia,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [34]:
porto_onehot.shape

(684, 130)

In [35]:
porto_grouped = porto_onehot.groupby('Neighborhood').mean().reset_index()
porto_grouped

Unnamed: 0,Neighborhood,Art Gallery,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,...,Theme Park,Toll Plaza,Trail,Train Station,Vegetarian / Vegan Restaurant,Waterfront,Wine Bar,Wine Shop,Winery,Yoga Studio
0,Alfena,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.4,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Amarante,0.0,0.0,0.0,0.0,0.0,0.0,0.030303,0.090909,0.121212,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Arcozelo,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.230769,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Aver-o-Mar,0.0,0.0,0.0,0.0,0.0,0.0,0.090909,0.090909,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Baião,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Castelões de Cepeda,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.16,0.04,...,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0
6,Ermezinde,0.0,0.0,0.0,0.025,0.0,0.0,0.05,0.1,0.05,...,0.0,0.0,0.0,0.025,0.0,0.0,0.0,0.0,0.0,0.0
7,Felgueiras,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Gondomar,0.0,0.0,0.0,0.0,0.0,0.0,0.1,0.2,0.1,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Lousada,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.272727,0.181818,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [36]:
porto_grouped.shape

(27, 130)

In [37]:
num_top_venues = 5

for hood in porto_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = porto_grouped[porto_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----Alfena----
           venue  freq
0         Bakery   0.4
1           Park   0.2
2           Café   0.2
3  Big Box Store   0.2
4    Art Gallery   0.0


----Amarante----
                   venue  freq
0             Restaurant  0.15
1                    Bar  0.12
2                   Café  0.09
3                 Bakery  0.09
4  Portuguese Restaurant  0.09


----Arcozelo----
                   venue  freq
0                 Bakery  0.23
1             Restaurant  0.15
2  Portuguese Restaurant  0.15
3          Grocery Store  0.15
4     Seafood Restaurant  0.08


----Aver-o-Mar----
                   venue  freq
0  Portuguese Restaurant  0.36
1             Restaurant  0.18
2            Supermarket  0.09
3              BBQ Joint  0.09
4                 Bakery  0.09


----Baião----
                   venue  freq
0  Portuguese Restaurant  0.25
1         Ice Cream Shop  0.25
2             Restaurant  0.25
3                 Bakery  0.25
4            Pizza Place  0.00


----Castelões de Cepeda---

In [38]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

In [39]:
num_top_venues = 50

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
porto_neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
porto_neighborhoods_venues_sorted['Neighborhood'] = porto_grouped['Neighborhood']

for ind in np.arange(porto_grouped.shape[0]):
    porto_neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(porto_grouped.iloc[ind, :], num_top_venues)

porto_neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
0,Alfena,Bakery,Big Box Store,Park,Café,Yoga Studio,Fast Food Restaurant,Food Court,Food,Flea Market,...,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Brazilian Restaurant
1,Amarante,Restaurant,Bar,Portuguese Restaurant,Bakery,Café,Coffee Shop,Lounge,Pizza Place,Plaza,...,Food Court,Yoga Studio,Convenience Store,Construction & Landscaping,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,Beach
2,Arcozelo,Bakery,Portuguese Restaurant,Restaurant,Grocery Store,Soccer Stadium,Pharmacy,Seafood Restaurant,Gas Station,Yoga Studio,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share
3,Aver-o-Mar,Portuguese Restaurant,Restaurant,Hotel,BBQ Joint,Spa,Bakery,Supermarket,Diner,Dessert Shop,...,Athletics & Sports,Auto Garage,Auto Workshop,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro
4,Baião,Ice Cream Shop,Portuguese Restaurant,Bakery,Restaurant,Deli / Bodega,Dessert Shop,Diner,Dutch Restaurant,Garden,...,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot


In [40]:
# set number of clusters
kclusters = 15

porto_grouped_clustering = porto_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(porto_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([ 3,  2, 10, 12,  0,  2,  2, 11,  2, 14], dtype=int32)

In [41]:
# add clustering labels
porto_neighborhoods_venues_sorted.insert(0, 'Cluster Labels Porto', kmeans.labels_)

porto_merged = porto_venues

# merge manhattan_grouped with manhattan_data to add latitude/longitude for each neighborhood
porto_merged = porto_merged.join(porto_neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

porto_merged.head() # check the last columns!

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category,Cluster Labels Porto,1st Most Common Venue,2nd Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
0,Vila Nova de Gaia,41.1333,-8.6167,Croft Port,41.134585,-8.614832,Wine Shop,1,Portuguese Restaurant,Wine Bar,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar
1,Vila Nova de Gaia,41.1333,-8.6167,Caves Taylor's,41.134341,-8.614405,Winery,1,Portuguese Restaurant,Wine Bar,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar
2,Vila Nova de Gaia,41.1333,-8.6167,The Yeatman,41.133652,-8.612981,Hotel,1,Portuguese Restaurant,Wine Bar,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar
3,Vila Nova de Gaia,41.1333,-8.6167,Barão Fladgate,41.134561,-8.614298,Portuguese Restaurant,1,Portuguese Restaurant,Wine Bar,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar
4,Vila Nova de Gaia,41.1333,-8.6167,7groaster,41.136979,-8.613956,Café,1,Portuguese Restaurant,Wine Bar,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar


In [44]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(porto_merged['Venue Latitude'], porto_merged['Venue Longitude'], porto_merged['Neighborhood'], porto_merged['Cluster Labels Porto']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [73]:
porto_cluster0=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 0, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster0.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
582,Baião,Ice Cream Shop,Portuguese Restaurant,Bakery,Restaurant,Deli / Bodega,Dessert Shop,Diner,Dutch Restaurant,Garden,...,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot
583,Baião,Ice Cream Shop,Portuguese Restaurant,Bakery,Restaurant,Deli / Bodega,Dessert Shop,Diner,Dutch Restaurant,Garden,...,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot


In [74]:
porto_cluster1=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 1, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster1.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
0,Vila Nova de Gaia,Portuguese Restaurant,Wine Bar,Restaurant,Bar,Café,Winery,Italian Restaurant,Hotel,Plaza,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar
1,Vila Nova de Gaia,Portuguese Restaurant,Wine Bar,Restaurant,Bar,Café,Winery,Italian Restaurant,Hotel,Plaza,...,Sushi Restaurant,Waterfront,Vegetarian / Vegan Restaurant,Construction & Landscaping,Candy Store,Chinese Restaurant,Chocolate Shop,Church,Clothing Store,Cocktail Bar


In [75]:
porto_cluster2=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 2, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster2.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
204,Gondomar,Bakery,Supermarket,BBQ Joint,Sushi Restaurant,Café,Seafood Restaurant,Bar,Scenic Lookout,Coffee Shop,...,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share
205,Gondomar,Bakery,Supermarket,BBQ Joint,Sushi Restaurant,Café,Seafood Restaurant,Bar,Scenic Lookout,Coffee Shop,...,Arts & Crafts Store,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share


In [76]:
porto_cluster3=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 3, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster3.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
586,Alfena,Bakery,Big Box Store,Park,Café,Yoga Studio,Fast Food Restaurant,Food Court,Food,Flea Market,...,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Brazilian Restaurant
587,Alfena,Bakery,Big Box Store,Park,Café,Yoga Studio,Fast Food Restaurant,Food Court,Food,Flea Market,...,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Bike Rental / Bike Share,Bistro,Brazilian Restaurant


In [77]:
porto_cluster4=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 4, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster4.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
200,Matosinhos,Convenience Store,Sporting Goods Shop,Pharmacy,BBQ Joint,Empanada Restaurant,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Workshop,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot
201,Matosinhos,Convenience Store,Sporting Goods Shop,Pharmacy,BBQ Joint,Empanada Restaurant,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Workshop,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot


In [78]:
porto_cluster5=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 5, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster5.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
577,Vizela,Bar,Wine Bar,Pizza Place,Tea Room,Yoga Studio,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro
578,Vizela,Bar,Wine Bar,Pizza Place,Tea Room,Yoga Studio,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro


In [79]:
porto_cluster6=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 6, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster6.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
649,Olival,BBQ Joint,Soccer Field,Bus Station,Auto Garage,Farm,Yoga Studio,Food Court,Food,Flea Market,...,Auto Workshop,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Brazilian Restaurant
650,Olival,BBQ Joint,Soccer Field,Bus Station,Auto Garage,Farm,Yoga Studio,Food Court,Food,Flea Market,...,Auto Workshop,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Brazilian Restaurant


In [80]:
porto_cluster7=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 7, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster7.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
604,Valadares,Breakfast Spot,Music Venue,Soccer Field,Portuguese Restaurant,Restaurant,Train Station,Gym / Fitness Center,Martial Arts School,Fast Food Restaurant,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store
605,Valadares,Breakfast Spot,Music Venue,Soccer Field,Portuguese Restaurant,Restaurant,Train Station,Gym / Fitness Center,Martial Arts School,Fast Food Restaurant,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store


In [81]:
porto_cluster8=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 8, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster8.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
423,Paços de Ferreira,Café,Ice Cream Shop,Portuguese Restaurant,Sandwich Place,Supermarket,Grocery Store,Electronics Store,Flea Market,Fast Food Restaurant,...,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share
424,Paços de Ferreira,Café,Ice Cream Shop,Portuguese Restaurant,Sandwich Place,Supermarket,Grocery Store,Electronics Store,Flea Market,Fast Food Restaurant,...,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share


In [82]:
porto_cluster9=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 9, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster9.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
274,Valongo,Café,Tapas Restaurant,Cocktail Bar,Bar,Deli / Bodega,Park,Snack Place,Portuguese Restaurant,Tennis Court,...,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Beach,Beer Bar,Beer Garden,Big Box Store
275,Valongo,Café,Tapas Restaurant,Cocktail Bar,Bar,Deli / Bodega,Park,Snack Place,Portuguese Restaurant,Tennis Court,...,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bakery,Beach,Beer Bar,Beer Garden,Big Box Store


In [83]:
porto_cluster10=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 10, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster10.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
591,Arcozelo,Bakery,Portuguese Restaurant,Restaurant,Grocery Store,Soccer Stadium,Pharmacy,Seafood Restaurant,Gas Station,Yoga Studio,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share
592,Arcozelo,Bakery,Portuguese Restaurant,Restaurant,Grocery Store,Soccer Stadium,Pharmacy,Seafood Restaurant,Gas Station,Yoga Studio,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share


In [84]:
porto_cluster11=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 11, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster11.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
419,Felgueiras,Bakery,Restaurant,Portuguese Restaurant,Diner,Electronics Store,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot
420,Felgueiras,Bakery,Restaurant,Portuguese Restaurant,Diner,Electronics Store,Food,Flea Market,Fast Food Restaurant,Farm,...,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro,Breakfast Spot


In [85]:
porto_cluster12=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 12, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster12.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
613,Aver-o-Mar,Portuguese Restaurant,Restaurant,Hotel,BBQ Joint,Spa,Bakery,Supermarket,Diner,Dessert Shop,...,Athletics & Sports,Auto Garage,Auto Workshop,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro
614,Aver-o-Mar,Portuguese Restaurant,Restaurant,Hotel,BBQ Joint,Spa,Bakery,Supermarket,Diner,Dessert Shop,...,Athletics & Sports,Auto Garage,Auto Workshop,Bar,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro


In [86]:
porto_cluster13=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 13, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster13.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
524,Trofa,Café,Shopping Mall,Bakery,Gas Station,Soccer Stadium,Park,Portuguese Restaurant,Clothing Store,Pub,...,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store
525,Trofa,Café,Shopping Mall,Bakery,Gas Station,Soccer Stadium,Park,Portuguese Restaurant,Clothing Store,Pub,...,Asian Restaurant,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Bar,Beach,Beer Bar,Beer Garden,Big Box Store


In [87]:
porto_cluster14=porto_merged.loc[porto_merged['Cluster Labels Porto'] == 14, porto_merged.columns[[0] + list(range(8, porto_merged.shape[1]))]]
porto_cluster14.head(2)

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
513,Lousada,Bakery,Bar,Gym / Fitness Center,Restaurant,Snack Place,Food Court,Coffee Shop,Portuguese Restaurant,Diner,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro
514,Lousada,Bakery,Bar,Gym / Fitness Center,Restaurant,Snack Place,Food Court,Coffee Shop,Portuguese Restaurant,Diner,...,Athletics & Sports,Auto Garage,Auto Workshop,BBQ Joint,Beach,Beer Bar,Beer Garden,Big Box Store,Bike Rental / Bike Share,Bistro


In [63]:
#porto_cluster2['1st Most Common Venue'].count_values()
for column in porto_cluster2.columns.values.tolist():
    print(column)
    print (porto_cluster2[column].value_counts())
    print("")  

Neighborhood
Maia                   60
Ermezinde              40
Amarante               33
Santo Tirso            29
Rio Tinto              26
Castelões de Cepeda    25
Póvoa de Varzim        24
Marco de Canavezes     24
Paredes                23
Penafiel               22
Gondomar               10
Name: Neighborhood, dtype: int64

2nd Most Common Venue
Bakery                   92
Café                     86
Portuguese Restaurant    49
Bar                      33
Restaurant               24
Park                     22
Supermarket              10
Name: 2nd Most Common Venue, dtype: int64

3rd Most Common Venue
Portuguese Restaurant    140
Restaurant                53
Gym                       40
Electronics Store         26
Park                      25
Pizza Place               22
BBQ Joint                 10
Name: 3rd Most Common Venue, dtype: int64

4th Most Common Venue
Café                    78
Coffee Shop             60
Bakery                  55
Fast Food Restaurant    40
Men's St

In [69]:
#porto_cluester2(porto_cluester2['Neighborhood']=="Maia")
porto_cluster2.groupby(['Neighborhood']).size()

Neighborhood
Amarante               33
Castelões de Cepeda    25
Ermezinde              40
Gondomar               10
Maia                   60
Marco de Canavezes     24
Paredes                23
Penafiel               22
Póvoa de Varzim        24
Rio Tinto              26
Santo Tirso            29
dtype: int64

In [72]:
Maia_cluster_result = porto_cluster2.loc[porto_cluster2['Neighborhood']== 'Maia']
Maia_cluster_result.head()

#df.loc[df['column_name'] == some_value]

Unnamed: 0,Neighborhood,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,...,41th Most Common Venue,42th Most Common Venue,43th Most Common Venue,44th Most Common Venue,45th Most Common Venue,46th Most Common Venue,47th Most Common Venue,48th Most Common Venue,49th Most Common Venue,50th Most Common Venue
214,Maia,Café,Portuguese Restaurant,Coffee Shop,Gym,Supermarket,Pizza Place,Restaurant,Sushi Restaurant,Pharmacy,...,Auto Garage,Athletics & Sports,Gastropub,General Entertainment,Gift Shop,Asian Restaurant,Gourmet Shop,Grocery Store,Arts & Crafts Store,Diner
215,Maia,Café,Portuguese Restaurant,Coffee Shop,Gym,Supermarket,Pizza Place,Restaurant,Sushi Restaurant,Pharmacy,...,Auto Garage,Athletics & Sports,Gastropub,General Entertainment,Gift Shop,Asian Restaurant,Gourmet Shop,Grocery Store,Arts & Crafts Store,Diner
216,Maia,Café,Portuguese Restaurant,Coffee Shop,Gym,Supermarket,Pizza Place,Restaurant,Sushi Restaurant,Pharmacy,...,Auto Garage,Athletics & Sports,Gastropub,General Entertainment,Gift Shop,Asian Restaurant,Gourmet Shop,Grocery Store,Arts & Crafts Store,Diner
217,Maia,Café,Portuguese Restaurant,Coffee Shop,Gym,Supermarket,Pizza Place,Restaurant,Sushi Restaurant,Pharmacy,...,Auto Garage,Athletics & Sports,Gastropub,General Entertainment,Gift Shop,Asian Restaurant,Gourmet Shop,Grocery Store,Arts & Crafts Store,Diner
218,Maia,Café,Portuguese Restaurant,Coffee Shop,Gym,Supermarket,Pizza Place,Restaurant,Sushi Restaurant,Pharmacy,...,Auto Garage,Athletics & Sports,Gastropub,General Entertainment,Gift Shop,Asian Restaurant,Gourmet Shop,Grocery Store,Arts & Crafts Store,Diner


# Conclusion

Considering the data and the analysis, we can say that the best palce to find an accommodation within the Porto district is the city of __Maia__