# Where to buy  in Helsinki

## 1 Introduction
### 1.1 Background

**Helsinki** is the capital city of Finland with a population of 657,674. Together with the cities of Espoo, Vantaa, and Kauniainen, and surrounding commuter towns, Helsinki forms the Greater Helsinki metropolitan area (Uusimaa), which has a population of over 1.5 million. This area is the country's most important center for politics, education, finance, culture, and research. The urbanization and development of the uusimaa area has brought great opportunities for the tertiary sectory business, including catering. Considering someone is seeking for a suitable place in Helsinki to open a restaurant, he or she must be interested in how restaurants are located in this city and which neighborhoods have the most restaurants. My project will provide an analysis of the 60 neighborhoods in Helsinki area and the situation of restaurants in each neighborhood. Then I will divide the neighborhoods to several clusters ... 

### 1.2 Data description

The data that will be used in this project include:\
-Subdivision (neighborhoods) of Helsinki, collected from wikipedia page [1].\
-The center coordinates of each neighborhood, collected from Google Map[2].\
-Housing price per square meter of each neighborhood, collected from Blok company website [3]. \
-The most common venues in each neighborhood, collected from Foursquare API [4].

## 2 Methodology

### 2.1 Data preparation

2.1.1 Prepare libaries needed for data collection, pre-processing and data modeling

In [1]:
# import libraries
import numpy as np 
import pandas as pd 
import requests # library to handle requests
!pip install bs4
from bs4 import BeautifulSoup

import json # library to handle JSON files
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge geopy --yes 
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

!conda install -c conda-forge folium=0.5.0 --yes 
import folium # map rendering library

Collecting bs4
  Downloading https://files.pythonhosted.org/packages/10/ed/7e8b97591f6f456174139ec089c769f89a94a1a4025fe967691de971f314/bs4-0.0.1.tar.gz
Collecting beautifulsoup4 (from bs4)
[?25l  Downloading https://files.pythonhosted.org/packages/d1/41/e6495bd7d3781cee623ce23ea6ac73282a373088fcd0ddc809a047b18eae/beautifulsoup4-4.9.3-py3-none-any.whl (115kB)
[K     |████████████████████████████████| 122kB 14.4MB/s eta 0:00:01
[?25hCollecting soupsieve>1.2; python_version >= "3.0" (from beautifulsoup4->bs4)
  Downloading https://files.pythonhosted.org/packages/41/e7/3617a4b988ed7744743fb0dbba5aa0a6e3f95a9557b43f8c4740d296b48a/soupsieve-2.2-py3-none-any.whl
Building wheels for collected packages: bs4
  Building wheel for bs4 (setup.py) ... [?25ldone
[?25h  Stored in directory: /home/jupyterlab/.cache/pip/wheels/a0/b0/b2/4f80b9456b87abedbc0bf2d52235414c3467d8889be38dd472
Successfully built bs4
Installing collected packages: soupsieve, beautifulsoup4, bs4
Successfully installed beaut

2.1.2 Get the neighborhood data from wikipedia using BeautifulSoup

In [31]:
# Use Webscraping to Extract data
url = 'https://en.wikipedia.org/wiki/Subdivisions_of_Helsinki'
data = requests.get(url)

soup= BeautifulSoup(data.content, "html.parser")
helsinki_neiborhood_raw = soup.find_all("div", {"class": "div-col"})[0].find_all("li")

df = pd.DataFrame(columns=["Code","Neighborhood","Codelen"])
for row in helsinki_neiborhood_raw:
    col = row.get_text().split(" ") 
    code = col[0] 
    neighborhood = col[1] 
    codelen = len(col[0]) #length of the code
    df= df.append({"Code":code, "Neighborhood":neighborhood,"Codelen":codelen },ignore_index = True)

df=df[df.Codelen!=3] #remove rows with sub-neighborhood (Column "Code" has values with three digit)
df.drop(['Codelen','Code'], axis=1, inplace=True) #drop colmn 'Codelen' and'Code'
df.reset_index(drop=True, inplace=True) 
df.replace({"Ultuna\n591":"Ultuna"}, inplace=True) #fix data of row 58
helsinki_neighborhood = df
helsinki_neighborhood

Unnamed: 0,Neighborhood
0,Kruununhaka
1,Kluuvi
2,Kaartinkaupunki
3,Kamppi
4,Punavuori
5,Eira
6,Ullanlinna
7,Katajanokka
8,Kaivopuisto
9,Sörnäinen


2.1.3 Get housing price data from Blok website

In [4]:
url2 = 'https://blok.ai/en/neighbourhoods/'
data2 = requests.get(url2)

soup2=BeautifulSoup(data2.content,'html.parser')
table = soup2.find_all('table')
housing_price_raw = table[0]

In [5]:
df2 = pd.DataFrame(columns=["Postcode","Neighborhood","City", "Avg_price_per_sqaure_meter_2020", "Price_change_percentage_1yr", "Price_change_percentage_5yr"])
rows = housing_price_raw.find('tbody').find_all('tr')
for row in rows:
    col = row.find_all('td')
    postcode = col[2].string
    neighborhood2 = col[3].string
    city = col[4].string
    avg_price_per_sqaure_meter_2020 = col[5].string
    price_change_percentage_1yr = col[6].string
    price_change_percentage_5yr = col[7].string
    df2= df2.append({"Postcode":postcode,"Neighborhood":neighborhood2,"City":city, "Avg_price_per_sqaure_meter_2020":avg_price_per_sqaure_meter_2020, "Price_change_percentage_1yr":price_change_percentage_1yr, "Price_change_percentage_5yr":price_change_percentage_5yr},ignore_index = True)

In [6]:
#Keep only rows with City value "Helsinki"
housing_price = df2[df2.City=='Helsinki']
housing_price.reset_index(drop=True, inplace=True)
housing_price.head()

Unnamed: 0,Postcode,Neighborhood,City,Avg_price_per_sqaure_meter_2020,Price_change_percentage_1yr,Price_change_percentage_5yr
0,140,Kaivopuisto - Ullanlinna,Helsinki,8713,2%,29%
1,150,Eira - Hernesaari,Helsinki,8367,4%,27%
2,120,Punavuori,Helsinki,8160,6%,27%
3,180,Kamppi - Ruoholahti,Helsinki,8023,14%,27%
4,220,Jätkäsaari,Helsinki,7871,,


2.1.4 Get the center coordinates for each neighborhood from LatLong.net. I have downloaded the csv file from LatLong.net.

In [28]:
geocodes = pd.read_csv('helsinki_neighborhood_geocode.csv')
geocodes.head()

Unnamed: 0,Location,Latitude,Longitude
0,Jakomäki - Alppikylä Helsinki Finland,60.26013,25.07803
1,Kontula - Vesala Helsinki Finland,60.23661,25.08363
2,Mellunmäki Helsinki Finland,60.23722,25.11409
3,Siltamäki Helsinki Finland,60.2744,24.98955
4,Puistola Helsinki Finland,60.27128,25.04527


In [29]:
geocodes.rename(columns={'Location':'Neighborhood'},inplace=True) #rename column 'Location' to 'Neighborhood'

In [30]:
geocodes['Neighborhood'] = geocodes.Neighborhood.str.replace(' Helsinki Finland','') #remove ' Helsinki Finland' from the neighborhood values
geocodes.head()

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Jakomäki - Alppikylä,60.26013,25.07803
1,Kontula - Vesala,60.23661,25.08363
2,Mellunmäki,60.23722,25.11409
3,Siltamäki,60.2744,24.98955
4,Puistola,60.27128,25.04527


2.1.5 Merge the geocodes dataframe with helsinki_neighborhood dataframe, and geocodes dataframe with housing_price dataframe (we cannot merge three dataframes into one due to the fact that the neighborhood divisions are different in helsinki_neighborhood and housing_price dataframe)

In [36]:
helsinki_neighborhood_geo = pd.merge(helsinki_neighborhood, geocodes, on =['Neighborhood'])
helsinki_neighborhood_geo

Unnamed: 0,Neighborhood,Latitude,Longitude
0,Kruununhaka,60.17164,24.95658
1,Kluuvi,60.17047,24.94654
2,Kaartinkaupunki,60.16486,24.9494
3,Kamppi,60.16739,24.93108
4,Punavuori,60.16147,24.93726
5,Eira,60.15654,24.93817
6,Ullanlinna,60.15749,24.94948
7,Katajanokka,60.16646,24.96935
8,Kaivopuisto,60.158,24.95977
9,Sörnäinen,60.18664,24.96759


In [37]:
housing_price_new = pd.merge(housing_price,geocodes, on =['Neighborhood'] )
housing_price_new.head()

Unnamed: 0,Postcode,Neighborhood,City,Avg_price_per_sqaure_meter_2020,Price_change_percentage_1yr,Price_change_percentage_5yr,Latitude,Longitude
0,140,Kaivopuisto - Ullanlinna,Helsinki,8713,2%,29%,60.158,24.95977
1,150,Eira - Hernesaari,Helsinki,8367,4%,27%,60.15654,24.93817
2,120,Punavuori,Helsinki,8160,6%,27%,60.16147,24.93726
3,180,Kamppi - Ruoholahti,Helsinki,8023,14%,27%,60.16739,24.93108
4,220,Jätkäsaari,Helsinki,7871,,,60.15826,24.9128


### 2.2 Explore and cluster the neighborhoods in Helsinki

2.2.1 Use geopy library to get the latitude and longitude values of Helsinki

In [38]:
address = 'Helsinki, FI'

geolocator = Nominatim(user_agent="ny_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Helsinki are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Helsinki are 60.1674881, 24.9427473.


2.2.2 Visualize the neighborhoods in Helsinki

In [43]:
# create map of Toronto using latitude and longitude values
map_helsinki = folium.Map(location=[latitude, longitude], zoom_start=11)

# add markers to map
for lat, lng, label in zip(helsinki_neighborhood_geo['Latitude'], helsinki_neighborhood_geo['Longitude'], helsinki_neighborhood_geo['Neighborhood']):
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_helsinki)  
    
map_helsinki

2.2.3 Utilize the Foursquare API to explore the neighborhoods and segment them.

In [44]:
#Degine Foursquare API credentials
CLIENT_ID = 'RJUXTLJHV2SXEUC5UDBZAQJWGIN1DR3RFACWJQY5J222RZME' 
CLIENT_SECRET = '1EH4YNDCQ5JZ4ZQCWKPGSK1Q1YOFJPZLJGQNFQDD4W5DMHIZ' 
VERSION = '20180605'
LIMIT = 100

In [47]:
#define a function to get the venue data of Helsinki neighborhoods
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [48]:
#Run the above function and create a new dataframe called helsinki_venues
helsinki_venues = getNearbyVenues(names=helsinki_neighborhood_geo['Neighborhood'],
                                   latitudes=helsinki_neighborhood_geo['Latitude'],
                                   longitudes=helsinki_neighborhood_geo['Longitude']
                                  )

In [49]:
helsinki_venues.shape

(2758, 7)

In [50]:
helsinki_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Kruununhaka,60.17164,24.95658,Kuurna,60.170128,24.958564,Scandinavian Restaurant
1,Kruununhaka,60.17164,24.95658,Papu Cafe,60.17304,24.956453,Café
2,Kruununhaka,60.17164,24.95658,Korea House,60.17291,24.956436,Korean Restaurant
3,Kruununhaka,60.17164,24.95658,Cafe LOV,60.171284,24.956623,Café
4,Kruununhaka,60.17164,24.95658,Bei Fang,60.171602,24.95399,Chinese Restaurant


Check how many venues were returned for each neighborhood

In [51]:
helsinki_venues.groupby('Neighborhood').count() 

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Alppiharju,100,100,100,100,100,100
Aluemeri,100,100,100,100,100,100
Eira,35,35,35,35,35,35
Etu-Töölö,33,33,33,33,33,33
Haaga,3,3,3,3,3,3
Hermanni,28,28,28,28,28,28
Herttoniemi,42,42,42,42,42,42
Kaarela,18,18,18,18,18,18
Kaartinkaupunki,72,72,72,72,72,72
Kaivopuisto,25,25,25,25,25,25


Find out how many unique categories can be curated from all the returned venues

In [53]:
print('There are {} uniques categories.'.format(len(helsinki_venues['Venue Category'].unique())))

There are 241 uniques categories.


Analyze Each Neighborhood

In [54]:
# one hot encoding
helsinki_onehot = pd.get_dummies(helsinki_venues[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
helsinki_onehot['Neighborhood'] = helsinki_venues['Neighborhood'] 

# move neighborhood column to the first column
fixed_columns = [helsinki_onehot.columns[-1]] + list(helsinki_onehot.columns[:-1])
helsinki_onehot = helsinki_onehot[fixed_columns]

helsinki_onehot.head()

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Antique Shop,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auditorium,BBQ Joint,...,Venezuelan Restaurant,Video Game Store,Vietnamese Restaurant,Waterfront,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo,Zoo Exhibit
0,Kruununhaka,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,Kruununhaka,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,Kruununhaka,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,Kruununhaka,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,Kruununhaka,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


Group rows by neighborhood and by taking the mean of the frequency of occurrence of each category

In [55]:
helsinki_grouped = helsinki_onehot.groupby('Neighborhood').mean().reset_index()
helsinki_grouped

Unnamed: 0,Neighborhood,Accessories Store,American Restaurant,Antique Shop,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,Auditorium,BBQ Joint,...,Venezuelan Restaurant,Video Game Store,Vietnamese Restaurant,Waterfront,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo,Zoo Exhibit
0,Alppiharju,0.0,0.0,0.0,0.02,0.03,0.01,0.02,0.0,0.0,...,0.0,0.01,0.01,0.0,0.02,0.02,0.0,0.01,0.0,0.0
1,Aluemeri,0.0,0.0,0.0,0.02,0.03,0.01,0.02,0.0,0.0,...,0.0,0.01,0.01,0.0,0.02,0.02,0.0,0.01,0.0,0.0
2,Eira,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.028571,0.0,0.0,0.0,0.028571,0.0,0.0
3,Etu-Töölö,0.0,0.0,0.0,0.030303,0.0,0.0,0.030303,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Haaga,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
5,Hermanni,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.0,...,0.0,0.0,0.035714,0.0,0.0,0.0,0.0,0.035714,0.0,0.0
6,Herttoniemi,0.02381,0.0,0.0,0.0,0.0,0.02381,0.02381,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Kaarela,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.055556,0.0,0.0,0.0,0.0
8,Kaartinkaupunki,0.0,0.013889,0.0,0.0,0.0,0.0,0.013889,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.013889,0.0,0.0,0.0,0.0
9,Kaivopuisto,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


Write a function to sort the venues based on its frequency of each neighborhood

In [56]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

Create a new dataframe and display the top 10 venues of each neighborhood

In [59]:
num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = helsinki_grouped['Neighborhood']

for ind in np.arange(helsinki_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(helsinki_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Alppiharju,Burger Joint,Coffee Shop,Scandinavian Restaurant,Art Museum,Clothing Store,Café,Gym / Fitness Center,Indie Movie Theater,Food Court,Chinese Restaurant
1,Aluemeri,Burger Joint,Coffee Shop,Scandinavian Restaurant,Art Museum,Clothing Store,Café,Gym / Fitness Center,Indie Movie Theater,Food Court,Chinese Restaurant
2,Eira,Café,Italian Restaurant,Pizza Place,Bakery,Ice Cream Shop,French Restaurant,Park,Waterfront,Beach,Cocktail Bar
3,Etu-Töölö,Scandinavian Restaurant,Pub,Plaza,Park,Bakery,Coffee Shop,Bookstore,Sushi Restaurant,Restaurant,Road
4,Haaga,Sushi Restaurant,Café,Grocery Store,Zoo Exhibit,Event Space,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Film Studio


Cluster Neighborhoods

Run K-means to cluster the neighborhood to 5 clusters

In [67]:
# set number of clusters
kclusters = 3

helsinki_grouped_clustering = helsinki_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(helsinki_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1], dtype=int32)

create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.

In [68]:
# add clustering labels
neighborhoods_venues_sorted['Cluster Labels']=kmeans.labels_
helsinki_merged = helsinki_neighborhood_geo

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
helsinki_merged = helsinki_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Neighborhood')

helsinki_merged.head() 

Unnamed: 0,Neighborhood,Latitude,Longitude,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
0,Kruununhaka,60.17164,24.95658,Café,Boat or Ferry,Bar,Grocery Store,Pizza Place,Scandinavian Restaurant,Theater,Modern European Restaurant,Chinese Restaurant,Plaza,1
1,Kluuvi,60.17047,24.94654,Café,Scandinavian Restaurant,Hotel,Coffee Shop,Park,Plaza,Bar,Chinese Restaurant,Clothing Store,Bistro,1
2,Kaartinkaupunki,60.16486,24.9494,Scandinavian Restaurant,Hotel,Café,Park,Furniture / Home Store,Vegetarian / Vegan Restaurant,Cocktail Bar,Hotel Bar,Boat or Ferry,Bar,1
3,Kamppi,60.16739,24.93108,Wine Bar,Scandinavian Restaurant,Hotel,Sushi Restaurant,Japanese Restaurant,Coffee Shop,Middle Eastern Restaurant,Chinese Restaurant,Asian Restaurant,Bar,1
4,Punavuori,60.16147,24.93726,Scandinavian Restaurant,Hotel,Coffee Shop,Café,Restaurant,Bakery,Park,Pizza Place,Kitchen Supply Store,Italian Restaurant,1


Visualize the resulting clusters

In [69]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(helsinki_merged['Latitude'], helsinki_merged['Longitude'], helsinki_merged['Neighborhood'], helsinki_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [72]:
helsinki_merged.loc[helsinki_merged['Cluster Labels'] == 2, helsinki_merged.columns[[1] + list(range(5, helsinki_merged.shape[1]))]]

Unnamed: 0,Latitude,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue,Cluster Labels
10,60.18609,Dog Run,Music School,Chinese Restaurant,Grocery Store,Park,Harbor / Marina,Gym,Italian Restaurant,2
15,60.20061,Plaza,Supermarket,Bar,Himalayan Restaurant,Filipino Restaurant,Farm,Farmers Market,Fast Food Restaurant,2
23,60.20911,Pool,Pub,Playground,Pizza Place,Fast Food Restaurant,Event Space,Falafel Restaurant,Farm,2
25,60.21827,Park,Kebab Restaurant,Go Kart Track,Pharmacy,Café,Grocery Store,Filipino Restaurant,Fast Food Restaurant,2
27,60.23345,Soccer Field,Grocery Store,Gym,Plaza,Skating Rink,Fast Food Restaurant,Taxi Stand,Playground,2
33,60.24441,Playground,Pizza Place,Grocery Store,Farmers Market,Ethiopian Restaurant,Event Space,Falafel Restaurant,Farm,2
36,60.24496,Grocery Store,Platform,Pizza Place,Pub,Convenience Store,Pet Store,Dive Bar,Park,2
41,60.18495,Badminton Court,Music School,Scandinavian Restaurant,Chinese Restaurant,Gym,Park,Harbor / Marina,Grocery Store,2
48,60.17156,Food Court,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,Film Studio,Filipino Restaurant,Fast Food Restaurant,2
57,60.25,Lounge,Park,Zoo Exhibit,Event Space,Flower Shop,Flea Market,Fish Market,Fish & Chips Shop,2
