# <center> Capstone Project - The Battle of Neighborhoods


## Table of content:
1. Introduction to Business Problem
2. Data Description
3. Methodology
4. Results and Discussion
5. Conclusion 
6. Reference

### 1. Introduction to Business Problem

Frankfurt is a global hub for commerce, culture, education, tourism and transportation, it is the 5th lagerst city and the financial capital of Germany. This leads to huge variation of restaurants, bars, night spot, shops, museums and etc. But in this project I would like to consider one part of everyday life of locals: opportunity to make sport. 
This project can be interesting for business people who want to open Fitness/Gym in Frankfurt, and he/she needs to know a geographical distrubution of sport facilities.

### 2. Data Description

For this project we need following data:
* Geographical data of Frankfurt boroughs with names and coordinates;
* List of sport facilitities for each borough.

Following source were used:
* geographical data: Wikipedia [1]
* Foursquare API [2];
* Foursquare list of categories [3]
* GeoJson file with map of Frankfurt am Main [4]


### 3. Methodology

In this section data processing is described step by step. First of all all neseccary libraries and packages should be installed. 

In [1]:
!pip install pandas
!pip install requests
!pip install bs4
!pip install plotly
!conda install -c conda-forge geopy --yes
!conda install -c conda-forge folium=0.5.0 --yes 
!pip install geocoder
!pip install pgeocode
print("Done")


Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... 
  - anaconda/win-64::ca-certificates-2020.10.14-0
  - defaults/win-64::ca-certificates-2020.10.14-0done

# All requested packages already installed.

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... 
  - anaconda/win-64::ca-certificates-2020.10.14-0
  - defaults/win-64::ca-certificates-2020.10.14-0done

# All requested packages already installed.

Done


In [2]:
#Install libraries for web scrapping
import pandas as pd
import requests
from bs4 import BeautifulSoup
import numpy as np
from geopy.geocoders import Nominatim 
import matplotlib.cm as cm
import matplotlib.colors as colors
from sklearn.cluster import KMeans
import folium 
import geocoder
import matplotlib.colors as colors
print("Done")

Done


With th e help of  *request*  we get data from Wikipedia webpage, then find amount of data frames

In [3]:
url = "https://de.wikipedia.org/wiki/Liste_der_Ortsbezirke_von_Frankfurt_am_Main"
frankfurt_html = requests.get(url).text
dataframe_list = pd.read_html(url,flavor='bs4')
print(len(dataframe_list))

1


From output above we see that Wiki page contains only two data frames. We take the first one and save it

In [4]:
frankfurt_data = dataframe_list[0]
frankfurt_data = pd.DataFrame(frankfurt_data)
frankfurt_data.head()

Unnamed: 0,Nr.,Name,Stadtteile,Einwohnerzahl[1],Fläche (km²)[2],Bevölkerungsdichte(Einwohner/km²),Ortsvorsteher[3]
0,1,Innenstadt I,"Gallus, Gutleutviertel, Bahnhofsviertel, Altst...",63.171,8848,7.14,Oliver Strank (SPD)
1,2,Innenstadt II,"Bockenheim, Westend-Süd, Westend-Nord",71.684,12160,5.895,Axel Kaufmann (CDU) (ww)
2,3,Innenstadt III,"Nordend-West, Nordend-Ost",54.046,4632,11.668,Karin Guder (Grüne) (ww)
3,4,Bornheim/Ostend,"Ostend, Bornheim",60.41,8350,7.235,Herrmann Steib (Grüne)
4,5,Süd,"Flughafen, Sachsenhausen-Süd, Sachsenhausen-No...",102.33,67778,1.51,Christian Becker (CDU) (nw)


 Clean and rename columns

In [5]:
frankfurt_data.drop(["Nr.","Ortsvorsteher[3]","Bevölkerungsdichte(Einwohner/km²)","Fläche (km²)[2]","Einwohnerzahl[1]"], axis = 1, inplace=True)


In [6]:
dict = {"Name":"Borough", "Stadtteile": "Neighbourhood" }
frankfurt_data.rename(columns = dict,inplace = True)
frankfurt_data

Unnamed: 0,Borough,Neighbourhood
0,Innenstadt I,"Gallus, Gutleutviertel, Bahnhofsviertel, Altst..."
1,Innenstadt II,"Bockenheim, Westend-Süd, Westend-Nord"
2,Innenstadt III,"Nordend-West, Nordend-Ost"
3,Bornheim/Ostend,"Ostend, Bornheim"
4,Süd,"Flughafen, Sachsenhausen-Süd, Sachsenhausen-No..."
5,West,"Schwanheim, Griesheim, Nied, Sossenheim, Höchs..."
6,Mitte-West,"Rödelheim, Praunheim, Hausen, STB 343"
7,Nord-West,"Niederursel, Heddernheim, STB 426 (Praunheim-N..."
8,Mitte-Nord,"Eschersheim, Ginnheim, Dornbusch"
9,Nord-Ost,"Eckenheim, Preungesheim, Berkersheim, Frankfur..."


 As far as there is list only with geographical names, therefore it is necessary to obtain geographical coordinates (longitude and latitude) for each city district. For this purpose *Nominatium*  and *geocoders* are used 

In [8]:
from geopy.geocoders import Nominatim 
districts = frankfurt_data['Borough'].tolist()
lat =[]
long =[]
for district in districts:
    
    city ="Frankfurt am Main"
    country ="DE"
    loc = geolocator.geocode(district +','+city+','+ country)
    
    latitude = loc.latitude
    longitude = loc.longitude
    
    lat.append(latitude)
    long.append(longitude)
    
print(lat, long)

[50.1039941, 50.1220292, 50.129308699999996, 50.125123099999996, 50.06215895, 50.09805815, 50.13711895, 50.166981, 50.15483965, 50.16316295, 50.13569915, 50.186279299999995, 50.2026088, 50.182287, 50.2017328, 50.15801465] [8.64359066368981, 8.648538084839267, 8.688586554310248, 8.712300724919832, 8.63761978937761, 8.546215090872613, 8.610849292665335, 8.619091035283446, 8.661879055251527, 8.694544780994663, 8.747953192695935, 8.639054519087743, 8.7119602, 8.6929716, 8.666710664514689, 8.762038791958947]


In [9]:
frankfurt_data['Latitude'] = lat
frankfurt_data['Longitude'] = long

frankfurt_data.head()



Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude
0,Innenstadt I,"Gallus, Gutleutviertel, Bahnhofsviertel, Altst...",50.103994,8.643591
1,Innenstadt II,"Bockenheim, Westend-Süd, Westend-Nord",50.122029,8.648538
2,Innenstadt III,"Nordend-West, Nordend-Ost",50.129309,8.688587
3,Bornheim/Ostend,"Ostend, Bornheim",50.125123,8.712301
4,Süd,"Flughafen, Sachsenhausen-Süd, Sachsenhausen-No...",50.062159,8.63762


Mapping each borough on a map

In [10]:
address = 'Frankfurt,DE'

geolocator = Nominatim(user_agent="frankfurt_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Frankfurt are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Frankfurt are 50.1106444, 8.6820917.


In [62]:

frankfurt_geo = r'https://offenedaten.frankfurt.de/dataset/85b38876-729c-4a78-910c-a52d5c6df8d2/resource/21d455e1-217d-47c5-af3d-ecdef1c50586/download/ffmortsbezirke.geojson'
# create map of New York using latitude and longitude values
frankfurt_map = folium.Map(location=[latitude, longitude], zoom_start=12)
folium.GeoJson(frankfurt_geo, name="geojson").add_to(frankfurt_map)
# add markers to map
for lat, lng, borough in zip(frankfurt_data['Latitude'], frankfurt_data['Longitude'], frankfurt_data['Borough']):
    label = '{}'.format(borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='yellow',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(frankfurt_map)  

frankfurt_map

**Foursqaure API**  In this step we obtain list of venues through Foresquare API. As far as we want to explore one specific category, namely Gym/Fitness Center venues, it is necessary to use *search* endpoint in a request and know category id ([3],see Reference section) 

In [12]:
# @hidden_cell
CLIENT_ID = 'KGL4XWJKNTOWM5OSRUGYAOPRPUKER3XRG5VGP42HITD4HI02' # your Foursquare ID
CLIENT_SECRET = 'S4KOWTPOM45WO3YNYIVEYPZTEY0UKMLEZUFOBZUQ2IITEBCZ' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value
print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: KGL4XWJKNTOWM5OSRUGYAOPRPUKER3XRG5VGP42HITD4HI02
CLIENT_SECRET:S4KOWTPOM45WO3YNYIVEYPZTEY0UKMLEZUFOBZUQ2IITEBCZ


To obtain list of venus with names, coordinates and subcategories the function **getNearbyVenues** is created. 

In [72]:
def getNearbyVenues(names, latitudes, longitudes, radius=3000, LIMIT=50, categoryId="4bf58dd8d48988d175941735"):

    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)

        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/search?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}&categoryId={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT,
            categoryId)

        # make the GET request
        results = requests.get(url).json()["response"]['venues']

        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['name'], 
            v['location']['lat'], 
            v['location']['lng'],  
            v['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Borough', 
                  'Borough Latitude', 
                  'Borough Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

In [73]:
frankfurt_venues = getNearbyVenues(names = frankfurt_data['Borough'],
                                   latitudes=frankfurt_data['Latitude'],
                                   longitudes=frankfurt_data['Longitude']
                                  )

Innenstadt I
Innenstadt II
Innenstadt III
Bornheim/Ostend
Süd
West
Mitte-West
Nord-West
Mitte-Nord
Nord-Ost
Ost
Kalbach/Riedberg
Nieder-Erlenbach
Harheim
Nieder-Eschbach
Bergen-Enkheim


In [74]:
print(frankfurt_venues.shape)
frankfurt_venues

(402, 7)


Unnamed: 0,Borough,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Innenstadt I,50.103994,8.643591,Gym @ Capri by Fraser,50.108852,8.648248,Gym / Fitness Center
1,Innenstadt I,50.103994,8.643591,FFM CrossFit,50.094738,8.642883,Gym
2,Innenstadt I,50.103994,8.643591,Classy Mid,50.105072,8.668682,Gym / Fitness Center
3,Innenstadt I,50.103994,8.643591,L1FE - EVOLVING MOVEMENTS,50.114049,8.673175,Gym / Fitness Center
4,Innenstadt I,50.103994,8.643591,Pure Training Sachsenhausen,50.101278,8.686467,Gym
...,...,...,...,...,...,...,...
397,Bergen-Enkheim,50.158015,8.762039,Netzwerk Körper,50.183391,8.744016,Gym / Fitness Center
398,Bergen-Enkheim,50.158015,8.762039,T Hall,50.133517,8.765915,Climbing Gym
399,Bergen-Enkheim,50.158015,8.762039,Sportanlage Turnverein Bergen-Enkheim 1874 e.V.,50.159743,8.747232,Soccer Field
400,Bergen-Enkheim,50.158015,8.762039,Sporthalle Bad Vilbel,50.184160,8.735944,Gym


Finally, working datafraem is rcreated and ready to use

In [75]:
frankfurt_venus_all = frankfurt_venues[["Borough","Venue","Venue Latitude", "Venue Longitude", "Venue Category"]]
frankfurt_venus_all 

Unnamed: 0,Borough,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Innenstadt I,Gym @ Capri by Fraser,50.108852,8.648248,Gym / Fitness Center
1,Innenstadt I,FFM CrossFit,50.094738,8.642883,Gym
2,Innenstadt I,Classy Mid,50.105072,8.668682,Gym / Fitness Center
3,Innenstadt I,L1FE - EVOLVING MOVEMENTS,50.114049,8.673175,Gym / Fitness Center
4,Innenstadt I,Pure Training Sachsenhausen,50.101278,8.686467,Gym
...,...,...,...,...,...
397,Bergen-Enkheim,Netzwerk Körper,50.183391,8.744016,Gym / Fitness Center
398,Bergen-Enkheim,T Hall,50.133517,8.765915,Climbing Gym
399,Bergen-Enkheim,Sportanlage Turnverein Bergen-Enkheim 1874 e.V.,50.159743,8.747232,Soccer Field
400,Bergen-Enkheim,Sporthalle Bad Vilbel,50.184160,8.735944,Gym


Data frame of uniq catgories is represented. This data frame will be used later.

In [76]:
frankfurt_uniq = frankfurt_venus_all["Venue Category"].unique()
frankfurt_uniq=pd.DataFrame(frankfurt_uniq)
frankfurt_uniq

Unnamed: 0,0
0,Gym / Fitness Center
1,Gym
2,Gym Pool
3,Yoga Studio
4,Martial Arts School
5,Track
6,Athletics & Sports
7,Pilates Studio
8,Gymnastics Gym
9,Climbing Gym


Mapping all venues

In [77]:
frankfurt_geo = r'https://offenedaten.frankfurt.de/dataset/85b38876-729c-4a78-910c-a52d5c6df8d2/resource/21d455e1-217d-47c5-af3d-ecdef1c50586/download/ffmortsbezirke.geojson'
frankfurt_venues_map = folium.Map(location=[latitude, longitude], zoom_start=11)
folium.GeoJson(frankfurt_geo, name="geojson").add_to(frankfurt_venues_map)
# add markers to map
for lat, lng, borough, venue, vencat in zip(frankfurt_venus_all['Venue Latitude'], frankfurt_venus_all['Venue Longitude'], frankfurt_venus_all['Borough'], frankfurt_venus_all['Venue'],frankfurt_venus_all["Venue Category"] ):
    label = '{}, {},{}'.format(venue,vencat,borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=4,
        popup=label,
        color="white",
        fill=True,
        fill_color= "green",
        fill_opacity=0.7,
        parse_html=False).add_to(frankfurt_venues_map)  
    
frankfurt_venues_map

In [78]:
frankfurt_venues.groupby('Borough').count()

Unnamed: 0_level_0,Borough Latitude,Borough Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Borough,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Bergen-Enkheim,14,14,14,14,14,14
Bornheim/Ostend,47,47,47,47,47,47
Harheim,12,12,12,12,12,12
Innenstadt I,45,45,45,45,45,45
Innenstadt II,45,45,45,45,45,45
Innenstadt III,47,47,47,47,47,47
Kalbach/Riedberg,10,10,10,10,10,10
Mitte-Nord,32,32,32,32,32,32
Mitte-West,35,35,35,35,35,35
Nieder-Erlenbach,6,6,6,6,6,6


In this step analysis of each borough is represented. First one-hote encoding is executed, then data frame is groupped by borough and sorted by appearence frequency.

In [79]:
# one hot encoding
frankfurt_onehot = pd.get_dummies(frankfurt_venus_all[['Venue Category']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
frankfurt_onehot['Borough'] = frankfurt_venus_all['Borough'] 

# move neighborhood column to the first column
fixed_columns = [frankfurt_onehot.columns[-1]] + list(frankfurt_onehot.columns[:-1])
frankfurt_onehot = frankfurt_onehot[fixed_columns]

frankfurt_onehot.head()

Unnamed: 0,Borough,Athletics & Sports,Boxing Gym,Climbing Gym,Gym,Gym / Fitness Center,Gym Pool,Gymnastics Gym,Martial Arts School,Medical Center,Outdoor Gym,Pilates Studio,Soccer Field,Track,Yoga Studio
0,Innenstadt I,0,0,0,0,1,0,0,0,0,0,0,0,0,0
1,Innenstadt I,0,0,0,1,0,0,0,0,0,0,0,0,0,0
2,Innenstadt I,0,0,0,0,1,0,0,0,0,0,0,0,0,0
3,Innenstadt I,0,0,0,0,1,0,0,0,0,0,0,0,0,0
4,Innenstadt I,0,0,0,1,0,0,0,0,0,0,0,0,0,0


Afterward a 10 top venue category chart is created

In [80]:
frankfurt_grouped = frankfurt_onehot.groupby('Borough').mean().reset_index()
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Borough']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
borough_venues_sorted = pd.DataFrame(columns=columns)
borough_venues_sorted['Borough'] = frankfurt_grouped['Borough']

for ind in np.arange(frankfurt_grouped.shape[0]):
    borough_venues_sorted.iloc[ind, 1:] = return_most_common_venues(frankfurt_grouped.iloc[ind, :], num_top_venues)

borough_venues_sorted.head()

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Bergen-Enkheim,Gym / Fitness Center,Martial Arts School,Gym,Climbing Gym,Soccer Field,Outdoor Gym,Medical Center,Yoga Studio,Track,Pilates Studio
1,Bornheim/Ostend,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gymnastics Gym,Track,Pilates Studio,Gym Pool,Climbing Gym,Soccer Field
2,Harheim,Gym / Fitness Center,Gym,Climbing Gym,Yoga Studio,Boxing Gym,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center
3,Innenstadt I,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gym Pool,Track,Athletics & Sports,Soccer Field,Pilates Studio,Outdoor Gym
4,Innenstadt II,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Track,Pilates Studio,Gym Pool,Athletics & Sports,Soccer Field,Outdoor Gym


The last k-means clustering analisis is represented. As far as our locations are widely spreaded (except for the city center), therefore amount of clusters

In [82]:
# set number of clusters
kclusters = 10

frankfurt_grouped_clustering = frankfurt_grouped.drop('Borough', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(frankfurt_grouped_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10] 

array([8, 5, 9, 5, 5, 5, 1, 7, 1, 4])

In [83]:
# add clustering labels
borough_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

frankfurt_merged = frankfurt_data

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
frankfurt_merged = frankfurt_merged.join(borough_venues_sorted.set_index('Borough'), on='Borough')

frankfurt_merged.head() 

Unnamed: 0,Borough,Neighbourhood,Latitude,Longitude,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Innenstadt I,"Gallus, Gutleutviertel, Bahnhofsviertel, Altst...",50.103994,8.643591,5,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gym Pool,Track,Athletics & Sports,Soccer Field,Pilates Studio,Outdoor Gym
1,Innenstadt II,"Bockenheim, Westend-Süd, Westend-Nord",50.122029,8.648538,5,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Track,Pilates Studio,Gym Pool,Athletics & Sports,Soccer Field,Outdoor Gym
2,Innenstadt III,"Nordend-West, Nordend-Ost",50.129309,8.688587,5,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gymnastics Gym,Track,Pilates Studio,Gym Pool,Climbing Gym,Athletics & Sports
3,Bornheim/Ostend,"Ostend, Bornheim",50.125123,8.712301,5,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gymnastics Gym,Track,Pilates Studio,Gym Pool,Climbing Gym,Soccer Field
4,Süd,"Flughafen, Sachsenhausen-Süd, Sachsenhausen-No...",50.062159,8.63762,6,Gym / Fitness Center,Gym,Gym Pool,Martial Arts School,Yoga Studio,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center


In [84]:
frankfurt_geo = r'https://offenedaten.frankfurt.de/dataset/85b38876-729c-4a78-910c-a52d5c6df8d2/resource/21d455e1-217d-47c5-af3d-ecdef1c50586/download/ffmortsbezirke.geojson'


# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)
folium.GeoJson(frankfurt_geo, name="geojson").add_to(map_clusters)
# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**3 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(frankfurt_merged['Latitude'], frankfurt_merged['Longitude'], frankfurt_merged['Borough'], frankfurt_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.9).add_to(map_clusters)
       
map_clusters

Cluster 1

In [85]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 0,  frankfurt_merged.columns[[0] + list(range(5,  frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
9,Nord-Ost,Gym / Fitness Center,Gym,Yoga Studio,Climbing Gym,Soccer Field,Gymnastics Gym,Boxing Gym,Track,Pilates Studio,Outdoor Gym


Cluster 2

In [86]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 1, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
6,Mitte-West,Gym / Fitness Center,Gym,Yoga Studio,Pilates Studio,Martial Arts School,Gym Pool,Track,Soccer Field,Outdoor Gym,Medical Center
11,Kalbach/Riedberg,Gym / Fitness Center,Gym,Yoga Studio,Pilates Studio,Track,Soccer Field,Outdoor Gym,Medical Center,Martial Arts School,Gymnastics Gym
14,Nieder-Eschbach,Gym / Fitness Center,Gym,Yoga Studio,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center,Martial Arts School,Gymnastics Gym


Cluster 3

In [87]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 2, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
10,Ost,Gym / Fitness Center,Gym,Martial Arts School,Yoga Studio,Gymnastics Gym,Climbing Gym,Soccer Field,Outdoor Gym,Medical Center,Gym Pool


Cluster 4 

In [88]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 3, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
5,West,Gym,Gym / Fitness Center,Yoga Studio,Gym Pool,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center,Martial Arts School


Cluster 5

In [89]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 4, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
12,Nieder-Erlenbach,Gym,Gym / Fitness Center,Yoga Studio,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center,Martial Arts School,Gymnastics Gym


Cluster 6

In [90]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 5, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Innenstadt I,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gym Pool,Track,Athletics & Sports,Soccer Field,Pilates Studio,Outdoor Gym
1,Innenstadt II,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Track,Pilates Studio,Gym Pool,Athletics & Sports,Soccer Field,Outdoor Gym
2,Innenstadt III,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gymnastics Gym,Track,Pilates Studio,Gym Pool,Climbing Gym,Athletics & Sports
3,Bornheim/Ostend,Gym / Fitness Center,Gym,Yoga Studio,Martial Arts School,Gymnastics Gym,Track,Pilates Studio,Gym Pool,Climbing Gym,Soccer Field


Cluster 7

In [91]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 6, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
4,Süd,Gym / Fitness Center,Gym,Gym Pool,Martial Arts School,Yoga Studio,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center


Cluster 8

In [92]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 7, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
7,Nord-West,Gym,Gym / Fitness Center,Yoga Studio,Pilates Studio,Track,Soccer Field,Outdoor Gym,Medical Center,Martial Arts School,Gymnastics Gym
8,Mitte-Nord,Gym,Gym / Fitness Center,Yoga Studio,Climbing Gym,Martial Arts School,Gymnastics Gym,Boxing Gym,Track,Soccer Field,Pilates Studio


Cluster 9

In [93]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 8, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
15,Bergen-Enkheim,Gym / Fitness Center,Martial Arts School,Gym,Climbing Gym,Soccer Field,Outdoor Gym,Medical Center,Yoga Studio,Track,Pilates Studio


Cluster 10

In [94]:
frankfurt_merged.loc[frankfurt_merged['Cluster Labels'] == 9, frankfurt_merged.columns[[0] + list(range(5, frankfurt_merged.shape[1]))]]

Unnamed: 0,Borough,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
13,Harheim,Gym / Fitness Center,Gym,Climbing Gym,Yoga Studio,Boxing Gym,Track,Soccer Field,Pilates Studio,Outdoor Gym,Medical Center


### 4. Results and discussions

According to our analisis highest concentration of sport facilities is in city center (Innenstadt I-III) and Borheim/Ostened, where situated historical center and financial district. Second place by density of sport facilities share three areas Mitte-Nord, Mitte-West and Ost. The biggest city part by population Süd (inkluding airport) has rather  low amount of sport venues. It means that this area is most suitable for new fitness club or gym. Three most popular categories of sport venues are: gym, fitness studio and yoga studio. 

### 5. Conclusion

In this project density of indooor sport location was analised. Areas with highest densities were identified. 

### 6. Reference
1. https://de.wikipedia.org/wiki/Liste_der_Ortsbezirke_von_Frankfurt_am_Main
2. https://developer.foursquare.com/
3. https://developer.foursquare.com/docs/build-with-foursquare/categories/