# Family Friendly Places in Munich

## Introduction

This notebook is my capstone project assignment for the [IBM Data Science Professional Certificate](https://www.coursera.org/professional-certificates/ibm-data-science) course.
For this assignment each student could come up with a problem, that could be solved by leveraging Foursquare location data and machine learning algorithms.

## Business Problem

Because I live in Munich, Germany and I have a family, I decided to explore neighborhoods in Munich and compare them by family friendly food places (restaurants, bakeries etc.).
Not every place in a city is family friendly. This project can help analyze Munich from that perspective and help families to explore the city, or maybe even settle in one of the neighborhoods.
Audience of this project are families who either live in Munich or visit the city.

## Data

In this section I’m getting the required data for the analysis and show examples of the data.

Munich has the following districts:

In [26]:
district_names = [
    'Altstadt - Lehel',
    'Ludwigsvorstadt - Isarvorstadt',
    'Maxvorstadt',
    'Schwabing-West',
    'Au - Haidhausen',
    'Sendling',
    'Sendling - Westpark',
    'Schwanthalerhöhe',
    'Neuhausen - Nymphenburg',
    'Moosach',
    'Milbertshofen - Am Hart',
    'Schwabing - Freimann',
    'Bogenhausen',
    'Berg am Laim',
    'Trudering - Riem',
    'Ramersdorf - Perlach',
    'Obergiesing',
    'Untergiesing - Harlaching',
    'Thalkirchen - Obersendling - Forstenried - Fürstenried - Solln',
    'Hadern',
    'Pasing - Obermenzing',
    'Aubing - Lochhausen - Langwied',
    'Allach - Untermenzing',
    'Feldmoching - Hasenbergl',
    'Laim'
]

Let’s install required dependencies to explore the districts

In [1]:
!conda install -c conda-forge geopy --yes

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... 
  - anaconda/win-64::openssl-1.1.1d-he774522_2
  - defaults/win-64::openssl-1.1.1d-he774522_2done

# All requested packages already installed.



In [2]:
!conda install -c conda-forge folium=0.5.0 --yes

Collecting package metadata (current_repodata.json): ...working... done
Solving environment: ...working... 
  - anaconda/win-64::openssl-1.1.1d-he774522_2
  - defaults/win-64::openssl-1.1.1d-he774522_2done

# All requested packages already installed.



Importing packages required for maps

In [3]:
# convert an address into latitude and longitude values
from geopy.geocoders import Nominatim

# map rendering library
import folium

Getting the location of Munich

In [4]:
city_address = 'Munich, Germany'

geolocator = Nominatim(user_agent='munich_explorer')
location = geolocator.geocode(city_address)
city_latitude = location.latitude
city_longitude = location.longitude

print(f'The geograpical coordinate of Munich are {city_latitude}, {city_longitude}.')

The geograpical coordinate of Munich are 48.1371079, 11.5753822.


Getting the location of each district

In [27]:
get_location = lambda name: geolocator.geocode(f'{name}, Munich, Germany')
district_locations = []
for name in district_names:
    loc = get_location(name)
    if loc == None:
        print(f'Could not find location for {name}')
    else:
        district_locations.append((name, loc.latitude, loc.longitude))
district_locations

[('Altstadt - Lehel', 48.1378285, 11.5745823),
 ('Ludwigsvorstadt - Isarvorstadt', 48.1317712, 11.5558087),
 ('Maxvorstadt', 48.1465704, 11.5714445),
 ('Schwabing-West', 48.164417, 11.5703639),
 ('Au - Haidhausen', 48.1287531, 11.5905362),
 ('Sendling', 48.1180125, 11.5390832),
 ('Sendling - Westpark', 48.11803085, 11.519332770284128),
 ('Schwanthalerhöhe', 48.1337822, 11.5410566),
 ('Neuhausen - Nymphenburg', 48.1542217, 11.5315172),
 ('Moosach', 48.1798949, 11.5105712),
 ('Milbertshofen - Am Hart', 48.1823848, 11.5750432),
 ('Schwabing - Freimann', 48.1892784, 11.60858258301819),
 ('Bogenhausen', 48.1547823, 11.6334838),
 ('Berg am Laim', 48.1234833, 11.6334511),
 ('Trudering - Riem', 48.1260355, 11.6633383),
 ('Ramersdorf - Perlach', 48.1141401, 11.6142551),
 ('Obergiesing', 48.1111557, 11.5889093),
 ('Untergiesing - Harlaching', 48.1149632, 11.5701894),
 ('Thalkirchen - Obersendling - Forstenried - Fürstenried - Solln',
  48.1028401,
  11.5459789),
 ('Hadern', 48.118064, 11.4818417

Let’s take a look at Munich and its districts on the map

In [30]:
munich_map = folium.Map(location=[city_latitude, city_longitude], zoom_start=11.5)

for name, lat, lng in district_locations:
    label = folium.Popup(name, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(munich_map)  
    
munich_map

Let’s connect Foursqueare API to explore location data. I’ll be using Foursquare API.
Since this notebook is public, I’m gonna store the credentials locally in a separate file.

In [31]:
from configparser import ConfigParser

parser = ConfigParser()
_ = parser.read('secrets.cfg')

foursquare_client_id = parser.get('secrets', 'foursquare_client_id')
foursquare_client_secret = parser.get('secrets', 'foursquare_client_secret')
foursquare_version = '20180605'

The idea is to search for venues in category `food`, that are marked as `family friendly`. To do that, I’m gonna use `venues/explore` API and specify `family friendly` in the query:

In [51]:
import requests
import pandas as pd

venues_list = []
categoryId = '4d4b7105d754a06374d81259' # food category
for name, lat, lng in district_locations:
    # create the API request URL
    url = (f'https://api.foursquare.com/v2/venues/explore?&query=family+friendly'
            f'&client_id={foursquare_client_id}&client_secret={foursquare_client_secret}'
            f'&v={foursquare_version}&categoryId={categoryId}&ll={lat},{lng}&limit=1000')

    # make the GET request
    results = requests.get(url).json()['response']['groups'][0]['items']

    # return only relevant information for each nearby venue
    venues_list.append([(
        name, 
        lat, 
        lng, 
        v['venue']['name'], 
        v['venue']['location']['lat'], 
        v['venue']['location']['lng'],  
        v['venue']['categories'][0]['name']) for v in results])
    
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood', 
              'Neighborhood Latitude', 
              'Neighborhood Longitude', 
              'Venue', 
              'Venue Latitude', 
              'Venue Longitude', 
              'Venue Category']

In [52]:
nearby_venues.head(10)

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Altstadt - Lehel,48.137828,11.574582,Augustiner Klosterwirt,48.138649,11.572527,German Restaurant
1,Altstadt - Lehel,48.137828,11.574582,Andechser am Dom,48.138302,11.573778,Bavarian Restaurant
2,Altstadt - Lehel,48.137828,11.574582,Kilians Irish Pub,48.13872,11.574556,Irish Pub
3,Altstadt - Lehel,48.137828,11.574582,La Burrita,48.136143,11.574489,Burrito Place
4,Altstadt - Lehel,48.137828,11.574582,Chocolaterie Beluga,48.13575,11.575776,Café
5,Altstadt - Lehel,48.137828,11.574582,Nürnberger Bratwurst Glöckl am Dom,48.138191,11.574165,Bavarian Restaurant
6,Altstadt - Lehel,48.137828,11.574582,Bite Delite,48.139996,11.575072,Café
7,Altstadt - Lehel,48.137828,11.574582,Restaurant Dallmayr,48.138489,11.576791,German Restaurant
8,Altstadt - Lehel,48.137828,11.574582,CrêpeFruit,48.138478,11.576857,Creperie
9,Altstadt - Lehel,48.137828,11.574582,Ratskeller,48.137583,11.576337,Bavarian Restaurant


What’s our total amount of venue categories?

In [54]:
print('There are {} uniques categories.'.format(len(nearby_venues['Venue Category'].unique())))

There are 98 uniques categories.


Let’s see them:

In [55]:
nearby_venues['Venue Category'].unique()

array(['German Restaurant', 'Bavarian Restaurant', 'Irish Pub',
       'Burrito Place', 'Café', 'Creperie', 'Seafood Restaurant',
       'Ice Cream Shop', 'Restaurant', 'Pizza Place',
       'Falafel Restaurant', 'Snack Place', 'Italian Restaurant',
       'Food Court', 'Afghan Restaurant', 'Soup Place',
       'Vegetarian / Vegan Restaurant', 'Theme Restaurant',
       'Sandwich Place', 'Steakhouse', 'Cupcake Shop',
       'English Restaurant', 'Argentinian Restaurant', 'Salad Place',
       'Burger Joint', 'French Restaurant', 'Portuguese Restaurant',
       'Mediterranean Restaurant', 'Trattoria/Osteria', 'Manti Place',
       'Japanese Restaurant', 'Breakfast Spot', 'Bakery',
       'Modern European Restaurant', 'Asian Restaurant',
       'Xinjiang Restaurant', 'Vietnamese Restaurant', 'Greek Restaurant',
       'BBQ Joint', 'Middle Eastern Restaurant', 'Spanish Restaurant',
       'Caucasian Restaurant', 'African Restaurant', 'Chinese Restaurant',
       'Mexican Restaurant', 'Tur

Using the obtained data on family friendly food places, I’m gonna compare districts of Munich, applying K-means clustering approach.