# Capstone Project - Museums Berlin

## Introduction / Business Problem

Many people **visit Berlin** for **cultural** reasons and explore the vast opportunities to visit **museums**. In order to **reduce the time** going to a museum or coming home, you want to **stay at a place** which located in the postal code area with the **highest density of museums**.

Therefor, in this project we will try to locate the areas in **Berlin** with the highest density of **museums**.

We will use our knowledge in data science to locate the area you want to book your hotel, AirBnB, etc. based on our criteria. The amount of museums within each postal area will then be clearly visualized in order to find the best possible location for your stay.

## Data

The data needed for our analysis can be retrieved via the **foursquare API** - Number of museums in each neighborhood / postal code area

## Methodology

In a first step we will retrieve all foursquare entries on museums in Berlin.

In a second step we will filter the data based on the postal codes of the locations in Berlin.

In a last step we will analyze the data according to the highest density postal code area with museums in Berlin.

## Analysis

In [13]:
#Import libraries
import requests
import pandas as pd 
import numpy as np

from geopy.geocoders import Nominatim # module to convert an address into latitude and longitude values

import folium # plotting library

from pandas.io.json import json_normalize



In [9]:
#Define foursquare credentials
CLIENT_ID = '5GU4TM5OKKJNTW01ANMGESFQ2QI3K2AKQWCS3WKPU3BCRDJT' # your Foursquare ID
CLIENT_SECRET = 'BEM55QHRBTDSNZYMRHKUUX21CBZH5BODNYS4FVQDTOU4RJX0' # your Foursquare Secret
ACCESS_TOKEN = 'L30HL4MSA1A1EDYY2EFC3IU3UPU1SPZ3NDNXR2YAZQNNH23S' # your FourSquare Access Token
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [6]:
#Define location
address = 'Berlin'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

52.5170365 13.3888599


In [18]:
#Define search query
search_query = 'Museum'
radius = 2000


In [19]:
#Define corresponding url
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, latitude, longitude,ACCESS_TOKEN, VERSION, search_query, radius, LIMIT)
url

'https://api.foursquare.com/v2/venues/search?client_id=5GU4TM5OKKJNTW01ANMGESFQ2QI3K2AKQWCS3WKPU3BCRDJT&client_secret=BEM55QHRBTDSNZYMRHKUUX21CBZH5BODNYS4FVQDTOU4RJX0&ll=52.5170365,13.3888599&oauth_token=L30HL4MSA1A1EDYY2EFC3IU3UPU1SPZ3NDNXR2YAZQNNH23S&v=20180605&query=Museum&radius=2000&limit=100'

In [81]:
#Examine results
results = requests.get(url).json()

# assign relevant part of JSON to venues
venues = results['response']['venues']


# tranform venues into a dataframe
dataframe = json_normalize(venues)
dataframe['location.postalCode'] = dataframe['location.postalCode'].fillna(0)
dataframe.rename(columns={'location.lat': 'lat', 'location.lng': 'lng'}, inplace = True)

dataframe


  dataframe = json_normalize(venues)


Unnamed: 0,id,name,categories,referralId,hasPerk,location.address,lat,lng,location.labeledLatLngs,location.distance,location.cc,location.city,location.state,location.country,location.formattedAddress,location.postalCode,location.crossStreet,location.neighborhood,venuePage.id
0,4ed2360702d5feaa1e6594e6,Antik- und Buchmarkt am Bode-Museum,"[{'id': '4bf58dd8d48988d1f7941735', 'name': 'F...",v-1625670575,False,Am Kupfergraben,52.521484,13.393945,"[{'label': 'display', 'lat': 52.52148422035696...",603,DE,Berlin,Berlin,Deutschland,"[Am Kupfergraben, Berlin]",0,,,
1,4adcda83f964a520aa4821e3,German Historical Museum (Deutsches Historisch...,"[{'id': '4bf58dd8d48988d190941735', 'name': 'H...",v-1625670575,False,Unter den Linden 2,52.517788,13.396948,"[{'label': 'display', 'lat': 52.5177881130353,...",554,DE,Berlin,Berlin,Deutschland,"[Unter den Linden 2, 10117 Berlin]",10117,,,
2,4adcda83f964a520b24821e3,Neues Museum,"[{'id': '4bf58dd8d48988d190941735', 'name': 'H...",v-1625670575,False,Bodestr. 1-3,52.520158,13.397838,"[{'label': 'display', 'lat': 52.5201576091737,...",700,DE,Berlin,Berlin,Deutschland,"[Bodestr. 1-3 (Museumsinsel), 10178 Berlin]",10178,Museumsinsel,Museumsinsel,
3,4adcda81f964a5207d4821e3,Museum für Kommunikation,"[{'id': '4bf58dd8d48988d191941735', 'name': 'S...",v-1625670575,False,Leipziger Str. 16,52.509822,13.387077,"[{'label': 'display', 'lat': 52.50982208432867...",812,DE,Berlin,Berlin,Deutschland,"[Leipziger Str. 16 (Mauerstr.), 10117 Berlin]",10117,Mauerstr.,,
4,4adcda81f964a5205d4821e3,Museum für Islamische Kunst,"[{'id': '4bf58dd8d48988d18f941735', 'name': 'A...",v-1625670575,False,Am Kupfergraben 5,52.520709,13.396697,"[{'label': 'display', 'lat': 52.52070890284828...",670,DE,Berlin,Berlin,Deutschland,"[Am Kupfergraben 5, 10117 Berlin]",10117,,,
5,4adcda81f964a520574821e3,Altes Museum,"[{'id': '4bf58dd8d48988d190941735', 'name': 'H...",v-1625670575,False,Am Lustgarten 1,52.519537,13.398803,"[{'label': 'display', 'lat': 52.51953741567019...",728,DE,Berlin,Berlin,Deutschland,"[Am Lustgarten 1, 10178 Berlin]",10178,,Museumsinsel,
6,4adcda81f964a5205b4821e3,Bode-Museum,"[{'id': '4bf58dd8d48988d18f941735', 'name': 'A...",v-1625670575,False,Am Kupfergraben,52.52173,13.395044,"[{'label': 'display', 'lat': 52.52173, 'lng': ...",669,DE,Berlin,Berlin,Deutschland,"[Am Kupfergraben, 10178 Berlin]",10178,,Museumsinsel,
7,4adcda80f964a5201c4821e3,Museum Island (Museumsinsel),"[{'id': '50aaa4314b90af0d42d5de10', 'name': 'I...",v-1625670575,False,Am Lustgarten,52.520296,13.398786,"[{'label': 'display', 'lat': 52.52029572914336...",763,DE,Berlin,Berlin,Deutschland,"[Am Lustgarten (Bodestr.), 10178 Berlin]",10178,Bodestr.,Museumsinsel,
8,4ae9c897f964a52062b621e3,DDR Museum,"[{'id': '4bf58dd8d48988d190941735', 'name': 'H...",v-1625670575,False,Karl-Liebknecht-Str. 1,52.519404,13.402239,"[{'label': 'display', 'lat': 52.51940384308041...",943,DE,Berlin,Berlin,Deutschland,"[Karl-Liebknecht-Str. 1, 10178 Berlin]",10178,,,48098990.0
9,4b13cbcbf964a5203d9923e3,Café im Bode-Museum,"[{'id': '4bf58dd8d48988d16d941735', 'name': 'C...",v-1625670575,False,Am Kupfergraben 1,52.522067,13.394092,"[{'label': 'display', 'lat': 52.52206730505164...",662,DE,Berlin,Berlin,Deutschland,"[Am Kupfergraben 1, 10785 Berlin]",10785,,,


In [65]:
dataframe.shape

(50, 19)

In [67]:
#Drop rows with no postal codes
indexNames = dataframe[dataframe['location.postalCode'] == 0].index
dataframe.drop(indexNames , inplace=True)

dataframe.shape

(43, 19)

In [87]:
#Visualize venues (museums) on map

venues_map = folium.Map(location=[latitude, longitude], zoom_start=100)

for lat, lng, label in zip(dataframe['lat'], dataframe['lng'], dataframe['name']):
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        color='blue',
        popup=label,
        fill = True,
        fill_color='blue',
        fill_opacity=0.6
    ).add_to(venues_map)

# display map
venues_map

In [69]:
#Count postal codes
postalcode_counts = dataframe['location.postalCode'].value_counts()
postalcode_counts

10178    18
10117     9
10785     5
10179     4
10969     4
10115     2
10557     1
Name: location.postalCode, dtype: int64

## Results and Discussion

The **postal code area 10178** has **18 museums**, which is double the amount of the second highest density of museums in specific postal code areas of Berlin.
The assumption would be that there is a dense cluster of museums in that area. It **might be a bigger complex with several museums** in it. 
The **type of museum is not display** so it would need **further analysis** regarding the types of museums.

## Conclusion

Based on the data it would be **recommended for a cultural visit to Berlin** to search your **accommodation within** the postal code area of **10178**.