# Capstone Project - The Battle of Neighborhoods (Week 1)
## Analyzing Geospatial Data using Foursquare and Python

__created by MaroDataScience__

## Introduction

### Problem Description

This is about a problem which all visionaires are confronted with: __Searching the right place for establishing your company__. 
\
\
As I am starting my data science and machine learning career, I was wondering where to place a __Data Science Start-Up__ near by my residence. The city to look for a good place should be __Frankfurt in Germany__. But what are the expectations of such a location. So the idea I had was to look for existing and good rated Data Science companies (in Germany) and analyze their locations for comparing them to the neighborhoods in Frankfurt. 

### Business Problem

The challenging part about this project is to generate features from location data of Data Science companies for clustering the neighborhoods in Frankfurt to answer the question: 
#### __Where to place my Data Science Start-Up in Frankfurt, Germany to get the best environment for workmates and customers?__

### Addressed Audience
In this project several important techniques are covert to collect environmental data about company locations and apply that data to a selection of locations in destination city. So in short this is interesting for all those future entrepreneurs considering to start a business in a selected city. 

## Data Description

In [1]:
import pandas as pd 
import numpy as np
import folium
# !pip install pgeocode
import pgeocode

In [2]:
latitude = 50.110924
longitude = 8.682127

In [3]:
frankfurt_data = pd.read_csv('frankfurt_parts.csv')

In [4]:
frankfurt_data.head()

Unnamed: 0,Stadtteil,Postleitzahl
0,Altstadt,"60311, 60313"
1,Bahnhofsviertel,60329
2,Bergen-Enkheim,"60388, 60389"
3,Berkersheim,60435
4,Bockenheim,"60325, 60431, 60486, 60487"


In [5]:
frankfurt_df = frankfurt_data.assign(Postleitzahl=frankfurt_data['Postleitzahl'].str.split(',')).explode('Postleitzahl').reset_index(drop=True)

In [37]:
print("shape before grouping zips: {}".format(frankfurt_df.shape))
# trim whitespaces
frankfurt_df['Postleitzahl'] = frankfurt_df['Postleitzahl'].str.strip()
fr_df = frankfurt_df.groupby('Postleitzahl')['Stadtteil'].apply(lambda x: ", ".join(x)).to_frame().reset_index()
print("shape after grouping zips: {}".format(fr_df.shape))

shape before grouping zips: (116, 2)
shape after grouping zips: (42, 2)


In [40]:
fr_df.head()

Unnamed: 0,Postleitzahl,Stadtteil
0,60306,Westend-Süd
1,60308,Westend-Süd
2,60310,Innenstadt
3,60311,"Altstadt, Innenstadt"
4,60312,Innenstadt


In [41]:
zipcoder = pgeocode.Nominatim('de')
fr_lat_lng = zipcoder.query_postal_code(fr_df['Postleitzahl'].values)

In [43]:
fr_lat_lng.head()

Unnamed: 0,postal_code,country_code,place_name,state_name,state_code,county_name,county_code,community_name,community_code,latitude,longitude,accuracy
0,60306,DE,Frankfurt am Main,Hessen,HE,Regierungsbezirk Darmstadt,64.0,"Frankfurt am Main, Stadt",6412.0,50.1159,8.6702,6.0
1,60308,DE,Frankfurt am Main,Hessen,HE,Regierungsbezirk Darmstadt,64.0,"Frankfurt am Main, Stadt",6412.0,50.1125,8.6529,6.0
2,60310,DE,Frankfurt am Main,Hessen,HE,Regierungsbezirk Darmstadt,64.0,"Frankfurt am Main, Stadt",6412.0,50.1107,8.673,6.0
3,60311,DE,Frankfurt am Main,Hessen,HE,Regierungsbezirk Darmstadt,64.0,"Frankfurt am Main, Stadt",6412.0,50.1112,8.6831,6.0
4,60312,,,,,,,,,,,


In [47]:
frankframe = pd.concat([fr_df, fr_lat_lng[['latitude', 'longitude']]], axis=1)

In [49]:
frankframe.dropna(axis=0, inplace=True)
frankframe.head()

Unnamed: 0,Postleitzahl,Stadtteil,latitude,longitude
0,60306,Westend-Süd,50.1159,8.6702
1,60308,Westend-Süd,50.1125,8.6529
2,60310,Innenstadt,50.1107,8.673
3,60311,"Altstadt, Innenstadt",50.1112,8.6831
5,60313,"Altstadt, Innenstadt",50.1153,8.6823


In [50]:
# create map of frankfurt using latitude and longitude values
map_frankfurt = folium.Map(location=[latitude, longitude], zoom_start=12)

# add markers to map
for lat, lng, borough in zip(frankframe['latitude'], frankframe['longitude'], frankframe['Stadtteil']):
    label = '{}'.format(borough)
    label = folium.Popup(label, parse_html=True)
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_frankfurt)  
    
map_frankfurt

## How the Data is used to solve the problem