# Description of the project

## Introduction / Business Problem

Beer is a major part of German culture. In 2012, Germany ranked third in Europe in terms of per-capita beer consumption, behind the Czech Republic and Austria. Germany is also famous for its beer festivals (e.g. Octoberfest in Munchen or Cannstatter Volksfest in Stuttgart). Thus, taking a glass of beer in the bar the evening seems to be a popular activity in Germany.

<div class="alert alert-block alert-success">
<b>Goal:</b> For this project, I decided to investigate the "bar"-business in Frankfurt am Main and find top-5 districts where it may be worth making a new bar.
</div>

Frankfurt is the most populous city in the German state of Hesse. Frankfurt is a global hub for commerce, culture, education, tourism and transportation, and rated as an "alpha world city" according to GaWC. <br>Sounds like a good candidate for the research!

The hospitality industry may be interested in the results of this project

## Data

For this research we need the following data:
1. Frankfurt districts (Stadtteile)
2. Their longitude and latitude
3. Their geometry (optional)
4. Information about the population density for each district
5. Information about the bars in Frankfurt (amount, longitude, latitude)

I will use the following services to exctract the data:

1. Parsing Wikipedia to get Frankfurt districts and information about the area and population which is used to calculated the population density ("https://de.wikipedia.org/wiki/Liste_der_Stadtteile_von_Frankfurt_am_Main"). 
2. Python Geocoder is used to get the latitude and longitude for each district. This coordinates will be later used in Foursquare API search/expolore query to find bars for each district.
3. Geometry of the districts will be downloaded as the .geojson file (https://offenedaten.frankfurt.de/dataset/85b38876-729c-4a78-910c-a52d5c6df8d2/resource/84dff094-ab75-431f-8c64-39606672f1da/download/ffmstadtteilewahlen.geojson)
4. Foursquare search/explore query API is used to find bars and their coordinates for each neighbourhood. The categoryId='4bf58dd8d48988d116941735' ('bars') 

<div class="alert alert-block alert-success">
<b>Plan:</b> Cluster Frankfurt districts based on the amount of bars and population density. Based on the clusterization, define districts-candidates for bars
</div>

### Imports

In [None]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
from pandas.io.json import json_normalize
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from bs4 import BeautifulSoup
import matplotlib.colors as colors
import matplotlib.cm as cm
import folium # map rendering library
from pandas.io.json import json_normalize
from geopy.geocoders import Nominatim
import requests
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.cluster import KMeans