# IBM Applied Data Science Course on Coursera
## Capstone Final Project


### Proposal for a new vegetarian / vegan restaurant in Berlin, Germany.

* Build a dataframe of neighborhoods in Kuala Lumpur, Malaysia by web scraping the data from Wikipedia page.
* Get the geographical coordinates of the neighborhoods.
* Obtain the venue data for the neighborhoods from Foursquare API.
* Explore and cluster the neighborhoods.
* Select the best cluster to open a new vegetarian / vegan restaurant.

## 1. Import libraries

In [1]:
import numpy as np # library to handle data in a vectorized manner

import pandas as pd # library for data analsysis
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import json # library to handle JSON files

!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from urllib.request import urlopen
from bs4 import BeautifulSoup # BeautifulSoup library to parse HTML input

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

print('Libraries imported.')

Solving environment: done

# All requested packages already installed.

Solving environment: done

# All requested packages already installed.

Libraries imported.


## 2. Scrap data from Wikipedia page into a DataFrame

In [2]:
# Extract data from Wikipedia
wiki_url = 'https://de.wikipedia.org/wiki/Kategorie:Bezirk_von_Berlin'
#data = request.get(wiki_url).text

In [3]:
# parse data from the html into a beautifulsoup object
soup = BeautifulSoup(urlopen(wiki_url))

In [4]:
# create a list to store neighborhood data
distric_list = []


In [5]:
# append the data into the neighborhood_list
for row in soup.find_all('div', class_='mw-category')[0].findAll('li'):
    distric_list.append(row.text)

In [6]:
# create a new DataFrame from the list
ber_df = pd.DataFrame({'Districts of Berlin, DE': distric_list})

ber_df

Unnamed: 0,"Districts of Berlin, DE"
0,Bezirk Charlottenburg-Wilmersdorf
1,Bezirk Friedrichshain-Kreuzberg
2,Bezirk Lichtenberg
3,Bezirk Marzahn-Hellersdorf
4,Bezirk Mitte
5,Bezirk Neukölln
6,Bezirk Pankow
7,Bezirk Reinickendorf
8,Bezirk Spandau
9,Bezirk Steglitz-Zehlendorf


In [7]:
# Cleaning the 'Bezirk'(district in German) in front of each distric's name.
ber_df = ber_df.replace('Bezirk', '', regex = True)
ber_df

Unnamed: 0,"Districts of Berlin, DE"
0,Charlottenburg-Wilmersdorf
1,Friedrichshain-Kreuzberg
2,Lichtenberg
3,Marzahn-Hellersdorf
4,Mitte
5,Neukölln
6,Pankow
7,Reinickendorf
8,Spandau
9,Steglitz-Zehlendorf


In [8]:
# print the number of rows of the DataFrame
ber_df.shape

(12, 1)

## 3. Get the geographical coordinates of each distric before building a map of Berlin

In [9]:
url = 'https://raw.githubusercontent.com/Tatirmp/Coursera_IBM_Data_Science/master/Berlin_Bezirk.csv'


In [10]:
geo_data_df = pd.read_csv(url)
geo_data_df

Unnamed: 0,Kurzname,Breitengrad,Längengrad
0,ChWi,52.507856,13.263952
1,FrKr,52.515306,13.461612
2,Lich,52.532161,13.511893
3,MaHe,52.522523,13.587663
4,Mitt,52.51769,13.402376
5,Neuk,52.48115,13.43535
6,Pank,52.597637,13.436374
7,Rein,52.604763,13.295287
8,Span,52.535788,13.197792
9,StZe,52.429205,13.229974


In [11]:
geo_data_df.shape

(12, 3)

In [12]:
ber_df['Latitude'] = geo_data_df['Breitengrad']
ber_df['Longitude'] = geo_data_df['Längengrad']

In [13]:
# Checking if the change works
print(ber_df.shape)
ber_df

(12, 3)


Unnamed: 0,"Districts of Berlin, DE",Latitude,Longitude
0,Charlottenburg-Wilmersdorf,52.507856,13.263952
1,Friedrichshain-Kreuzberg,52.515306,13.461612
2,Lichtenberg,52.532161,13.511893
3,Marzahn-Hellersdorf,52.522523,13.587663
4,Mitte,52.51769,13.402376
5,Neukölln,52.48115,13.43535
6,Pankow,52.597637,13.436374
7,Reinickendorf,52.604763,13.295287
8,Spandau,52.535788,13.197792
9,Steglitz-Zehlendorf,52.429205,13.229974


## 4. Create a map of Berlin with neighborhoods superimposed on top

### Using the geopy library to get the latitude and longitude of Berlin, Germany

In [14]:
address = 'Berlin, Germany'

geolocator = Nominatim(user_agent="my-application")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Berlin are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Berlin are 52.5170365, 13.3888599.


### Create a map of Berlin to visualize the neighborhoods

In [16]:
# create a map of Berlin using latitude and longitude values
map_ber = folium.Map(location=[latitude, longitude], zoom_start = 11)

# add markers to the map
for lat, lgn, district in zip(ber_df['Latitude'], ber_df['Longitude'], ber_df['Districts of Berlin, DE']):
    label = '{}'.format(district)
    label = folium.Popup(label, parse_html = True)
    folium.CircleMarker([lat, lgn],
                       radius = 5,
                       popup = label,
                       color = 'blue',
                       fill = True,
                       fill_color = '#3186cc',
                       fill_opacity = 0.7).add_to(map_ber)
map_ber

In [18]:
# save the map as HTML file
map_ber.save('map_ber.html')

## 5. Use the Foursquare API to explore the neighborhoods

In [59]:
# define Foursquare Credentials and Version

CLIENT_ID = 'your Foursquare ID' # your Foursquare ID
CLIENT_SECRET = 'your Foursquare Secret' # your Foursquare Secret
VERSION = '20180604'

print('Your credentails:')
print('CLIENT_ID: ' + CLIENT_ID)
print('CLIENT_SECRET:' + CLIENT_SECRET)

Your credentails:
CLIENT_ID: your Foursquare ID
CLIENT_SECRET:your Foursquare Secret


### Exploring the Neighborhoods from Berlin, Germany. Let's get the top 100 venues within a radius of 2000 meters.

In [25]:
radius = 2000
LIMIT = 100

venues = []

for lat, long, neighborhood in zip(ber_df['Latitude'], ber_df['Longitude'], ber_df['Districts of Berlin, DE']):
    
    # create the API request URL
    url = "https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}".format(
        CLIENT_ID,
        CLIENT_SECRET,
        VERSION,
        lat,
        long,
        radius, 
        LIMIT)
    
    # make the GET request
    results = requests.get(url).json()["response"]['groups'][0]['items']
    
    # return only relevant information for each nearby venue
    for venue in results:
        venues.append((
            neighborhood,
            lat, 
            long, 
            venue['venue']['name'], 
            venue['venue']['location']['lat'], 
            venue['venue']['location']['lng'],  
            venue['venue']['categories'][0]['name']))

In [27]:
# convert the venues list into a new DataFrame
venues_df = pd.DataFrame(venues)

# define the column names
venues_df.columns = ['District', 'Latitude', 'Longitude', 'VenueName', 'VenueLatitude', 'VenueLongitude', 'VenueCategory']

print(venues_df.shape)
venues_df.head(15)

(816, 7)


Unnamed: 0,District,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
0,Charlottenburg-Wilmersdorf,52.507856,13.263952,Die Wühlmäuse,52.50883,13.270733,Comedy Club
1,Charlottenburg-Wilmersdorf,52.507856,13.263952,Rasas,52.5121,13.264464,Indian Restaurant
2,Charlottenburg-Wilmersdorf,52.507856,13.263952,Adik's Stehcafe,52.507889,13.258131,Café
3,Charlottenburg-Wilmersdorf,52.507856,13.263952,Block House,52.509393,13.270958,Steakhouse
4,Charlottenburg-Wilmersdorf,52.507856,13.263952,Drachenberg,52.502594,13.249834,Mountain
5,Charlottenburg-Wilmersdorf,52.507856,13.263952,Hotel Villa Kastania,52.51031,13.268223,Hotel
6,Charlottenburg-Wilmersdorf,52.507856,13.263952,Lindenwirtin,52.510335,13.271707,German Restaurant
7,Charlottenburg-Wilmersdorf,52.507856,13.263952,Piccolo Mondo,52.512355,13.267806,Italian Restaurant
8,Charlottenburg-Wilmersdorf,52.507856,13.263952,Café K,52.509789,13.255227,Café
9,Charlottenburg-Wilmersdorf,52.507856,13.263952,Westend Klause,52.516494,13.260109,Bar


### Let's check how many venues were returned for each neighorhood.

In [28]:
venues_df.groupby(["District"]).count()

Unnamed: 0_level_0,Latitude,Longitude,VenueName,VenueLatitude,VenueLongitude,VenueCategory
District,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Charlottenburg-Wilmersdorf,100,100,100,100,100,100
Friedrichshain-Kreuzberg,100,100,100,100,100,100
Lichtenberg,77,77,77,77,77,77
Marzahn-Hellersdorf,37,37,37,37,37,37
Mitte,100,100,100,100,100,100
Neukölln,100,100,100,100,100,100
Pankow,26,26,26,26,26,26
Reinickendorf,44,44,44,44,44,44
Spandau,82,82,82,82,82,82
Steglitz-Zehlendorf,56,56,56,56,56,56


### Let's find out how many unique categories can be curated from all the returned venues.

In [29]:
print('There are {} uniques categories.'.format(len(venues_df['VenueCategory'].unique())))

There are 193 uniques categories.


In [36]:
# print out the list of categories
venues_df['VenueCategory'].unique()[:50]

array(['Comedy Club', 'Indian Restaurant', 'Café', 'Steakhouse',
       'Mountain', 'Hotel', 'German Restaurant', 'Italian Restaurant',
       'Bar', 'Organic Grocery', 'Chinese Restaurant', 'Scenic Lookout',
       'Supermarket', 'Park', 'Vietnamese Restaurant', 'Art Museum',
       'Stadium', 'Pizza Place', 'Concert Hall', 'Soccer Stadium',
       'Garden', 'Asian Restaurant', 'Flower Shop', 'Lounge', 'Pool',
       'Bowling Alley', 'Building', 'Argentinian Restaurant',
       'Historic Site', 'Dessert Shop', 'Drugstore',
       'Fried Chicken Joint', 'American Restaurant', 'Falafel Restaurant',
       'Mexican Restaurant', 'Trattoria/Osteria', 'Plaza', 'Bakery',
       'Pet Store', 'Persian Restaurant', 'Japanese Restaurant',
       'Thai Restaurant', 'Playground', 'Deli / Bodega',
       'Filipino Restaurant', 'Doner Restaurant', 'History Museum',
       'Restaurant', 'Boarding House', 'Clothing Store'], dtype=object)

In [37]:
# check if the results contain "Vegetarian/Vegan Restaurant"
"Vegetarian / Vegan Restaurant" in venues_df['VenueCategory'].unique()

True

## 6. Analyze Each Neighborhood

In [40]:
# one hot encoding
ber_onehot = pd.get_dummies(venues_df[['VenueCategory']], prefix="", prefix_sep="")

# add neighborhood column back to dataframe
ber_onehot['Districts'] = venues_df['District'] 

# move neighborhood column to the first column
fixed_columns = [ber_onehot.columns[-1]] + list(ber_onehot.columns[:-1])
ber_onehot = ber_onehot[fixed_columns]

print(ber_onehot.shape)
ber_onehot.head()

(816, 194)


Unnamed: 0,Districts,Adult Boutique,African Restaurant,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Automotive Shop,Bagel Shop,Bakery,Bank,Bar,Bathing Area,Beach,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Boarding House,Boat Rental,Bookstore,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Stop,Cable Car,Cafeteria,Café,Canal,Castle,Caucasian Restaurant,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Concert Hall,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Discount Store,Dive Bar,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flower Shop,Food & Drink Shop,Food Service,Forest,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Go Kart Track,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Shop,Gym / Fitness Center,Harbor / Marina,Hardware Store,Historic Site,History Museum,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indoor Play Area,Intersection,Italian Restaurant,Japanese Restaurant,Kumpir Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Mountain,Movie Theater,Museum,Music Venue,Nail Salon,Nature Preserve,Neighborhood,Nightclub,Opera House,Optical Shop,Organic Grocery,Outdoor Sculpture,Paintball Field,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Pub,Racetrack,Ramen Restaurant,Real Estate Office,Record Shop,Rest Area,Restaurant,River,Roof Deck,Russian Restaurant,Sandwich Place,Sauna / Steam Room,Scenic Lookout,Seafood Restaurant,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Spa,Speakeasy,Sports Club,Squash Court,Stadium,Stationery Store,Steakhouse,Street Art,Supermarket,Sushi Restaurant,Syrian Restaurant,Tapas Restaurant,Taverna,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Trail,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Volleyball Court,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Charlottenburg-Wilmersdorf,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Charlottenburg-Wilmersdorf,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Charlottenburg-Wilmersdorf,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
3,Charlottenburg-Wilmersdorf,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,Charlottenburg-Wilmersdorf,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


### Next, let's group rows by neighborhood and by taking the mean of the frequency of occurrence of each category.

In [41]:
ber_grouped = ber_onehot.groupby(["Districts"]).mean().reset_index()

print(ber_grouped.shape)
ber_grouped

(12, 194)


Unnamed: 0,Districts,Adult Boutique,African Restaurant,American Restaurant,Argentinian Restaurant,Art Gallery,Art Museum,Asian Restaurant,Athletics & Sports,Automotive Shop,Bagel Shop,Bakery,Bank,Bar,Bathing Area,Beach,Beer Bar,Beer Garden,Beer Store,Big Box Store,Bike Rental / Bike Share,Bike Shop,Bistro,Boarding House,Boat Rental,Bookstore,Bowling Alley,Brazilian Restaurant,Breakfast Spot,Brewery,Building,Burger Joint,Burrito Place,Bus Stop,Cable Car,Cafeteria,Café,Canal,Castle,Caucasian Restaurant,Chinese Restaurant,Chocolate Shop,Church,Climbing Gym,Clothing Store,Cocktail Bar,Coffee Shop,Comedy Club,Concert Hall,Cosmetics Shop,Cupcake Shop,Deli / Bodega,Department Store,Dessert Shop,Discount Store,Dive Bar,Doner Restaurant,Donut Shop,Drugstore,Dumpling Restaurant,Electronics Store,Event Space,Falafel Restaurant,Farmers Market,Fast Food Restaurant,Filipino Restaurant,Flower Shop,Food & Drink Shop,Food Service,Forest,Fried Chicken Joint,Furniture / Home Store,Gaming Cafe,Garden,Gas Station,Gastropub,Gay Bar,General Entertainment,German Restaurant,Gift Shop,Go Kart Track,Gourmet Shop,Greek Restaurant,Grocery Store,Gun Shop,Gym / Fitness Center,Harbor / Marina,Hardware Store,Historic Site,History Museum,Hookah Bar,Hostel,Hot Dog Joint,Hotel,Ice Cream Shop,Indian Restaurant,Indie Movie Theater,Indie Theater,Indoor Play Area,Intersection,Italian Restaurant,Japanese Restaurant,Kumpir Restaurant,Lake,Light Rail Station,Liquor Store,Lounge,Market,Mediterranean Restaurant,Memorial Site,Men's Store,Metro Station,Mexican Restaurant,Middle Eastern Restaurant,Mobile Phone Shop,Modern European Restaurant,Monument / Landmark,Motorcycle Shop,Mountain,Movie Theater,Museum,Music Venue,Nail Salon,Nature Preserve,Neighborhood,Nightclub,Opera House,Optical Shop,Organic Grocery,Outdoor Sculpture,Paintball Field,Park,Pastry Shop,Performing Arts Venue,Persian Restaurant,Pet Store,Pharmacy,Pizza Place,Platform,Playground,Plaza,Poke Place,Pool,Pool Hall,Pub,Racetrack,Ramen Restaurant,Real Estate Office,Record Shop,Rest Area,Restaurant,River,Roof Deck,Russian Restaurant,Sandwich Place,Sauna / Steam Room,Scenic Lookout,Seafood Restaurant,Shopping Mall,Skating Rink,Snack Place,Soccer Field,Soccer Stadium,Soup Place,Spa,Speakeasy,Sports Club,Squash Court,Stadium,Stationery Store,Steakhouse,Street Art,Supermarket,Sushi Restaurant,Syrian Restaurant,Tapas Restaurant,Taverna,Tea Room,Tennis Court,Thai Restaurant,Theater,Theme Park Ride / Attraction,Trail,Tram Station,Trattoria/Osteria,Turkish Restaurant,Vegetarian / Vegan Restaurant,Video Store,Vietnamese Restaurant,Volleyball Court,Whisky Bar,Wine Bar,Wine Shop,Yoga Studio
0,Charlottenburg-Wilmersdorf,0.0,0.0,0.01,0.01,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.07,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.06,0.01,0.03,0.0,0.0,0.0,0.0,0.06,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.02,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.01,0.04,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0
1,Friedrichshain-Kreuzberg,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.01,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.02,0.0,0.01,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.12,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.04,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.05,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.05,0.0,0.01,0.0,0.0,0.0,0.02,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.03,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.08,0.0,0.01,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.0,0.0,0.01,0.01,0.01
2,Lichtenberg,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.025974,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.051948,0.0,0.0,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.012987,0.0,0.0,0.025974,0.0,0.0,0.012987,0.0,0.0,0.0,0.038961,0.0,0.0,0.012987,0.0,0.0,0.0,0.025974,0.0,0.025974,0.0,0.0,0.0,0.012987,0.012987,0.038961,0.0,0.012987,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.012987,0.012987,0.0,0.012987,0.012987,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.025974,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038961,0.0,0.0,0.0,0.0,0.0,0.012987,0.0,0.0,0.012987,0.0,0.142857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.116883,0.0,0.0,0.0,0.0,0.077922,0.0,0.0,0.0,0.0,0.0
3,Marzahn-Hellersdorf,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.054054,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.189189,0.027027,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.027027,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.216216,0.0,0.0,0.0,0.0,0.027027,0.0,0.0,0.0,0.027027,0.027027,0.0,0.027027,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Mitte,0.0,0.0,0.0,0.0,0.04,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.02,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.02,0.01,0.0,0.02,0.02,0.05,0.0,0.03,0.02,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.05,0.0,0.0,0.0,0.09,0.01,0.0,0.03,0.0,0.0,0.0,0.01,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.02,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.01,0.0,0.0
5,Neukölln,0.0,0.03,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.16,0.0,0.0,0.01,0.01,0.0,0.0,0.01,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.04,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.06,0.11,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02,0.0,0.0,0.0,0.01,0.0,0.0,0.02,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.02,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.04,0.01,0.0,0.0,0.03,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.01,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.01,0.0,0.01,0.0,0.0,0.0,0.01,0.0,0.0,0.0,0.0,0.0,0.01,0.04,0.0,0.0,0.0,0.01,0.01,0.02,0.01
6,Pankow,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.076923,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.038462,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.269231,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.115385,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,Reinickendorf,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.045455,0.0,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.045455,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.022727,0.0,0.0,0.022727,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.113636,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.022727,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
8,Spandau,0.0,0.0,0.012195,0.02439,0.0,0.0,0.012195,0.012195,0.012195,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.012195,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.012195,0.0,0.085366,0.0,0.012195,0.02439,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.012195,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.036585,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.012195,0.0,0.012195,0.012195,0.012195,0.02439,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.02439,0.02439,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.0,0.0,0.04878,0.0,0.0,0.02439,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.036585,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.012195,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.121951,0.012195,0.0,0.0,0.0,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.02439,0.0,0.0,0.0,0.012195,0.0,0.0,0.0,0.0,0.0
9,Steglitz-Zehlendorf,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.035714,0.017857,0.0,0.017857,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.107143,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.035714,0.0,0.0,0.017857,0.0,0.017857,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.017857,0.0,0.017857,0.0,0.0,0.0,0.107143,0.0,0.0,0.017857,0.035714,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.017857,0.0,0.0,0.0,0.035714,0.0,0.107143,0.017857,0.0,0.0,0.017857,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.017857


In [42]:
len(ber_grouped[ber_grouped["Vegetarian / Vegan Restaurant"] > 0])

3

### Create a new DataFrame for Vegetarian / Vegan Restaurant data only.

In [43]:
ber_veg = ber_grouped[["Districts", "Vegetarian / Vegan Restaurant"]]

In [46]:
ber_veg

Unnamed: 0,Districts,Vegetarian / Vegan Restaurant
0,Charlottenburg-Wilmersdorf,0.0
1,Friedrichshain-Kreuzberg,0.04
2,Lichtenberg,0.0
3,Marzahn-Hellersdorf,0.0
4,Mitte,0.01
5,Neukölln,0.04
6,Pankow,0.0
7,Reinickendorf,0.0
8,Spandau,0.0
9,Steglitz-Zehlendorf,0.0


## 7. Cluster Neighborhoods

### Run k-means to cluster the neighborhoods in Berlin into 3 clusters.

In [48]:
# set number of clusters
kclusters = 3

ber_clustering = ber_veg.drop(["Districts"], 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(ber_clustering)

# check cluster labels generated for each row in the dataframe
kmeans.labels_[0:10]

array([1, 0, 1, 1, 2, 0, 1, 1, 1, 1], dtype=int32)

In [49]:
# create a new dataframe that includes the cluster as well as the top 10 venues for each neighborhood.
ber_merged = ber_veg.copy()

# add clustering labels
ber_merged["Cluster Labels"] = kmeans.labels_

In [50]:
ber_merged.rename(columns={"Districts": "District"}, inplace=True)
ber_merged.head()

Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels
0,Charlottenburg-Wilmersdorf,0.0,1
1,Friedrichshain-Kreuzberg,0.04,0
2,Lichtenberg,0.0,1
3,Marzahn-Hellersdorf,0.0,1
4,Mitte,0.01,2


In [52]:
# merge ber_grouped with ber_df to add latitude/longitude for each neighborhood
ber_merged = ber_merged.join(ber_df.set_index("Districts of Berlin, DE"), on="District")

print(ber_merged.shape)
ber_merged.head() # check the last columns!

(12, 5)


Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels,Latitude,Longitude
0,Charlottenburg-Wilmersdorf,0.0,1,52.507856,13.263952
1,Friedrichshain-Kreuzberg,0.04,0,52.515306,13.461612
2,Lichtenberg,0.0,1,52.532161,13.511893
3,Marzahn-Hellersdorf,0.0,1,52.522523,13.587663
4,Mitte,0.01,2,52.51769,13.402376


In [53]:
# sort the results by Cluster Labels
print(ber_merged.shape)
ber_merged.sort_values(["Cluster Labels"], inplace=True)
ber_merged

(12, 5)


Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels,Latitude,Longitude
1,Friedrichshain-Kreuzberg,0.04,0,52.515306,13.461612
5,Neukölln,0.04,0,52.48115,13.43535
0,Charlottenburg-Wilmersdorf,0.0,1,52.507856,13.263952
2,Lichtenberg,0.0,1,52.532161,13.511893
3,Marzahn-Hellersdorf,0.0,1,52.522523,13.587663
6,Pankow,0.0,1,52.597637,13.436374
7,Reinickendorf,0.0,1,52.604763,13.295287
8,Spandau,0.0,1,52.535788,13.197792
9,Steglitz-Zehlendorf,0.0,1,52.429205,13.229974
10,Tempelhof-Schöneberg,0.0,1,52.440603,13.373703


### Finally, let's visualize the resulting clusters.

In [54]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=11)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i+x+(i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(ber_merged['Latitude'], ber_merged['Longitude'], ber_merged['District'], ber_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' - Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

In [55]:
# save the map as HTML file
map_clusters.save('map_clusters.html')

## 8. Examine Clusters

### Cluster 0

In [56]:
ber_merged.loc[ber_merged['Cluster Labels'] == 0]

Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels,Latitude,Longitude
1,Friedrichshain-Kreuzberg,0.04,0,52.515306,13.461612
5,Neukölln,0.04,0,52.48115,13.43535


### Cluster 1

In [57]:
ber_merged.loc[ber_merged['Cluster Labels'] == 1]

Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels,Latitude,Longitude
0,Charlottenburg-Wilmersdorf,0.0,1,52.507856,13.263952
2,Lichtenberg,0.0,1,52.532161,13.511893
3,Marzahn-Hellersdorf,0.0,1,52.522523,13.587663
6,Pankow,0.0,1,52.597637,13.436374
7,Reinickendorf,0.0,1,52.604763,13.295287
8,Spandau,0.0,1,52.535788,13.197792
9,Steglitz-Zehlendorf,0.0,1,52.429205,13.229974
10,Tempelhof-Schöneberg,0.0,1,52.440603,13.373703
11,Treptow-Köpenick,0.0,1,52.417893,13.600185


### Cluster 2

In [58]:
ber_merged.loc[ber_merged['Cluster Labels'] == 2]

Unnamed: 0,District,Vegetarian / Vegan Restaurant,Cluster Labels,Latitude,Longitude
4,Mitte,0.01,2,52.51769,13.402376


## 9. Conclusion

#### Analyzing the data collected above, we can see that vegetarian / vegan restaurants are still something new in Berlin. Most of the those few restaurants are concentrated in Friedrichshain-Kreuzberg and Neukölln, within the cluster 0. There's also a vegetarian / vegan restaurante in Mitte, the central area of Berlin, within the cluster 2. 
#### It is interesting to note that in the cluster number 1, with the largest number of districts in it, is the one that does not have any vegetarian / vegan restaurants listed, which represents  a great opportunity and high potential areas to open new restaurants for those who do not want to eat meat,  as there is almost no competition from existing restaurants of this kind in the area. 
#### Therefore, this project recommends property developers to capitalize on these findings to open new vegetarian / vegan restaurants in neighborhoods in cluster 1 with almost no competition. Property developers vegetarian / vegan restaurants can also open new shopping malls in neighborhoods in cluster 2 with little competition. 