>>> ## IBM Applied Data Science Capstone Final Project:
>> # Exploring the Services Available Near and the Quality of Playgrounds in Germany
#### This project is designed to satisfy the IBM Applied Data Science Capstone final project requirements. However, the project is also intended to be useful.

# Table of contents
1. [Introduction](#introduction)
2. [Part I: Data preparation](#part1)
    1. [Part IA: Get a list of playgrounds in a city of interest](#part1A)
    2. [Part IB: Get detailed playground information for each location](#part1B)
    3. [Part IC: Visualizing the playground dataset](#part1C)
    4. [Part ID: Adding Foursquare data](#part1D)
3. [Part II: Methodology and results](#part2)
    1. [Part IIA: Exploring the top five venue types for each playground](#part2A)
    2. [Part IIB: Clustering and mapping based on venues](#part2B)
    3. [Part IIC: Clustering and mapping based on playground equipment](#bonus1)
    4. [Part IID: Finding the playgrounds that are near fast food restaurants, icecream shops, etc.](#bonus2)
4. [Discussion and concluding remarks](#discussionandconclusion)

# Introduction <a name="introduction"></a>

### Description of the problem/background <a name="description"></a>
Before the series of coronavirus lockdowns that we've had in Germany, my spouse and I generally did our shopping and errands while our children were at school. We could then do family trips the nearby playgrounds in the afternoon. Playground trips would occur sometimes several times a week. 
<br /> 
<br />
Now in coronavirus times, things are a bit different. For weeks at a time, our children are home ALL THE TIME in lockdown with us. So, to go do our shopping involves one of us staying with the children while the other is shopping. One option is of course to stay home with them, but they're also quite bored with being in lockdown and some outside time is great for them anyway. So, now that the lockdowns are less intense - playgrounds open but not always schools and daycares - we like to combine shopping and playground trips. This generally involves one of us going into the store while the other takes the children to a playground.
<br />
<br />
To aid the combined shopping-playground process, this project will combine playground and commercial venue data. Using a crowd-sourced playground database and the Foursquare API, we can check which playgrounds are near what sorts of shops. Do we need to go to a variety of stores to meet our shopping requirements? There's a cluster of playgrounds for that. Do we need a set of playgrounds near supermarkets and such? Yep, we can find those. How about playgrounds with extensive equipment or playgrounds that are away from the shops so the children can really go crazy? Yes, we can identify those too. So, there's a couple different problems being discussed here. The primary one is combining shopping and playground trips into the village. The other is identifying playgrounds that fit specific playing needs. That is, playgrounds with a lot of equipment for an extended adventure versus more limited ones for shorter trips. I should be able to address both sorts of information requirements once I collect and prepare the data.
<br />
<br />

### Data plan<a name="data"></a>
The plan is to use html web scraping to retrieve a list of playgrounds and their characteristics from the crowd-source based website 'spielplatznet' (https://spielplatznet.de/spielplaetze). This site allows a user to search for a city in Germany which then returns a list of playgrounds in the vicinity. It is based on playground users inputting the data, so not all areas of the country are well-represented. However, the area where I will conduct the analysis - the village of Wedel in the state of Schleswig-Holstein (near Hamburg) has pretty good data. A plus is that I know many of the locations well and so can confirm when the data is complete or missing.
<br />
<br />
The second substantial data source is the Foursquare API. I will use it to retrieve information on venues near each playground. I can then classify the playgrounds based on what's nearby for the purpose of combining shopping/errands and playground trips to the village. I'll primarily use the Foursquare data in a k-means clustering process, but also to search through for particular types of venues. These will include particular stores, store types, or stores with keywords in their titles such as 'icecream' ('eis' in German).
<br />
<br />
I will also use geolocator to search for the village's geocoordinates. This is probably a little excessive as I could take an average/mean of the playground coordinates.

### Data examples<a name="dataexample"></a>
To aid in planning, I have gathered data from the Spielplatznet and Foursquare websites to make example rows of the dataframes that I will be developing:

####  From the playground information website Spielplatznet:

In [2]:
#The playground data includes identifying information as well as a short description (in German) and I will scrape
#information on the number and types of playground equipment available.
import pandas as pd
pd.set_option('display.max_columns', None)
df_columns=('playground', 'latitude', 'longitude', 'description',
       'rating', 'water feature', 'sandpit', 'cable car', 'playhouse',
       'tree house', 'slide', 'swing', 'climbing features', 'sledding hill',
       'football field', 'seesaw', 'basketball', 'nest swing', 'total equipment')
df=pd.DataFrame([['Spielplatz Waldspielplatz Moorwegsiedlung Wedel',53.5926308917772,9.73169803619385,
    'Großer Spielplatz im Wald. Viel Wiese.',5,0,0,2,0,0,0,1,1,0,0,0,0,0,4]],columns=df_columns)
df

Unnamed: 0,playground,latitude,longitude,description,rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,total equipment
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Großer Spielplatz im Wald. Viel Wiese.,5,0,0,2,0,0,0,1,1,0,0,0,0,0,4


#### From the Foursquare website:

In [3]:
#This is an example of the data when it has been grouped by playground and mean-normalized for use in 
#the k-means clustering algorithm.
df2_columns=('Playground','Asian Restaurant','Auto Garage','Bakery','Beach','Beach Bar','Boat Rental',
            'Boat or Ferry','Bookstore','Bus Stop','Café','Clothing Store','College Gym','Construction & Landscaping',
            'Drugstore','Electronics Store','Farmers Market','Fast Food Restaurant','Flea Market','Food & Drink Shop',
            'French Restaurant','Furniture / Home Store','Garden','Garden Center','German Restaurant','Gym',
            'Gym / Fitness Center','Harbor / Marina','Hotel','Insurance Office','Italian Restaurant',
            'Light Rail Station','Liquor Store','Mexican Restaurant','Museum','Nightclub','Optical Shop','Pet Store',
            'Pier','Plaza','Pool','Pub','Residential Building (Apartment / Condo)','Restaurant','Sandwich Place',
            'Sculpture Garden','Seafood Restaurant','Shopping Mall','Soccer Field','Spa','Steakhouse','Supermarket',
            'Taverna','Tea Room','Thai Restaurant','Theater','Trail','Trattoria/Osteria','Turkish Restaurant')
df2=pd.DataFrame([['Spielplatz Croningstraße Wedel',0.0625,0.0625,0,0,0,0,0,0,0,0,0,0,0,0,0,
                  0.0625,0,0.125,0,0,0.0625,0.0625,0,0,0,0,0.0625,0,0,0,0,0,0,0,0,0.0625,0,
                  0.0625,0,0,0,0,0,0.0625,0.0625,0,0,0,0,0,0,0.187500,0.062500,0,0,0,0,0]],columns=df2_columns)
df2

Unnamed: 0,Playground,Asian Restaurant,Auto Garage,Bakery,Beach,Beach Bar,Boat Rental,Boat or Ferry,Bookstore,Bus Stop,Café,Clothing Store,College Gym,Construction & Landscaping,Drugstore,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,French Restaurant,Furniture / Home Store,Garden,Garden Center,German Restaurant,Gym,Gym / Fitness Center,Harbor / Marina,Hotel,Insurance Office,Italian Restaurant,Light Rail Station,Liquor Store,Mexican Restaurant,Museum,Nightclub,Optical Shop,Pet Store,Pier,Plaza,Pool,Pub,Residential Building (Apartment / Condo),Restaurant,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shopping Mall,Soccer Field,Spa,Steakhouse,Supermarket,Taverna,Tea Room,Thai Restaurant,Theater,Trail,Trattoria/Osteria,Turkish Restaurant
0,Spielplatz Croningstraße Wedel,0.0625,0.0625,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0625,0,0.125,0,0,0.0625,0.0625,0,0,0,0,0.0625,0,0,0,0,0,0,0,0,0.0625,0,0.0625,0,0,0,0,0,0.0625,0.0625,0,0,0,0,0,0,0.1875,0.0625,0,0,0,0,0


### Analysis plan<a name="setup"></a>
The general plan follows:
- Retrieve a list of the playgrounds in the vicinity of a German city.
- Use that list to then look up each playground's detailed information.
- Use Foursquare's api to then find which venues are nearby and add to the dataset.
- Find the commercial characteristics of each playground's neighborhood.
- Cluster the playgrounds based on their commercial surroundings.
- Also cluster the playgrounds based on the equipment available.
- Finally, make a few lists of playgrounds with kid-friendly food and icecream nearby and certain playground features.

#### Import the relevant libraries:

In [4]:
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests
import re
import folium
from folium import plugins
from folium.plugins import HeatMap
from geopy.geocoders import Nominatim
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.cm as cm
import matplotlib.colors as colors
from IPython.display import display
pd.set_option('display.max_columns', None)

# Part I: Data preparation: <a name="part1"></a>

### Part IA: Get a list of playgrounds in a city of interest <a name="part1A"></a>
Save as a list of URLs linking to detailed playground information

#### Set up the search URL by specifying the city where we want to look for playgrounds:

In [5]:
'''
Note this notebook was designed to query the playgrounds in the vicinities of Elmshorn and Wedel.
For the purposes of this assignment, Wedel is used as the data is particularly complete.
'''
search_city = input("Please enter a German city: ")
print('')
print("If the search does not return a list of playgrounds in the city, please enter the name of a larger city.")
#This is the website url where we can then append a city name and search:
search_site_url = "https://spielplatznet.de/spielplaetze/"
url = search_site_url + search_city
print('')
print("Here is the url where more information is available on the playgrounds available:")
print(url)

Please enter a German city: Wedel

If the search does not return a list of playgrounds in the city, please enter the name of a larger city.

Here is the url where more information is available on the playgrounds available:
https://spielplatznet.de/spielplaetze/Wedel


#### Import the city playground search data using a get request and make it more readable with Beautiful Soup:

In [6]:
data = requests.get(url).text
#Create a soup object using the variable 'data':
soup = BeautifulSoup(data,"html5lib")
#Display the material available to work with:
print(soup.prettify())

<html class="no-js" lang="de">
 <head>
 </head>
 <body class="blau" onload="initialize('Wedel',-1,51.158592,10.406305,10,1,53.5989169842047,9.73169803619385,53.5669774121414,9.68239367008209); ">
  ﻿
  <meta content="ie=edge" http-equiv="x-ua-compatible"/>
  <meta content="width=device-width, initial-scale=1.0" name="viewport"/>
  <meta content="text/html; charset=utf-8" http-equiv="Content-Type"/>
  <meta content="text/javascript" http-equiv="Content-Script-Type"/>
  <meta content="text/css" http-equiv="Content-Style-Type"/>
  <meta content="german" http-equiv="content-language"/>
  <meta content="public,no-cache" http-equiv="Cache-Control"/>
  <meta content="de" name="Content-Language"/>
  <meta content="german" name="Language"/>
  <meta content="all" name="audience"/>
  <meta content="Spielplatznet.de" name="copyright"/>
  <meta content="INDEX,FOLLOW" name="Robots"/>
  <meta content="0" name="expires"/>
  <meta content="7 days" name="revisit-after"/>
  <meta content="Ralph Anthes" n

#### Given a search city, get a list of playground information urls:

In [7]:
#Prepare an empty list:
str_links=[]
#Search through soup for html anchor/links represented by the tag <a>:
for link in soup.findAll('a'):
    #When found, append to the empty list:
    temp_href=link.get('href')
    temp_href.replace('/spielplatz/', '/', 1)
    str_links.append("https://spielplatznet.de"+temp_href)
#I only want the list entries that have url data, so find and keep those:
playground_urls = [s for s in str_links if "/"+search_city+"/" in s]
#Because of the way the source site is written, also need to find and drop entries containing "spielplaetze":
playground_urls = [s for s in playground_urls if "spielplaetze" not in s]
#Check whether the list of playground urls matches the expectation (for Wedel should be 51):
print(len(playground_urls))
playground_urls

51


['https://spielplatznet.de/spielplatz/872/Wedel/Waldspielplatz Moorwegsiedlung',
 'https://spielplatznet.de/spielplatz/849/Wedel/Haselweg',
 'https://spielplatznet.de/spielplatz/855/Wedel/Meisenweg',
 'https://spielplatznet.de/spielplatz/17868/Wedel/Wasserspielplatz Haus am See',
 'https://spielplatznet.de/spielplatz/856/Wedel/Mühlenweg',
 'https://spielplatznet.de/spielplatz/864/Wedel/Rotdornstraße',
 'https://spielplatznet.de/spielplatz/846/Wedel/Hamburger Yachthafen',
 'https://spielplatznet.de/spielplatz/844/Wedel/Ginsterweg',
 'https://spielplatznet.de/spielplatz/847/Wedel/Hans-Böckler Platz',
 'https://spielplatznet.de/spielplatz/861/Wedel/Pulverstraße',
 'https://spielplatznet.de/spielplatz/830/Wedel/Alter Zirkusplatz',
 'https://spielplatznet.de/spielplatz/831/Wedel/Altstadtschule',
 'https://spielplatznet.de/spielplatz/832/Wedel/Anne-Frank-Weg',
 'https://spielplatznet.de/spielplatz/833/Wedel/Ansgariusweg',
 'https://spielplatznet.de/spielplatz/835/Wedel/Brombeerweg',
 'https:

### Part IB: Get detailed playground information for each location <a name="part1B"></a>
Transform the playground information into a dataframe.
<br /> Note the source is an amateur, crowd-sourced site. 
So, getting the relevant information out of it is less straightforward than some of those used in the course.
<br />
<br />In this section I primarily rely on writing functions to do the work, then calling them at the end.

#### A function that retrieves a playground's name and geocoordinates:

In [8]:
'''
This function pulls the basic descriptive data for a playground. It takes a playground's url data as 'a_soup'
and returns the playground's name, latitude, longitude, and a longer description that sometimes includes the 
street address. The longer description is crowd-sourced and so is a bit inconsistent. The output is a dictionary
with entries {a_name: playground name, a_lat: latitude, a_long: longitude, a_name_address: long description}
'''
def name_location_data(a_soup):
    #Parse out the playground's name:
    a_name=str(a_soup.find_all('meta',property="og:site_name"))
    a_name=a_name.split('"')
    a_name=a_name[1].split(':')
    a_name=[a_name[1]]
    #Parse out the latitude:
    a_lat=str(a_soup.find_all('meta',property="og:latitude"))
    a_lat=[a_lat.split('"')[1]]
    #Parse out the longitude:
    a_long=str(a_soup.find_all('meta',property="og:longitude"))
    a_long=[a_long.split('"')[1]]
    #Parse out the long description including street address:
    a_name_address=str(a_soup.find_all('meta', property="og:description"))
    a_name_address=a_name_address.split('"')[1]
    a_name_address=[a_name_address.split(str(len(playground_urls)))[0]]
    #Turn the components into a dictionary to make it easier to process later.
    #This could also have been integrated into the former steps but is at least as transparent this way:
    dict={}
    dict['a_name']=a_name
    dict['a_lat']=a_lat
    dict['a_long']=a_long
    dict['a_name_address']=a_name_address
    #return the dictionary:
    return dict

#### A function that retrieves a longer playground description and user rating:

In [9]:
'''
The source website includes a space for users to write a longer description of the playground and various notes.
For examples, they sometimes note the most appropriate age range for the equipment available.
Some playgrounds also have a star rating (with 5-star being the best). However, not all are rated.
This data is mostly retrieved for personal use.
'''
def descriptions_and_rating(a_soup): 
    #Sometimes these sections are blank, preparing the result for this:
    a_description=['']
    a_feature=['']
    a_rating=['']
    #Retrieve the site descriptions:
    for heading in a_soup.find_all("h3"):
        #Heading "Beschreibung" has the long description of the site which sometimes includes the best age range:
        if heading.text.strip()=="Beschreibung":
            #Get to the description following the heading:
            a_description=heading.next.next
            #Only retrieve the portion needed:
            pattern = '<div class="description">'
            string = str(a_description)
            repl = ' '
            a_description = re.sub(pattern, repl, string, count=1)
            pattern = '</div>'
            string = str(a_description)
            repl = ' '
            a_description = re.sub(pattern, repl, string, count=1)
            a_description=[a_description]
        ##There were rarely other short notes that were ultimately not used:
        ##These are additional short notes, ex. if the site has shade:
        #elif heading.text.strip()=='Features':
        #    a_feature=[heading.next.next]
        #Sometimes the playground sites also report a user-assigned star rating within the heading "Bewertungen/ Kommentare".
        #So, also retrieving that:
        elif heading.text.strip()=='Bewertungen/ Kommentare':
            #Getting to the data under the heading:
            a_rating=str(heading.next.next.next) 
            #If there isn't a rating, the site encourages the user to add one, we don't need this:
            if a_rating!='Leider wurden noch keine Bewertungen getätigt.' :
                #If there is a rating, parse it out and return it:
                if a_rating.split('"')[3]!='':
                    a_rating=[a_rating.split('"')[3]]
                #If there isn't a rating, return a blank:
                else:
                    a_rating=[''] 
            else:
                a_rating=['']
    #Turn the results into a dicionary for later convenience:
    dict={}
    dict['a_description']=a_description
    dict['a_rating']=a_rating
    #Return the dictionary:
    return dict

#### A function that builds a dictionary of German-English playground equipment names:

In [10]:
'''
As the source website is in German, I find it useful to build a German-English dictionary of playground equipment.
The German names of common playground equipment types is the key, and the English correspondence as a value. I am also
using the dictionary in a later step to keep track of the equipment at a playground. So, the dictionary is of form
{German name: [English name, 0]} and the zero is then replaced with a count later.
'''
def equipment_dictionary():
    #Prepare an empty dictionary:
    DE_EN_dictionary=  {} 
    #Make a list of German playground equipment:
    DE_text=["Wasserspiel","Sand","Seilbahn","Spielhaus","Baumhaus","Rutsche","Schaukel","Kletter",
             "Rodelberg","Bolzplatz","Wippe","Basketball","Nestschaukel","Schwingschaukel",
             "Drehscheibe","Karussell","Tischtennis","Trampolin","eisenbahn","traktor","Bagger",
             "Kletterturm","Tunnel","Federbrett","Blancierbretter","Toilette","Fahrradständ"]
    #Make a list of the same equipment in English:
    EN_text=["Water feature","Sandpit","Cable car","Playhouse","tree house","slide","Swing","climbing features",
             "sledding Hill","Football field","seesaw","Basketball","Nest swing","swings",
             "turntable","carousel","table tennis","trampoline","railroad","tractor","Excavator",
             "Climbing tower","tunnel","Spring board","Blancing boards","Toilets",
             "Bicycle stand"]
    #Populating the dicionary via a while loop:
    counter=0
    while counter<len(DE_text):
        #At each iteration, the next German word is used as a key and the next English word is assigned as a value:
        #I also save the keys and values as all lower case for easier searching and matching.
        DE_EN_dictionary[DE_text[counter].lower()] = EN_text[counter].lower()
        counter=counter+1
    #Change the dictionary value to a list and add a second entry of zero for each to be used in the next step:
    #This could also have been done in the former step, but is particularly clear here.
    for key in DE_EN_dictionary:
        DE_EN_dictionary[key]=[DE_EN_dictionary[key],0]
    #Return the resulting dictionary:
    return DE_EN_dictionary

#### A pair of functions that retrieve any playground equipment listed:

In [11]:
'''
The source website sometimes makes note of the types of playground equipment available.
The website is great for the intended use, but the html for this area is particularly difficult to retrieve and parse.
So, I retrieve the start of the relevant section and the 1,000 chacters to follow. I then transform this information 
into a string, parse out just the section needed, clean it up a little, and then search through it for the dictionary
keys from the prior function. When the relevant words are found, the playground equipment dictionary also acts as a 
counter and records their existence.
'''
#This function takes a string, searches for key words against the dictionary, and counts when they're found:
def equipment_counter(temp_features, DE_EN_dictionary):  
    #Remove any leading spaces from the string:
    temp_features = temp_features.strip()
    #Make the string all lower case:
    temp_features = temp_features.lower()
    # Split the string into words noting that a lot of different punctuation is used in html:
    words_split = re.findall( r'\w+|[^\s\w]+', temp_features)
    #Loop through the equipment dictionary and the parsed string to search for any words of interest and count them:
    for key in DE_EN_dictionary.keys():
        for word in words_split:
            # Check if the word is in the dictionary:
            if word in DE_EN_dictionary.keys() and word==key:
                #If in the dictionary, add one to the count:
                DE_EN_dictionary[key][1] = DE_EN_dictionary[key][1] + 1
            else:
                #If not in the dictionary, don't add to the count:
                DE_EN_dictionary[key][1]
    #Return the dictionary which now also includes the count of equipment at the playground:
    return DE_EN_dictionary

#This function generally calls the former one plus the equipment dictionary. 
#But first it retrieves the html code that might include equipment:
def playground_equipment_counted(a_soup):
    #As the html is a bit unstructured, retrieve as a string:
    all_string=str(a_soup)
    #Keep the part of the string that might contain the playground equipment:
    #From the heading "Spielplatzgeräte" (playground equipment) plus 1,000 characters.
    temp_features=all_string[all_string.find('Spielplatzgeräte'):all_string.find('Spielplatzgeräte')+1000]
    #Remove more pieces of the string which aren't needed and could cause double-counting:
    pattern = '</h3>.*"/>'
    string = temp_features
    repl = ' '
    temp_features = re.sub(pattern, repl, string, count=1)
    #Next, use the functions equipment_dictionary() and equipment_counter(temp_features, DE_EN_dictionary) to make 
    #counts of the playground equipment available at each site in the dicionary:
    DE_EN_equipment_counted=equipment_counter(temp_features, equipment_dictionary() )
    #Return the playground equipment dictionary, but now with the playground equipment for the site counted:
    return DE_EN_equipment_counted

#### Making a dataframe of playground information:
The dataframe will contain the information for all listed playgrounds in the search city's area.

In [12]:
#Make the column names for the resulting dataframe:
a_columns=('a_name', 'a_lat', 'a_long', 'a_name_address', 'a_description',
       'a_rating', 'water feature', 'sandpit', 'cable car', 'playhouse',
       'tree house', 'slide', 'swing', 'climbing features', 'sledding hill',
       'football field', 'seesaw', 'basketball', 'nest swing', 'swings',
       'turntable', 'carousel', 'table tennis', 'trampoline', 'railroad',
       'tractor', 'excavator', 'climbing tower', 'tunnel', 'spring board',
       'blancing boards', 'toilets', 'bicycle stand', 'total equipment')
#Make an empty dataframe with the column headers in which to put the results:
playground_df = pd.DataFrame(columns=a_columns)
#For each playground url in the list from part I, retrieve the playground's details:
for i in range(len(playground_urls)):
    #Assign i's url to be the one retrieved in this loop:
    a_url=playground_urls[i]
    #In-process the playground site's information:
    a_data  = requests.get(a_url).text
    a_soup = BeautifulSoup(a_data,"html5lib")
    #Run the prior functions to process and format the playground's information:
    out1=name_location_data(a_soup)
    out2=descriptions_and_rating(a_soup)
    out3=playground_equipment_counted(a_soup)
    #Convert each function's results into dataframes:
    a_output=pd.DataFrame(out1)
    b_output=pd.DataFrame(out2)
    c_output=pd.DataFrame(out3)
    #The c_output is the dictionary of playground equipment counts.
    #Initially it has two rows. The first row has equipment names which are then used as the column names:
    c_output.columns= c_output.iloc[0].copy()
    #The equipment counts are then retained as the first row and the extra row dropped:
    c_output.iloc[0]=c_output.iloc[1]
    c_output.drop(index=c_output.index[1], axis=0, inplace=True)
    #I then add an extra column which is a sum of the number of equipment available:
    total_equipment=int(c_output[c_output.columns].sum().sum())
    c_output['total equipment']=total_equipment
    #Combine the dataframes for a playground into one output which is a single row with several columns:
    result = pd.concat([a_output, b_output, c_output], axis=1, join='inner')
    #The playground's row is then appended to the dataframe containing all the playground's information:
    playground_df=playground_df.append(result, ignore_index=False, verify_integrity=False, sort=None)
#For each feature count, treat as a 0/1 indicator variable:
feature_list=['water feature', 'sandpit', 'cable car', 'playhouse',
       'tree house', 'slide', 'swing', 'climbing features', 'sledding hill',
       'football field', 'seesaw', 'basketball', 'nest swing', 'swings',
       'turntable', 'carousel', 'table tennis', 'trampoline', 'railroad',
       'tractor', 'excavator', 'climbing tower', 'tunnel', 'spring board',
       'blancing boards', 'toilets', 'bicycle stand']
for feature in feature_list:
    playground_df.loc[playground_df[feature]!=0, feature] = 1
#Show the resulting dataframe:
playground_df

Unnamed: 0,a_name,a_lat,a_long,a_name_address,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.5926308917772,9.73169803619385,Großer Spielplatz im Wald. Viel Wiese.,Großer Spielplatz im Wald. Viel Wiese.,5.0,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13
0,Spielplatz Haselweg Wedel,53.5912910844463,9.70646917819977,Schöner Spielplatz mit angegliederter kleiner ...,Schöner Spielplatz mit angegliederter kleiner...,5.0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Meisenweg Wedel,53.5943062279335,9.71506834030151,Großer Spielplatz mit viel Wiese. Die Spielger...,Großer Spielplatz mit viel Wiese. Die Spielge...,5.0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Wasserspielplatz Haus am See Wedel,53.5914407324793,9.70553040504456,Spielplatz Wasserspielplatz Haus am See in Wed...,Der Spielplatz macht einen herausragenden Eind...,5.0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,Spielplatz Mühlenweg Wedel,53.581066013162,9.71019208431244,Schön gestalteter Spielplatz am Mühlenweg.,Schön gestalteter Spielplatz am Mühlenweg.<br/>,4.0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,Spielplatz Rotdornstraße Wedel,53.591066930019,9.68836158514023,Besonderheiten laut der Liste aus Wedel in Zah...,Besonderheiten laut der Liste aus Wedel in Za...,5.0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,Spielplatz Hamburger Yachthafen Wedel,53.5743968895724,9.68239367008209,"neuer, riesiger toller spielplatz, muss man hi...","neuer, riesiger toller spielplatz, muss man h...",5.0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,Spielplatz Ginsterweg Wedel,53.5736384685368,9.71911311149597,Der Spielplatz ist auf mehrere Ebenen in einem...,Der Spielplatz ist auf mehrere Ebenen in eine...,4.0,0,0,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8
0,Spielplatz Hans-Böckler Platz Wedel,53.5688601987249,9.71491277217865,Der Spielplatz ist zur Straße hin mit einem Za...,Der Spielplatz ist zur Straße hin mit einem Z...,4.0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,5
0,Spielplatz Pulverstraße Wedel,53.5712051224608,9.71940010786057,Dieser mittelgroße Spielplatz befindet sich nö...,Dieser mittelgroße Spielplatz befindet sich n...,4.0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,5


In [13]:
#Changing the dataframe's latitudes and longitudes to float type:
playground_df['a_lat']=playground_df['a_lat'].astype(float)
playground_df['a_long']=playground_df['a_long'].astype(float)
#Copying the dataframe. The dataframe built from the website's information has no more changes done to it and now serves
#as source material for the rest of the process:
p_df=playground_df.copy()

### Part IC: Visualizing the playground dataset <a name="part1C"></a>

#### Get some starting coordinates for mapping the playgrounds:

In [14]:
#This can certainly be done by taking an average of the playground coordinates or something like that.
#Another way is to use geolocator:
#Searching for the coordinates based on a search for the city in Germany 'search_city':
address = '{}, Germany' .format(search_city)
geolocator = Nominatim(user_agent="playground_explorer")
#Call and retrieve the location data:
location = geolocator.geocode(address)
#Retrieve the latitude:
latitude = location.latitude
#Retrieve the longitude:
longitude = location.longitude
#Display the geocoordinates and the city name:
print('The geograpical coordinate of {} are {}, {}.'.format(address, latitude, longitude))

The geograpical coordinate of Wedel, Germany are 53.5810226, 9.7038772.


### Visualize the playground dataset:

In [15]:
#Create a folium map of the city's area using the latitude and longitude values:
map_playgrounds = folium.Map(location=[latitude, longitude], zoom_start=13)
#Add markers to the map for each playground in a loop over the dataframe:
for lat, lng, playground in zip(p_df['a_lat'], p_df['a_long'], p_df['a_name']):
    #Assign the playground name to the marker label:
    label = '{}'.format(playground)
    #Make the label pop up when clicked on:
    label = folium.Popup(label, parse_html=True)
    #Define the folium inputs - lat, long, marker size, color, pop up label data, etc.:
    folium.CircleMarker(
        [lat, lng],
        radius=5,
        popup=label,
        color='blue',
        fill=True,
        fill_color='#3186cc',
        fill_opacity=0.7,
        parse_html=False).add_to(map_playgrounds) #Finally add the point to the map.
#Display the map:
map_playgrounds

In [16]:
#It might also be interesting to view the playgrounds in the form of a heat map:
heat_map_playgrounds = folium.Map(location=[latitude, longitude], zoom_start=13)
HeatMap(list(zip(p_df['a_lat'], p_df['a_long'])),radius=35).add_to(heat_map_playgrounds)
heat_map_playgrounds

### Part ID: Adding Foursquare data <a name="part1D"></a>

### Set up to use Foursquare:

#### Set the Foursquare client information (hidden when sharing):

In [151]:
#Hide
#This square should be hidden when sharing.
#Foursquare ID:
CLIENT_ID = 'M033QEZW4LF5UWKUPA0MZKNGU02EBCHQEWKJEHRKV2YIGO1X'
#Foursquare secret:
CLIENT_SECRET = 'CORCD1VAYNRTW3ME31SNDJS4P100S0KSBWDAOVBPV0UE1TIF'

#### Establish the Foursquare search parameters:

In [152]:
#Foursquare version:
VERSION = '20210520'
#Limit the total number of venues returned for each point:
LIMIT=100
#Search radius in meters around each playground:
radius=500
#The radius could also be a user input at the beginning, after the city name.

#### A function that searches Foursquare for venues within a radius around a playground:

In [153]:
'''
This function accesses the Foursquare API, passes the latitudes and longitudes of a set of playgrounds 
(as well as a radius and return limit), and receives back information on the commercial venues within the circle 
defined by the radius around each playground.
'''
def getNearbyVenues(latitudes, longitudes, playgrounds, radius, LIMIT):
    #Create a list to store returned venues in:
    venues_list=[]
    #Run as a loop through the playgrounds in the dataframe when called:
    for lat, lng, playground in zip(latitudes, longitudes, playgrounds):
        print(playground)     
        #Create the API request URL:
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, CLIENT_SECRET, VERSION, lat, lng, radius, LIMIT)    
        #Make the relevant get request:
        results = requests.get(url).json()["response"]['groups'][0]['items']
        #Store the information for each nearby listed venue:
        venues_list.append([(
            playground, lat, lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])
    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    #Define the dataframe column names:
    nearby_venues.columns = ['Playground', 'Playground Latitude', 'Playground Longitude', 
                  'Venue', 'Venue Latitude', 'Venue Longitude', 'Venue Category']
    #Return the venues information:
    return(nearby_venues)

#### Call the getNearbyVenues(...) function for the playgrounds dataset:

In [154]:
Playground_venues = getNearbyVenues(playgrounds=p_df['a_name'], 
                                    latitudes=p_df['a_lat'], longitudes=p_df['a_long'], 
                                    radius=radius, LIMIT=LIMIT)
#The function will also print which playgrounds have been called within the function.

 Spielplatz Waldspielplatz Moorwegsiedlung Wedel
 Spielplatz Haselweg Wedel
 Spielplatz Meisenweg Wedel
 Spielplatz Wasserspielplatz Haus am See Wedel
 Spielplatz Mühlenweg Wedel
 Spielplatz Rotdornstraße Wedel
 Spielplatz Hamburger Yachthafen Wedel
 Spielplatz Ginsterweg Wedel
 Spielplatz Hans-Böckler Platz Wedel
 Spielplatz Pulverstraße Wedel
 Spielplatz Alter Zirkusplatz Wedel
 Spielplatz Altstadtschule Wedel
 Spielplatz Anne-Frank-Weg Wedel
 Spielplatz Ansgariusweg Wedel
 Spielplatz Brombeerweg Wedel
 Spielplatz Croningstraße Wedel
 Spielplatz Gärtnerstraße Wedel
 Spielplatz Ernst-Thälmann-Weg Wedel
 Spielplatz Geesthang Wedel
 Spielplatz Heinrich-Schacht-Straße Wedel
 Spielplatz Lindenstraße Wedel
 Spielplatz Pferdekoppel Wedel
 Spielplatz Pinneberger Straße Wedel
 Spielplatz Rosengarten Wedel
 Spielplatz Schwartenseekamp Wedel
 Spielplatz Strandbad Wedel
 Spielplatz Vogt-Körner Straße Wedel
 Spielplatz Wacholderstraße Wedel
 Spielplatz Kronskamp Wedel
 Spielplatz Appelboomtwiete 

#### Check the playgrounds-venues data:

In [155]:
#Check the output size and format:
print(Playground_venues.shape)
#Take a look at the first few rows of the resulting dataframe:
Playground_venues.head(10)

(308, 7)


Unnamed: 0,Playground,Playground Latitude,Playground Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Waldspielplatz,53.592623,9.731623,Playground
1,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Hackradt Bäcker,53.590744,9.725168,Bakery
2,Spielplatz Haselweg Wedel,53.591291,9.706469,ALDI NORD,53.59282,9.710503,Supermarket
3,Spielplatz Meisenweg Wedel,53.594306,9.715068,ALDI NORD,53.59282,9.710503,Supermarket
4,Spielplatz Meisenweg Wedel,53.594306,9.715068,Spielplatz,53.592523,9.710625,Playground
5,Spielplatz Meisenweg Wedel,53.594306,9.715068,Schokoengel,53.590827,9.715937,Café
6,Spielplatz Meisenweg Wedel,53.594306,9.715068,elBistro Wedel,53.590727,9.715602,Mexican Restaurant
7,Spielplatz Meisenweg Wedel,53.594306,9.715068,Spielplatz,53.5925,9.72089,Playground
8,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,9.70553,ALDI NORD,53.59282,9.710503,Supermarket
9,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,9.70553,Steinberghalle Wedel,53.589295,9.699597,College Gym


#### Check which categories are returned and for which playgrounds:

In [156]:
#Find out how many categories of venues have been returned:
print('There are {} uniques categories.'.format(len(Playground_venues['Venue Category'].unique())))
#Get a list of the venue categories:
print(Playground_venues['Venue Category'].unique())
#Get a list of the counts of venues around each playground:
#Note these include the number of other playgrounds ('Spielplatz') nearby.
Playground_venues.groupby('Playground')['Playground'].count()

There are 54 uniques categories.
['Playground' 'Bakery' 'Supermarket' 'Café' 'Mexican Restaurant'
 'College Gym' 'Thai Restaurant' 'Italian Restaurant' 'Drugstore' 'Gym'
 'Doner Restaurant' 'Shopping Mall' 'Fast Food Restaurant' 'Garden Center'
 'Seafood Restaurant' 'Harbor / Marina' 'Boat or Ferry' 'Garden'
 'Photography Studio' 'Taverna' 'Bus Stop' 'Beach' 'Turkish Restaurant'
 'Clothing Store' 'Bank' 'Optical Shop' 'Pub' 'Pool' 'Steakhouse'
 'German Restaurant' 'Hotel' 'Trattoria/Osteria' 'Sculpture Garden'
 'Restaurant' 'Museum' 'Theater' 'Insurance Office' 'Tea Room'
 'Food & Drink Shop' 'Arts & Crafts Store' 'Sandwich Place' 'Nightclub'
 'French Restaurant' 'Gym / Fitness Center' 'Furniture / Home Store'
 'Electronics Store' 'Pet Store' 'Asian Restaurant' 'Plaza' 'Spa'
 'Beach Bar' 'Pier' 'Soccer Field' 'Sushi Restaurant']


Playground
 Spielplatz Albert-Schweizer Schule Wedel            5
 Spielplatz Alter Zirkusplatz Wedel                 12
 Spielplatz Altstadtschule Wedel                    20
 Spielplatz Anne-Frank-Weg Wedel                     4
 Spielplatz Ansgariusweg Wedel                       4
 Spielplatz Appelboomtwiete Ecke Aastwiete Wedel     3
 Spielplatz Appelboomtwiete Ecke Steinberg Wedel     3
 Spielplatz Autal Wedel                              6
 Spielplatz Brombeerweg Wedel                        4
 Spielplatz Bürgerpark Wedel                         8
 Spielplatz Croningstraße Wedel                     16
 Spielplatz Egenbüttelweg Wedel                      5
 Spielplatz Elbstraße Wedel                          5
 Spielplatz Ernst-Thälmann-Weg Wedel                 4
 Spielplatz Geesthang Wedel                          1
 Spielplatz Gerhart-Hauptmann Straße Wedel           5
 Spielplatz Ginsterweg Wedel                         5
 Spielplatz Gärtnerstraße Wedel                     15

#### Prepare to analyze the data using dummy variable encoding:

In [157]:
#Use one hot encoding on the venues data as a new dataframe:
Playgrounds_onehot = pd.get_dummies(Playground_venues[['Venue Category']], prefix="", prefix_sep="")
#Add playground names to the dataframe:
Playgrounds_onehot = Playgrounds_onehot.drop(['Playground'], axis=1)
Playgrounds_onehot.insert(0, 'Playground', Playground_venues['Playground'])
#Get the shape of the resulting dataframe and explore the first few rows:
print(Playgrounds_onehot.shape)
Playgrounds_onehot.head(10)

(308, 54)


Unnamed: 0,Playground,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Beach,Beach Bar,Boat or Ferry,Bus Stop,Café,Clothing Store,College Gym,Doner Restaurant,Drugstore,Electronics Store,Fast Food Restaurant,Food & Drink Shop,French Restaurant,Furniture / Home Store,Garden,Garden Center,German Restaurant,Gym,Gym / Fitness Center,Harbor / Marina,Hotel,Insurance Office,Italian Restaurant,Mexican Restaurant,Museum,Nightclub,Optical Shop,Pet Store,Photography Studio,Pier,Plaza,Pool,Pub,Restaurant,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shopping Mall,Soccer Field,Spa,Steakhouse,Supermarket,Sushi Restaurant,Taverna,Tea Room,Thai Restaurant,Theater,Trattoria/Osteria,Turkish Restaurant
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,Spielplatz Haselweg Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
3,Spielplatz Meisenweg Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
4,Spielplatz Meisenweg Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,Spielplatz Meisenweg Wedel,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,Spielplatz Meisenweg Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
7,Spielplatz Meisenweg Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
8,Spielplatz Wasserspielplatz Haus am See Wedel,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0
9,Spielplatz Wasserspielplatz Haus am See Wedel,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0


In [158]:
#For the purposes of clustering, normalize these by the mean:
Playgrounds_grouped = Playgrounds_onehot.groupby('Playground').mean().reset_index()
#Again review the dataframe:
print(Playgrounds_grouped.shape)
Playgrounds_grouped.head(10)

(50, 54)


Unnamed: 0,Playground,Arts & Crafts Store,Asian Restaurant,Bakery,Bank,Beach,Beach Bar,Boat or Ferry,Bus Stop,Café,Clothing Store,College Gym,Doner Restaurant,Drugstore,Electronics Store,Fast Food Restaurant,Food & Drink Shop,French Restaurant,Furniture / Home Store,Garden,Garden Center,German Restaurant,Gym,Gym / Fitness Center,Harbor / Marina,Hotel,Insurance Office,Italian Restaurant,Mexican Restaurant,Museum,Nightclub,Optical Shop,Pet Store,Photography Studio,Pier,Plaza,Pool,Pub,Restaurant,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shopping Mall,Soccer Field,Spa,Steakhouse,Supermarket,Sushi Restaurant,Taverna,Tea Room,Thai Restaurant,Theater,Trattoria/Osteria,Turkish Restaurant
0,Spielplatz Albert-Schweizer Schule Wedel,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Spielplatz Alter Zirkusplatz Wedel,0.0,0.0,0.083333,0.083333,0.0,0.0,0.0,0.083333,0.083333,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.083333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.166667,0.0,0.083333,0.0,0.0,0.0,0.0,0.083333
2,Spielplatz Altstadtschule Wedel,0.0,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.0,0.1,0.0,0.1,0.0,0.05,0.0,0.0,0.0,0.0,0.0,0.0,0.05,0.05,0.05,0.0,0.05,0.0,0.0,0.0,0.0,0.05,0.0,0.0,0.05,0.0,0.05,0.05,0.05,0.0
3,Spielplatz Anne-Frank-Weg Wedel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Spielplatz Ansgariusweg Wedel,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.25,0.0,0.0,0.0,0.0
5,Spielplatz Appelboomtwiete Ecke Aastwiete Wedel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0
6,Spielplatz Appelboomtwiete Ecke Steinberg Wedel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.333333,0.0,0.0,0.0,0.0
7,Spielplatz Autal Wedel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.166667,0.0,0.166667,0.0,0.0,0.166667,0.0,0.0,0.0
8,Spielplatz Brombeerweg Wedel,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,Spielplatz Bürgerpark Wedel,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.125,0.0,0.125,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.125,0.0,0.0,0.125,0.0,0.0,0.0,0.0,0.125,0.125,0.0,0.0,0.0,0.0,0.125,0.0,0.0


# Part II: Methodology and results<a name="part2"></a>

### Part IIA: Exploring the top five venue types for each playground<a name="part2A"></a>

In [159]:
#Note we want the top 5:
num_top_venues = 5
#Run as a loop of each playground:
for location in Playgrounds_grouped['Playground']:
    print("----"+location+"----")
    #Set up to retrieve the data for a location:
    temp = Playgrounds_grouped[Playgrounds_grouped['Playground'] == location].T.reset_index()
    #Set the resulting data columns:
    temp.columns = ['venue','freq']
    #Load the relevant data into the columns:
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    #Sort and return the results:
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

---- Spielplatz Albert-Schweizer Schule Wedel----
                 venue  freq
0  Arts & Crafts Store   0.2
1   Photography Studio   0.2
2             Bus Stop   0.2
3          Supermarket   0.2
4               Garden   0.2


---- Spielplatz Alter Zirkusplatz Wedel----
                venue  freq
0       Shopping Mall  0.17
1         Supermarket  0.17
2  Turkish Restaurant  0.08
3            Bus Stop  0.08
4             Taverna  0.08


---- Spielplatz Altstadtschule Wedel----
                venue  freq
0  Italian Restaurant  0.10
1               Hotel  0.10
2    Doner Restaurant  0.05
3    Sculpture Garden  0.05
4          Restaurant  0.05


---- Spielplatz Anne-Frank-Weg Wedel----
                 venue  freq
0     Insurance Office  0.25
1          Supermarket  0.25
2          College Gym  0.25
3        Garden Center  0.25
4  Arts & Crafts Store  0.00


---- Spielplatz Ansgariusweg Wedel----
                 venue  freq
0               Bakery  0.25
1             Tea Room  0.25
2     

4     Harbor / Marina  0.11


---- Spielplatz Theaterstraße Wedel----
                venue  freq
0           Drugstore  0.15
1  Italian Restaurant  0.08
2                Café  0.08
3                 Gym  0.08
4          Restaurant  0.08


---- Spielplatz Tinsdaler Weg Wedel----
                 venue  freq
0   Photography Studio  0.33
1              Taverna  0.33
2               Garden  0.33
3  Arts & Crafts Store  0.00
4     Sculpture Garden  0.00


---- Spielplatz Vogt-Körner Straße Wedel----
                venue  freq
0         Supermarket  0.33
1  Turkish Restaurant  0.11
2        Optical Shop  0.11
3                Café  0.11
4      Clothing Store  0.11


---- Spielplatz Von-Suttner Straße Wedel----
                 venue  freq
0               Bakery  0.25
1           Restaurant  0.25
2  Arts & Crafts Store  0.00
3   Seafood Restaurant  0.00
4            Nightclub  0.00


---- Spielplatz Wacholderstraße Wedel----
                 venue  freq
0     Insurance Office  0.25
1       

#### A function that sorts and returns the most common venues:

In [160]:
def return_most_common_venues(row, num_top_venues):
    #Note the playground row's categories:
    row_categories = row.iloc[1:]
    #Sort the data by most common venue category:
    row_categories_sorted = row_categories.sort_values(ascending=False)
    #Return the sorted data:
    return row_categories_sorted.index.values[0:num_top_venues]

#### Display the playgrounds with their frequency of categories returned: 

In [161]:
#Return the top ten most common venue categories. 
#However, top ten might be a bit much, something like top five would probably be pretty informative.
num_top_venues = 10
#Use these endings to make the numbers more readable:
indicators = ['st', 'nd', 'rd']
#Create columns by the number of top venues:
columns = ['Playground']
#Generate the column names to assign the venues to:
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))
#Create a new dataframe results and assign the column labels just generated to it:
playgrounds_venues_sorted = pd.DataFrame(columns=columns)
#Assign data to the new dataframe from above to start from:
playgrounds_venues_sorted['Playground'] = Playgrounds_grouped['Playground']
#Sort through the data for each playground, get the name of the most common venue, and put into the dataframe cell:
for ind in np.arange(Playgrounds_grouped.shape[0]):
    #Note calling of the prior function to get the ranked venues:
    playgrounds_venues_sorted.iloc[ind, 1:] = return_most_common_venues(Playgrounds_grouped.iloc[ind, :], num_top_venues)
#Return the dataframe with the most common venues for each playground noted:
playgrounds_venues_sorted.head()

Unnamed: 0,Playground,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Albert-Schweizer Schule Wedel,Arts & Crafts Store,Garden,Supermarket,Photography Studio,Bus Stop,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center
1,Spielplatz Alter Zirkusplatz Wedel,Supermarket,Shopping Mall,Turkish Restaurant,Café,Bakery,Bank,Taverna,Optical Shop,Clothing Store,Bus Stop
2,Spielplatz Altstadtschule Wedel,Italian Restaurant,Hotel,Sculpture Garden,Trattoria/Osteria,Museum,Fast Food Restaurant,Drugstore,Doner Restaurant,Pool,German Restaurant
3,Spielplatz Anne-Frank-Weg Wedel,Garden Center,Supermarket,College Gym,Insurance Office,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
4,Spielplatz Ansgariusweg Wedel,Bakery,Tea Room,Supermarket,Garden Center,Turkish Restaurant,Drugstore,Gym / Fitness Center,Gym,German Restaurant,Garden


### Part IIB: Clustering and mapping based on venues <a name="part2B"></a>

#### Fit the k-means algorithm to the playground surrounding venues normalized data:

In [162]:
#Set the number of clusters (five seems to work fine):
kclusters = 5
#Drop the name and set the dataset to use:
Playgrounds_grouped_clustering = Playgrounds_grouped.drop('Playground', 1)
#Run/fit the k-means clustering algorithm:
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(Playgrounds_grouped_clustering)
#Check the cluster labels generated:
kmeans.labels_[0:len(Playgrounds_grouped)]

array([3, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 3, 0, 4, 1, 3, 1, 0, 1, 3, 0,
       1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 0, 1, 3, 1, 1, 1, 4, 0, 1, 1, 1, 1,
       1, 1, 0, 1, 0, 0])

#### Create a new dataframe that includes the cluster labels as well as the top 10 venues around each playground:

In [163]:
#Insert the cluster labels into the sorted/ranked venues dataframe:
playgrounds_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)
#Copy the playground dataframe and rename the playground column to use in merging:
Playgrounds_merged=p_df.copy().rename(columns={"a_name":"Playground"})
#Merge the playgrounds_grouped and playgrounds_df (p_df) dataframes so as to add latitude/longitudes:
final_df = Playgrounds_merged.join(playgrounds_venues_sorted.set_index('Playground'), on='Playground')
#Note the dataframe's dimensions:
print(final_df.shape)
#Move the cluster labels to the first column:
first_column = final_df.pop('Cluster Labels')
final_df.insert(0, 'Cluster Labels', first_column)
#There are a couple of remote playgrounds without venues listed 
#These will instead be added separately to the map:
exception_df=final_df[final_df['Cluster Labels'].isna()]
final_df = final_df[final_df['Cluster Labels'].notna()]
#The couple outlier playgrounds then caused the cluster labels to be float, so changing back to int:
final_df['Cluster Labels']=final_df['Cluster Labels'].astype(int)
#Display the resulting dataframe:
final_df.head(60)

(51, 45)


Unnamed: 0,Cluster Labels,Playground,a_lat,a_long,a_name_address,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,1,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Großer Spielplatz im Wald. Viel Wiese.,Großer Spielplatz im Wald. Viel Wiese.,5.0,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13,Bakery,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden,Furniture / Home Store
0,0,Spielplatz Haselweg Wedel,53.591291,9.706469,Schöner Spielplatz mit angegliederter kleiner ...,Schöner Spielplatz mit angegliederter kleiner...,5.0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7,Supermarket,Turkish Restaurant,Drugstore,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden,Furniture / Home Store
0,1,Spielplatz Meisenweg Wedel,53.594306,9.715068,Großer Spielplatz mit viel Wiese. Die Spielger...,Großer Spielplatz mit viel Wiese. Die Spielge...,5.0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6,Mexican Restaurant,Supermarket,Café,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,0,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,9.70553,Spielplatz Wasserspielplatz Haus am See in Wed...,Der Spielplatz macht einen herausragenden Eind...,5.0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,Supermarket,College Gym,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,1,Spielplatz Mühlenweg Wedel,53.581066,9.710192,Schön gestalteter Spielplatz am Mühlenweg.,Schön gestalteter Spielplatz am Mühlenweg.<br/>,4.0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,Bakery,Drugstore,Italian Restaurant,Thai Restaurant,Gym,Shopping Mall,Fast Food Restaurant,Doner Restaurant,Electronics Store,Gym / Fitness Center
0,4,Spielplatz Rotdornstraße Wedel,53.591067,9.688362,Besonderheiten laut der Liste aus Wedel in Zah...,Besonderheiten laut der Liste aus Wedel in Za...,5.0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,Garden Center,Turkish Restaurant,Insurance Office,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden,Furniture / Home Store,French Restaurant
0,1,Spielplatz Hamburger Yachthafen Wedel,53.574397,9.682394,"neuer, riesiger toller spielplatz, muss man hi...","neuer, riesiger toller spielplatz, muss man h...",5.0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,Boat or Ferry,Harbor / Marina,Seafood Restaurant,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,3,Spielplatz Ginsterweg Wedel,53.573638,9.719113,Der Spielplatz ist auf mehrere Ebenen in einem...,Der Spielplatz ist auf mehrere Ebenen in eine...,4.0,0,0,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,Taverna,Supermarket,Garden,Bus Stop,Photography Studio,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant
0,3,Spielplatz Hans-Böckler Platz Wedel,53.56886,9.714913,Der Spielplatz ist zur Straße hin mit einem Za...,Der Spielplatz ist zur Straße hin mit einem Z...,4.0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,5,Bakery,Beach,Supermarket,Bus Stop,Turkish Restaurant,Fast Food Restaurant,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,3,Spielplatz Pulverstraße Wedel,53.571205,9.7194,Dieser mittelgroße Spielplatz befindet sich nö...,Dieser mittelgroße Spielplatz befindet sich n...,4.0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,5,Bakery,Supermarket,Bus Stop,Garden,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant


#### Visualize the clustering result:

In [164]:
#Create another folium map:
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)
#Set color scheme for the clusters, doing so in a way that is flexible to the number of clusters::
#Get a set of evenly spaced values to use in color selection, then use in choosing color values:
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
#Access the colors available using hex codes:
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
#Add markers to the map for each playground based on cluster:
for lat, lon, poi, cluster in zip(final_df['a_lat'], final_df['a_long'], final_df['Playground'], final_df['Cluster Labels']):
    #Label the markers based on cluster and playground name, set the cluster parameters and add to the map:
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters) 
#Add the playground points for the outliers that didn't have venues nearby to use in clustering:    
for lat, lon, poi in zip(exception_df['a_lat'], exception_df['a_long'], exception_df['Playground']):
    #Label these exceptions as outliers and map in the color black:
    label = folium.Popup(str(poi) + ' *Note not clustered - no venues listed', parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color='#000000',
        fill=True,
        fill_color='#000000',
        fill_opacity=0.7).add_to(map_clusters)
#Display the map with the clustered and non-clustered playgrounds:
map_clusters

#### Make detailed lists of each cluster for analysis:

In [165]:
'''
This cluster is around a more residential neighborhood of the village. 
There are less cafes here and a more residential/homeowner services.
'''
final_df.loc[final_df['Cluster Labels'] == 0, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Haselweg Wedel,53.591291,Schöner Spielplatz mit angegliederter kleiner...,5.0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7,Supermarket,Turkish Restaurant,Drugstore,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden,Furniture / Home Store
0,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,Der Spielplatz macht einen herausragenden Eind...,5.0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,Supermarket,College Gym,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,Spielplatz Anne-Frank-Weg Wedel,53.58895,Spielplatz mit Matschanlage (also die Ersatzk...,4.0,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,Garden Center,Supermarket,College Gym,Insurance Office,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Ansgariusweg Wedel,53.587196,An der Zufahrt zum Fährmannssand liegt dieser...,4.0,0,0,0,1,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,Bakery,Tea Room,Supermarket,Garden Center,Turkish Restaurant,Drugstore,Gym / Fitness Center,Gym,German Restaurant,Garden
0,Spielplatz Ernst-Thälmann-Weg Wedel,53.588167,Eingeschlossen von Häusern liegt hier ein ruh...,4.0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,Garden Center,Supermarket,College Gym,Insurance Office,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Pferdekoppel Wedel,53.588979,Dieser Spielplatz mit schöner Spielburg liegt...,4.0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,Supermarket,College Gym,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,Spielplatz Wacholderstraße Wedel,53.590775,Spielplatz mit Schwerpunkt Sandspiele.<br/>,4.0,0,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,Tea Room,Garden Center,College Gym,Insurance Office,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Appelboomtwiete Ecke Steinberg Wedel,53.588835,Diesen Spielplatz haben wir noch nicht besuch...,3.0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,Tea Room,College Gym,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,Spielplatz Hainbuchenweg Wedel,53.589938,Spielplatz eher für etwas ältere Kinder.<br/>,3.0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,Tea Room,Garden Center,College Gym,Insurance Office,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Schlehdornweg Wedel,53.589801,Besonderheiten laut der Liste aus Wedel in Za...,3.0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,Bakery,Tea Room,Supermarket,Garden Center,Turkish Restaurant,Drugstore,Gym / Fitness Center,Gym,German Restaurant,Garden


In [166]:
'''This cluster includes the bulk of observations. 
These playgrounds are generally in the more urban area of the village.
From these locations, users have access to several small shops and services.
They're perhaps a better choice for an extended outing that includes playgrounds and socializing.
'''
final_df.loc[final_df['Cluster Labels'] == 1, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,Großer Spielplatz im Wald. Viel Wiese.,5.0,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13,Bakery,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden,Furniture / Home Store
0,Spielplatz Meisenweg Wedel,53.594306,Großer Spielplatz mit viel Wiese. Die Spielge...,5.0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6,Mexican Restaurant,Supermarket,Café,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,Spielplatz Mühlenweg Wedel,53.581066,Schön gestalteter Spielplatz am Mühlenweg.<br/>,4.0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,Bakery,Drugstore,Italian Restaurant,Thai Restaurant,Gym,Shopping Mall,Fast Food Restaurant,Doner Restaurant,Electronics Store,Gym / Fitness Center
0,Spielplatz Hamburger Yachthafen Wedel,53.574397,"neuer, riesiger toller spielplatz, muss man h...",5.0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,Boat or Ferry,Harbor / Marina,Seafood Restaurant,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden
0,Spielplatz Alter Zirkusplatz Wedel,53.575596,Versteckter Spielplatz mit schattigen Ecken u...,4.0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,Supermarket,Shopping Mall,Turkish Restaurant,Café,Bakery,Bank,Taverna,Optical Shop,Clothing Store,Bus Stop
0,Spielplatz Altstadtschule Wedel,53.582625,Dieser Spielplatz liegt auf dem Schulhof der ...,4.0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,Italian Restaurant,Hotel,Sculpture Garden,Trattoria/Osteria,Museum,Fast Food Restaurant,Drugstore,Doner Restaurant,Pool,German Restaurant
0,Spielplatz Brombeerweg Wedel,53.574119,Schöner kleiner Spielplatz unter Bäumen<br/>,4.0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,7,Arts & Crafts Store,Bus Stop,Café,Food & Drink Shop,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center
0,Spielplatz Croningstraße Wedel,53.58191,Dieser Spielplatz nur von der Croningstraße e...,4.0,0,0,0,1,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,Supermarket,Fast Food Restaurant,Furniture / Home Store,Restaurant,Gym / Fitness Center,Pet Store,French Restaurant,Nightclub,Sandwich Place,Electronics Store
0,Spielplatz Gärtnerstraße Wedel,53.585607,Dieser Spielplatz wurde seit meinem letzten B...,4.0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,Hotel,Italian Restaurant,Sculpture Garden,Trattoria/Osteria,Museum,College Gym,Pub,German Restaurant,Café,Restaurant
0,Spielplatz Heinrich-Schacht-Straße Wedel,53.580021,Ein größerer Spielplatz. Bemerkenswert neben ...,4.0,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,7,Supermarket,Fast Food Restaurant,Sandwich Place,French Restaurant,Restaurant,Gym / Fitness Center,Taverna,Bakery,Beach,Gym


In [167]:
'''
This cluster isn't the only beach playground on the list. But, it's in a cluster of its own due to 
isolation from other shops.
'''
final_df.loc[final_df['Cluster Labels'] == 2, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Hellgrund Wedel,53.566977,Versteckt im Tal in direkter Nähe zum Vattenf...,3,0,0,0,1,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,Beach,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden Center,Garden,Furniture / Home Store


In [168]:
'''
It is intereting that this group received it's own cluster rather than being grouped with the large cluster.
It definitely has a different, distinctive feel from the ceneter of the village. The area appears to have been
redeveloped in the 1960's-70's and is somehow being reflected as different in the shops available.
'''
final_df.loc[final_df['Cluster Labels'] == 3, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Ginsterweg Wedel,53.573638,Der Spielplatz ist auf mehrere Ebenen in eine...,4,0,0,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,Taverna,Supermarket,Garden,Bus Stop,Photography Studio,Turkish Restaurant,Electronics Store,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Hans-Böckler Platz Wedel,53.56886,Der Spielplatz ist zur Straße hin mit einem Z...,4,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,5,Bakery,Beach,Supermarket,Bus Stop,Turkish Restaurant,Fast Food Restaurant,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Pulverstraße Wedel,53.571205,Dieser mittelgroße Spielplatz befindet sich n...,4,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,5,Bakery,Supermarket,Bus Stop,Garden,Turkish Restaurant,Electronics Store,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant
0,Spielplatz Albert-Schweizer Schule Wedel,53.571721,Dieser Spielplatz befindet sich auf dem Gelän...,3,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,Arts & Crafts Store,Garden,Supermarket,Photography Studio,Bus Stop,Electronics Store,Gym / Fitness Center,Gym,German Restaurant,Garden Center
0,Spielplatz Elbstraße Wedel,53.56994,Kleiner Spielplatz.<br/>,3,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,Bus Stop,Bakery,Beach,Supermarket,Turkish Restaurant,Fast Food Restaurant,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant


In [169]:
'''
This cluster is just a pair of playgrounds on the road into the village. They just share the same shops.
'''
final_df.loc[final_df['Cluster Labels'] == 4, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,Spielplatz Rotdornstraße Wedel,53.591067,Besonderheiten laut der Liste aus Wedel in Za...,5,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,Garden Center,Turkish Restaurant,Insurance Office,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden,Furniture / Home Store,French Restaurant
0,Spielplatz Geesthang Wedel,53.589786,Dieser Spielplatz befindet sich am Ende der H...,4,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,Garden Center,Turkish Restaurant,Insurance Office,Harbor / Marina,Gym / Fitness Center,Gym,German Restaurant,Garden,Furniture / Home Store,French Restaurant


In [170]:
'''
Finally, if any playgrounds didn't return commercial venues nearby from Foursquare they would end up in this list.
These would perhaps still be great playgrounds, but inconvenient if trying to do a combined play-shopping trip.
Alternatively, Foursquare is not entirely consistent for this area as this playground was grouped in prior runs.
'''
exception_df

Unnamed: 0,Cluster Labels,Playground,a_lat,a_long,a_name_address,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,,Spielplatz Opn Klint Wedel,53.58734,9.708695,"Dieser Spielplatz ist zum Teil öffentlch, zum ...","Dieser Spielplatz ist zum Teil öffentlch, zum...",3,0,0,0,1,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7,,,,,,,,,,


### Part IIC: Clustering and mapping based on playground equipment <a name="bonus1"></a>
This is an alternative way to map and explore that data. If the focus isn't on combining tasks, but rather choosing a
playground based on the equipment available, clustering can help reduce the search cost involved.

#### Prepare a second dataframe to use in clustering on playground equipment:

In [171]:
#Start by copying the dataframe that has playground equipment noted but not surrounding venues:
df_alt=p_df.copy()
#Copy the columns that will be dropped when clustering so they can be added back afterwards: 
df_labels=df_alt[['a_name','a_lat','a_long','a_name_address','a_description','a_rating']].copy()
#Drop the columns not needed in clustering:
df_alt=df_alt.drop(['a_lat','a_long','a_name_address','a_description','a_rating'], axis=1)
#Rename and keep the Playground name column for merging later:
df_alt=df_alt.rename(columns={"a_name":"Playground"})
df_row_names=df_alt['Playground'].copy()
df_alt=df_alt.drop(['Playground'], axis=1)
#Normalize the data using StandardScaler:
normalized_DS=StandardScaler().fit_transform(df_alt)
print(normalized_DS)
df_alt.head(10)

[[-0.29172998 -0.14142136  4.94974747 ...  0.          0.
   3.50051626]
 [-0.29172998 -0.14142136 -0.20203051 ...  0.          0.
   1.19198614]
 [-0.29172998 -0.14142136 -0.20203051 ...  0.          0.
   0.80723112]
 ...
 [-0.29172998 -0.14142136 -0.20203051 ...  0.          0.
  -1.501299  ]
 [-0.29172998 -0.14142136 -0.20203051 ...  0.          0.
  -1.501299  ]
 [-0.29172998 -0.14142136 -0.20203051 ...  0.          0.
  -1.501299  ]]


Unnamed: 0,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13
0,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7
0,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,0,0,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8
0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,5
0,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,5


#### Run k-means clustering on the playground equipment data:

In [172]:
#Set number of clusters
kclusters = 5
#Run k-means clustering:
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(normalized_DS)
#Check cluster labels generated for each row in the dataframe:
kmeans.labels_[0:len(df_alt)]

array([1, 0, 0, 2, 0, 2, 4, 0, 0, 0, 0, 4, 2, 3, 0, 0, 2, 0, 3, 3, 4, 4,
       4, 0, 0, 3, 4, 0, 0, 4, 4, 0, 0, 4, 4, 4, 3, 4, 3, 4, 4, 0, 4, 4,
       3, 4, 4, 4, 4, 4, 4])

#### Combine clustering result and datasets:

In [173]:
#Add the playground names:
df_alt['Playground']=df_row_names
#Add the cluster labels:
df_alt.insert(0, 'Cluster Labels', kmeans.labels_)
#Prepare the datasets for merging by 'playground':
df_alt=df_alt.rename(columns={"a_name":"Playground"})
df_labels=df_labels.rename(columns={"a_name":"Playground"})
#Merge/join the datasets:
final_df = df_labels.join(df_alt.set_index('Playground'), on='Playground')
#Check the dataframe shape:
print(final_df.shape)
#Move cluster labels to the front:
first_column = final_df.pop('Cluster Labels')
final_df.insert(0, 'Cluster Labels', first_column)
#Ensure cluster labesl are still integers (changes if any were blank):
final_df['Cluster Labels']=final_df['Cluster Labels'].astype(int)
#Display the dataframe:
final_df.head()

(51, 35)


Unnamed: 0,Cluster Labels,Playground,a_lat,a_long,a_name_address,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,1,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Großer Spielplatz im Wald. Viel Wiese.,Großer Spielplatz im Wald. Viel Wiese.,5,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13
0,0,Spielplatz Haselweg Wedel,53.591291,9.706469,Schöner Spielplatz mit angegliederter kleiner ...,Schöner Spielplatz mit angegliederter kleiner...,5,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7
0,0,Spielplatz Meisenweg Wedel,53.594306,9.715068,Großer Spielplatz mit viel Wiese. Die Spielger...,Großer Spielplatz mit viel Wiese. Die Spielge...,5,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6
0,2,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,9.70553,Spielplatz Wasserspielplatz Haus am See in Wed...,Der Spielplatz macht einen herausragenden Eind...,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,0,Spielplatz Mühlenweg Wedel,53.581066,9.710192,Schön gestalteter Spielplatz am Mühlenweg.,Schön gestalteter Spielplatz am Mühlenweg.<br/>,4,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4


#### Map the new clusters:

In [174]:
#This follows the same process discussed in the last two mapping cells above.
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]
# add markers to the map:
for lat, lon, poi, cluster in zip(final_df['a_lat'], final_df['a_long'], final_df['Playground'], final_df['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters) 
'''
Note I do not expect any particular clustering pattern to emerge in the map. The map is useful, however, in
determining which playgrounds with certain equipment groups are available near a given point.
'''
map_clusters

#### Analyzing the playground equipment-based groupings:

In [175]:
'''
These are mostly some pretty good playgrounds. Probably good for an hour-long trip.
'''
final_df.loc[final_df['Cluster Labels'] == 0, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Haselweg Wedel,53.591291,Schöner Spielplatz mit angegliederter kleiner...,5,0,0,0,1,0,1,1,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Meisenweg Wedel,53.594306,Großer Spielplatz mit viel Wiese. Die Spielge...,5,0,0,0,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Mühlenweg Wedel,53.581066,Schön gestalteter Spielplatz am Mühlenweg.<br/>,4,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,Spielplatz Ginsterweg Wedel,53.573638,Der Spielplatz ist auf mehrere Ebenen in eine...,4,0,0,1,1,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8
0,Spielplatz Hans-Böckler Platz Wedel,53.56886,Der Spielplatz ist zur Straße hin mit einem Z...,4,0,0,0,0,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,5
0,Spielplatz Pulverstraße Wedel,53.571205,Dieser mittelgroße Spielplatz befindet sich n...,4,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,5
0,Spielplatz Alter Zirkusplatz Wedel,53.575596,Versteckter Spielplatz mit schattigen Ecken u...,4,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
0,Spielplatz Brombeerweg Wedel,53.574119,Schöner kleiner Spielplatz unter Bäumen<br/>,4,0,0,0,1,0,1,1,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Croningstraße Wedel,53.58191,Dieser Spielplatz nur von der Croningstraße e...,4,0,0,0,1,0,1,1,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Ernst-Thälmann-Weg Wedel,53.588167,Eingeschlossen von Häusern liegt hier ein ruh...,4,0,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5


In [176]:
'''
This is an exceptional playground, the kind that parents send pictures of to other people. 
In a class of its own so this clustering of one makes sense. There's probably even some equipment that's not
listed here.
'''
final_df.loc[final_df['Cluster Labels'] == 1, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,Großer Spielplatz im Wald. Viel Wiese.,5,0,0,1,0,0,1,1,0,0,0,1,0,0,0,1,0,0,1,0,0,1,1,0,0,0,0,0,13


In [177]:
'''
One nice result is that the clustering algorithm has placed all the playgrounds with water features (pumps, troughs, 
water wheels, etc.) into this category. They tend to have less of the standard playground equipment, and are more
just focused on water play.
'''
final_df.loc[final_df['Cluster Labels'] == 2, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,Der Spielplatz macht einen herausragenden Eind...,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,Spielplatz Rotdornstraße Wedel,53.591067,Besonderheiten laut der Liste aus Wedel in Za...,5,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,Spielplatz Anne-Frank-Weg Wedel,53.58895,Spielplatz mit Matschanlage (also die Ersatzk...,4,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Gärtnerstraße Wedel,53.585607,Dieser Spielplatz wurde seit meinem letzten B...,4,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3


In [178]:
'''
Also good playgrounds, definitely good for an hour or so of children's play. 
One defining difference is that these playgrounds all include football/soccer fields.
'''
final_df.loc[final_df['Cluster Labels'] == 3, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Ansgariusweg Wedel,53.587196,An der Zufahrt zum Fährmannssand liegt dieser...,4,0,0,0,1,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Geesthang Wedel,53.589786,Dieser Spielplatz befindet sich am Ende der H...,4,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Heinrich-Schacht-Straße Wedel,53.580021,Ein größerer Spielplatz. Bemerkenswert neben ...,4,0,0,0,0,0,1,1,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Strandbad Wedel,53.570948,Großer Spielplatz:<br/>Wegen des Wassers in d...,4,0,0,0,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Hellgrund Wedel,53.566977,Versteckt im Tal in direkter Nähe zum Vattenf...,3,0,0,0,1,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6
0,Spielplatz Opn Klint Wedel,53.58734,"Dieser Spielplatz ist zum Teil öffentlch, zum...",3,0,0,0,1,0,0,1,1,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,7
0,Spielplatz Im Grund Wedel,53.575454,"Zunächst findet man hier den Bolzplatz, dahin...",2,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4


In [179]:
'''
Some of these playgrounds have less going on and are probably more neighborhood playgrounds than those
worth driving to. It also has the playgrounds without rating or poorly listed equipment.

There are a couple ways to interpret these. Since the data is crowdsourced, it could be that these just have not had
much data provided to the webpage. In that regard, they're wildcards. So they may be worth exploring and could be
surprising. However, they could also be playgrounds with limited equipment available and not much fun for the kids.
For the ones I'm familiar with, the latter explanation is most often the case. One has an oddly high rating for not
having much equipment available.
'''

final_df.loc[final_df['Cluster Labels'] == 4, final_df.columns[[1] + [2] + list(range(5, final_df.shape[1]))]]

Unnamed: 0,Playground,a_lat,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,Spielplatz Hamburger Yachthafen Wedel,53.574397,"neuer, riesiger toller spielplatz, muss man h...",5.0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4
0,Spielplatz Altstadtschule Wedel,53.582625,Dieser Spielplatz liegt auf dem Schulhof der ...,4.0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
0,Spielplatz Lindenstraße Wedel,53.578413,Ein langezogener Spielplatz mit insgesamt vie...,4.0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,Spielplatz Pferdekoppel Wedel,53.588979,Dieser Spielplatz mit schöner Spielburg liegt...,4.0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,Spielplatz Pinneberger Straße Wedel,53.585592,Abgegrenzt vom Obstbaumweg an der Rückseite u...,4.0,0,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5
0,Spielplatz Vogt-Körner Straße Wedel,53.575123,Dieser kleine Spielplatz liegt versteckt hint...,4.0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,2
0,Spielplatz Appelboomtwiete Ecke Steinberg Wedel,53.588835,Diesen Spielplatz haben wir noch nicht besuch...,3.0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
0,Spielplatz Albert-Schweizer Schule Wedel,53.571721,Dieser Spielplatz befindet sich auf dem Gelän...,3.0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,Spielplatz Elbstraße Wedel,53.56994,Kleiner Spielplatz.<br/>,3.0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,Spielplatz Gerhart-Hauptmann Straße Wedel,53.591999,Besonderheiten laut der Liste aus Wedel in Za...,3.0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1


### Part IID: Finding the playgrounds that are near fast food restaurants, icecream shops, etc. <a name="bonus2"></a>
Sometimes it's more important to find a playground with a particular venue nearby. Here's some of those sets.

#### Find the playgrounds with 'fast food restaurants' nearby:

In [180]:
'''
Listing the playgrounds but also the restaurant names. 
Kids gotta eat and sometimes playground adventures go longer than planned. 
If it seems like one of those days, maybe it's best to go to one of these playgrounds.
'''
Playground_venues[Playground_venues['Venue Category']=='Fast Food Restaurant']

Unnamed: 0,Playground,Playground Latitude,Playground Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
18,Spielplatz Mühlenweg Wedel,53.581066,9.710192,Hähnchengrill Wedel,53.579899,9.703757,Fast Food Restaurant
69,Spielplatz Altstadtschule Wedel,53.582625,9.699267,Hähnchengrill Wedel,53.579899,9.703757,Fast Food Restaurant
86,Spielplatz Croningstraße Wedel,53.58191,9.723378,Burger King,53.583856,9.726297,Fast Food Restaurant
92,Spielplatz Croningstraße Wedel,53.58191,9.723378,McDonald's,53.58353,9.723654,Fast Food Restaurant
123,Spielplatz Heinrich-Schacht-Straße Wedel,53.580021,9.72488,Burger King,53.583856,9.726297,Fast Food Restaurant
128,Spielplatz Heinrich-Schacht-Straße Wedel,53.580021,9.72488,McDonald's,53.58353,9.723654,Fast Food Restaurant
161,Spielplatz Rosengarten Wedel,53.581053,9.705455,Hähnchengrill Wedel,53.579899,9.703757,Fast Food Restaurant
199,Spielplatz Kronskamp Wedel,53.580795,9.722074,Burger King,53.583856,9.726297,Fast Food Restaurant
204,Spielplatz Kronskamp Wedel,53.580795,9.722074,McDonald's,53.58353,9.723654,Fast Food Restaurant
300,Spielplatz Theaterstraße Wedel,53.582217,9.70818,Hähnchengrill Wedel,53.579899,9.703757,Fast Food Restaurant


#### Find the playgrounds near venues that service icecream:

In [181]:
'''
Search for 'Eis' in the business title since it doesn't have it's own venue category. 
Note 'Eis' is German for 'icecream'. There is another restaurant in the vilage that has a walk-up icecream window,
but 'eis' isn't in the name. One option would be to vastly increase the number of Foursquare API calls and see if
we can check menus for icecream.
'''
Playground_venues[Playground_venues['Venue'].str.contains('Eis')]

Unnamed: 0,Playground,Playground Latitude,Playground Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
39,Spielplatz Alter Zirkusplatz Wedel,53.575596,9.710723,Eiscafé Venezia,53.577454,9.70535,Café
154,Spielplatz Rosengarten Wedel,53.581053,9.705455,Eiscafé Venezia,53.577454,9.70535,Café
182,Spielplatz Vogt-Körner Straße Wedel,53.575123,9.705176,Eiscafé Venezia,53.577454,9.70535,Café


#### Find the playgrounds that list water features:

In [182]:
'''
These generally include various pumps, channels, waterwheels, etc. 
Very popular and useful in the summer. 
Unfortunately, the playgrounds where water features were noted does not correspond to the playgrounds 
near the icecream shop in the prior list (in the case of the village of Wedel).
'''
final_df[final_df['water feature']==1]

Unnamed: 0,Cluster Labels,Playground,a_lat,a_long,a_name_address,a_description,a_rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,swings,turntable,carousel,table tennis,trampoline,railroad,tractor,excavator,climbing tower,tunnel,spring board,blancing boards,toilets,bicycle stand,total equipment
0,2,Spielplatz Wasserspielplatz Haus am See Wedel,53.591441,9.70553,Spielplatz Wasserspielplatz Haus am See in Wed...,Der Spielplatz macht einen herausragenden Eind...,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
0,2,Spielplatz Rotdornstraße Wedel,53.591067,9.688362,Besonderheiten laut der Liste aus Wedel in Zah...,Besonderheiten laut der Liste aus Wedel in Za...,5,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3
0,2,Spielplatz Anne-Frank-Weg Wedel,53.58895,9.693766,Spielplatz mit Matschanlage (also die Ersatzkl...,Spielplatz mit Matschanlage (also die Ersatzk...,4,1,0,0,0,0,1,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6
0,2,Spielplatz Gärtnerstraße Wedel,53.585607,9.697387,Dieser Spielplatz wurde seit meinem letzten Be...,Dieser Spielplatz wurde seit meinem letzten B...,4,1,0,0,0,0,0,1,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,3


#### Find the playgrounds with supermarkets nearby:

In [183]:
'''
Gotta shop sometimes, might as well do a shopping-playground trip and save time.
I'm partial to Netto, so searching based on that brand of supermarket.
'''
Playground_venues[(Playground_venues['Venue Category']=='Supermarket') 
                  & (Playground_venues['Venue'].str.contains("Netto"))]

Unnamed: 0,Playground,Playground Latitude,Playground Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
32,Spielplatz Hans-Böckler Platz Wedel,53.56886,9.714913,Netto Marken-Discount,53.56941,9.7138,Supermarket
36,Spielplatz Pulverstraße Wedel,53.571205,9.7194,Netto Marken-Discount,53.56941,9.7138,Supermarket
73,Spielplatz Anne-Frank-Weg Wedel,53.58895,9.693766,Netto Marken-Discount,53.585539,9.691813,Supermarket
76,Spielplatz Ansgariusweg Wedel,53.587196,9.686224,Netto Marken-Discount,53.585539,9.691813,Supermarket
105,Spielplatz Gärtnerstraße Wedel,53.585607,9.697387,Netto Marken-Discount,53.585539,9.691813,Supermarket
114,Spielplatz Ernst-Thälmann-Weg Wedel,53.588167,9.69368,Netto Marken-Discount,53.585539,9.691813,Supermarket
215,Spielplatz Bürgerpark Wedel,53.584817,9.692506,Netto Marken-Discount,53.585539,9.691813,Supermarket
228,Spielplatz Elbstraße Wedel,53.56994,9.711303,Netto Marken-Discount,53.56941,9.7138,Supermarket
249,Spielplatz Reepschlägerstraße Wedel,53.58633,9.693267,Netto Marken-Discount,53.585539,9.691813,Supermarket
258,Spielplatz Schlehdornweg Wedel,53.589801,9.69021,Netto Marken-Discount,53.585539,9.691813,Supermarket


# Discussion and concluding remarks <a name="discussionandconclusion"></a>

This section concludes the report with a discussion of the results and some concluding remarks.
<br />
<br />
The clustering methods used in this study appear to work well at dividing the playgrounds in the village of interest into groups. From experience with the village, clustering based on nearby venues captures differences in the village neighborhoods well. In particular, it is not ex ante apparent that the neighborhoods of clusters zero and one would fall into different clusters. Yet, the neighborhoods do have a different feel to them in real life. This exercise reveals that one of the sources of that difference is that different sorts of venues are concentrated in each. So too with clusters two and four which are in less-dense areas of the village (cluster three is also in an isolated corner on the Elbe beach).
<br />
<br />
When analyzing the data again based on clustering by playground characteristics, patterns again emerge. Many of the
playgrounds in the village are good and fairly consistent and these have been grouped together. The really exceptional is placed in a cluster of its own. Then, playgrounds with football/soccer fields are also grouped well, as are
those with water features. The playgrounds with less features and less information are sort of grouped together - those
that we wouldn't want to show up to expecting the children to enjoy for an hour. These are the ones we'd want to have a 
backup plan for. Overall, I'd say the clustering exercise has worked well.
<br />
<br />
Finally, sometimes it's about have a lot of shops nearby or specific the equipment on the playground. But other times it's 
about having a specific business or business category nearby. Using the playgrounds and Foursquare data results in a few
relevant lists. We now have lists of the playgrounds near fast food, icecream, and one of the supermarkets readily
available.
<br />
<br />
I think there is more that can be done with this sort of database. For instance, we could calculate the distance between each playground and the venues. This might be useful since walking with children can be difficult. I have also considered adding a visual analysis. For instance, I could use the coordinates to retrieve some satellite imagery, and then use some sort of computer vision approach to check how much shade is available. That could be an interesting and useful addition.  Parking and other features would be interesting to retrieve too. Anyway, linking the Foursquare API to the data scraped from the community crowd-source spielplatznet website has led to some interesting and useful insights.