>>> ## IBM Applied Data Science Capstone Final Project:
>> # Exploring the Services Available Near and the Quality of Playgrounds in Germany
#### This project is designed to satisfy the IBM Applied Data Science Capstone final project requirements. However, the project is also intended to be useful.

# Introduction <a name="introduction"></a>

### Description of the problem/background <a name="description"></a>
Before the series of coronavirus lockdowns that we've had in Germany, my spouse and I generally did our shopping and errands while our children were at school. We could then do family trips the nearby playgrounds in the afternoon. Playground trips would occur sometimes several times a week. 
<br /> 
<br />
Now in coronavirus times, things are a bit different. For weeks at a time, our children are home ALL THE TIME in lockdown with us. So, to go do our shopping involves one of us staying with the children while the other is shopping. One option is of course to stay home with them, but they're also quite bored with being in lockdown and some outside time is great for them anyway. So, now that the lockdowns are less intense - playgrounds open but not always schools and daycares - we like to combine shopping and playground trips. This generally involves one of us going into the store while the other takes the children to a playground.
<br />
<br />
To aid the combined shopping-playground process, this project will combine playground and commercial venue data. Using a crowd-sourced playground database and the Foursquare API, we can check which playgrounds are near what sorts of shops. Do we need to go to a variety of stores to meet our shopping requirements? There's a cluster of playgrounds for that. Do we need a set of playgrounds near supermarkets and such? Yep, we can find those. How about playgrounds with extensive equipment or playgrounds that are away from the shops so the children can really go crazy? Yes, we can identify those too. So, there's a couple different problems being discussed here. The primary one is combining shopping and playground trips into the village. The other is identifying playgrounds that fit specific playing needs. That is, playgrounds with a lot of equipment for an extended adventure versus more limited ones for shorter trips. I should be able to address both sorts of information requirements once I collect and prepare the data.
<br />

### Data plan<a name="data"></a>
The plan is to use html web scraping to retrieve a list of playgrounds and their characteristics from the crowd-source based website 'spielplatznet' (https://spielplatznet.de/spielplaetze). This site allows a user to search for a city in Germany which then returns a list of playgrounds in the vicinity. It is based on playground users inputting the data, so not all areas of the country are well-represented. However, the area where I will conduct the analysis - the village of Wedel in the state of Schleswig-Holstein (near Hamburg) has pretty good data. A plus is that I know many of the locations well and so can confirm when the data is complete or missing.
<br />
<br />
The second substantial data source is the Foursquare API. I will use it to retrieve information on venues near each playground. I can then classify the playgrounds based on what's nearby for the purpose of combining shopping/errands and playground trips to the village. I'll primarily use the Foursquare data in a k-means clustering process, but also to search through for particular types of venues. These will include particular stores, store types, or stores with keywords in their titles such as 'icecream' ('eis' in German).
<br />
<br />
I will also use geolocator to search for the village's geocoordinates. This is probably a little excessive as I could take an average/mean of the playground coordinates.

### Data examples<a name="dataexample"></a>
To aid in planning, I have gathered data from the Spielplatznet and Foursquare websites to make example rows of the dataframes that I will be developing:

####  From the playground information website Spielplatznet:

In [19]:
#The playground data includes identifying information as well as a short description (in German) and I will scrape
#information on the number and types of playground equipment available.
import pandas as pd
pd.set_option('display.max_columns', None)
df_columns=('playground', 'latitude', 'longitude', 'description',
       'rating', 'water feature', 'sandpit', 'cable car', 'playhouse',
       'tree house', 'slide', 'swing', 'climbing features', 'sledding hill',
       'football field', 'seesaw', 'basketball', 'nest swing', 'total equipment')
df=pd.DataFrame([['Spielplatz Waldspielplatz Moorwegsiedlung Wedel',53.5926308917772,9.73169803619385,
    'Großer Spielplatz im Wald. Viel Wiese.',5,0,0,2,0,0,0,1,1,0,0,0,0,0,4]],columns=df_columns)
df

Unnamed: 0,playground,latitude,longitude,description,rating,water feature,sandpit,cable car,playhouse,tree house,slide,swing,climbing features,sledding hill,football field,seesaw,basketball,nest swing,total equipment
0,Spielplatz Waldspielplatz Moorwegsiedlung Wedel,53.592631,9.731698,Großer Spielplatz im Wald. Viel Wiese.,5,0,0,2,0,0,0,1,1,0,0,0,0,0,4


#### From the Foursquare website:

In [20]:
#This is an example of the data when it has been grouped by playground and mean-normalized for use in 
#the k-means clustering algorithm.
df2_columns=('Playground','Asian Restaurant','Auto Garage','Bakery','Beach','Beach Bar','Boat Rental',
            'Boat or Ferry','Bookstore','Bus Stop','Café','Clothing Store','College Gym','Construction & Landscaping',
            'Drugstore','Electronics Store','Farmers Market','Fast Food Restaurant','Flea Market','Food & Drink Shop',
            'French Restaurant','Furniture / Home Store','Garden','Garden Center','German Restaurant','Gym',
            'Gym / Fitness Center','Harbor / Marina','Hotel','Insurance Office','Italian Restaurant',
            'Light Rail Station','Liquor Store','Mexican Restaurant','Museum','Nightclub','Optical Shop','Pet Store',
            'Pier','Plaza','Pool','Pub','Residential Building (Apartment / Condo)','Restaurant','Sandwich Place',
            'Sculpture Garden','Seafood Restaurant','Shopping Mall','Soccer Field','Spa','Steakhouse','Supermarket',
            'Taverna','Tea Room','Thai Restaurant','Theater','Trail','Trattoria/Osteria','Turkish Restaurant')
df2=pd.DataFrame([['Spielplatz Croningstraße Wedel',0.0625,0.0625,0,0,0,0,0,0,0,0,0,0,0,0,0,
                  0.0625,0,0.125,0,0,0.0625,0.0625,0,0,0,0,0.0625,0,0,0,0,0,0,0,0,0.0625,0,
                  0.0625,0,0,0,0,0,0.0625,0.0625,0,0,0,0,0,0,0.187500,0.062500,0,0,0,0,0]],columns=df2_columns)
df2

Unnamed: 0,Playground,Asian Restaurant,Auto Garage,Bakery,Beach,Beach Bar,Boat Rental,Boat or Ferry,Bookstore,Bus Stop,Café,Clothing Store,College Gym,Construction & Landscaping,Drugstore,Electronics Store,Farmers Market,Fast Food Restaurant,Flea Market,Food & Drink Shop,French Restaurant,Furniture / Home Store,Garden,Garden Center,German Restaurant,Gym,Gym / Fitness Center,Harbor / Marina,Hotel,Insurance Office,Italian Restaurant,Light Rail Station,Liquor Store,Mexican Restaurant,Museum,Nightclub,Optical Shop,Pet Store,Pier,Plaza,Pool,Pub,Residential Building (Apartment / Condo),Restaurant,Sandwich Place,Sculpture Garden,Seafood Restaurant,Shopping Mall,Soccer Field,Spa,Steakhouse,Supermarket,Taverna,Tea Room,Thai Restaurant,Theater,Trail,Trattoria/Osteria,Turkish Restaurant
0,Spielplatz Croningstraße Wedel,0.0625,0.0625,0,0,0,0,0,0,0,0,0,0,0,0,0,0.0625,0,0.125,0,0,0.0625,0.0625,0,0,0,0,0.0625,0,0,0,0,0,0,0,0,0.0625,0,0.0625,0,0,0,0,0,0.0625,0.0625,0,0,0,0,0,0,0.1875,0.0625,0,0,0,0,0


### Analysis plan<a name="setup"></a>
The general plan follows:
- Retrieve a list of the playgrounds in the vicinity of a German city.
- Use that list to then look up each playground's detailed information.
- Use Foursquare's api to then find which venues are nearby and add to the dataset.
- Find the commercial characteristics of each playground's neighborhood.
- Cluster the playgrounds based on their commercial surroundings.
- Also cluster the playgrounds based on the equipment available.
- Finally, make a few lists of playgrounds with kid-friendly food and icecream nearby and certain playground features.

#### Finally, here is the planned table of contents:
### Table of contents
1. Introduction
2. Part I: Get a list of playgrounds in a city of interes
3. Part II: Get detailed playground information for each location
4. Part III: Visualizing the playground dataset and adding Foursquare data
5. Part IV: Clustering playgrounds using k-means clustering
6. Bonus I: Clustering and mapping based on playground equipment
7. Bonus II: Finding the playgrounds that are near fast food restaurants, icecream shops, etc.
8. Concluding remarks