# Problem and Background

In the summer of 2021 The Toronto City Council commissioned a detailed analysis of the city infrastructure. The goal is to identify future areas of investment aiming to improve general "liveability" for Toronto's residents and furthermore make the city more attractive to global talent - especially important considering the growing start-up scene.

This task was divided into several priority areas. The first, which will be the focus of this analysis, is to determine how well parks and green areas have been integrated into the city infrastructure as various research has shown that: 

_"City parks provide access to recreational opportunities, increase property values, spur local economies, combat crime, and protect cities from environmental impact."_   
Source: https://cityparksalliance.org/about-us/why-city-parks-matter/

Furthermore, according to the latest report from The Intergovernmental Panel on Climate Change (IPCC, https://www.ipcc.ch/report/ar6/wg1/) focusing on urban areas is essential in preparing for climate change:

_"For cities, some aspects of climate change may be amplified, including heat (since urban areas are usually warmer than their surroundings), flooding from heavy precipitation events and sea level rise in coastal cities."_  
Source: https://www.ipcc.ch/2021/08/09/ar6-wg1-20210809-pr/

According to various rankings on the world's greenest cities, Toronto is unfortunately nowhere to be found, which could be an indication that this focus area has room for improvement! Examples include:  https://www.afar.com/magazine/greenest-cities-in-the-world-in-2020 and https://assets.new.siemens.com/siemens/assets/api/uuid:cf26889b-3254-4dcb-bc50-fef7e99cb3c7/gci-report-summary.pdf. 

It is however not straightforward to simply increase green areas wherever possible: 

_"Park planning and design represents a complex process that begins with the decision to create a park and ends with the final construction and subsequent use of the park. A design context, design concepts study is an important step in park development which should precede the development of the site plan and working drawings. Design context examines community and resident needs, potential user demand, and site analysis […]."_  
Source: https://js.sagamorepub.com/jpra/article/view/1849

_“Typically, a park project gets started through a demonstrated need from surveys of community members, and other public input that is incorporated into the city’s Comprehensive Plan and the Parks and Recreation Open Space (PROS) Plan,” […]_  
Source: https://www.nrpa.org/parks-recreation-magazine/2017/march/from-concept-to-reality/

All in all, there seems to be valid support in arguing that the further development of parks and green areas in Toronto needs to be prioritized. The question is of course where to start. A survey of community members, as suggested above could yield helpful information. However, first of all this would require some background analysis in order for the questions be structured in a way that would render enough useful information. Secondly, the results need to be seen in context in order to determine the highest return on investment. Hence, it has been suggested to start with a benchmarking analysis comparing Toronto to a similar city known for its focus on green urban areas. For this analysis, Berlin has been chosen. It consistently ranks in the top of the greenest cities worldwide (see links above), is of similar size (although slightly bigger) and like Toronto also has a diverse population and a growing start-up scene. 

# Data and Problem-solving

The analysis and comparison of Toronto and Berlin will be based on the "venues" by neighbourhood as listed by 
Foursquare's location data. 

Both cities compromise a very high number of neighbourhoods as defined by postal code, which should make it possible to generate data detailed enough for a thorough analysis of the infrastructure.  It should be acknowledged that the postal code system differs between the cities, hence the neighbourhoods might be of varying sizes. Nevertheless this deemed to be an acceptable approach for a first-round analysis considering time and resources needed.

Postal codes for both cities are widely available via public sources. For our project the following sources have been selected - a quick crosscheck with other information providers could validate the quality of the information listed here: 

Toronto:  https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M  
Berlin:  https://www.dasoertliche.de/Themen/Postleitzahlen/Berlin.html

These sources were used to provide input for the Foursquare API in order to identify the venues by postal code (neighbourhood):

_"Foursquare summarizes data from thousands of references, incorporates validation and user-generated content from app and Super Users, and checks our POI against geographic assets"_  
Source: https://foursquare.com/products/places/

Although auto-generated content should always be handled with care, it is assumed that Foursquare - one of the most widely used location service providers - overall will show a true picture of city infrastructure. Furthermore, the venues identified cover a large number of categories, ranging from stores and restaurants to museums and stadiums. These categories, although not complete, are highly detailed and focus on some of the main characteristics of what makes up local communities. Hence the Foursquare venues seem adequate for our analysis. 

The Foursquare venue information was then used to only select neighbourhoods which included "park" for further analysis. However, no further criteria such as the size of the park was taken into account. This might be an important factor to be considered in future analyses. 

The dataset was clustered using algorithms from the Scikit-Learn library, a widely used machine learning library utilized by companies such as Microsoft and Fujitsu (https://scikit-learn.org/stable/). It was initially developed by David Cournapeau as a Google project (https://en.wikipedia.org/wiki/Scikit-learn).

The results were subsequently visualised for interpretation and recommendations for next steps.