# Battle of Neighbourhoods - Sydney Water Bubblers 

### 1. Introduction

It is only when something is taken away that we realise just how importance it was to our lives, and it is safe to say that the loss of outdoor drinking fountains and indoor water dispensers would have a significant impact on our lives. We expect to see them in schools, offices and even hotel lobbies and luckily, nowadays we are rarely disappointed.

Different locations will prefer different taps because of the type of customer they have. School drinking fountains and gym fountains will have a constant need for small quick bursts of water and this is exactly what the bubbler tap is designed for.  While offices and professional areas are unlikely to want their staff to be dipping down to catch a quick mouthful of water. It hardly gives off the best professional image. The bubbler tap allows the user to quickly dip down, filling their mouth with refreshment and then get on with their day. Even those who are not being active might be in need of a quick drink, such as in between classes or after lunch.

There is a large network of drinking fountains (water bubblers) dotted around the city of Sydney where one can get fresh water while they are walking or cycling. Most are in parks and playgrounds while some can also be found along main roads or near tourist attractions. Sydney water fountains are not only useful, they are something of a tourist attraction in themselves - with some more than 100 years old.


### 2. Business Problem

A research to evaluate the access to and supply of water in a variety of settings, such as open spaces and sports and recreation centres could be carried out so that places with higher demand have adequate and continuous supply of water. This guide is based on the research findings and a review of drinking water fountains.

##### Consider location:
* Map out locations of existing water fountains and identify opportunities for installation  of new water fountains/refill stations
* Research indicates that if drinking water sources are not in prominent areas and blend in with surroundings, they are less likely to be used
* Units in poor locations or not installed on appropriate (flat) surfaces can make it difficult to access the water source

##### For new installation sites:
* identify high pedestrian traffic areas, such as next to a playground
* open spaces where there are opportunities to do physical activity
* open spaces where there are planned picnic tables/BBQ facilities.
* Presence of a water bottle refill station

## Methodology

This report helped my detect areas of Sydney that have low water bubbler density, particularly those aorund crowded CBD areas. 

In first step I have collected the required **data: location and type (category) of every water bubbler in Sydney CBD area**. I have also **identified venues near those bubblers** (according to Foursquare categorization).

In final step we will focus on most promising areas and within those create **clusters of bubbler locations that meet some basic requirements** established in discussion with stakeholders: we will take into consideration locations with **venues around these bubblers**, and we want locations **so that these venues could be addressed with appropriate number of bubblers**. We will present map of all such locations but also create clusters (using **k-means clustering**) of those locations to identify general zones / neighborhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

### 3. Data

The data for this project has been retrieved and processed through multiple sources, giving careful considerations to the accuracy of the methods used.


#### 3.1 Neighbourhoods Sydney Water Bubbler Data

The data of the neighbourhoods in Kolkata can be extracted out by downloading from [City of Sydney](https://data.cityofsydney.nsw.gov.au/datasets/166d56fa6d644397add849d6190fc388_0/data) site. This downloaded data is then read through panda dataframe.

#### 3.2 Geocoding

The latitude and longitude of Sydney are retrieved using Google Maps Geocoding API. The geometric location values are then stored into the intial dataframe.

#### 3.3 Venue Data

From the location data obtained after Web Scraping and Geocoding, the venue data is found out by passing in the required parameters to the FourSquare API, and creating another DataFrame to contain all the venue details along with the respective neighbourhoods.

### 4. Data Processing

#### 4.1 Importing Packages

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import folium
import requests
import json
from bs4 import BeautifulSoup
import matplotlib.cm as cm
import matplotlib.colors as colors
from geopy import OpenCage
from sklearn.cluster import KMeans

%matplotlib inline

#### 4.2 Reading data from CSV file

In [2]:
water_df = pd.read_csv('Drinking_fountains_water_bubblers.csv')
water_df.head()

Unnamed: 0,Longitude,Latitude,OBJECTID,site_name,Suburb,Location,Accessible
0,151.181929,-33.895707,1,WJ Thurbon Playground,Newtown,Park,No wheelchair access
1,151.192401,-33.879331,2,M J (Paddy) Dougherty Reserve,Glebe,Reserve,Wheelchair accessible
2,151.184712,-33.879971,3,St James Park,Glebe,Park,Wheelchair accessible
3,151.184337,-33.880301,4,St James Park,Glebe,Park,
4,151.184114,-33.880468,5,John St Reserve,Glebe,Park,


#### 4.3 Rename columns and drop unneeded ones

In [3]:
columns = ['Longitude', 'Latitude', 'Site', 'Suburb']
water_df.drop(['OBJECTID', 'Location', 'Accessible'], axis=1, inplace=True)
water_df.columns = columns
water_df.head()

Unnamed: 0,Longitude,Latitude,Site,Suburb
0,151.181929,-33.895707,WJ Thurbon Playground,Newtown
1,151.192401,-33.879331,M J (Paddy) Dougherty Reserve,Glebe
2,151.184712,-33.879971,St James Park,Glebe
3,151.184337,-33.880301,St James Park,Glebe
4,151.184114,-33.880468,John St Reserve,Glebe


#### 4.4 Removing rows with Null or Blanks

In [4]:
water_df = water_df[water_df.Longitude.isna() == False]
water_df = water_df[water_df.Latitude.isna() == False]
water_df = water_df[water_df.Site.isna() == False]
water_df = water_df[water_df.Suburb.isna() == False]
water_df.shape

(215, 4)

### 5. Four Square API Call

#### 5.1 Preparing URL Calls

In [6]:
search_url = 'https://api.foursquare.com/v2/venues/search'
explore_url = 'https://api.foursquare.com/v2/venues/explore'

CLIENT_ID = '5EFTZ2TCXU5SKLZNSF3UFZ5JVMMULOSZQ5XBVD5BXP2C0GPA'
CLIENT_SECRET = 'ZOD0N4JU53LY5AE3LWK0ARSKH1QF3YRJWL0WEXL22C55BXCN'
VERSION = '20180605'

#### 5.2 Finding coordinates of one of the locations in Sydney

In [8]:
sample_bubbler = water_df.loc[0]
neighborhood_latitude = water_df.loc[0, 'Latitude'] # neighborhood latitude value
neighborhood_longitude = water_df.loc[0, 'Longitude'] # neighborhood longitude value

neighborhood_name = sample_bubbler.Suburb # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_name, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of Newtown are -33.895707290374304, 151.18192926396196.


#### 5.3 Repeating the process for all the locations

In [9]:
search_query = sample_bubbler.Suburb
LIMIT = 100
radius = 500
url = 'https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&v={}&query={}&radius={}&limit={}'.format(CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, search_query, radius, LIMIT)

#### 5.4 Creating a function to Four Square API Call

In [10]:
def getNearbyVenues(names, latitudes, longitudes, radius=500):
    
    venues_list=[]
    for name, lat, lng in zip(names, latitudes, longitudes):
        print(name)
            
        # create the API request URL
        url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
            CLIENT_ID, 
            CLIENT_SECRET, 
            VERSION, 
            lat, 
            lng, 
            radius, 
            LIMIT)
            
        # make the GET request
        results = requests.get(url).json()["response"]['groups'][0]['items']
        
        # return only relevant information for each nearby venue
        venues_list.append([(
            name, 
            lat, 
            lng, 
            v['venue']['name'], 
            v['venue']['location']['lat'], 
            v['venue']['location']['lng'],  
            v['venue']['categories'][0]['name']) for v in results])

    nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
    nearby_venues.columns = ['Neighborhood', 
                  'Neighborhood Latitude', 
                  'Neighborhood Longitude', 
                  'Venue', 
                  'Venue Latitude', 
                  'Venue Longitude', 
                  'Venue Category']
    
    return(nearby_venues)

#### 5.5 Calling the function and stroing near by venues in a python list

In [11]:
bubbler_venues = getNearbyVenues(names=water_df['Site'],
                                   latitudes=water_df['Latitude'],
                                   longitudes=water_df['Longitude']
                                  )

WJ Thurbon Playground
M J (Paddy) Dougherty Reserve
St James Park
St James Park
John St Reserve
May Pitt Playground
Seamer St Reserve
Lyons Rd Reserve
Glebe Foreshore Walk Stage 5
Peace Park
Oxford Square
Frog Hollow Reserve
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Sydney Park
Pitt Street Mall
Pitt Street Mall
Pitt Street Mall
Pitt Street Mall
Munn Reserve
Amy Street Reserve
Amy Street Reserve
5010 O'Connell Street
Elizabeth McCrea Playground
5040 MacDonald Street
Wentworth Park
Wentworth Park
Wentworth Park
Wentworth Park
Wentworth Park
Wentworth Park
5010 Morrissey Road
Hyde Park South
Hyde Park South
Hyde Park South
Hyde Park South
Hyde Park South
Hyde Park South
5020 Rothschild Avenue
5010 Archibald Avenue
4010 King Street
4010 King Street
4090 Erskineville Road
4020 Barcom Avenue
4220 Cleveland Street
4060 George Street
Matron Ruby Park - 3 Joynton Avenue


In [12]:
bubbler_venues.head()

Unnamed: 0,Neighborhood,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
0,WJ Thurbon Playground,-33.895707,151.181929,Gelato Messina,-33.89606,151.18117,Ice Cream Shop
1,WJ Thurbon Playground,-33.895707,151.181929,Dendy Cinemas,-33.896094,151.180544,Indie Movie Theater
2,WJ Thurbon Playground,-33.895707,151.181929,Brewtown Newtown,-33.893849,151.182478,Café
3,WJ Thurbon Playground,-33.895707,151.181929,Delhi 'O' Delhi,-33.897142,151.180542,Indian Restaurant
4,WJ Thurbon Playground,-33.895707,151.181929,Black Sheep,-33.895758,151.181115,Cocktail Bar


#### 5.6 Grouping the Neighbourhood

In [13]:
bubbler_venues.groupby('Neighborhood').count()

Unnamed: 0_level_0,Neighborhood Latitude,Neighborhood Longitude,Venue,Venue Latitude,Venue Longitude,Venue Category
Neighborhood,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
3000 Hay Street,100,100,100,100,100,100
4010 College Street,87,87,87,87,87,87
4010 King Street,99,99,99,99,99,99
4020 Barcom Avenue,100,100,100,100,100,100
4040 Kent Street,71,71,71,71,71,71
...,...,...,...,...,...,...
Waterloo Oval,60,60,60,60,60,60
Waterloo Park & Mount Carmel,17,17,17,17,17,17
Wattle & Broadway Rest Area,100,100,100,100,100,100
Wentworth Park,138,138,138,138,138,138


#### 5.7 Finding the number of Unique venues

In [14]:
print('There are {} uniques categories.'.format(len(bubbler_venues['Venue Category'].unique())))

There are 250 uniques categories.


#### 5.8 One hot encoding

In [15]:
bubbler_onehot = pd.get_dummies(bubbler_venues[['Venue Category']], prefix="", prefix_sep="")

bubbler_onehot['Neighborhood'] = bubbler_venues['Neighborhood'] 

fixed_columns = [bubbler_onehot.columns[-1]] + list(bubbler_onehot.columns[:-1])
bubbler_onehot = bubbler_onehot[fixed_columns]

bubbler_onehot.head()

Unnamed: 0,Neighborhood,American Restaurant,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo Exhibit
0,WJ Thurbon Playground,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,WJ Thurbon Playground,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,WJ Thurbon Playground,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,WJ Thurbon Playground,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,WJ Thurbon Playground,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


#### 5.9 Grouping Neighbourhood again

In [16]:
bubbler_grouped = bubbler_onehot.groupby('Neighborhood').mean().reset_index()
bubbler_grouped

Unnamed: 0,Neighborhood,American Restaurant,Antique Shop,Aquarium,Arcade,Argentinian Restaurant,Art Gallery,Art Museum,Arts & Crafts Store,Asian Restaurant,...,Vegetarian / Vegan Restaurant,Video Game Store,Vietnamese Restaurant,Water Park,Whisky Bar,Wine Bar,Wine Shop,Wings Joint,Yoga Studio,Zoo Exhibit
0,3000 Hay Street,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.010000,0.0,0.010000,0.0,0.01,0.000000,0.0,0.0,0.01,0.0
1,4010 College Street,0.011494,0.0,0.0,0.00,0.0,0.011494,0.011494,0.0,0.0,...,0.011494,0.0,0.000000,0.0,0.00,0.000000,0.0,0.0,0.00,0.0
2,4010 King Street,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.020202,0.0,0.020202,0.0,0.00,0.020202,0.0,0.0,0.00,0.0
3,4020 Barcom Avenue,0.020000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.00,0.010000,0.0,0.0,0.01,0.0
4,4040 Kent Street,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.00,0.000000,0.0,0.0,0.00,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
145,Waterloo Oval,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.00,0.000000,0.0,0.0,0.00,0.0
146,Waterloo Park & Mount Carmel,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.00,0.000000,0.0,0.0,0.00,0.0
147,Wattle & Broadway Rest Area,0.000000,0.0,0.0,0.01,0.0,0.020000,0.000000,0.0,0.0,...,0.000000,0.0,0.010000,0.0,0.00,0.020000,0.0,0.0,0.00,0.0
148,Wentworth Park,0.000000,0.0,0.0,0.00,0.0,0.000000,0.000000,0.0,0.0,...,0.000000,0.0,0.000000,0.0,0.00,0.000000,0.0,0.0,0.00,0.0


#### 5.10 Finding top 5 venues around bubblers

In [18]:
num_top_venues = 5

for hood in bubbler_grouped['Neighborhood']:
    print("----"+hood+"----")
    temp = bubbler_grouped[bubbler_grouped['Neighborhood'] == hood].T.reset_index()
    temp.columns = ['venue','freq']
    temp = temp.iloc[1:]
    temp['freq'] = temp['freq'].astype(float)
    temp = temp.round({'freq': 2})
    print(temp.sort_values('freq', ascending=False).reset_index(drop=True).head(num_top_venues))
    print('\n')

----3000 Hay Street----
                 venue  freq
0      Thai Restaurant  0.16
1                 Café  0.14
2                Hotel  0.06
3   Chinese Restaurant  0.04
4  Japanese Restaurant  0.04


----4010 College Street----
          venue  freq
0          Café  0.17
1   Coffee Shop  0.05
2          Park  0.03
3        Museum  0.03
4  Dessert Shop  0.02


----4010 King Street----
                venue  freq
0                Café  0.20
1     Thai Restaurant  0.10
2         Pizza Place  0.06
3         Coffee Shop  0.04
4  Italian Restaurant  0.04


----4020 Barcom Avenue----
                venue  freq
0                Café  0.27
1  Italian Restaurant  0.07
2                 Bar  0.05
3                 Pub  0.05
4         Pizza Place  0.03


----4040 Kent Street----
                venue  freq
0                Café  0.14
1         Coffee Shop  0.08
2                 Bar  0.08
3               Hotel  0.07
4  Italian Restaurant  0.06


----4060 Cowper Street----
                venue  f

4  Eastern European Restaurant  0.05


----5090 Harris Street----
                 venue  freq
0                 Café  0.20
1                 Park  0.05
2                  Pub  0.04
3       Sandwich Place  0.04
4  Japanese Restaurant  0.04


----5090 Kent Street----
                 venue  freq
0          Coffee Shop  0.11
1                 Café  0.08
2  Japanese Restaurant  0.06
3         Cocktail Bar  0.06
4                Hotel  0.05


----5090 King Street----
            venue  freq
0            Café  0.12
1     Coffee Shop  0.05
2       Speakeasy  0.05
3   Shopping Mall  0.04
4  Sandwich Place  0.04


----Alexandria Park----
                   venue  freq
0                   Café  0.27
1                    Pub  0.09
2  Vietnamese Restaurant  0.06
3           Liquor Store  0.04
4                    Bar  0.04


----Amy Street Reserve----
                  venue  freq
0                  Café  0.22
1                   Pub  0.08
2       Thai Restaurant  0.07
3               Theater  0.



----Harry Noble Reserve----
                venue  freq
0                Café  0.21
1                 Pub  0.08
2  Italian Restaurant  0.08
3              Bistro  0.04
4           Pet Store  0.04


----Hollis Park----
                           venue  freq
0                Thai Restaurant  0.11
1                           Café  0.11
2                    Pizza Place  0.05
3  Vegetarian / Vegan Restaurant  0.03
4                   Burger Joint  0.03


----Hyde Park North----
             venue  freq
0             Café  0.12
1      Coffee Shop  0.06
2    Shopping Mall  0.05
3  Thai Restaurant  0.04
4            Hotel  0.04


----Hyde Park South----
                 venue  freq
0                 Café  0.13
1                Hotel  0.08
2      Thai Restaurant  0.07
3          Coffee Shop  0.05
4  Japanese Restaurant  0.05


----Jack Haynes Rest Area----
                 venue  freq
0                 Café  0.15
1      Thai Restaurant  0.13
2            Bookstore  0.04
3  Japanese Restaurant

                 venue  freq
0                 Café  0.19
1          Coffee Shop  0.06
2                  Bar  0.06
3  Dumpling Restaurant  0.03
4         Burger Joint  0.03


----Wentworth Park----
                venue  freq
0                Café  0.19
1  Seafood Restaurant  0.14
2                 Pub  0.09
3         Fish Market  0.07
4    Sushi Restaurant  0.04


----Wynyard Park----
          venue  freq
0          Café  0.12
1           Bar  0.10
2  Cocktail Bar  0.07
3     Speakeasy  0.07
4   Coffee Shop  0.06




#### 5.11 Function to return most common venues

In [20]:
def return_most_common_venues(row, num_top_venues):
    row_categories = row.iloc[1:]
    row_categories_sorted = row_categories.sort_values(ascending=False)
    
    return row_categories_sorted.index.values[0:num_top_venues]

num_top_venues = 10

indicators = ['st', 'nd', 'rd']

# create columns according to number of top venues
columns = ['Neighborhood']
for ind in np.arange(num_top_venues):
    try:
        columns.append('{}{} Most Common Venue'.format(ind+1, indicators[ind]))
    except:
        columns.append('{}th Most Common Venue'.format(ind+1))

# create a new dataframe
neighborhoods_venues_sorted = pd.DataFrame(columns=columns)
neighborhoods_venues_sorted['Neighborhood'] = bubbler_grouped['Neighborhood']

for ind in np.arange(bubbler_grouped.shape[0]):
    neighborhoods_venues_sorted.iloc[ind, 1:] = return_most_common_venues(bubbler_grouped.iloc[ind, :], num_top_venues)

neighborhoods_venues_sorted.head()

Unnamed: 0,Neighborhood,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,3000 Hay Street,Thai Restaurant,Café,Hotel,Chinese Restaurant,Japanese Restaurant,Coffee Shop,Breakfast Spot,Sandwich Place,Korean Restaurant,Pizza Place
1,4010 College Street,Café,Coffee Shop,Museum,Park,Burger Joint,Australian Restaurant,Thai Restaurant,Shopping Mall,Bar,Nightclub
2,4010 King Street,Café,Thai Restaurant,Pizza Place,Bar,Coffee Shop,Italian Restaurant,Burger Joint,Pub,Southern / Soul Food Restaurant,Spanish Restaurant
3,4020 Barcom Avenue,Café,Italian Restaurant,Pub,Bar,Pizza Place,American Restaurant,Cocktail Bar,Nightclub,Lounge,Japanese Restaurant
4,4040 Kent Street,Café,Coffee Shop,Bar,Hotel,Italian Restaurant,Speakeasy,Restaurant,Bakery,Seafood Restaurant,Steakhouse


#### 5.12 Finding clusters around neighbourhood

In [22]:
# set number of clusters
kclusters = 5

bubbler_grouped_clustering = bubbler_grouped.drop('Neighborhood', 1)

# run k-means clustering
kmeans = KMeans(n_clusters=kclusters, random_state=0).fit(bubbler_grouped_clustering)


# add clustering labels
neighborhoods_venues_sorted.insert(0, 'Cluster Labels', kmeans.labels_)

bubbler_merged = water_df

# merge toronto_grouped with toronto_data to add latitude/longitude for each neighborhood
bubbler_merged = bubbler_merged.join(neighborhoods_venues_sorted.set_index('Neighborhood'), on='Site')

bubbler_merged.head() # check the last columns!

Unnamed: 0,Longitude,Latitude,Site,Suburb,Cluster Labels,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,151.181929,-33.895707,WJ Thurbon Playground,Newtown,4,Café,Thai Restaurant,Bar,Ice Cream Shop,Bakery,Cocktail Bar,Pub,Coffee Shop,Deli / Bodega,Japanese Restaurant
1,151.192401,-33.879331,M J (Paddy) Dougherty Reserve,Glebe,0,Café,Eastern European Restaurant,Italian Restaurant,Coffee Shop,Bar,Clothing Store,Movie Theater,Farmers Market,Tapas Restaurant,Dumpling Restaurant
2,151.184712,-33.879971,St James Park,Glebe,3,Café,Pub,Pizza Place,Indian Restaurant,Thai Restaurant,Bakery,Japanese Restaurant,Eastern European Restaurant,Vietnamese Restaurant,Breakfast Spot
3,151.184337,-33.880301,St James Park,Glebe,3,Café,Pub,Pizza Place,Indian Restaurant,Thai Restaurant,Bakery,Japanese Restaurant,Eastern European Restaurant,Vietnamese Restaurant,Breakfast Spot
4,151.184114,-33.880468,John St Reserve,Glebe,3,Café,Pub,Pizza Place,Thai Restaurant,Indian Restaurant,Bakery,Japanese Restaurant,Coffee Shop,Paper / Office Supplies Store,Beer Garden


#### 5.13 Finding Georagraphical coordinates of Sydney

In [23]:
from geopy.geocoders import Nominatim 
address = 'Sydney, NSW'

geolocator = Nominatim(user_agent="Sydney_explorer")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print('The geograpical coordinate of Sydney are {}, {}.'.format(latitude, longitude))

The geograpical coordinate of Sydney are -33.8888621, 151.2048978618509.


#### 5.14 Mapping clusters on Sydney map

In [35]:
# create map
map_clusters = folium.Map(location=[latitude, longitude], zoom_start=13)

# set color scheme for the clusters
x = np.arange(kclusters)
ys = [i + x + (i*x)**2 for i in range(kclusters)]
colors_array = cm.rainbow(np.linspace(0, 1, len(ys)))
rainbow = [colors.rgb2hex(i) for i in colors_array]

# add markers to the map
markers_colors = []
for lat, lon, poi, cluster in zip(bubbler_merged['Latitude'], bubbler_merged['Longitude'], bubbler_merged['Site'], bubbler_merged['Cluster Labels']):
    label = folium.Popup(str(poi) + ' Cluster ' + str(cluster), parse_html=True)
    folium.CircleMarker(
        [lat, lon],
        radius=5,
        popup=label,
        color=rainbow[cluster-1],
        fill=True,
        fill_color=rainbow[cluster-1],
        fill_opacity=0.7).add_to(map_clusters)
       
map_clusters

### 6. Clustered Neighbourhood

#### 6.1 Cluster 1

In [27]:
bubbler_merged.loc[bubbler_merged['Cluster Labels'] == 1, bubbler_merged.columns[[2] + list(range(5, bubbler_merged.shape[1]))]]

Unnamed: 0,Site,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
86,Southern Cross Dr Reserve,Golf Course,Hotel Bar,Food & Drink Shop,Supermarket,Park,Shopping Mall,Farmers Market,Event Space,Eastern European Restaurant,Electronics Store
114,Kimberly Grove Reserve,Golf Course,Cosmetics Shop,Thai Restaurant,Bakery,Café,Fish Market,Fish & Chips Shop,Field,Eastern European Restaurant,Fast Food Restaurant
115,Bannerman Cres.Reserve,Golf Course,Thai Restaurant,Park,Zoo Exhibit,Eastern European Restaurant,Food Court,Food & Drink Shop,Flea Market,Fish Market,Fish & Chips Shop


#### 6.2 Cluster 2

In [28]:
bubbler_merged.loc[bubbler_merged['Cluster Labels'] == 2, bubbler_merged.columns[[2] + list(range(5, bubbler_merged.shape[1]))]]

Unnamed: 0,Site,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
108,Harold Park,Café,Ice Cream Shop,Italian Restaurant,Pub,Pizza Place,Pet Store,Park,Middle Eastern Restaurant,Japanese Restaurant,Hostel
127,Federal Park 1,Park,Café,Ice Cream Shop,Steakhouse,Fish & Chips Shop,Food Court,Fried Chicken Joint,Italian Restaurant,Japanese Restaurant,Light Rail Station
128,Jubilee Park,Park,Hostel,Italian Restaurant,Pizza Place,Pier,Pet Store,Middle Eastern Restaurant,Japanese Restaurant,Ice Cream Shop,Steakhouse
135,Blackwattle Bay Park,Park,Harbor / Marina,Hostel,Café,Seafood Restaurant,Pizza Place,Wine Bar,Sri Lankan Restaurant,Monument / Landmark,Australian Restaurant
192,5030 Glebe Point Road,Park,Pizza Place,Hostel,Café,Harbor / Marina,Italian Restaurant,American Restaurant,Pub,Fried Chicken Joint,Breakfast Spot
205,Bicentennial Park 1,Park,Harbor / Marina,Hostel,Light Rail Station,Soccer Field,Pier,Vietnamese Restaurant,Ice Cream Shop,Spanish Restaurant,Pizza Place
206,Bicentennial Park 1,Park,Harbor / Marina,Hostel,Light Rail Station,Soccer Field,Pier,Vietnamese Restaurant,Ice Cream Shop,Spanish Restaurant,Pizza Place
207,Bicentennial Park 1,Park,Harbor / Marina,Hostel,Light Rail Station,Soccer Field,Pier,Vietnamese Restaurant,Ice Cream Shop,Spanish Restaurant,Pizza Place
208,Bicentennial Park 2,Park,Light Rail Station,Café,Middle Eastern Restaurant,Spanish Restaurant,Soccer Field,Pizza Place,Pier,Pet Store,Ice Cream Shop
209,Bicentennial Park 2,Park,Light Rail Station,Café,Middle Eastern Restaurant,Spanish Restaurant,Soccer Field,Pizza Place,Pier,Pet Store,Ice Cream Shop


#### 6.3 Cluster 3

In [29]:
bubbler_merged.loc[bubbler_merged['Cluster Labels'] == 3, bubbler_merged.columns[[2] + list(range(5, bubbler_merged.shape[1]))]]

Unnamed: 0,Site,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
2,St James Park,Café,Pub,Pizza Place,Indian Restaurant,Thai Restaurant,Bakery,Japanese Restaurant,Eastern European Restaurant,Vietnamese Restaurant,Breakfast Spot
3,St James Park,Café,Pub,Pizza Place,Indian Restaurant,Thai Restaurant,Bakery,Japanese Restaurant,Eastern European Restaurant,Vietnamese Restaurant,Breakfast Spot
4,John St Reserve,Café,Pub,Pizza Place,Thai Restaurant,Indian Restaurant,Bakery,Japanese Restaurant,Coffee Shop,Paper / Office Supplies Store,Beer Garden
5,May Pitt Playground,Café,Pub,Thai Restaurant,Pizza Place,Yoga Studio,Portuguese Restaurant,Harbor / Marina,Motorcycle Shop,Gym / Fitness Center,Paper / Office Supplies Store
7,Lyons Rd Reserve,Café,Convenience Store,Pizza Place,Korean Restaurant,Motorcycle Shop,Brewery,Sushi Restaurant,Donut Shop,Furniture / Home Store,Climbing Gym
...,...,...,...,...,...,...,...,...,...,...,...
190,5040 Bridge Road,Café,Pub,Pizza Place,Bakery,Eastern European Restaurant,Thai Restaurant,Indian Restaurant,Japanese Restaurant,Lebanese Restaurant,Tapas Restaurant
193,5010 Minogue Crescent,Café,Grocery Store,Pub,Snack Place,Pizza Place,Fried Chicken Joint,Sushi Restaurant,Garden Center,Climbing Gym,Liquor Store
197,5040 Renwick Street,Café,Vietnamese Restaurant,Park,Italian Restaurant,Pub,Poke Place,Bakery,Shoe Store,Breakfast Spot,Market
210,Dr H J Foley Rest Park,Café,Pub,Pizza Place,Bakery,Indian Restaurant,Coffee Shop,Thai Restaurant,Bar,Japanese Restaurant,Italian Restaurant


#### 6.4 Cluster 4

In [30]:
bubbler_merged.loc[bubbler_merged['Cluster Labels'] == 4, bubbler_merged.columns[[2] + list(range(5, bubbler_merged.shape[1]))]]

Unnamed: 0,Site,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
0,WJ Thurbon Playground,Café,Thai Restaurant,Bar,Ice Cream Shop,Bakery,Cocktail Bar,Pub,Coffee Shop,Deli / Bodega,Japanese Restaurant
27,Pitt Street Mall,Café,Coffee Shop,Cocktail Bar,Hotel,Bar,Shopping Mall,Speakeasy,Thai Restaurant,Sandwich Place,Bookstore
28,Pitt Street Mall,Café,Coffee Shop,Cocktail Bar,Hotel,Bar,Shopping Mall,Speakeasy,Thai Restaurant,Sandwich Place,Bookstore
29,Pitt Street Mall,Café,Coffee Shop,Cocktail Bar,Hotel,Bar,Shopping Mall,Speakeasy,Thai Restaurant,Sandwich Place,Bookstore
30,Pitt Street Mall,Café,Coffee Shop,Cocktail Bar,Hotel,Bar,Shopping Mall,Speakeasy,Thai Restaurant,Sandwich Place,Bookstore
34,5010 O'Connell Street,Café,Thai Restaurant,Pizza Place,Bar,Italian Restaurant,Japanese Restaurant,Juice Bar,Park,Pie Shop,Coffee Shop
44,Hyde Park South,Café,Hotel,Thai Restaurant,Japanese Restaurant,Coffee Shop,Korean Restaurant,Italian Restaurant,Park,Chinese Restaurant,Greek Restaurant
45,Hyde Park South,Café,Hotel,Thai Restaurant,Japanese Restaurant,Coffee Shop,Korean Restaurant,Italian Restaurant,Park,Chinese Restaurant,Greek Restaurant
46,Hyde Park South,Café,Hotel,Thai Restaurant,Japanese Restaurant,Coffee Shop,Korean Restaurant,Italian Restaurant,Park,Chinese Restaurant,Greek Restaurant
47,Hyde Park South,Café,Hotel,Thai Restaurant,Japanese Restaurant,Coffee Shop,Korean Restaurant,Italian Restaurant,Park,Chinese Restaurant,Greek Restaurant


#### 6.5 Cluster 5

In [31]:
bubbler_merged.loc[bubbler_merged['Cluster Labels'] == 0, bubbler_merged.columns[[2] + list(range(5, bubbler_merged.shape[1]))]]

Unnamed: 0,Site,1st Most Common Venue,2nd Most Common Venue,3rd Most Common Venue,4th Most Common Venue,5th Most Common Venue,6th Most Common Venue,7th Most Common Venue,8th Most Common Venue,9th Most Common Venue,10th Most Common Venue
1,M J (Paddy) Dougherty Reserve,Café,Eastern European Restaurant,Italian Restaurant,Coffee Shop,Bar,Clothing Store,Movie Theater,Farmers Market,Tapas Restaurant,Dumpling Restaurant
6,Seamer St Reserve,Café,Coffee Shop,Bar,Burger Joint,Pub,Japanese Restaurant,Eastern European Restaurant,Pizza Place,Indian Restaurant,Tapas Restaurant
8,Glebe Foreshore Walk Stage 5,Seafood Restaurant,Café,Bakery,Park,Pub,Fish Market,Pizza Place,Coffee Shop,Grocery Store,Dog Run
9,Peace Park,Café,Bar,Pub,Thai Restaurant,Burger Joint,Electronics Store,Chinese Restaurant,Supermarket,Multiplex,Dumpling Restaurant
10,Oxford Square,Café,Italian Restaurant,Pizza Place,Japanese Restaurant,Hotel,Bar,Thai Restaurant,Wine Bar,Cocktail Bar,Burger Joint
...,...,...,...,...,...,...,...,...,...,...,...
202,Fitzroy Gardens,Café,Italian Restaurant,Australian Restaurant,Hotel,Japanese Restaurant,Coffee Shop,Pub,Thai Restaurant,Sushi Restaurant,Supermarket
203,Rushcutters Bay Park,Café,Harbor / Marina,Italian Restaurant,Park,Pizza Place,Australian Restaurant,Bar,Thai Restaurant,Japanese Restaurant,Wine Bar
204,Rushcutters Bay Park,Café,Harbor / Marina,Italian Restaurant,Park,Pizza Place,Australian Restaurant,Bar,Thai Restaurant,Japanese Restaurant,Wine Bar
211,Thomas Portley Reserve,Seafood Restaurant,Café,Bakery,Fish Market,Thai Restaurant,Farmers Market,Pizza Place,Park,Sushi Restaurant,Bar


## Results and Discussions

Our analysis shows that although there is a great number of bubblers in Sydney, there are pockets of water bubblers density fairly close to famous venues. 

Result of all this is 5 zones containing largest number of potential new water bubblers locations based on number of and distance to existing venues - both restaurants and cafes in general. This, of course, does not imply that those zones are actually optimal locations for a new water bubbler! Purpose of this analysis was to only provide info on areas close to Sydney CBD  but not crowded with existing water bubblers. Recommended zones should therefore be considered only as a starting point for more detailed analysis.

## Conclusion

Purpose of this project was to identify Sdyney CBS area with low number of water bubblers in order to aid Sydney water department in narrowing down the search for optimal location for a new water bubbler. By calculating venues density distribution from Foursquare data we have first identified general neighbourhood and then generated extensive collection of locations which satisfy some basic requirements regarding existing nearby bubblers. Clustering of those locations was then performed in order to create major zones of interest (containing greatest number of potential locations) and addresses of those zone centers were created to be used as starting points for final exploration by Sydney Water Department.

Final decission on optimal restaurant location will be made by Sydney Water Department based on specific characteristics of neighborhoods and locations in every recommended zone, taking into consideration additional factors like attractiveness of each location (proximity to park or restaurant).