# Peer-graded Assignment: Capstone Project - The Battle of Neighbourhoods (Week 2)


# If maps don't show in your browser

Just <b>open the url</b>: https://nbviewer.jupyter.org/ and copy the github address of the project and select <b>go</b>!

The resulting url shows the project and the maps show perfectly.




## Description of the problem and discussion of the background

<b>Melbourne</b> is the cultural capital of Australia, known for its music, art centres and museums, and celebration and expression of art. Melbourne also has a very strong food culture. Melburnians are extremely diverse with all of their dining selections, with cuisine from all over the world. If you are a true foodie who loves a good aesthetically pleasing meal, you will be in heaven exploring the food spots in Melbourne. From beautiful pizza and pasta in Carlton (Melbourne's version of Little Italy), to amazing sushi spots with fresh seafood caught hours prior, and vegan friendly cafes, there is a ton of variety that will keep you on your toes. The city takes its food culture very seriously and is constantly changing their menus to keep up with all the culinary trends.

In this project I will identify potential locations that could be considered for an <b>Italian restaurant</b> in the City of Melbourne, Australia. The City of Melbourne is comprised of several suburbs, including the CBD and suburbs surrounding the CBD. With Carlton (Melbourne's version of Little Italy) being a suburb next to the Melbourne CBD, it is critical to <b>identify a suitable location for an Italian restaurant</b>, considering the established restaurants in this location. With a significant number of restaurants being in the CBD and surrounding suburbs, it will be important to identify locations that are not crowded with established restaurants, including <b>locations that do not have any current Italian restaurants</b>.

Key audience members that would be interested in understanding the outcome for this problem include:

 - <b>Business owners</b>: With the impact of restrictions associated with COVID-19, a business owner will be very interested in confirming that the location of their new restaurant will provide the optimal opportunity for success. If a restaurant is placed in the wrong location, where competing against other food service businesses, there is the possibility of the business failing.
 - <b>Local and state government officials</b>: These officials will need to understand the optimal locations for businesses. As approval for businesses to operate in a location, these officials would want to understand if a type of business is proposing to enter an already saturated area.
 - <b>Residential tenants</b>: As part of the decision process on where people live, having access to a solution that highlights the businesses that are located in a particular area, may influence the location that these people choose to live.

Reflecting on these considerations, I will create maps that identify restaurant densities and areas that do not have to compete against other restaurants, especially other Italian restaurants. This will give potential stakeholders the locations that will enable the best possibility of founding a successful restaurant.

## Description of data and how used to solve the problem

Based on the definition of the problem to be solved, the following data, with how it will be used, is defined below:

 - I used <b>Foursquare API</b> to identify the number of restaurants and their type and location in every neighbourhood
 - I used <b>Nominatim geocoding</b> to get the coordinates for Melbourne and the identified neighbourhood centres
 - I used <b>Nominatim reverse geocoding</b> to generate approximate addresses of neighbourhood centres
 - I used the json file from <b>VIC Suburb/Locality Boundaries</b> (https://data.gov.au/dataset/ds-dga-af33dd8c-0534-4e18-9245-fc64440f742e/details?q=) to draw the borders of City of Melbourne boroughs on the map

## Neighbourhood Candidates

I need to create latitude & longitude coordinates for centroids of our candidate neighbourhoods. We will create a grid of cells covering our area of interest which is aprox. 12x12 killometers centered around City of Melbourne centre.

I need to first find the latitude & longitude of Melbourne city centre, using specific, well known address and Nominatim geocoding.

In [1]:
# @hidden_cell

import numpy as np # library to handle data in a vectorized manner

import pandas as pd

from bs4 import BeautifulSoup # this module helps in web scrapping.
import requests  # this module helps us to download a web page

import csv

#!conda install -c conda-forge geopy --yes # uncomment this line if you haven't completed the Foursquare API lab
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

#!conda install -c conda-forge folium=0.5.0 --yes # uncomment this line if you haven't completed the Foursquare API lab
import folium # map rendering library

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('Libraries imported.')

Libraries imported.


In [2]:
#from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="my_user_agent")
city ="Melbourne"
country ="Australia"
loc = geolocator.geocode(city+','+ country)
melbourne_centre = loc.latitude, loc.longitude

print('Coordinate of {}, {}'.format(city, country), "is latitude :-" ,loc.latitude," longtitude is:-" ,loc.longitude)

Coordinate of Melbourne, Australia is latitude :- -37.8142176  longtitude is:- 144.9631608


I'll create a grid of area candidates, equally spaced, centered around Melbourne city centre and within ~6km from Melbourne. The neighbourhoods will be defined as circular areas with a radius of 300 metres, so our neighbourhood centres will be 600 meters apart.

To accurately calculate distances we need to create a grid of locations in Cartesian 2D coordinate system which allows us to calculate distances in metres (not in latitude/longitude degrees). Then we'll project those coordinates back to latitude/longitude degrees to be shown on Folium map. So let's create functions to convert between WGS84 spherical coordinate system (latitude/longitude degrees) and UTM Cartesian coordinate system (X/Y coordinates in metres).

In [3]:
import warnings
warnings.filterwarnings('ignore')

#!pip install shapely
import shapely.geometry
from pyproj import Transformer

#!pip install pyproj
import pyproj

import math

def lonlat_to_xy(lon, lat):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    xy = pyproj.transform(proj_latlon, proj_xy, lon, lat)
    return xy[0], xy[1]

def xy_to_lonlat(x, y):
    proj_latlon = pyproj.Proj(proj='latlong',datum='WGS84')
    proj_xy = pyproj.Proj(proj="utm", zone=33, datum='WGS84')
    lonlat = pyproj.transform(proj_xy, proj_latlon, x, y)
    return lonlat[0], lonlat[1]

def calc_xy_distance(x1, y1, x2, y2):
    dx = x2 - x1
    dy = y2 - y1
    return math.sqrt(dx*dx + dy*dy)

print('Coordinate transformation check')
print('-------------------------------')
print('Melboourne center longitude={}, latitude={}'.format(loc.longitude, loc.latitude))
x, y = lonlat_to_xy(loc.longitude, loc.latitude)
print('Melboourne center UTM X={}, Y={}'.format(x, y))
lo, la = xy_to_lonlat(x, y)
print('Melboourne center longitude={}, latitude={}'.format(lo, la))

Coordinate transformation check
-------------------------------
Melboourne center longitude=144.9631608, latitude=-37.8142176
Melboourne center UTM X=4980281.116219562, Y=-14408028.424977692
Melboourne center longitude=144.96316080000003, latitude=-37.814217600000006


Let's create a <b>hexagonal grid of cells</b>: we offset every other row, and adjust vertical row spacing so that <b>every cell centre is equally distant from all it's neighbours</b>.

In [4]:
melbourne_centre_x, melbourne_centre_y = lonlat_to_xy(loc.longitude, loc.latitude) # City center in Cartesian coordinates

k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_min = melbourne_centre_x - 6000
x_step = 600
y_min = melbourne_centre_y - 6000 - (int(21/k)*k*600 - 12000)/2
y_step = 600 * k 

latitudes = []
longitudes = []
distances_from_center = []
xs = []
ys = []
for i in range(0, int(21/k)):
    y = y_min + i * y_step
    x_offset = 300 if i%2==0 else 0
    for j in range(0, 21):
        x = x_min + j * x_step + x_offset
        distance_from_center = calc_xy_distance(melbourne_centre_x, melbourne_centre_y, x, y)
        if (distance_from_center <= 6001):
            lon, lat = xy_to_lonlat(x, y)
            latitudes.append(lat)
            longitudes.append(lon)
            distances_from_center.append(distance_from_center)
            xs.append(x)
            ys.append(y)

print(len(latitudes), 'candidate neighborhood centres generated.')

364 candidate neighborhood centres generated.


Let's visualise the data we have so far: city centre location and candidate neighbourhood centres:

In [5]:
map_melbourne = folium.Map(location=melbourne_centre, zoom_start=13)
folium.Marker(melbourne_centre, popup='Melbourne').add_to(map_melbourne)
for lat, lon in zip(latitudes, longitudes):
    folium.Circle([lat, lon], radius=238, color='blue', fill=False).add_to(map_melbourne)
map_melbourne

OK, we now have the coordinates of centres of neighbourhoods/areas to be evaluated, equally spaced (distance from every point to it's neighbours is exactly the same) and within ~6km from Melbourne.

Let's now use Nominatim to get approximate addresses of those locations.

In [6]:
locator = Nominatim(user_agent='myGeocoder')
coordinates = loc.latitude, loc.longitude
location = locator.reverse(coordinates)
print('Reverse geocoding check')
print('-----------------------')
print('Address of [{}, {}] is: {}'.format(loc.latitude, loc.longitude, location.address))

Reverse geocoding check
-----------------------
Address of [-37.8142176, 144.9631608] is: Melbourne's GPO, 350, Bourke Street, Melbourne, Southbank, Melbourne, City of Melbourne, Victoria, 3000, Australia


In [7]:
print('Obtaining location addresses: ', end='')
addresses = []
for lat, lon in zip(latitudes, longitudes):
    coord = lat, lon
    loc = locator.reverse(coord)
#    address = get_address(google_api_key, lat, lon)
    if loc is None:
        loc = 'NO ADDRESS'
#    loc = loc.replace(', Australia', '') # We don't need country part of address
    addresses.append(loc)
    print(' .', end='')
print(' done.')

Obtaining location addresses:  . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . done.


In [8]:
addresses[150:170]

[Location(20, Stubbs Street, Kensington, Melbourne, City of Melbourne, Victoria, 3000, Australia, (-37.789566, 144.9356995, 0.0)),
 Location(Flemington Child Care Co-Op, Wellington Street, Newmarket, Flemington, Travancore, Melbourne, City of Moonee Valley, Victoria, 3031, Australia, (-37.7863002, 144.9314308, 0.0)),
 Location(John Street, Newmarket, Flemington, Travancore, Melbourne, City of Moonee Valley, Victoria, 3031, Australia, (-37.78403458787126, 144.92715222656383, 0.0)),
 Location(Alexandra Avenue, South Yarra, Toorak, Melbourne, City of Stonnington, Victoria, 3165, Australia, (-37.834085160076825, 145.00492807616075, 0.0)),
 Location(Richmond Terminal Station, CityLink, Richmond, Burnley, Melbourne, City of Yarra, Victoria, 3121, Australia, (-37.83061135, 145.00265305687753, 0.0)),
 Location(560, Church Street, Cremorne, Burnley, Melbourne, City of Yarra, Victoria, 3121, Australia, (-37.8291642, 144.9971351, 0.0)),
 Location(Dover Street, Cremorne, Abbotsford, Melbourne, Cit

Let's now place all this into a Pandas dataframe.

In [9]:
df_locations = pd.DataFrame({'Address': addresses,
                             'Latitude': latitudes,
                             'Longitude': longitudes,
                             'X': xs,
                             'Y': ys,
                             'Distance from center': distances_from_center})

df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center
0,"(The Esplanade, Clifton Hill, Melbourne, City ...",-37.788782,145.006813,4978481.0,-14413740.0,5992.495307
1,"(Dwyer Street, Clifton Hill, Melbourne, City o...",-37.786241,145.002438,4979081.0,-14413740.0,5840.3767
2,"(Walker Street, Westgarth, Northcote, Melbourn...",-37.783699,144.998063,4979681.0,-14413740.0,5747.173218
3,"(Westgarth Street/McLachlan Street, McLachlan ...",-37.781158,144.993689,4980281.0,-14413740.0,5715.767665
4,"(12, Bundara Street, Fitzroy North, Melbourne,...",-37.778616,144.989315,4980881.0,-14413740.0,5747.173218
5,"(58, May Street, East Brunswick Village, Fitzr...",-37.776075,144.984942,4981481.0,-14413740.0,5840.3767
6,"(266, Glenlyon Road, East Brunswick Village, F...",-37.773534,144.98057,4982081.0,-14413740.0,5992.495307
7,"(Yarra Bend Golf Course, Yarra Bend Road, Fair...",-37.795601,145.010603,4977581.0,-14413220.0,5855.766389
8,"(Eastern Freeway, Fairfield, Clifton Hill, Mel...",-37.79306,145.006227,4978181.0,-14413220.0,5604.462508
9,"(Public Toilets, Ramsden Street, Clifton Hill,...",-37.790518,145.001852,4978781.0,-14413220.0,5408.326913


...and let's now save/persist this data into local file.

In [10]:
df_locations.to_pickle('./locations.pkl')

### Foursquare
Now that we have our location candidates, let's use Foursquare API to get info on restaurants in each neighborhood.

We're interested in venues in 'food' category, but only those that are proper restaurants - coffe shops, pizza places, bakeries etc. are not direct competitors so we don't care about those. So we will include in out list only venues that have 'restaurant' in category name, and we'll make sure to detect and include all the subcategories of specific 'Italian restaurant' category, as we need info on Italian restaurants in the neighborhood.

In [11]:
# @hidden_cell
# Define the Foursquare credentials
CLIENT_ID = 'CRHYIZL0NC1XDQRZB3AJLLIP324LOKFK3J3HFXHKHXA2G2DE' # your Foursquare ID
CLIENT_SECRET = 'EQP1UOU1KRZUHHFWAWVKESVX4BTS5NY4CPKLAIKWPNLW0W4Z' # your Foursquare Secret
VERSION = '20180605' # Foursquare API version
LIMIT = 100 # A default Foursquare API limit value

In [12]:
# Category IDs corresponding to Italian restaurants were taken from Foursquare web site (https://developer.foursquare.com/docs/resources/categories):

food_category = '4d4b7105d754a06374d81259' # 'Root' category for all food-related venues

italian_restaurant_categories = ['4bf58dd8d48988d110941735','55a5a1ebe4b013909087cbb6','55a5a1ebe4b013909087cb7c',
                                 '55a5a1ebe4b013909087cba7','55a5a1ebe4b013909087cba1','55a5a1ebe4b013909087cba4',
                                 '55a5a1ebe4b013909087cb95','55a5a1ebe4b013909087cb89','55a5a1ebe4b013909087cb9b',
                                 '55a5a1ebe4b013909087cb98','55a5a1ebe4b013909087cbbf','55a5a1ebe4b013909087cb79',
                                 '55a5a1ebe4b013909087cbb0','55a5a1ebe4b013909087cbb3','55a5a1ebe4b013909087cb74',
                                 '55a5a1ebe4b013909087cbaa','55a5a1ebe4b013909087cb83','55a5a1ebe4b013909087cb8c',
                                 '55a5a1ebe4b013909087cb92','55a5a1ebe4b013909087cb8f','55a5a1ebe4b013909087cb86',
                                 '55a5a1ebe4b013909087cbb9','55a5a1ebe4b013909087cb7f','55a5a1ebe4b013909087cbbc',
                                 '55a5a1ebe4b013909087cb9e','55a5a1ebe4b013909087cbc2','55a5a1ebe4b013909087cbad']

def is_restaurant(categories, specific_filter=None):
    restaurant_words = ['restaurant', 'diner', 'taverna', 'steakhouse']
    restaurant = False
    specific = False
    for c in categories:
        category_name = c[0].lower()
        category_id = c[1]
        for r in restaurant_words:
            if r in category_name:
                restaurant = True
        if 'fast food' in category_name:
            restaurant = False
        if not(specific_filter is None) and (category_id in specific_filter):
            specific = True
            restaurant = True
    return restaurant, specific

def get_categories(categories):
    return [(cat['name'], cat['id']) for cat in categories]

def format_address(location):
    address = ', '.join(location['formattedAddress'])
    address = address.replace(', Deutschland', '')
    address = address.replace(', Germany', '')
    return address

def get_venues_near_location(lat, lon, category, client_id, client_secret, radius=500, limit=100):
    version = '20180724'
    url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&v={}&ll={},{}&categoryId={}&radius={}&limit={}'.format(
        client_id, client_secret, version, lat, lon, category, radius, limit)
    try:
        results = requests.get(url).json()['response']['groups'][0]['items']
        venues = [(item['venue']['id'],
                   item['venue']['name'],
                   get_categories(item['venue']['categories']),
                   (item['venue']['location']['lat'], item['venue']['location']['lng']),
                   format_address(item['venue']['location']),
                   item['venue']['location']['distance']) for item in results]        
    except:
        venues = []
    return venues

In [13]:
# Let's now go over our neighborhood locations and get nearby restaurants; we'll also maintain a dictionary of all found restaurants and all found italian restaurants

import pickle

def get_restaurants(lats, lons):
    restaurants = {}
    italian_restaurants = {}
    location_restaurants = []

    print('Obtaining venues around candidate locations:', end='')
    for lat, lon in zip(lats, lons):
        # Using radius=350 to meke sure we have overlaps/full coverage so we don't miss any restaurant (we're using dictionaries to remove any duplicates resulting from area overlaps)
        venues = get_venues_near_location(lat, lon, food_category, CLIENT_ID, CLIENT_SECRET, radius=350, limit=100)
        area_restaurants = []
        for venue in venues:
            venue_id = venue[0]
            venue_name = venue[1]
            venue_categories = venue[2]
            venue_latlon = venue[3]
            venue_address = venue[4]
            venue_distance = venue[5]
            is_res, is_italian = is_restaurant(venue_categories, specific_filter=italian_restaurant_categories)
            if is_res:
                x, y = lonlat_to_xy(venue_latlon[1], venue_latlon[0])
                restaurant = (venue_id, venue_name, venue_latlon[0], venue_latlon[1], venue_address, venue_distance, is_italian, x, y)
                if venue_distance<=300:
                    area_restaurants.append(restaurant)
                restaurants[venue_id] = restaurant
                if is_italian:
                    italian_restaurants[venue_id] = restaurant
        location_restaurants.append(area_restaurants)
        print(' .', end='')
    print(' done.')
    return restaurants, italian_restaurants, location_restaurants

# Try to load from local file system in case we did this before
restaurants = {}
italian_restaurants = {}
location_restaurants = []
loaded = False
try:
    with open('restaurants_350.pkl', 'rb') as f:
        restaurants = pickle.load(f)
    with open('italian_restaurants_350.pkl', 'rb') as f:
        italian_restaurants = pickle.load(f)
    with open('location_restaurants_350.pkl', 'rb') as f:
        location_restaurants = pickle.load(f)
    print('Restaurant data loaded.')
    loaded = True
except:
    pass

# If load failed use the Foursquare API to get the data
if not loaded:
    restaurants, italian_restaurants, location_restaurants = get_restaurants(latitudes, longitudes)
    
    # Let's persists this in local file system
    with open('restaurants_350.pkl', 'wb') as f:
        pickle.dump(restaurants, f)
    with open('italian_restaurants_350.pkl', 'wb') as f:
        pickle.dump(italian_restaurants, f)
    with open('location_restaurants_350.pkl', 'wb') as f:
        pickle.dump(location_restaurants, f)

Restaurant data loaded.


In [14]:
print('Total number of restaurants:', len(restaurants))
print('Total number of Italian restaurants:', len(italian_restaurants))
print('Percentage of Italian restaurants: {:.2f}%'.format(len(italian_restaurants) / len(restaurants) * 100))
print('Average number of restaurants in neighborhood:', np.array([len(r) for r in location_restaurants]).mean())

Total number of restaurants: 1085
Total number of Italian restaurants: 116
Percentage of Italian restaurants: 10.69%
Average number of restaurants in neighborhood: 3.9175824175824174


In [15]:
print('List of all restaurants')
print('-----------------------')
for r in list(restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(restaurants))

List of all restaurants
-----------------------
('4b7a4d0cf964a520ea282fe3', 'Curry Cafe', -37.78049573561143, 144.99678286765194, '73 High St, Melbourne VIC 3070, Australia', 282, False, 4980059.313184422, -14414020.989189155)
('5287346211d2a8114bd6068a', 'Base Camp', -37.779783, 144.997073, '102 High St, Northcote VIC 3070, Australia', 334, False, 4980092.10779405, -14414120.163098266)
('4bc044d5461576b0e8f47932', 'Taxiboat', -37.77982, 144.9971, '100 High St, Northcote VIC 3070, Australia', 335, False, 4980086.648545134, -14414117.76453932)
('4b058746f964a5205d8822e3', 'Moroccan Soup Bar', -37.780446, 144.986512, '183 St Georges Rd, Melbourne VIC 3068, Australia', 190, False, 4980980.740782806, -14413354.998286394)
('4b95ecccf964a520bcb734e3', 'Munsterhaus', -37.77900686558561, 144.98749052600607, '371 St Georges Rd, Fitzroy North, Melbourne VIC 3068, Australia', 313, False, 4981011.923366989, -14413580.935665874)
('4b1b783df964a52090fb23e3', 'Malaymas', -37.78028361559838, 144.9864

In [16]:
print('List of Italian restaurants')
print('---------------------------')
for r in list(italian_restaurants.values())[:10]:
    print(r)
print('...')
print('Total:', len(italian_restaurants))

List of Italian restaurants
---------------------------
('4ba196cef964a52074c237e3', 'Al Albero', -37.7794, 144.98738, '354 St Georges Rd, Fitzroy North VIC 3068, Australia', 286, True, 4980989.40296849, -14413529.468202278)
('4ba87040f964a52015db39e3', 'Supermaxi', -37.780849362909166, 144.98588875436604, '305 St Georges Rd, Fitzroy North VIC, Australia', 145, True, 4981003.167237139, -14413268.849588754)
('4b6aa76bf964a520ebda2be3', 'Bar Idda', -37.77405993926217, 144.97135365465144, '132 Lygon St, Brunswick VIC 3057, Australia', 333, True, 4982860.959430872, -14413082.030837916)
('4e0d9a57d164fff335a29554', 'Pinotta', -37.783977648068635, 144.98324965235872, 'Fitzroy North VIC, Australia', 85, True, 4980981.073151112, -14412744.244328575)
('4b74f8dcf964a520c3f92de3', '400 Gradi', -37.77598922331927, 144.97097113068276, '99 Lygon St (Weston St), Brunswick East VIC 3057, Australia', 116, True, 4982736.087214403, -14412839.89899825)
('4b7fc859f964a520883d30e3', 'Zia Teresa', -37.77607,

In [17]:
print('Restaurants around location')
print('---------------------------')
for i in range(100, 110):
    rs = location_restaurants[i][:8]
    names = ', '.join([r[1] for r in rs])
    print('Restaurants around location {}: {}'.format(i+1, names))

Restaurants around location
---------------------------
Restaurants around location 101: 
Restaurants around location 102: Tom Toon Thai Noodle Cafe, Hektik Kebabs, Richmond Seafood Tavern
Restaurants around location 103: Amaretto Trattoria, Papirica, Chotto
Restaurants around location 104: Neko Neko, IDES, Charcoal Lane, Wabi Sabi Salon, Bowl Bowl, Tokushima, Biggie Smalls, Añada
Restaurants around location 105: Ichi Ni Nana Izakaya, Bon Ap' Petit Bistro, Sonido!, Cutler & Co., Blue Chillies, Smith & Daughters, Village People Hawker Foodhall
Restaurants around location 106: Mon Ami, East Imperial
Restaurants around location 107: Shakahari, Donnini's Restaurant, Gemma Simply Italian, Abla's Lebanese Restaurant, D.O.C Espresso, Trotters, Papa Gino's, Dimattina's
Restaurants around location 108: Bento King, Oriental House Chinese Takeaway, Uni Curry, Egg Sake Bistro, Zambrero
Restaurants around location 109: 
Restaurants around location 110: 


Let's now see all the collected restaurants (Blue) in our area of interest on map, and let's also show Italian restaurants in a different colour (Red).

In [18]:
map_melbourne = folium.Map(location=melbourne_centre, zoom_start=13)
folium.Marker(melbourne_centre, popup='Melbourne').add_to(map_melbourne)
for res in restaurants.values():
    lat = res[2]; lon = res[3]
    is_italian = res[6]
    color = 'red' if is_italian else 'blue'
    folium.CircleMarker([lat, lon], radius=3, color=color, fill=True, fill_color=color, fill_opacity=1).add_to(map_melbourne)
map_melbourne

Now we have all the restaurants in area within few kilometers from Melbourne, and we know which ones are Italian restaurants! We also know which restaurants exactly are in vicinity of every neighbourhood candidate centre.

This concludes the data gathering phase - we're now ready to use this data for analysis to produce the report on optimal locations for a new Italian restaurant!

## Methodology

In this project I will direct my efforts on detecting areas of City of Melbourne that have low restaurant density, particularly those with low number of Italian restaurants. I will limit my analysis to area ~6km around the city centre.

In the first step, I collected the required <b>data: location and type (category) of every restaurant within 6km from Melbourne centre</b>. I have also identified Italian restaurants (according to Foursquare categorisation).

Second step in the analysis will be calculation and exploration of <b>'restaurant density'</b> across different areas of Melbourne - we will use <b>heatmaps</b> to identify a few promising areas close to centre with low number of restaurants in general (and no Italian restaurants in vicinity) and focus our attention on those areas.

In the third and final step, I will focus on the most promising areas and within those create <b>clusters of locations that meet some basic requirements</b> established in discussion with stakeholders: we will take into consideration locations with <b>no more than two restaurants in radius of 250 metres</b>, and want locations <b>without Italian restaurants in radius of 400 metres</b>. I will present a map of all such locations but also create clusters (using <b>k-means clustering</b>) of those locations to identify general zones / neighbourhoods / addresses which should be a starting point for final 'street level' exploration and search for optimal venue location by stakeholders.

## Analysis

Let's perform some basic explanatory data analysis and derive some additional info from our raw data. First let's count the <b>number of restaurants in every area candidate</b>:

In [19]:
location_restaurants_count = [len(res) for res in location_restaurants]

df_locations['Restaurants in area'] = location_restaurants_count

print('Average number of restaurants in every area with radius=238m:', np.array(location_restaurants_count).mean())

df_locations.head(10)

Average number of restaurants in every area with radius=238m: 3.9175824175824174


Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area
0,"(The Esplanade, Clifton Hill, Melbourne, City ...",-37.788782,145.006813,4978481.0,-14413740.0,5992.495307,0
1,"(Dwyer Street, Clifton Hill, Melbourne, City o...",-37.786241,145.002438,4979081.0,-14413740.0,5840.3767,0
2,"(Walker Street, Westgarth, Northcote, Melbourn...",-37.783699,144.998063,4979681.0,-14413740.0,5747.173218,0
3,"(Westgarth Street/McLachlan Street, McLachlan ...",-37.781158,144.993689,4980281.0,-14413740.0,5715.767665,1
4,"(12, Bundara Street, Fitzroy North, Melbourne,...",-37.778616,144.989315,4980881.0,-14413740.0,5747.173218,3
5,"(58, May Street, East Brunswick Village, Fitzr...",-37.776075,144.984942,4981481.0,-14413740.0,5840.3767,0
6,"(266, Glenlyon Road, East Brunswick Village, F...",-37.773534,144.98057,4982081.0,-14413740.0,5992.495307,0
7,"(Yarra Bend Golf Course, Yarra Bend Road, Fair...",-37.795601,145.010603,4977581.0,-14413220.0,5855.766389,0
8,"(Eastern Freeway, Fairfield, Clifton Hill, Mel...",-37.79306,145.006227,4978181.0,-14413220.0,5604.462508,0
9,"(Public Toilets, Ramsden Street, Clifton Hill,...",-37.790518,145.001852,4978781.0,-14413220.0,5408.326913,0


Let's calculate the distance to the <b>nearest Italian restaurant from every area candidate centre</b> (not only those within 300m - we want distance to closest one, regardless of how distant it is).

In [20]:
distances_to_italian_restaurant = []

for area_x, area_y in zip(xs, ys):
    min_distance = 10000
    for res in italian_restaurants.values():
        res_x = res[7]
        res_y = res[8]
        d = calc_xy_distance(area_x, area_y, res_x, res_y)
        if d<min_distance:
            min_distance = d
    distances_to_italian_restaurant.append(min_distance)

df_locations['Distance to Italian restaurant'] = distances_to_italian_restaurant

In [21]:
df_locations.head(10)

Unnamed: 0,Address,Latitude,Longitude,X,Y,Distance from center,Restaurants in area,Distance to Italian restaurant
0,"(The Esplanade, Clifton Hill, Melbourne, City ...",-37.788782,145.006813,4978481.0,-14413740.0,5992.495307,0,2517.460824
1,"(Dwyer Street, Clifton Hill, Melbourne, City o...",-37.786241,145.002438,4979081.0,-14413740.0,5840.3767,0,1920.329373
2,"(Walker Street, Westgarth, Northcote, Melbourn...",-37.783699,144.998063,4979681.0,-14413740.0,5747.173218,0,1325.790633
3,"(Westgarth Street/McLachlan Street, McLachlan ...",-37.781158,144.993689,4980281.0,-14413740.0,5715.767665,1,740.119385
4,"(12, Bundara Street, Fitzroy North, Melbourne,...",-37.778616,144.989315,4980881.0,-14413740.0,5747.173218,3,240.484106
5,"(58, May Street, East Brunswick Village, Fitzr...",-37.776075,144.984942,4981481.0,-14413740.0,5840.3767,0,536.552427
6,"(266, Glenlyon Road, East Brunswick Village, F...",-37.773534,144.98057,4982081.0,-14413740.0,5992.495307,0,1023.041392
7,"(Yarra Bend Golf Course, Yarra Bend Road, Fair...",-37.795601,145.010603,4977581.0,-14413220.0,5855.766389,0,2384.652193
8,"(Eastern Freeway, Fairfield, Clifton Hill, Mel...",-37.79306,145.006227,4978181.0,-14413220.0,5604.462508,0,2513.053222
9,"(Public Toilets, Ramsden Street, Clifton Hill,...",-37.790518,145.001852,4978781.0,-14413220.0,5408.326913,0,2216.186881


In [22]:
print('Average distance to closest Italian restaurant from each area center:', df_locations['Distance to Italian restaurant'].mean())

Average distance to closest Italian restaurant from each area center: 880.8365174955554


OK, so <b>on average Italian restaurant can be found within ~900m</b> from every area centre candidate. That's fairly close, so we need to filter our areas carefully!

Let's crete a map showing <b>heatmap / density of restaurants</b> and try to extract some meaningful info from that. Also, let's show <b>borders of City of Melbourne boroughs</b> on our map and a few circles indicating distance of 1km, 2km and 3km from Melbourne.

In [None]:
melbourne_boroughs_url = 'https://data.gov.au/geoserver/vic-suburb-locality-boundaries-psma-administrative-boundaries/wfs?request=GetFeature&typeName=ckan_af33dd8c_0534_4e18_9245_fc64440f742e&outputFormat=json'

#melbourne_boroughs_url = 'https://github.com/tcol1404/coursera-capstone-project/blob/main/City_of_Melbourne_Boundaries.geojson'
melbourne_boroughs = requests.get(melbourne_boroughs_url).json()

def boroughs_style(feature):
    return { 'color': 'blue', 'fill': False }

In [23]:
restaurant_latlons = [[res[2], res[3]] for res in restaurants.values()]

italian_latlons = [[res[2], res[3]] for res in italian_restaurants.values()]

In [24]:
from folium import plugins
from folium.plugins import HeatMap

map_melbourne = folium.Map(location=melbourne_centre, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_melbourne) #cartodbpositron cartodbdark_matter
HeatMap(restaurant_latlons).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=1000, fill=False, color='white').add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=2000, fill=False, color='white').add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=3000, fill=False, color='white').add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

Looks like a few pockets of low restaurant density closest to city centre can be found <b>north-east, east, and south-east from City of Melbourne</b>.

<b>Let's create another heatmap map showing heatmap/density of Italian restaurants only</b>.

In [26]:
map_melbourne = folium.Map(location=melbourne_centre, zoom_start=13)
folium.TileLayer('cartodbpositron').add_to(map_melbourne) #cartodbpositron cartodbdark_matter
HeatMap(italian_latlons).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=1000, fill=False, color='white').add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=2000, fill=False, color='white').add_to(map_melbourne)
folium.Circle(melbourne_centre, radius=3000, fill=False, color='white').add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

This map is not so 'hot' (Italian restaurants represent a subset of ~10% of all restaurants in City of Melbourne) but it also indicates higher density of existing Italian restaurants directly north and south from Melbourne, with closest pockets of <b>low Italian restaurant density positioned north-west, north-east and south-east from the city centre</b>.

Based on this we will now focus our analysis on areas north-west, north-east and south-east from Melbourne centre - we will move the centre of our area of interest and reduce it's size to have a radius of <b>2.5km</b>. This places our location candidates mostly in boroughs <b>Docklands and East Melbourne</b> (another potentially interesting borough is <b>Richmond</b> with large low restaurant density south-east from the city centre, however this borough is less interesting to stakeholders as it's mostly a sports precinct that does not have consistent visitors throughout the week).

## Docklands and East Melbourne

Analysis of popular web sites indicate that Docklands and East Melbourne are:

- <b>Docklands</b>: The myriad public artworks make for an inspiring breadcrumb trail on a stroll around the Docklands, where there's something for everyone from foodies to footy fans, excited kids and shoppers with keen eyes for a bargain.
- <b>East Melbourne</b>: East Melbourne is marked by stately Victorian terraces, art deco buildings and parks. Spring Street offers European-style cafes and glitzy musicals at the opulent Princess Theatre. Office workers lunch in sprawling Fitzroy Gardens, while the wide steps of nearby Parliament House are a popular meeting place.

*"Docklands is a modern harbour development dominated by high-rises and the colourful Melbourne Star Observation Wheel, and popular for its shopping and waterside dining."* (google.com)

*"There's a smorgasbord of dining options at The District, Newquay, Victoria Harbour and Waterfront City, tucked in among the entertainment zone, beneath luxury apartments and lining the marina."* (https://www.visitvictoria.com/Regions/Melbourne/Destinations/Docklands)

*"East Melbourne is an established area to the east of the central city, home to many 19th century homes, iconic landmarks and the heritage listed Fitzroy, Treasury and Parliament gardens."* (https://participate.melbourne.vic.gov.au/east-melbourne-profile?_ga=2.221333784.1881709526.1626002896-1518540431.1623374114)

*"East Melbourne has long been home to many significant government, health and religious institutions, including the Parliament of Victoria and offices of the Government of Victoria in the Parliamentary and Cathedral precincts"* (https://en.wikipedia.org/wiki/East_Melbourne,_Victoria)

Popular with tourists, office workers and hippies, relatively close to city centre and well connected, those boroughs appear to justify further analysis.

Let's define new, more narrow region of interest, which will include low-restaurant-count parts of Docklands and East Melbourne closest to Melbourne city centre.

In [27]:
roi_x_min = melbourne_centre_x - 2000
roi_y_max = melbourne_centre_y + 1000
roi_width = 5000
roi_height = 5000
roi_center_x = roi_x_min + 2500
roi_center_y = roi_y_max - 2500
roi_center_lon, roi_center_lat = xy_to_lonlat(roi_center_x, roi_center_y)
roi_center = [roi_center_lat, roi_center_lon]

map_melbourne = folium.Map(location=roi_center, zoom_start=14)
HeatMap(restaurant_latlons).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.4).add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

This nicely covers all the pockets of low restaurant density in Docklands and East Melbourne closest to Melbourne city centre.

Let's also create new, more dense grid of location candidates restricted to our new region of interest (let's make our location candidates 100m appart).

In [28]:
k = math.sqrt(3) / 2 # Vertical offset for hexagonal grid cells
x_step = 100
y_step = 100 * k 
roi_y_min = roi_center_y - 2500

roi_latitudes = []
roi_longitudes = []
roi_xs = []
roi_ys = []
for i in range(0, int(51/k)):
    y = roi_y_min + i * y_step
    x_offset = 50 if i%2==0 else 0
    for j in range(0, 51):
        x = roi_x_min + j * x_step + x_offset
        d = calc_xy_distance(roi_center_x, roi_center_y, x, y)
        if (d <= 2501):
            lon, lat = xy_to_lonlat(x, y)
            roi_latitudes.append(lat)
            roi_longitudes.append(lon)
            roi_xs.append(x)
            roi_ys.append(y)

print(len(roi_latitudes), 'candidate neighborhood centres generated.')

2261 candidate neighborhood centres generated.


Now let's calculate two most important things for each location candidate: <b>number of restaurants in vicinity</b> (we'll use radius of <b>250 metres</b>) and <b>distance to closest Italian restaurant</b>.

In [29]:
def count_restaurants_nearby(x, y, restaurants, radius=250):    
    count = 0
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=radius:
            count += 1
    return count

def find_nearest_restaurant(x, y, restaurants):
    d_min = 100000
    for res in restaurants.values():
        res_x = res[7]; res_y = res[8]
        d = calc_xy_distance(x, y, res_x, res_y)
        if d<=d_min:
            d_min = d
    return d_min

roi_restaurant_counts = []
roi_italian_distances = []

print('Generating data on location candidates... ', end='')
for x, y in zip(roi_xs, roi_ys):
    count = count_restaurants_nearby(x, y, restaurants, radius=250)
    roi_restaurant_counts.append(count)
    distance = find_nearest_restaurant(x, y, italian_restaurants)
    roi_italian_distances.append(distance)
print('done.')

Generating data on location candidates... done.


In [30]:
# Let's put this into dataframe
df_roi_locations = pd.DataFrame({'Latitude':roi_latitudes,
                                 'Longitude':roi_longitudes,
                                 'X':roi_xs,
                                 'Y':roi_ys,
                                 'Restaurants nearby':roi_restaurant_counts,
                                 'Distance to Italian restaurant':roi_italian_distances})

df_roi_locations.head(10)

Unnamed: 0,Latitude,Longitude,X,Y,Restaurants nearby,Distance to Italian restaurant
0,-37.789177,144.98125,4980731.0,-14412030.0,0,533.198708
1,-37.788754,144.980521,4980831.0,-14412030.0,0,443.258561
2,-37.792009,144.984797,4980181.0,-14411940.0,0,912.570716
3,-37.791585,144.984068,4980281.0,-14411940.0,0,947.487541
4,-37.791161,144.983339,4980381.0,-14411940.0,0,848.475061
5,-37.790738,144.98261,4980481.0,-14411940.0,0,749.724762
6,-37.790314,144.981881,4980581.0,-14411940.0,0,651.3559
7,-37.78989,144.981152,4980681.0,-14411940.0,1,553.571854
8,-37.789466,144.980423,4980781.0,-14411940.0,0,456.748385
9,-37.789043,144.979694,4980881.0,-14411940.0,0,361.657816


Let's now <b>filter</b> those locations: we're interested only in <b>locations with no more than two restaurants in radius of 250 metres</b>, and <b>no Italian restaurants in radius of 400 metres</b>.

In [31]:
good_res_count = np.array((df_roi_locations['Restaurants nearby']<=2))
print('Locations with no more than two restaurants nearby:', good_res_count.sum())

good_ita_distance = np.array(df_roi_locations['Distance to Italian restaurant']>=400)
print('Locations with no Italian restaurants within 400m:', good_ita_distance.sum())

good_locations = np.logical_and(good_res_count, good_ita_distance)
print('Locations with both conditions met:', good_locations.sum())

df_good_locations = df_roi_locations[good_locations]

Locations with no more than two restaurants nearby: 1239
Locations with no Italian restaurants within 400m: 1145
Locations with both conditions met: 866


Let's see how this looks on a map.

In [32]:
good_latitudes = df_good_locations['Latitude'].values
good_longitudes = df_good_locations['Longitude'].values

good_locations = [[lat, lon] for lat, lon in zip(good_latitudes, good_longitudes)]

map_melbourne = folium.Map(location=roi_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_melbourne)
HeatMap(restaurant_latlons).add_to(map_melbourne)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.6).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne) 
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

We now have a bunch of locations fairly close to Melbourne (mostly in Docklands, East Melbourne and north of city centre), and we know that each of those locations has no more than two restaurants in radius of 250m, and no Italian restaurant closer than 400m. Any of those locations is a potential candidate for a new Italian restaurant, at least based on nearby competition.

Let's now show those good locations in a form of heatmap:

In [33]:
map_melbourne = folium.Map(location=roi_center, zoom_start=14)
HeatMap(good_locations, radius=25).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

Looking good. What we have now is a clear indication of zones with low number of restaurants in vicinity, and no Italian restaurants at all nearby.

Let us now <b>cluster</b> those locations to create <b>centres of zones containing good locations</b>. Those zones, their centres and addresses will be the final result of our analysis.

In [34]:
#from sklearn.cluster import KMeans

number_of_clusters = 10

good_xys = df_good_locations[['X', 'Y']].values
kmeans = KMeans(n_clusters=number_of_clusters, random_state=0).fit(good_xys)

cluster_centres = [xy_to_lonlat(cc[0], cc[1]) for cc in kmeans.cluster_centers_]

map_melbourne = folium.Map(location=roi_center, zoom_start=14)
folium.TileLayer('cartodbpositron').add_to(map_melbourne)
HeatMap(restaurant_latlons).add_to(map_melbourne)
folium.Circle(roi_center, radius=2500, color='white', fill=True, fill_opacity=0.4).add_to(map_melbourne)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lon, lat in cluster_centres:
    folium.Circle([lat, lon], radius=500, color='green', fill=True, fill_opacity=0.25).add_to(map_melbourne) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

The clusters represent groupings of most of the candidate locations and cluster centres are placed nicely in the middle of the zones 'rich' with location candidates.

Addresses of those cluster centres will be a good starting point for exploring the neighbourhoods to find the best possible location based on neighbourhood specifics.

Let's see those zones on a city map without heatmap, using shaded areas to indicate our clusters:

In [35]:
map_melbourne = folium.Map(location=roi_center, zoom_start=14)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#00000000', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne)
for lon, lat in cluster_centres:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_melbourne) 
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

Let's zoom in on candidate areas in Docklands:

In [36]:
map_melbourne = folium.Map(location=[-37.8082, 144.9578], zoom_start=15)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lon, lat in cluster_centres:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_melbourne) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

...and candidate areas in East Melbourne:

In [37]:
map_melbourne = folium.Map(location=[-37.815018, 144.946014], zoom_start=15)
folium.Marker(melbourne_centre).add_to(map_melbourne)
for lon, lat in cluster_centres:
    folium.Circle([lat, lon], radius=500, color='green', fill=False).add_to(map_melbourne) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.07).add_to(map_melbourne)
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.CircleMarker([lat, lon], radius=2, color='blue', fill=True, fill_color='blue', fill_opacity=1).add_to(map_melbourne)
#folium.GeoJson(melbourne_boroughs, style_function=boroughs_style, name='geojson').add_to(map_melbourne)
map_melbourne

Finaly, let's reverse geocode those candidate area centres to get the addresses which can be presented to stakeholders.

In [38]:
candidate_area_addresses = []
print('==============================================================')
print('Addresses of centres of areas recommended for further analysis')
print('==============================================================\n')
for lon, lat in cluster_centres:
    coord = lat, lon
    addr = locator.reverse(coord)
#    addr = get_address(google_api_key, lat, lon).replace(', Germany', '')
    candidate_area_addresses.append(addr)    
    x, y = lonlat_to_xy(lon, lat)
    d = calc_xy_distance(x, y, melbourne_centre_x, melbourne_centre_y)
    print('{}{} => {:.1f}km from Melbourne'.format(addr, ' '*(50-len(addr)), d/1000))

Addresses of centres of areas recommended for further analysis

Lansdowne Street, East Melbourne, Melbourne, City of Melbourne, Victoria, 3002, Australia                                                 => 1.7km from Melbourne
East West Link (Stage 1), Clifton Hill, Fitzroy, Melbourne, City of Yarra, Victoria, 3065, Australia                                                 => 3.2km from Melbourne
Flagstaff City Inn, 45, Dudley Street, West Melbourne, North Melbourne, Melbourne, City of Melbourne, Victoria, 3003, Australia                                                 => 1.5km from Melbourne
33, Cliveden Close, East Melbourne, Melbourne, City of Melbourne, Victoria, 3000, Australia                                                 => 1.6km from Melbourne
Center Avenue, Carlton North, Princes Hill, Melbourne, City of Melbourne, Victoria, 3054, Australia                                                 => 3.4km from Melbourne
MB Armistice, 163-165R, Alexandra Parade, Fitzroy North, Melbourn

This concludes the analysis. I have created 10 addresses representing centres of zones containing locations with low number of restaurants and no Italian restaurants nearby, all zones being fairly close to city centre (all less than 4km from Melbourne city centre, and about half of those less than 2km from Melbourne city centre). Although zones are shown on map with a radius of ~500 meters (green circles), their shape is actually very irregular and their centres/addresses should be considered only as a starting point for exploring area neighbourhoods in search for potential restaurant locations. Most of the zones are located in Docklands and East Melbourne boroughs, which we have identified as interesting due to being popular with tourists, fairly close to the city centre and well connected by public transport.

In [39]:
map_melbourne = folium.Map(location=roi_center, zoom_start=14)
folium.Circle(melbourne_centre, radius=50, color='red', fill=True, fill_color='red', fill_opacity=1).add_to(map_melbourne)
for lonlat, addr in zip(cluster_centres, candidate_area_addresses):
    folium.Marker([lonlat[1], lonlat[0]], popup=addr).add_to(map_melbourne) 
for lat, lon in zip(good_latitudes, good_longitudes):
    folium.Circle([lat, lon], radius=250, color='#0000ff00', fill=True, fill_color='#0066ff', fill_opacity=0.05).add_to(map_melbourne)
map_melbourne

## Results and Discussion

Our analysis shows that although there is a great number of restaurants in Melbourne (~2000 in our initial area of interest which was 12x12km around Melbourne city centre), there are pockets of low restaurant density fairly close to city centre. Highest concentration of restaurants was detected in the city centre, north-east and south from Melbourne, so I focused my attention to areas north-west, north-east and south-east from the city centre, corresponding to boroughs Docklands, East Melbourne and north of central Melbourne. Another borough was identified as potentially interesting (Richmond, south-east from central Melbourne), but our attention was focused on Docklands and East Melbourne which offer a combination of popularity among tourists, closeness to city centre, strong socio-economic dynamics and a number of pockets of low restaurant density.

After directing our attention to this narrower area of interest (covering approx. 5x5km south-east from central Melbourne) we first created a dense grid of location candidates (spaced 100m apart); those locations were then filtered so that those with more than two restaurants in radius of 250m and those with an Italian restaurant closer than 400m were removed.

Those location candidates were then clustered to create zones of interest which contain greatest number of location candidates. Addresses of centres of those zones were also generated using reverse geocoding to be used as markers/starting points for more detailed local analysis based on other factors.

Result of all this is 10 zones containing largest number of potential new restaurant locations based on number of and distance to existing venues - both restaurants in general and Italian restaurants particularly. This, of course, does not imply that those zones are actually optimal locations for a new restaurant! Purpose of this analysis was to only provide info on areas close to Melbourne central, but not crowded with existing restaurants (particularly Italian) - it is entirely possible that there is a very good reason for small number of restaurants in any of those areas, reasons which would make them unsuitable for a new restaurant regardless of lack of competition in the area. Recommended zones should therefore be considered only as a starting point for more detailed analysis which could eventually result in locations which have not only no nearby competition, but also other factors considered and all other relevant conditions met.

## Conclusion

The purpose of this analysis was to identify Melbourne areas close to the city centre with a low number of restaurants (particularly Italian restaurants) in order to aid stakeholders in narrowing down the search for optimal locations for a new Italian restaurant. By calculating restaurant density distribution from Foursquare data, I have first identified general boroughs that justify further analysis (Docklands and East Melbourne), and then generated extensive collection of locations which satisfy some basic requirements regarding existing nearby restaurants. Clustering of those locations was then performed to create major zones of interest (containing the greatest number of potential locations) and addresses of those zone centres were created to be used as starting points for final exploration by stakeholders.

The final decision on optimal restaurant location will be made by stakeholders based on specific characteristics of neighbourhoods and locations in every recommended zone, taking into consideration additional factors like attractiveness of each location (proximity to park or water), levels of noise / proximity to major roads, real estate availability, prices, social and economic dynamics of every neighbourhood etc.