<div class="usecase-title">New Business Location Use Case</div>

<div class="usecase-authors"><b>Authored by: </b> Steven Tuften</div>

<div class="usecase-duration"><b>Duration:</b> 90 mins</div>

<div class="usecase-level-skill">
    <div class="usecase-level"><b>Level: </b>Intermediate</div>
    <div class="usecase-skill"><b>Pre-requisite Skills: </b>Python</div>
</div>

<div class="usecase-section-header">Scenario</div>

#### As a Cafe, Restaurant or Bar, I am looking for commercial space in the City of Melbourne where I can open a new venue or extend my existing venue.

#### I would like to know where similar businesses are located and the density of residents and office workers in comparison.

#### I want to know the number of seats I should provide based on seating capacity at other similar establishments in the same area.

<div class="usecase-section-header">What this use case will teach you</div>

At the end of this use case you will:
- understand what CLUE data is and how to access it
- have explored a dataset derived from the CLUE survey
- learnt how to visualise CLUE data using different mapping visualisation techniques

<div class="usecase-section-header">A brief introduction to CLUE data</div>

The City of Melbourne conducts a comprehensive bi-annual survey of its residents and businesses called the "Census of Land Use and Employment (CLUE)". CLUE captures key information on land use, employment, and economic activity across the City of Melbourne.

CLUE datasets are a valuable tool for businesses looking to invest in the City of Melbourne and for researchers wanting to understand those factors that influence and shape the social and economic dynamics of Australia's second largest metropolis and one of the world's most liveable cities.

CLUE data assists the City of Melbourne's business planning, policy development and strategic decision making. Investors, consultants, students, urban researchers, property analysts, businesses and developers can take advantage of CLUE to understand customers, the marketplace and the changing form and nature of the city.

Source: __[CLUE]( https://data.melbourne.vic.gov.au/stories/s/CLUE/rt3z-vy3t?src=hdr)__

This use case utilises various CLUE datasets to illustrate their value to Data Scientists, Researchers and Software Developers.

### CLUE Geospatial Data

CLUE Data is often coded to a specific location (Latitude and Longitude) and/or to a City precinct, referred to as the "CLUE small area". Datasets may also include the individual city block within a precinct referred to by its CLUE Block ID.

The geospatial coordinates describing these areas as polygons can be downloaded in GeoJSON format and used to show shaded areas on a map, known as a choropleth map. This can be a useful technique for illustrating broad trends or statistics for a city area rather than a specific location.

A map visualisation of CLUE Blocks and small areas can be found at the following links:
- __[CLUE small areas](https://data.melbourne.vic.gov.au/Business/Small-Areas-for-Census-of-Land-Use-and-Employment-/gei8-3w86)__
- __[CLUE Blocks](https://data.melbourne.vic.gov.au/Business/Blocks-for-Census-of-Land-Use-and-Employment-CLUE-/aia8-ryiq)__


<div class="usecase-section-header">Which CLUE data should I use?</div>

To begin we shall first import the necessary libraries to support our exploratory data analysis and visualisation of the CLUE data.

The following are core packages required for this exercise:

- The plotly.express package lets use build interact maps using map box services.

In [1]:
import os
import time
import requests
from io import StringIO
from datetime import datetime
import numpy as np
import pandas as pd
import plotly.graph_objs as go
import plotly.express as px

In [2]:

def API_Unlimited(datasetname): # pass in dataset name and api key
    dataset_id = datasetname

    base_url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/'
    #apikey = api_key
    dataset_id = dataset_id
    format = 'csv'

    url = f'{base_url}{dataset_id}/exports/{format}'
    params = {
        'select': '*',
        'limit': -1,  # all records
        'lang': 'en',
        'timezone': 'UTC'
    }

    # GET request
    response = requests.get(url, params=params)

    if response.status_code == 200:
        # StringIO to read the CSV data
        url_content = response.content.decode('utf-8')
        datasetname = pd.read_csv(StringIO(url_content), delimiter=';')
        print(datasetname.sample(10, random_state=999)) # Test
        return datasetname
    else:
        return (print(f'Request failed with status code {response.status_code}'))


"""
Get unlimited data from the API Function

Parameters:
datasetname (string): dataset name as from city of melbourn
apikey (string): the current api Key ( this should be gotton via the below if api stored in current workspace / google drive ( refer to Te API)

f = open("API.txt","r")
api_key = f.read()

Returns:
Csv : Returns the csv dataset of the dataset name
"""


'\nGet unlimited data from the API Function\n\nParameters:\ndatasetname (string): dataset name as from city of melbourn\napikey (string): the current api Key ( this should be gotton via the below if api stored in current workspace / google drive ( refer to Te API)\n\nf = open("API.txt","r")\napi_key = f.read()\n\nReturns:\nCsv : Returns the csv dataset of the dataset name\n'

In [3]:

import geopandas as gpd

def fetch_geojson_dataset_API(dataset_id): # pass in dataset name and api key

    base_url = 'https://data.melbourne.vic.gov.au/api/v2/catalog/datasets/'
    #output format is made according to COM
    format = 'geojson' # JSON , CSV , ECT..... -----------------IMPORTANT

    url = f'{base_url}{dataset_id}/exports/{format}'
    params = {
        'select': '*',
        'limit': -1,  # all records
        'lang': 'en',
        'timezone': 'UTC'
    }
    #GET
    response = requests.get(url)

    if response.status_code == 200:
        geojson_data = gpd.read_file(response.text)
        return geojson_data
    else:
        print(f'Request failed with status code {response.status_code}')
        return None

# SAME API BUT RETURN JSON


In [4]:
dataset_id_1 = 'residential-dwellings'
res_dataset = API_Unlimited(dataset_id_1)


        census_year  block_id  property_id  base_property_id  \
52936          2018       732       110482            110482   
164738         2016      2516       614680            614680   
129892         2013       328       560796            107110   
6938           2006       910       108487            108487   
78980          2002       861       107375            107375   
60032          2009       511       100305            100305   
151863         2010       101       623962            623962   
44513          2022       352       556077            556077   
42587          2012       355       106001            106001   
17224          2007       420       104712            104712   

                                      building_address  \
52936          109-117 Clarendon Street SOUTHBANK 3006   
164738             12 McConnell Street KENSINGTON 3031   
129892          21 Plane Tree Way NORTH MELBOURNE 3051   
6938                   141 Royal Parade PARKVILLE 3052   
78980

In [5]:

#Filter Residential dataset for only year 2020
res_dataset = res_dataset[res_dataset["census_year"] == 2020]

#rename the columns to match the columns named by original coder Steven Tuften. As well, to change the dataset to match the column order
res_dataset.rename(columns={'property_id': 'pbs_property_id', 'base_property_id': 'bps_base_id',"building_address":"street_name","longitude":"x_coordinate","latitude":"y_coordinate"}, inplace=True)
columns_list = ["census_year","block_id","pbs_property_id","bps_base_id","street_name","clue_small_area","dwelling_type","dwelling_number","x_coordinate","y_coordinate"]
res_dataset = res_dataset[columns_list]

Next, we will look at one of the CLUE datasets to better understand its structure and how we can use it.

Our data requirements from this use case include the following:
- Number of Residential Dwellings per CLUE Block
- Number of Employees per CLUE Block
- Number of Seats (Indoor and Outdoor) per Venue and CLUE Block

For this exercise, we shall start by examining the Residential Dwelling dataset.
Each dataset in the Melbourne Open Data Portal has a unique identifier which can be used to retrieve the dataset using the sodapy library.

This dataset is placed in a Pandas dataframe and we will inspect the first three rows.

In [6]:
# Retrieve the "CLUE Residential Dwellings 2020" dataset

print(f'The shape of dataset is {res_dataset.shape}.')
print('Below are the first few rows of this dataset:')

# Transpose the DataFrame for easier visual comparison.
res_dataset.head(3).T

The shape of dataset is (10404, 10).
Below are the first few rows of this dataset:


Unnamed: 0,75881,75883,75885
census_year,2020,2020,2020
block_id,332,332,332
pbs_property_id,629485,629486,636678
bps_base_id,629485,629486,636678
street_name,46 Provost Street NORTH MELBOURNE VIC 3051,48 Provost Street NORTH MELBOURNE VIC 3051,28 Provost Street NORTH MELBOURNE VIC 3051
clue_small_area,North Melbourne,North Melbourne,North Melbourne
dwelling_type,House/Townhouse,House/Townhouse,House/Townhouse
dwelling_number,1,1,1
x_coordinate,144.946641,144.946585,144.947173
y_coordinate,-37.801765,-37.801759,-37.801816


We can see that there are 10,403 records and 10 fields describing each record.

Each record show us the number of dwellings for each individual property and the type of dwelling e.g. House/Townhouse, Residential Apartments, etc.

The location of each property is given using:
- Latitude and Longitude
- CLUE Small Area and Block ID
- Property Id

The Census year that the data was collected is also shown.

For our analysis of this dataset and others we will be restricting our analysis to the 2020 CLUE Census and summarising the data to CLUE Block level.

<div class="usecase-section-header">Summarising Residential Dwelling data</div>

We want to plot the density of both residential dwellings and employment at city block level rather than a specific property or address. We can use a __[choropleth map](https://en.wikipedia.org/wiki/Choropleth_map)__ to do this.

Let's start by summarising the data at CLUE small area and Block level.

*Note: We include CLUE Small Area as one of our group by fields so we can display the CLUE Small area name in the popup window when you hover over the area on the map.*

We want to summarise the data by summing the number of dwellings across all rows in the same CLUE Block.

The following cell creates a dataframe containing this summary of residential dwellings.

In [7]:
# Cast datatypes to correct type so we can summarise
res_dataset[['census_year', 'dwelling_number']] = res_dataset[['census_year', 'dwelling_number']].astype(int)
res_dataset[['x_coordinate', 'y_coordinate']] = res_dataset[['x_coordinate', 'y_coordinate']].astype(float)
res_dataset = res_dataset.convert_dtypes() # convert remaining to string
res_dataset.dtypes

# create the aggregate dataset
groupbyfields = ['block_id','clue_small_area']
aggregatebyfields = {'dwelling_number': ["sum"]}

dwellingsByBlock = pd.DataFrame(res_dataset.groupby(groupbyfields, as_index=False).agg(aggregatebyfields))

# Dataframse Group by creates two levels of headings
# so we flatten the headings to make it easier to extract data for plotting
dwellingsByBlock.columns = dwellingsByBlock.columns.map(''.join) # flatten column header
dwellingsByBlock.rename(columns={'clue_small_area': 'clue_area'}, inplace=True) #rename to match GeoJSON extract
dwellingsByBlock.rename(columns={'dwelling_numbersum': 'dwelling_count'}, inplace=True)
dwellingsByBlock.head(5)

Unnamed: 0,block_id,clue_area,dwelling_count
0,1,Melbourne (CBD),385
1,11,Melbourne (CBD),690
2,12,Melbourne (CBD),190
3,13,Melbourne (CBD),112
4,14,Melbourne (CBD),99


<div class="usecase-section-header">Visualising Residential Dwelling on a Choropleth Map</div>

We use the __[Plotly Python Open Source Graphing Library](https://plotly.com/python/)__ to generate maps from __[mapbox](https://www.mapbox.com/)__.

Creating a choropleth map requires us to know the geometry(shape) of each CLUE Block area as a collection of latitude and longitude points defining a polygon. This data can be downloaded from the Melbourne Open Data Portal in __[GeoJSON](https://en.wikipedia.org/wiki/GeoJSON)__ format.

We also need to supply the data to be used to highlight the CLUE Blocks and that data must include the same unique identifier for each Block contained in the GeoJSON data set.

Below we extract the Melbourne CLUE Block polygons into a GeoJSON datatype.

In [8]:
import json
import geopandas as gpd
dataset_id_2 = 'blocks-for-census-of-land-use-and-employment-clue'
block = fetch_geojson_dataset_API(dataset_id_2)
#block = gpd.read_file('https://data.melbourne.vic.gov.au/api/v2/catalog/datasets/blocks-for-census-of-land-use-and-employment-clue/exports/geojson')
block

Unnamed: 0,geo_point_2d,block_id,clue_area,geometry
0,"{'lon': 144.9421522923356, 'lat': -37.78851721...",925,Parkville,"POLYGON ((144.94214 -37.78757, 144.94100 -37.7..."
1,"{'lon': 144.9428068448982, 'lat': -37.78752314...",924,Parkville,"POLYGON ((144.94214 -37.78757, 144.94319 -37.7..."
2,"{'lon': 144.94536433435934, 'lat': -37.7803563...",930,Parkville,"POLYGON ((144.94259 -37.77872, 144.94202 -37.7..."
3,"{'lon': 144.94336441344424, 'lat': -37.8072018...",412,West Melbourne (Residential),"POLYGON ((144.94271 -37.80732, 144.94370 -37.8..."
4,"{'lon': 144.94240782994808, 'lat': -37.8067063...",410,West Melbourne (Residential),"POLYGON ((144.94271 -37.80732, 144.94285 -37.8..."
...,...,...,...,...
601,"{'lon': 144.92542680884995, 'lat': -37.7881674...",2502,Kensington,"POLYGON ((144.92464 -37.78880, 144.92464 -37.7..."
602,"{'lon': 144.92690534228524, 'lat': -37.7883951...",2506,Kensington,"POLYGON ((144.92715 -37.78854, 144.92736 -37.7..."
603,"{'lon': 144.94077827466833, 'lat': -37.7887369...",2382,North Melbourne,"POLYGON ((144.94216 -37.78939, 144.94100 -37.7..."
604,"{'lon': 144.9399716568005, 'lat': -37.79303223...",2388,North Melbourne,"POLYGON ((144.93945 -37.79223, 144.93919 -37.7..."


Now using just one function call called 'choropleth_mapbox' we can display an interactive map using the **block** GeoJSON data to define the regions and the **dwellingsByBlock** dataframe to define the summarised data by block.

In [9]:
# Display the choropleth map
fig = px.choropleth_mapbox(dwellingsByBlock, # pass in the summarised dwellings per block
                           geojson=block, # pass in the GeoJSON data defining the CLUE Block polygons
                           locations='block_id', # define the unique identifier for the Blocks from the dataframe
                           color='dwelling_count', # change the colour of the block region according to the dwelling count
                           color_continuous_scale=["#FFFF88", "yellow", "orange", "orange",
                                                   "orange", "darkorange", "red", "darkred"], # define custom colour scale
                           range_color=(0, dwellingsByBlock['dwelling_count'].max()), # set the numeric range for the colour scale
                           featureidkey="properties.block_id", # define the Unique polygon identifier from the GeoJSON data
                           mapbox_style="stamen-toner", # set the visual style of the map
                           zoom=12.15, # set the zoom level
                           center = {"lat": -37.813, "lon": 144.945}, # set the map centre coordinates
                           opacity=0.5, # opacity of the choropleth polygons
                           hover_name='clue_area', # the title of the hover pop up box
                           hover_data={'block_id':True,'dwelling_count':True}, # defines which dataframe fields to display
                                                                               # in the hover popup box
                           labels={'dwelling_count':'Number of Dwellings','block_id':'CLUE Block Id'}, # defines labels for
                                                                               # the hover popup box
                           title='Residential Dwellings by CLUE Block Id for 2020', # Title for plot
                           width=950, height=800 # dimensions of plot in pixels
                          )
fig.show()

You've successfully used Melbourne CLUE Open Data and Plotly to visualise residential density in the City of Melbourne!<br>
Now zoom in and out on the map above to explore the city and areas of high and low residential density.<br><br>
This is your first step to selecting a suitable location for your new business!

__[You can explore the Residential Density data here](../dataanalysis/eda-clue-residentialdwellings.ipynb)__.

<div class="usecase-section-header">Visualising Residential Density and Cafe or Restaurant Seating</div>

To build our view of cafe venue seating and how it relates to residential density we need to visualise both datasets on the same interactive map view.

We can do this by adding a new layer (or "trace" as it is called in Plotly) to our previous map of residential density.

Let's extract the Melbourne CLUE cafe, restaurant, bistro seats dataset and summarise it so its ready to plot.

In [10]:
# Pull dataset for Cafe, restaurant and bistro seat dataset

dataset_id_3 = 'cafes-and-restaurants-with-seating-capacity'
cafe_dataset = API_Unlimited(dataset_id_3)
#Filter cafe dataset for 2020
cafe_dataset = cafe_dataset[cafe_dataset["census_year"] == 2020]

# Cast columns to correct data type
cafe_dataset.rename(columns={"longitude":"x_coordinate","latitude":"y_coordinate"},inplace=True)
integer_columns = ['census_year', 'block_id', 'property_id', 'base_property_id', 'industry_anzsic4_code', 'number_of_seats']
fp_columns = ['x_coordinate', 'y_coordinate']
cafe_dataset[integer_columns] = cafe_dataset[integer_columns].astype(int)
cafe_dataset[fp_columns] = cafe_dataset[fp_columns].astype(float)
cafe_dataset = cafe_dataset.convert_dtypes() # convert remaining to string

# Summarise venue seating by location
groupbyfields = ['clue_small_area','block_id','y_coordinate','x_coordinate']
aggregatebyfields = {'number_of_seats': ["sum"]}

seatsByLocn = pd.DataFrame(cafe_dataset.groupby(groupbyfields, as_index=False).agg(aggregatebyfields))
seatsByLocn.columns = seatsByLocn.columns.map(''.join) # flatten column header
seatsByLocn.rename(columns={'clue_small_area': 'clue_area'}, inplace=True) #rename to match GeoJSON extract
seatsByLocn.rename(columns={'number_of_seatssum': 'number_of_seats'}, inplace=True) #rename to match GeoJSON extract
seatsByLocn['number_of_seats'] = seatsByLocn['number_of_seats'].astype(int)

# Calculate scale for drawing each bubble on scatter map plot
all_data_diffq = (seatsByLocn["number_of_seats"].max() - seatsByLocn["number_of_seats"].min()) / 16
seatsByLocn['scale'] = (seatsByLocn["number_of_seats"] - seatsByLocn["number_of_seats"].min()) / all_data_diffq + 1
seatsByLocn['scale'] = seatsByLocn['scale'].astype(int)+2
seatsByLocn.head(10)

       census_year  block_id  property_id  base_property_id  \
33589         2010       354       109798            109798   
53701         2015        58       105656            105656   
12141         2016        94       105480            105480   
1603          2020       104       104084            104084   
13239         2019        68       107766            107766   
12506         2016       752       110734            110733   
49854         2018       505       100508            100508   
45230         2010        58       103621            103621   
47447         2022        28       102061            102060   
9747          2005        57       101217            101217   

                                        building_address  clue_small_area  \
33589       464-468 Victoria Street NORTH MELBOURNE 3051  North Melbourne   
53701              11-19 Liverpool Street MELBOURNE 3000  Melbourne (CBD)   
12141                 360 La Trobe Street MELBOURNE 3000  Melbourne (CBD)  

Unnamed: 0,clue_area,block_id,y_coordinate,x_coordinate,number_of_seats,scale
0,Carlton,203,-37.796707,144.965534,51,3
1,Carlton,203,-37.79668,144.9649,42,3
2,Carlton,204,-37.797834,144.965174,50,3
3,Carlton,204,-37.797255,144.965754,120,3
4,Carlton,205,-37.799463,144.964894,96,3
5,Carlton,205,-37.799001,144.964765,80,3
6,Carlton,205,-37.798721,144.965257,41,3
7,Carlton,206,-37.800458,144.966553,51,3
8,Carlton,206,-37.800191,144.966716,140,3
9,Carlton,206,-37.800046,144.966741,115,3


Above we can see our summary dataframe has calculated the total number of seats (indoor and outdoor) at each unique locations (latitude and longitude).

Since there is such a wide variance in venue seating across the city we need to scale the size of the bubbles drawn on the map to just a few (16) distinct sizes.

We set the lowest scale to 3 to ensure even the smallest venue's bubble is large enough when one zooms in at block level.

The next step is to display both the Choropleth and Scatter maps.
We first draw the choropleth map showing residential density.
We then draw the scatter plot assigning it as a trace (aka "layer") to the existing figure then show both.

In [11]:
# Plot residential density and venue seating
fig = px.choropleth_mapbox(dwellingsByBlock, geojson=block, locations='block_id', color='dwelling_count',
                           color_continuous_scale=["#FFFF88", "yellow", "orange", "orange",
                                                   "orange", "darkorange", "red", "darkred"],
                           range_color=(0, dwellingsByBlock['dwelling_count'].max()),
                           featureidkey="properties.block_id",
                           mapbox_style="stamen-toner", #"carto-positron",
                           zoom=12.15,
                           center = {"lat": -37.813, "lon": 144.945},
                           opacity=0.5,
                           hover_name='clue_area',
                           hover_data={'block_id':True,'dwelling_count':True},
                           labels={'dwelling_count':'Number of Dwellings','block_id':'CLUE Block Id'},
                           title='Residential Dwellings Density & Venue Seating (2020)',
                           width=950, height=800
                          )

# Plot of venue seating
fig2 = px.scatter_mapbox(seatsByLocn, lat="y_coordinate", lon="x_coordinate", size="scale",
                        mapbox_style="stamen-toner",
                        zoom=12.15,
                        center = {"lat": -37.813, "lon": 144.945},
                        opacity=0.70,
                        hover_name="clue_area",
                        hover_data={"block_id":True,"scale":False,"number_of_seats":True,"x_coordinate":False,"y_coordinate":False},
                        color_discrete_sequence=['purple'],
                        labels={'number_of_seats':'Number of Seats', 'block_id':'CLUE Block Id'},
                        width=950, height=800)
fig.add_trace(fig2.data[0])
fig.update_geos(fitbounds="locations", visible=False)

fig.show()

You've successfully used Melbourne CLUE Open Data and Plotly to visualise residential density and venue seating in the City of Melbourne in one map!<br>
Now zoom in and out on the map above to explore the city and areas of high residential density but low venue seating.<br><br>
This could be a possible location for your new business!

__[You can explore the Venue Seating data in more detail here](../dataanalysis/eda-clue-venueseats.ipynb)__.

<div class="usecase-section-header">Building an Interactive Visualisation for New Business Location</div>

In the previous step we saw how we can create a new layer, also called a trace, to an existing mapbox plot in order to visualise both residential density and cafe or Restaurant venue seating on the one map.

We now wish to add Employment Density to this visualisation.
Since Employment density and Residential density both require use a choropleth map to visualise data at CLUE block level, we cannot overlay these two layers at the same time.

We therefore need a way to select the base choropleth map to show either residential density or employment density and then optionally turn on or off the venue seating as an additional scatter map box layer.

To achieve this interactivity we can make use of Plotly express functions to build a drop down menu and button to be overlaid on the map.

We will require three datasets and associated layers (traces) for this visualisation.

Let's start by extracting our third dataset titled __["Employment per industry for blocks 2020"](https://data.melbourne.vic.gov.au/Business/Employment-per-industry-for-blocks-2020/qnju-it8g)__ and performing some data preparation prior to plotting.

*Note: The ***"Employment per industry for blocks 2020"*** dataset is a summary of employment at CLUE Block level and so we do not need to perform a groupby aggregation on the dataset.*

In [12]:
# Pull dataset for the Job employment by block by clue industry
dataset_id_4 = 'employment-by-block-by-clue-industry'
jobs_dataset = API_Unlimited(dataset_id_4)
#Filter jobs dataset for 2020
jobs_dataset = jobs_dataset[jobs_dataset["census_year"] == 2020]
#rename columns
jobs_dataset.rename(columns={"total_jobs_in_block":"total_employment_in_block"}, inplace=True)
# Filter out unwanted columns
columnsToKeep = ['clue_small_area','block_id','total_employment_in_block']
employmentByBlock = jobs_dataset.filter(columnsToKeep)

# Rename to match GeoJSON extract
employmentByBlock.rename(columns={'clue_small_area': 'clue_area'}, inplace=True)

# Replace all NaNs with zero
employmentByBlock.fillna(value=0,inplace=True)

# Cast columns to correct datatype
employmentByBlock[['block_id','total_employment_in_block']] = employmentByBlock[['block_id','total_employment_in_block']].astype(int)
employmentByBlock = employmentByBlock.convert_dtypes() # convert remaining to string

# Exclude summary total for all of City of Melbourne
employmentByBlock = employmentByBlock[employmentByBlock['block_id'] > 0]

# Display sample data
employmentByBlock.head(5)

       census_year  block_id               clue_small_area  accommodation  \
2533          2014       409  West Melbourne (Residential)            0.0   
3598          2009       865                   South Yarra            0.0   
599           2003       855                   South Yarra            0.0   
2801          2013       787                Port Melbourne            0.0   
283           2004       222                       Carlton            0.0   
2109          2016       329               North Melbourne            0.0   
8275          2022        61               Melbourne (CBD)            NaN   
2158          2016       528                    Kensington            0.0   
10114         2013       114               Melbourne (CBD)            NaN   
690           2002       353               North Melbourne            NaN   

       admin_and_support_services  agriculture_and_mining  \
2533                          0.0                     0.0   
3598                          

Unnamed: 0,clue_area,block_id,total_employment_in_block
1231,Melbourne (CBD),2,195
1232,Melbourne (CBD),5,5
1233,Melbourne (CBD),16,821
1234,Melbourne (CBD),21,4892
1235,Melbourne (CBD),23,2211


Now we have a dataset showing total number of employees by CLUE block, let's visualise it as a choropleth map and overlay venue seating.

In this map visualisation we will use a different map style called "open-street-map" which lets us identify the names of venues close to where the venue seating measures have been reported. **Note that not all venues may have been marked on Open Street Maps.**

Mapbox styles which do not require a Mapbox API token are 'open-street-map', 'white-bg', 'carto-positron', 'carto-darkmatter', 'stamen- terrain', 'stamen-toner', 'stamen-watercolor'. Mapbox styles which do require a Mapbox API token are 'basic', 'streets', 'outdoors', 'light', 'dark', 'satellite', 'satellite- streets'.

**Source:** __[plotly.express.line_mapbox documentation](https://plotly.com/python-api-reference/generated/plotly.express.line_mapbox.html)__

In [13]:
# Plot employment density
fig = px.choropleth_mapbox(employmentByBlock, geojson=block, locations='block_id', color='total_employment_in_block',
                           color_continuous_scale="Blues",
                           range_color=(0, employmentByBlock['total_employment_in_block'].max()),
                           featureidkey="properties.block_id",
                           mapbox_style="open-street-map",
                           zoom=12.15,
                           center = {"lat": -37.813, "lon": 144.945},
                           opacity=0.5,
                           hover_name='clue_area',
                           hover_data={'block_id':True,'total_employment_in_block':True},
                           labels={'total_employment_in_block':'Number of Employees','block_id':'CLUE Block Id'},
                           title='Employment Density & Venue Seating (2020)',
                           width=950, height=800
                          )

# Plot of venue seating
fig2 = px.scatter_mapbox(seatsByLocn, lat="y_coordinate", lon="x_coordinate", size="scale",
                        mapbox_style="stamen-toner",
                        zoom=12.15,
                        center = {"lat": -37.813, "lon": 144.945},
                        opacity=0.70,
                        hover_name="clue_area",
                        hover_data={"block_id":True,"scale":False,"number_of_seats":True,"x_coordinate":False,"y_coordinate":False},
                        color_discrete_sequence=['purple'],
                        labels={'number_of_seats':'Number of Seats', 'block_id':'CLUE Block Id'},
                        width=950, height=800)
fig.add_trace(fig2.data[0])
fig.update_geos(fitbounds="locations", visible=False)

fig.show()

<div class="usecase-section-header">Combining all map layers into one interactive map box visualisation</div>

Let's now build a single map box visualisation using our three datasets.

Our first step is to create a base plotly figure to which we can add each individual map plot as a new layer.

The title of the visualisation and any common parameters can be set using the fig.update_layout() method.

In the cell below we also have defined two custom colorscales, one continuous for the choropleth map and the other discrete for the scatter map plot.

We then create a figure for each dataset and add it as a layer to the base figure using the fig.add_trace() method.

In [14]:
# Define custom colour scale for choropleth (continuous) and scatter (discrete)
custom_continuous_colorscale = [(0, "lightblue"), (0.25, "blue"), (1, "darkblue")]
custom_discrete_colorscale = ['red']

# Create the base figure to which layers(traces) will be added.
fig = go.Figure()

# Set the default style for the map
fig.update_layout(mapbox_style="open-street-map")
fig.update_layout(hovermode='closest')
fig.update_layout(mapbox_center_lat=-37.813, mapbox_center_lon=144.945, mapbox_zoom=12.15)
fig.update_layout(width=950, height=800)
fig.update_layout(title='Residential & Employment Density plus Venue Seating (2020)')
fig.update_layout(coloraxis_colorscale=custom_continuous_colorscale)
fig.update_layout(coloraxis_colorbar={'title':'Density'})

# Create the definition for the Residential Dwellings Layer
fig1 = px.choropleth_mapbox(dwellingsByBlock, geojson=block, locations='block_id', color='dwelling_count',
                           range_color=(0, dwellingsByBlock['dwelling_count'].max()),
                           featureidkey="properties.block_id",
                           hover_name='clue_area',
                           hover_data={'block_id':True,'dwelling_count':True},
                           labels={'dwelling_count':'Number of Dwellings','block_id':'CLUE Block Id'},
                           opacity=0.5,

                          )
fig.add_trace(fig1.data[0]) # add this layer to the base figure

# Create the definition for the Employment Layer
fig2 = px.choropleth_mapbox(employmentByBlock, geojson=block, locations='block_id', color='total_employment_in_block',
                           range_color=(0, employmentByBlock['total_employment_in_block'].max()),
                           featureidkey="properties.block_id",
                           hover_name='clue_area',
                           hover_data={'block_id':True,'total_employment_in_block':True},
                           labels={'total_employment_in_block':'Number of Employees','block_id':'CLUE Block Id'},
                           opacity=0.5
                          )
fig.add_trace(fig2.data[0]) # add this layer to the base figure

# Create the definition for the Venue Seating Layer
fig3 = px.scatter_mapbox(seatsByLocn, lat="y_coordinate", lon="x_coordinate", size="scale",
                        hover_name="clue_area",
                        hover_data={"block_id":True,"scale":False,"number_of_seats":True,"x_coordinate":False,"y_coordinate":False},
                        labels={'number_of_seats':'Number of Seats', 'block_id':'CLUE Block Id'},
                        opacity=0.70, color_discrete_sequence=custom_discrete_colorscale
                        )
fig.add_trace(fig3.data[0]) # add this layer to the base figure

Finally, we define buttons and text to appear along the top of the map.

Each button turns on a combination of layers when it is clicked. The layers it turns on are defined in the 'visible' arg array with the order of boolean values corresponding to the map layers in the order they were added.

For example: When the 'Residential Density & Seating' button is clicked it turns on the 1st and 3rd layer as defined by the following argument 'visible':[True, False, True] . The 1st layer was the Residential Dwelling density choropleth map and the 3rd layer was the Venue Seating Scatter map.

In [15]:
# Turn off all choropleth layers
fig.update_traces(visible=False, selector=dict(type='choroplethmapbox'))

# Add buttons for selection on plot
buttons = [dict(method='update',
                label='Venue Seating only',  visible=True,
                args=[{'label': 'Venue Seating', 'visible':[False, False, True]}]),
           dict(method='update',
                label='Residential Density & Seating', visible=True,
                args=[{'label': 'Residential Dwelling Density','visible':[True, False, True]}]),
           dict(method='update',
                label='Employment Density & Seating', visible=True,
                args=[{'label': 'Employment Density','visible':[False, True, True]}])
          ]

um_buttons = [{'active':0, 'showactive':True, 'buttons':buttons,
               'direction': 'down', 'xanchor': 'left','yanchor': 'bottom', 'x': 0.71, 'y': 1.01}]
map_annotations = [{'text':'Please select a map view to display', 'x': 1, 'y': 1.1,
                    'showarrow': False, 'font':{'family':'Arial','size':14}}]

fig.update_layout(updatemenus=um_buttons, annotations=map_annotations)

# Display the map
fig.show()

<div class="usecase-section-header">Congratulations. Our interactive map is now complete!</div>

Now you can use the controls on the map above to explore the City of Melbourne and observe the residential density and employment density of each city block in relation to venue seating capacity.<br><br>

If you would like to extend this interactive map further, please visit the __[City of Melbourne Open Data Site](https://data.melbourne.vic.gov.au/)__ and explore some of the other valuable datasets including:
- __[Off Street Parking](https://data.melbourne.vic.gov.au/Transport/Off-street-car-parking-2020/g9am-cna5)__
- __[Pedestrian Counting System](https://data.melbourne.vic.gov.au/Transport/Pedestrian-Counting-System-Monthly-counts-per-hour/b2ak-trbp)__
- __[Microclimate sensor readings](https://data.melbourne.vic.gov.au/Environment/Microclimate-Sensor-Readings/u4vh-84j8?src=featured_banner)__
