In [1]:
import pandas as pd
import geopandas as gpd

In [119]:
folder = "C:\\Users\\z3258367\\OneDrive - UNSW\\#PhD\\Walkability\\Other Cities\\"
meshblocks = pd.read_csv(''.join(folder + "Shared Aus Data\\MB_DZN_SA2_2016_AUST.csv"))
DZNs = pd.read_csv(''.join(folder + "Melbourne Data\\2016 Victoria DZN employment.csv"), dtype='int64')
mb_shapes = gpd.read_file(''.join(folder + "Melbourne Data\\2016_VIC_MBs\\MB_2016_VIC.shp"))

Select only employment-generating meshblocks. 'Other' is typically a designation for meshblocks with a mixture of land uses. Usually these are larger semi-rural meshblocks that will not be significant for walkability results either way. We thought it more accurate to include these meshblocks.

In [141]:
employ_mbs = meshblocks[meshblocks['MB_CATEGORY_NAME_2016'].isin(
    ['Commercial','Primary Production','Hospital/Medical','Education','Other','Industrial'])]

Next sum the meshblock areas by DZN code. Ie, the output is a list of DZNs along with summed areas for each of them, of the area of the employment meshbocks within. This is then joined to the DZN Place of Work numbers data.

In [120]:
employ_areas = pd.DataFrame(employ_mbs.groupby('DZN_CODE_2016')['AREA_ALBERS_SQKM'].sum())

In [121]:
DZN_areas = DZNs.join(employ_areas, on='DZN (POW)', how='left')

This is the portion of jobs we lose with this method - jobs that are in DZNs that are made of entirely excluded meshblocks (residential, transport, parkland, water). For states I have done so far it's under 5% so considered it acceptable. One potential improvement would be to manually change some meshblock categories, for example an airport from 'transport' to 'industrial'. (Most transport meshblocks are just road or rail corridors so are better excluded).

In [122]:
DZN_areas[(DZN_areas['Number']>0) & (DZN_areas['AREA_ALBERS_SQKM'].isna())]['Number'].sum()/DZN_areas['Number'].sum()

0.03846591307284983

The DZN 'Job Density' is the number of people who report that DZN as their place of work, divided by the area of employment meshblocks within. This density is then used to calculate the job number for each of those meshblocks.

In [123]:
DZN_areas['JobDensity'] = DZN_areas['Number']/DZN_areas['AREA_ALBERS_SQKM']

employ_mbs = employ_mbs.join(DZN_areas.set_index('DZN (POW)'), on='DZN_CODE_2016', how='inner', rsuffix='_DZN')

employ_mbs['Jobs'] = employ_mbs['JobDensity']*employ_mbs['AREA_ALBERS_SQKM']

The employment figures are attached to the meshblock shapefiles, and centroids are also output, as currently I am using the centroids as the points for walkability calculations.

In [133]:
mb_shapes['MB_CODE16'] = mb_shapes['MB_CODE16'].astype('int64')
employ_mbs['MB_CODE_2016'] = employ_mbs['MB_CODE_2016'].astype('int64')

employ_shapes = mb_shapes.join(employ_mbs.set_index('MB_CODE_2016')[['DZN_CODE_2016','Jobs']], how='right', on='MB_CODE16')

In [138]:
employ_shapes.to_file(''.join(folder + "Melbourne Data\\Vic_Employment_meshblocks.gpkg"), layer='meshblocks')

centroids = employ_shapes.copy()
centroids.geometry = (centroids.geometry
                         .to_crs('EPSG:7856')
                         .centroid)
centroids.to_file(''.join(folder + "Melbourne Data\\Vic_Employment_meshblocks.gpkg"), layer='centroids')