# Melbourne parkruns

Step one of my project is figuring what parkruns are in the Greater Melbourne area as defined by ~~the State Revenue Office: https://www.sro.vic.gov.au/greater-melbourne-map-and-urban-zones#current-greater-melbourne~~ OpenStreetMap: https://www.openstreetmap.org/relation/4246124#map=9/-37.952/145.319

Luckily all parkrun events worldwide are referenced in this json file: https://images.parkrun.com/events.json so that's the first dataset we'll be playing with.

In [1]:
# Was expecting I'd be scraping the website initally, but turns out there's a nice json file I can look at instead.

import geopandas as gpd

url = "https://images.parkrun.com/events.json"
gdf = gpd.read_file(url)

gdf.head()

Unnamed: 0,eventname,EventLongName,EventShortName,LocalisedEventLongName,countrycode,seriesid,EventLocation,geometry
0,bushy,Bushy parkrun,Bushy Park,,97,1,"Bushy Park, Teddington",POINT (-0.33579 51.41099)
1,wimbledon,Wimbledon Common parkrun,Wimbledon Common,,97,1,Wimbledon Common,POINT (-0.23222 51.44208)
2,banstead,Banstead Woods parkrun,Banstead Woods,,97,1,"Banstead Woods, Coulsdon",POINT (-0.18422 51.30765)
3,richmond,Richmond parkrun,Richmond Park,,97,1,"Richmond Park, Richmond upon Thames",POINT (-0.29289 51.45196)
4,woodhousemoor,Woodhouse Moor parkrun,Woodhouse Moor,,97,1,"Woodhouse Moor, Leeds",POINT (-1.56006 53.80858)


In [2]:
gdf.shape

(2727, 8)

In [3]:
gdf.columns

Index(['eventname', 'EventLongName', 'EventShortName',
       'LocalisedEventLongName', 'countrycode', 'seriesid', 'EventLocation',
       'geometry'],
      dtype='object')

In [4]:
gdf.describe

<bound method NDFrame.describe of                   eventname                     EventLongName  \
0                     bushy                     Bushy parkrun   
1                 wimbledon          Wimbledon Common parkrun   
2                  banstead            Banstead Woods parkrun   
3                  richmond                  Richmond parkrun   
4             woodhousemoor            Woodhouse Moor parkrun   
...                     ...                               ...   
2722  lakemaryanntingkkarli  Lake Mary Ann Tingkkarli parkrun   
2723       christchurchpark         Christchurch Park parkrun   
2724    saigawaryokuchikoen     Saigawa ryokuchi koen parkrun   
2725   mamquamspawningtrail    Mamquam Spawning Trail parkrun   
2726               comopark                 Como Park parkrun   

                EventShortName LocalisedEventLongName  countrycode  seriesid  \
0                   Bushy Park                   None           97         1   
1             Wimbledon C

**This gives me all the parkrun events worldwide but I'm only interested in parkruns in Melbourne so first I'm going to filter the dataset to only give me Australian parkruns.**

In [5]:
# From examining the dataset I can see that Australia's country code is 3.

au_gdf = gdf[gdf['countrycode'] == 3]

au_gdf.head()

Unnamed: 0,eventname,EventLongName,EventShortName,LocalisedEventLongName,countrycode,seriesid,EventLocation,geometry
21,stpeters,St Peters parkrun,St Peters,,3,1,"Sydney Park, St Peters, Sydney",POINT (151.18219 -33.90766)
46,albertmelbourne,"Albert parkrun, Melbourne",Albert Melbourne,,3,1,"Albert Park, Melbourne, Australia",POINT (144.96312 -37.84352)
177,newfarm,New Farm parkrun,New Farm,,3,1,"New Farm, Brisbane",POINT (153.05164 -27.47113)
185,mainbeach,Main Beach parkrun,Main Beach,,3,1,"Main Beach, Goldcoast",POINT (153.42888 -27.97271)
304,kirra,Kirra parkrun,Kirra,,3,1,"Kirra Beach, Gold Coast QLD, Australia",POINT (153.52278 -28.16676)


In [6]:
# According to https://www.parkrun.com.au/ there are 518 locations in Australia, have I got all of them in my dataset?

f"Number of Australian parkruns: {len(au_gdf)}"

'Number of Australian parkruns: 520'

In [39]:
# Where is that extra (UPDATE: extra two) parkrun coming from? I will export my au_gdf to csv to manually check. 

au_gdf.to_csv("au-parkruns.csv", index=False)

In [8]:
# They all look like legit parkruns to me from a quick skim of the csv. Thought of something else. Are any of the parkruns in my dataset at the same location?

au_gdf['geometry'].nunique()

520

Nope to that last question so not sure where that extra rogue parkrun is coming from -- may need to ask parkrun Australia directly if their website is correct.

**UPDATE:** There is both a parkrun event and a junior event at Westerfolds Park, they have slightly different long/lats hence .nunique didn't pick them up but that could be why there's one more parkrun event to locations. And since I first ran the data the number of parkruns has gone from 519 to 520 which what looks like a new one near Tennant Creek, NT so the stats at the bottom of the website may have not be updated to reflect that.

In [11]:
# Have identified 5 junior parkruns as part of my dataset that I want to exclude. 

au_pk_gdf = au_gdf[~au_gdf['EventLongName'].str.contains('junior parkrun', na=False)]
f"Number of Australian (not junior) parkruns: {len(au_pk_gdf)}"

'Number of Australian (not junior) parkruns: 515'

## Now to filter the dataset further to get Melbourne parkruns only

I was thinking of getting a rough approximation of events in Melbourne by creating a bounding box using the lat/long coordinates of Greater Melbourne's northernmost, southernmost, easternmost and westernmost points and then checking manually to remove events outside the city bounds. 

However ChatGPT suggested another option: Use a precise shapefile or GeoJSON of Greater Melbourne and filter using that. I've never done this before but I'm going to give it a try.

I've found this Major urban area geojson on the Data Vic website: https://discover.data.vic.gov.au/dataset/major-urban-area-location-polygons-and-table/resource/eea2ce6a-68b0-49d2-827c-82feb335739c

In [13]:
vic_urban_boundaries = gpd.read_file("DataVic_MAJOR_URBAN_AREA.geojson")
vic_urban_boundaries

Unnamed: 0,OBJECTID,MUA_Name,LGA_Name,Source,geometry
0,1,Bendigo,Greater Bendigo,Bendigo UGB,"POLYGON ((144.18505 -36.73521, 144.1859 -36.73..."
1,2,Bendigo,Greater Bendigo,Bendigo UGB,"POLYGON ((144.37767 -36.81894, 144.38078 -36.8..."
2,1,Melbourne,Various,Melbourne UGB,"POLYGON ((144.70389 -38.3209, 144.70586 -38.32..."
3,2,Melbourne,Various,Melbourne UGB,"POLYGON ((144.99114 -38.36436, 144.99111 -38.3..."
4,3,Melbourne,Various,Melbourne UGB,"POLYGON ((145.02212 -38.38637, 145.02294 -38.3..."
...,...,...,...,...,...
67,18,Morwell,Latrobe,Derived from UCL and PLAN_ZONE,"POLYGON ((146.38503 -38.22981, 146.38502 -38.2..."
68,15,Lara,Greater Geelong,Derived from UCL and PLAN_ZONE,"POLYGON ((144.3898 -38.03458, 144.38965 -38.03..."
69,12,Drysdale - Clifton Springs,Greater Geelong,Derived from UCL and PLAN_ZONE,"POLYGON ((144.54465 -38.16444, 144.54464 -38.1..."
70,30,Maryborough,Central Goldfields,Derived from UCL and PLAN_ZONE,"POLYGON ((143.74628 -37.06445, 143.74628 -37.0..."


In [14]:
vic_urban_boundaries['MUA_Name'].nunique()

34

In [15]:
vic_urban_boundaries.columns

Index(['OBJECTID', 'MUA_Name', 'LGA_Name', 'Source', 'geometry'], dtype='object')

In [16]:
# I'm filtering it to get only the urban areas of Melbourne.

melb_boundary = vic_urban_boundaries[vic_urban_boundaries['MUA_Name'] == 'Melbourne']
melb_boundary

Unnamed: 0,OBJECTID,MUA_Name,LGA_Name,Source,geometry
2,1,Melbourne,Various,Melbourne UGB,"POLYGON ((144.70389 -38.3209, 144.70586 -38.32..."
3,2,Melbourne,Various,Melbourne UGB,"POLYGON ((144.99114 -38.36436, 144.99111 -38.3..."
4,3,Melbourne,Various,Melbourne UGB,"POLYGON ((145.02212 -38.38637, 145.02294 -38.3..."
5,4,Melbourne,Various,Melbourne UGB,"POLYGON ((145.06583 -38.42529, 145.06423 -38.4..."
6,5,Melbourne,Various,Melbourne UGB,"POLYGON ((145.00695 -38.48095, 145.00693 -38.4..."
7,6,Melbourne,Various,Melbourne UGB,"POLYGON ((145.11631 -38.3635, 145.11268 -38.37..."
8,7,Melbourne,Various,Melbourne UGB,"POLYGON ((145.14823 -38.3923, 145.14871 -38.39..."
9,8,Melbourne,Various,Melbourne UGB,"POLYGON ((145.22083 -38.37542, 145.22173 -38.3..."
10,9,Melbourne,Various,Melbourne UGB,"POLYGON ((145.16682 -38.27577, 145.16682 -38.2..."
11,10,Melbourne,Various,Melbourne UGB,"POLYGON ((145.14254 -38.19629, 145.14255 -38.1..."


In [17]:
melb_boundary.to_file("DataVic_greater_melb.geojson", driver="GeoJSON")

## It turns out the concept of Greater Melbourne was more complicated that I initially realised!

Ask OpenStreetMap for a map of Greater Melbourne and it returns this: https://www.openstreetmap.org/relation/4246124#map=9/-37.952/145.319  
(Which I turned into a GeoJSON file here: https://polygons.openstreetmap.fr/?id=4246124)

But the SRO's version is more bitsy (area in blue): https://www.sro.vic.gov.au/greater-melbourne-map-and-urban-zones#current-greater-melbourne

It appears that OpenStreetMap takes the blue and green areas of the map as Greater Melbourne whereas the SRO clearly defines the green areas as "Councils that have land both inside and outside the Urban Growth Boundary. The green denotes the parts of these councils falling outside the Urban Growth Boundary and therefore outside greater Melbourne."

As I'm using state government data I'm following the SRO's definition of Greater Melbourne for this analysis, while noting that such things are subjective and people may consider they live in Greater Melbourne even when by this definition they don't.

**UPDATE:** After manually checking the list of Melbourne parkruns the DataVic boundaries gave me against the parkrun website, it led to some parkruns being left out that a reasonable Melburnian would likely consider part of the Greater Melbourne area, hence I've switched to using OpenStreetMap's definition of Greater Melbourne and updating this analysis to reflect that. 

In [18]:
greater_melbourne_boundary = gpd.read_file("OSM_greater_melbourne.geojson")
greater_melbourne_boundary

Unnamed: 0,geometry
0,"MULTIPOLYGON (((144.44405 -37.86413, 144.44898..."


In [19]:
greater_melbourne_boundary.columns

Index(['geometry'], dtype='object')

In [20]:
# Checking the CRS for each of my geopandas are the same.

greater_melbourne_boundary.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [21]:
au_pk_gdf.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

**Used my GeoJSON of Greater Melbourne to spatially join and filter parkruns within that region. Using .sjoin instead of .merge because I'm joining based on spatial relationships (ie. whether a parkrun is in Greater Melbourne or not) rather than a shared column.**

In [None]:
# Rename geometry column in melb_boundary gdf so that it doesn't get lost when I join the two.

# melb_boundary['boundary_geom'] = melb_boundary.geometry
# melb_boundary

# Not bothering with this any more as it's only one polygon and two geometry columns can cause problems later. Can easily combine two GeoJSON files in geojson.io if I need to. 

In [22]:
melb_parkruns = gpd.sjoin(au_pk_gdf, greater_melbourne_boundary, predicate='within')
melb_parkruns

Unnamed: 0,eventname,EventLongName,EventShortName,LocalisedEventLongName,countrycode,seriesid,EventLocation,geometry,index_right
46,albertmelbourne,"Albert parkrun, Melbourne",Albert Melbourne,,3,1,"Albert Park, Melbourne, Australia",POINT (144.96312 -37.84352),0
472,highlands,Highlands parkrun,Highlands,,3,1,Stocklandâ€™s Highlands community,POINT (144.90371 -37.59106),0
485,hastingsforeshore,Hastings Foreshore parkrun,Hastings Foreshore,,3,1,Hastings Foreshore,POINT (145.19665 -38.30731),0
504,diamondcreek,Diamond Creek parkrun,Diamond Creek,,3,1,Marngrook Oval,POINT (145.15325 -37.67358),0
520,berwicksprings,Berwick Springs parkrun,Berwick Springs,,3,1,Berwick Springs Promenade,POINT (145.32293 -38.06207),0
521,pointcook,Point Cook parkrun,Point Cook,,3,1,Arndell Park Community Centre,POINT (144.72952 -37.86209),0
536,westerfolds,Westerfolds parkrun,Westerfolds,,3,1,Westerfolds Park,POINT (145.13399 -37.74609),0
572,lillydalelake,Lillydale Lake parkrun,Lillydale Lake,,3,1,Lilydale Lake,POINT (145.35707 -37.76622),0
633,maribyrnong,Maribyrnong parkrun,Maribyrnong,,3,1,Maribyrnong,POINT (144.89786 -37.77063),0
651,pakenham,Pakenham parkrun,Pakenham,,3,1,Pakenham,POINT (145.46384 -38.06462),0


In [23]:
melb_parkruns.shape

(49, 9)

In [24]:
melb_parkruns = melb_parkruns.drop(columns=['index_right', 'LocalisedEventLongName'])
melb_parkruns

Unnamed: 0,eventname,EventLongName,EventShortName,countrycode,seriesid,EventLocation,geometry
46,albertmelbourne,"Albert parkrun, Melbourne",Albert Melbourne,3,1,"Albert Park, Melbourne, Australia",POINT (144.96312 -37.84352)
472,highlands,Highlands parkrun,Highlands,3,1,Stocklandâ€™s Highlands community,POINT (144.90371 -37.59106)
485,hastingsforeshore,Hastings Foreshore parkrun,Hastings Foreshore,3,1,Hastings Foreshore,POINT (145.19665 -38.30731)
504,diamondcreek,Diamond Creek parkrun,Diamond Creek,3,1,Marngrook Oval,POINT (145.15325 -37.67358)
520,berwicksprings,Berwick Springs parkrun,Berwick Springs,3,1,Berwick Springs Promenade,POINT (145.32293 -38.06207)
521,pointcook,Point Cook parkrun,Point Cook,3,1,Arndell Park Community Centre,POINT (144.72952 -37.86209)
536,westerfolds,Westerfolds parkrun,Westerfolds,3,1,Westerfolds Park,POINT (145.13399 -37.74609)
572,lillydalelake,Lillydale Lake parkrun,Lillydale Lake,3,1,Lilydale Lake,POINT (145.35707 -37.76622)
633,maribyrnong,Maribyrnong parkrun,Maribyrnong,3,1,Maribyrnong,POINT (144.89786 -37.77063)
651,pakenham,Pakenham parkrun,Pakenham,3,1,Pakenham,POINT (145.46384 -38.06462)


In [25]:
# Fix broken character in one of the cells, likely caused by the curly apostropher on their page. 

melb_parkruns['EventLocation'] = melb_parkruns['EventLocation'].str.replace("â€™", "'", regex=False)
melb_parkruns

Unnamed: 0,eventname,EventLongName,EventShortName,countrycode,seriesid,EventLocation,geometry
46,albertmelbourne,"Albert parkrun, Melbourne",Albert Melbourne,3,1,"Albert Park, Melbourne, Australia",POINT (144.96312 -37.84352)
472,highlands,Highlands parkrun,Highlands,3,1,Stockland's Highlands community,POINT (144.90371 -37.59106)
485,hastingsforeshore,Hastings Foreshore parkrun,Hastings Foreshore,3,1,Hastings Foreshore,POINT (145.19665 -38.30731)
504,diamondcreek,Diamond Creek parkrun,Diamond Creek,3,1,Marngrook Oval,POINT (145.15325 -37.67358)
520,berwicksprings,Berwick Springs parkrun,Berwick Springs,3,1,Berwick Springs Promenade,POINT (145.32293 -38.06207)
521,pointcook,Point Cook parkrun,Point Cook,3,1,Arndell Park Community Centre,POINT (144.72952 -37.86209)
536,westerfolds,Westerfolds parkrun,Westerfolds,3,1,Westerfolds Park,POINT (145.13399 -37.74609)
572,lillydalelake,Lillydale Lake parkrun,Lillydale Lake,3,1,Lilydale Lake,POINT (145.35707 -37.76622)
633,maribyrnong,Maribyrnong parkrun,Maribyrnong,3,1,Maribyrnong,POINT (144.89786 -37.77063)
651,pakenham,Pakenham parkrun,Pakenham,3,1,Pakenham,POINT (145.46384 -38.06462)


In [26]:
melb_parkruns.dtypes

eventname           object
EventLongName       object
EventShortName      object
countrycode          int32
seriesid             int32
EventLocation       object
geometry          geometry
dtype: object

In [27]:
melb_parkruns.to_file("melb_parkruns.geojson", driver="GeoJSON")

In [28]:
melb_parkruns['EventLongName']

46                          Albert parkrun, Melbourne
472                                 Highlands parkrun
485                        Hastings Foreshore parkrun
504                             Diamond Creek parkrun
520                           Berwick Springs parkrun
521                                Point Cook parkrun
536                               Westerfolds parkrun
572                            Lillydale Lake parkrun
633                               Maribyrnong parkrun
651                                  Pakenham parkrun
666                             Toolern Creek parkrun
682                               Frog Hollow parkrun
736                                 Parkville parkrun
751                                  Brimbank parkrun
766                                    Coburg parkrun
804                                     Jells parkrun
841                              Altona Beach parkrun
948                              Wyndham Vale parkrun
989                         

In [29]:
melb_parkruns.to_csv("melb_parkruns.csv", index=False)

In [30]:
f"The number of parkruns in the Greater Melbourne area is {len(melb_parkruns)}."

'The number of parkruns in the Greater Melbourne area is 49.'

## Want to determine if I can call Melbourne Australia's parkrun capital

So I'm going to compare the number of parkruns in Greater Melbourne with the number in Great Sydney and Greater Brisbane respectively.

In [31]:
# Greater Sydney first.

greater_sydney_boundary = gpd.read_file("OSM_greater_sydney.geojson")
greater_sydney_boundary

Unnamed: 0,geometry
0,"MULTIPOLYGON (((150.27091 -33.67212, 150.27093..."


In [32]:
greater_sydney_boundary.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [33]:
syd_parkruns = gpd.sjoin(au_pk_gdf, greater_sydney_boundary, predicate='within')
syd_parkruns

Unnamed: 0,eventname,EventLongName,EventShortName,LocalisedEventLongName,countrycode,seriesid,EventLocation,geometry,index_right
21,stpeters,St Peters parkrun,St Peters,,3,1,"Sydney Park, St Peters, Sydney",POINT (151.18219 -33.90766),0
347,parramatta,Parramatta parkrun,Parramatta,,3,1,George Kendell Riverside Park,POINT (151.06549 -33.81977),0
366,curlcurl,Curl Curl parkrun,Curl Curl,,3,1,John Fisher Park,POINT (151.28234 -33.76648),0
502,mosman,Mosman parkrun,Mosman,,3,1,Spit West Reserve,POINT (151.24599 -33.80625),0
591,cooksriver,Cooks River parkrun,Cooks River,,3,1,Saint Mary Mackillop Reserve,POINT (151.11686 -33.91371),0
600,campbelltown,Campbelltown parkrun,Campbelltown,,3,1,Hurricane Drive,POINT (150.81482 -34.0183),0
612,penrithlakes,Penrith Lakes parkrun,Penrith Lakes,,3,1,Sydney International Regatta Centre,POINT (150.68574 -33.72592),0
622,rousehill,Rouse Hill parkrun,Rouse Hill,,3,1,Rouse Hill Regional Park,POINT (150.91111 -33.6793),0
637,lawson,Lawson parkrun,Lawson,,3,1,North Lawson Park,POINT (150.4269 -33.71363),0
718,cronulla,Cronulla parkrun,Cronulla,,3,1,Don Lucas Reserve,POINT (151.16184 -34.04094),0


In [34]:
len(syd_parkruns)

34

In [35]:
# Then Greater Brisbane.

greater_brisbane_boundary = gpd.read_file("OSM_greater_brisbane.geojson")
greater_brisbane_boundary

Unnamed: 0,geometry
0,"MULTIPOLYGON (((152.67969 -27.37226, 152.68418..."


In [36]:
greater_brisbane_boundary.crs

<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World.
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984 ensemble
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich

In [37]:
bris_parkruns = gpd.sjoin(au_pk_gdf, greater_brisbane_boundary, predicate='within')
bris_parkruns

Unnamed: 0,eventname,EventLongName,EventShortName,LocalisedEventLongName,countrycode,seriesid,EventLocation,geometry,index_right
177,newfarm,New Farm parkrun,New Farm,,3,1,"New Farm, Brisbane",POINT (153.05164 -27.47113),0
365,wynnum,Wynnum parkrun,Wynnum,,3,1,"Wynnum Manly Foreshore, QLD",POINT (153.1779 -27.44293),0
380,sandgate,Sandgate parkrun,Sandgate,,3,1,Arthur Davis Park,POINT (153.06823 -27.31129),0
381,southbank,South Bank parkrun,South Bank,,3,1,South Bank Parklands,POINT (153.02396 -27.47887),0
434,mitchelton,Mitchelton parkrun,Mitchelton,,3,1,Teralba Park,POINT (152.9793 -27.40442),0
482,rocksriverside,Rocks Riverside parkrun,Rocks Riverside,,3,1,Rocks Riverside Park,POINT (152.95924 -27.54263),0
557,minnippi,Minnippi parkrun,Minnippi,,3,1,Meadowfields Road,POINT (153.11868 -27.49331),0
567,calamvale,Calamvale parkrun,Calamvale,,3,1,Calamvale District Park,POINT (153.03929 -27.62078),0
623,stonescorner,Stones Corner parkrun,Stones Corner,,3,1,Hanlon Park,POINT (153.04247 -27.49872),0
625,wishart,Wishart parkrun,Wishart,,3,1,Wishart Community Park,POINT (153.10347 -27.56098),0


In [38]:
len(bris_parkruns)

20