# Transport attributes

The following attributes are derived from the official [GTFS dataset](https://discover.data.vic.gov.au/dataset/ptv-timetable-and-geographic-information-2015-gtfs) published by Public Transport Victoria (PTV).

This dataset provides static timetable data and geographic information in the The General Transit Feed Specification (GTFS) format. It contains scheduled information for all metropolitan and regional trains, all metropolitan and regional bus (including coach) and all metropolitan tram services in Victoria.

The [General Transit Feed Specification](https://developers.google.com/transit/gtfs/reference) (GTFS) defines a common format for public transport schedules and associated geographic information. The GTFS provides scheduled information for all metropolitan and regional train, tram and bus services in Victoria.

In [1]:
import os
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
from geopy.distance import geodesic
from scipy.spatial.distance import cdist
from haversine import haversine
from time import perf_counter

import helper

# make all output interactive
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"


In [2]:
# Define universal constants
# ================================
DATA_PATH = os.path.join(os.getcwd(), "../data/")
RAW_DATA_PATH = os.path.join(DATA_PATH, "raw")
DERIVED_DATA_PATH = os.path.join(DATA_PATH, "derived")
CORR_DATA_PATH = os.path.join(DATA_PATH, "correspondence")

Within the directory there are 11 directories. Each directory is labelled a number. According to the [official release specifications](http://data.ptv.vic.gov.au/downloads/PTVGTFSReleaseNotes.pdf) by PTV, each directory equates to the following operational branches:

1. Regional Train
2. Metropolitan Train
3. Metropolitan Tram
4. Metropolitan Bus
5. <strike>Regional Coach</strike> - This branch won't be considered as it is irrelevant to the purpose of finding a livable suburb.
6. Regional Bus
7. <strike>TeleBus</strike> - This branch won't be considered as it is abnormal and not widespread.
8. <strike>Night Bus</strike> - This branch won't be considered as it contains no data (likely contained within regional/metropolitan bus branches).
10. (10) <strike>Interstate</strike> - This branch won't be considered as it is irrelevant to the purpose of finding a livable suburb.
11. (11) <strike>SkyBus</strike> - This branch won't be considered as it is irrelevant to the purpose of finding a livable suburb.

Within each branch, there are eight datasets available which fully describe the public transport network for that branch. The eight datasets are:

- Agency
- Calendar
- Calendar dates
- Routes
- Trips
- Stops
- Stop times
- Shapes

We first define constants which define the datasets available.


In [3]:
GTFS_DATA_PATH = os.path.join(RAW_DATA_PATH, "gtfs")


BRANCHES = {
    "regionalTrain": 1,
    "metropolitanTrain": 2,
    "metropolitanTram": 3,
    "metropolitanBus": 4,
    "regionalCoach": 5,
    "regionalBus": 6,
    "teleBus": 7,
    "nightBus": 8,
    "interstate": 10,
    "skyBus": 11,
}

DATASETS = {
    "agency": "agency.txt",
    "calendar": "calendar.txt",
    "calendarDates": "calendar_dates.txt",
    "routes": "routes.txt",
    "trips": "trips.txt",
    "stops": "stops.txt",
    "stopTimes": "stop_times.txt",
    "shapes": "shapes.txt",
}


VALID_BRANCHES = [
    "regionalTrain",
    "metropolitanTrain",
    "metropolitanTram",
    "metropolitanBus",
    "regionalBus",
]


We next define functions to automate access to these datasets.

In [4]:
def getBranchId(name: str) -> int:
    if name not in BRANCHES:
        raise ValueError(f"Name '{name}' is not recognised as PTV operational branch.")

    return BRANCHES[name]


def getDataset(branch_id: int, dataset_name: str) -> pd.DataFrame:
    if branch_id < 1 or branch_id > 11 or branch_id in [5, 7, 8, 9, 10, 11]:
        raise ValueError(f"Branch id '{branch_id}' is not a valid number. Try 1, 2, 3, 4, or 6.")
    if dataset_name not in DATASETS:
        raise ValueError(f"Dataset name '{dataset_name}' is not recognised as available dataset.")

    return pd.read_csv(os.path.join(GTFS_DATA_PATH, f"{branch_id}/google_transit/{DATASETS[dataset_name]}"))


In [5]:
getDataset(getBranchId("metropolitanTrain"), "routes").head()

Unnamed: 0,route_id,agency_id,route_short_name,route_long_name,route_type,route_color,route_text_color
0,2-ain-A-mjp-1,,Flemington,Showgrounds/Flemington - Flinders Street,2,0072CE,FFFFFF
1,2-ain-C-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
2,2-ain-D-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
3,2-ain-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
4,2-ALM-B-mjp-1,,Alamein,City (Flinders Street) - Alamein,2,0072CE,FFFFFF


We now initialise a structure to access the appropriate dataset (rather than recreate the dataset each time, which is highly inefficient).

In [6]:
datasets = {
    branch_name: {
        dataset_name: getDataset(getBranchId(branch_name), dataset_name)
        for dataset_name in DATASETS
    }
    for branch_name in VALID_BRANCHES
}

In [7]:
datasets["metropolitanTrain"]["routes"].head()

Unnamed: 0,route_id,agency_id,route_short_name,route_long_name,route_type,route_color,route_text_color
0,2-ain-A-mjp-1,,Flemington,Showgrounds/Flemington - Flinders Street,2,0072CE,FFFFFF
1,2-ain-C-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
2,2-ain-D-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
3,2-ain-mjp-1,,Flemington,Flinders Street,2,0072CE,FFFFFF
4,2-ALM-B-mjp-1,,Alamein,City (Flinders Street) - Alamein,2,0072CE,FFFFFF


## Attributes

The goal of this notebook is to produce an arbitrary 'score' for each suburb for the three public transport branches: Train, Bus, and Tram.

The attributes which we will use to produce these scores will, at a high-level, incorporate the number and distance of nearby stops*, the regularity of the service (time between trips), and the availability of full-time service (days of weeks and hours of day). The precise implementations will be explained in detail later in the notebook at the appropriate locations.

\* Keep in mind that the distance is calculated to the 'centre' of the suburb as determined by Google Maps using `geopy.GoogleV3` in [`0-SuburbMetadataClean.ipynb`](0-SuburbMetadataClean.ipynb).

- `tra_train` - this attribute will equally combine the scores for the following features:
  - `regional_train_distance`
  - `regional_train_regularity`
  - `regional_train_availablility`
  - `metro_train_distance`
  - `metro_train_regularity`
  - `metro_train_availablility`
- `tra_bus` - this attribute will equally combine the scores for the following features:
  - `regional_bus_distance`
  - `regional_bus_regularity`
  - `regional_bus_availablility`
  - `metro_bus_distance`
  - `metro_bus_regularity`
  - `metro_bus_availablility`
- `tra_tram` - this attribute will equally combine the scores for the following features:
  - `tram_distance`
  - `tram_regularity`
  - `tram_availablility`

For each suburb:

- The `_distance` attributes are the sum of distances to the 3 nearest 'stops' in kilometers.
- The `_regularity` attributes are the average time between services at the 3 nearest stations.
- The `_availability` attributes are the number of hours in a normal (non-holiday) week which has a service running. Naturally, the range of this number is 0 to 168 (7 * 24). As this is between the 3 nearest stations, the value will be the averge for the 3 stations.

Firstly, we need to obtain the suburb metadata file (using the [`helper.py`](helper.py) module).

In [8]:
# All localities meta data
# ========================
# get data
suburb_metadata_df = helper.getSuburbsMetadata()
# convert to numpy array for faster operations
suburb_metadata = suburb_metadata_df[["locality", "coordinates"]].to_numpy()

suburb_metadata_df.shape
suburb_metadata_df.head()

(3268, 18)

Unnamed: 0,postcode,locality,coordinates,lgaregion,sa1_code_2016,sa1_code_2021,sa2_code_2016,sa2_name_2016,sa2_code_2021,sa2_name_2021,sa3_code_2016,sa3_name_2016,sa3_code_2021,sa3_name_2021,sa4_code_2016,sa4_name_2016,sa4_code_2021,sa4_name_2021
0,3000,Melbourne,"(-37.8152065, 144.963937)",Melbourne,20604110000.0,20604150327,206041122.0,Melbourne,206041503,Melbourne CBD - East,20604.0,Melbourne City,20604,Melbourne City,206.0,Melbourne - Inner,206,Melbourne - Inner
1,3002,East Melbourne,"(-37.8161444, 144.9804594)",Yarra,20604110000.0,20604111914,206041119.0,East Melbourne,206041119,East Melbourne,20604.0,Melbourne City,20604,Melbourne City,206.0,Melbourne - Inner,206,Melbourne - Inner
2,3003,West Melbourne,"(-37.8114504, 144.9253974)",Melbourne,20604110000.0,20604112701,206041127.0,West Melbourne,206041127,West Melbourne - Industrial,20604.0,Melbourne City,20604,Melbourne City,206.0,Melbourne - Inner,206,Melbourne - Inner
3,3004,St Kilda Road Central,"(-37.8367638, 144.9756445)",Yarra,20604110000.0,20604112506,206041125.0,South Yarra - West,206041125,South Yarra - West,20604.0,Melbourne City,20604,Melbourne City,206.0,Melbourne - Inner,206,Melbourne - Inner
4,3004,St Kilda Road Melbourne,"(-37.8367638, 144.9756445)",Yarra,20604110000.0,20604112506,206041125.0,South Yarra - West,206041125,South Yarra - West,20604.0,Melbourne City,20604,Melbourne City,206.0,Melbourne - Inner,206,Melbourne - Inner


Now we define the functions used to determine each of the three features: distance, regularity, and availability.

The `modulusString` function takes a time formatted as `00:00:00` and ensures that the hour component is within the correct time frame (i.e., modulus 24) since the GTFS docs state that the time can go 'past' midnight on a given day when the trip began on the previous day and continues on past midnight.

In [9]:
def modulusString(string: str):
    string_comp = string.split(":")
    hour = int(string_comp[0])
    new_hour = str(hour % 24).rjust(2, "0")
    return f"{new_hour}:{':'.join(string_comp[1:])}"


modulusString("32:14:00")

'08:14:00'

The following loops generates dictionaries with the same keys as `datasets`. The first dictionary comprehension, `service_ids`, gives the `service_id` values from the `calendar` dataset which are longer than 31 days (to make sure no temporary timetables are included in the data analysis). The second dictionary comprehension, `schedules`,  merges many of the datasets togther (and uses the `service_ids` dictionary) to create a dataset which has all departure times for services and which days of the week those services apply to.

In [10]:
service_ids = {
    branch_name: (
        datasets[branch_name]["calendar"]
        .assign(start_date=lambda x: x["start_date"].astype(str).apply(datetime.strptime, args=("%Y%m%d",)))
        .assign(end_date=lambda x: x["end_date"].astype(str).apply(datetime.strptime, args=("%Y%m%d",)))
        .pipe(lambda x: x[x.end_date - x.start_date > timedelta(5)])  # BUG: This number is picky and can result in NaN values in the final dataframe
    )["service_id"].to_list()
    for branch_name in VALID_BRANCHES
}

schedules = {
    branch_name: (
        datasets[branch_name]["stopTimes"]
        .merge(datasets[branch_name]["stops"])  # merge stop-time data with stop geography data
        .query("pickup_type == 0")  # ensure service is picking up passengers (see GTFS docs)
        .merge(datasets[branch_name]["trips"])  # merge with trips data on trip_id
        .merge(datasets[branch_name]["calendar"])  # get days of week service is running
        .pipe(lambda x: x[x["service_id"].isin(service_ids[branch_name])])  # filter to contain only services which match the condition for permanent service, not largely temporary
        .assign(coordinates=lambda x: list(zip(x["stop_lat"], x["stop_lon"])))  # merge coordinates together
        .drop([
            "arrival_time", "stop_sequence", "stop_headsign", "pickup_type", 
            "drop_off_type", "shape_dist_traveled", "trip_id", "route_id", 
            "shape_id", "trip_headsign", "direction_id", "start_date", "end_date",
            "stop_lat", "stop_lon"
        ], axis=1)  # drop unnecessary columns
    )
    for branch_name in VALID_BRANCHES
}

This next code block is used for efficiency. It determines the 3 closest stops to each suburb and stores this result to be used in the later feature extraction functions `distanceFeature`, `regularityFeature`, and `availabilityFeature`.

In [11]:
closestStops = {}

for branch_name in VALID_BRANCHES:
    # get the coordinates of the suburbs in a two-dim array
    suburb_points = pd.DataFrame(suburb_metadata_df.coordinates.to_list()).to_numpy()
    # get the branch stop coordinates in two-dim array
    stop_points = datasets[branch_name]["stops"][["stop_lat", "stop_lon"]].to_numpy()
    
    start = perf_counter()
    closestStops[branch_name] = (
        pd.DataFrame(
            cdist(suburb_points, stop_points, metric=lambda u, v: haversine(u, v)),  # convert distance matrix to dataframe
            columns=datasets[branch_name]["stops"].stop_id,  # use stop_id as column names
            index=suburb_metadata_df.locality  # use locality names as index
        )
        .reset_index()  # move index into separate column
        .melt("locality", value_name="distance_km")  # wide-to-narrow conversion
        .groupby("locality", group_keys=False)  # groupby the locality
        .apply(pd.DataFrame.nsmallest, n=3, columns="distance_km")  # get the 3 smallest distance rows for each suburb
    )

    print(f"Completed: {branch_name:24}({round(perf_counter() - start, 2)} s)")


Completed: regionalTrain           (1.83 s)
Completed: metropolitanTrain       (2.76 s)
Completed: metropolitanTram        (14.93 s)
Completed: metropolitanBus         (161.19 s)
Completed: regionalBus             (58.71 s)


In [12]:
def distanceFeature(branch_name: str) -> list:
    """
    Determines the sum of distances of the closest three 'stops' in the 
    `branch_name` for each suburb.
    
    ## Parameters

    `branch_id` • str
        The name of the PTV branch whose 'stops' are being considered.

    ## Returns
    
    An ordered list (order is the same as that of `closestStops` dataframe which 
    is derived from `suburb_metadata_df`) which contains the sum of distances to 
    the 3 nearest 'stops' in the `branch_name` branch of service.
    """
    
    assert branch_name in VALID_BRANCHES

    return (
        closestStops[branch_name]
        .groupby("locality")
        .sum("distance_km")
    ).distance_km.to_list()


def regularityFeature(suburb_name: str, branch_name: str) -> float:
    """
    Determines the average number of minutes between services for the 
    `branch_name` service at the closest 3 'stops' to the 
    `suburb_name` suburb.
    
    ## Parameters

    `suburb_name` • tuple(float, float)
        Name of the suburb for which the distance is calculated. Valid values in 
        include those in the `locality` column in the dataframe obtained from 
        `helper.getSuburbsMetadata()`.

    `branch_id` • str
        The name of the PTV branch whose 'stops' are being considered.

    ## Returns
    
    The average number of minutes between services.
    """

    stop_ids = closestStops[branch_name].groupby("locality").agg(list).stop_id[suburb_name]
    
    temp = (
        schedules[branch_name]
        .pipe(lambda x: x[x.stop_id.isin(stop_ids)])
        .query("monday == 1")
        .assign(departure_time=lambda x: x.departure_time.apply(modulusString).apply(datetime.strptime, args=("%H:%M:%S",)))
        .sort_values("departure_time")
    )

    values = [temp.query(f"stop_id == {stop}").sort_values("departure_time").departure_time.diff().mean() for stop in stop_ids]
    new_values = []
    for v in values:
        if isinstance(v, timedelta):
            new_values.append(v)

    if len(new_values) > 0:
        return np.array(new_values).mean().total_seconds() / 60
    return np.nan


def availabilityFeature(suburb_name: tuple, branch_name: str) -> int:
    """
    Determines the sum of distances of the closest three 'stops' in the 
    `branch_name` service to the `suburb_coordinates` suburb.
    
    ## Parameters

    `suburb_name` • tuple(float, float)
        Name of the suburb for which the distance is calculated. Valid values in 
        include those in the `locality` column in the dataframe obtained from 
        `helper.getSuburbsMetadata()`.

    `branch_id` • str
        The name of the PTV branch whose 'stops' are being considered.

    ## Returns
    
    The number of hours in a week which has a service running at the three 
    closest stations to the given suburb (average between stations).
    """

    stop_ids = closestStops[branch_name].groupby("locality").agg(list).stop_id[suburb_name]

    temp = (
        schedules[branch_name]
        .pipe(lambda x: x[x.stop_id.isin(stop_ids)])  # only consider the 3 nearest stops
        .assign(departure_time=lambda x: x.departure_time.apply(modulusString).apply(lambda s: int(s.split(":")[0])))  # convert departure_time string into integer representing hour
        .melt(["departure_time", "stop_id", "stop_name", "service_id", "coordinates"], var_name="day", value_name="is_running")  # convert day columns into single column
        .query("is_running == 1")  # only consider rows where the service is running (sideproduct from melt above)
        .drop_duplicates(["departure_time", "day"])  # only keep unique hour, day pairs
    )

    return np.array([len(temp.query(f"stop_id == {stop}")) for stop in stop_ids]).mean()

In [13]:
# train based features
# ==============================================================================

start = perf_counter()
suburb_metadata_df["regional_train_distance"] = distanceFeature("regionalTrain")
print(f"Completed: {'regional_train_distance':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["regional_train_regularity"] = suburb_metadata_df["locality"].apply(regularityFeature, args=("regionalTrain",))
print(f"Completed: {'regional_train_regularity':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["regional_train_availablility"] = suburb_metadata_df["locality"].apply(availabilityFeature, args=("regionalTrain",))
print(f"Completed: {'regional_train_availablility':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_train_distance"] = distanceFeature("metropolitanTrain")
print(f"Completed: {'metro_train_distance':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_train_regularity"] = suburb_metadata_df["locality"].apply(regularityFeature, args=("metropolitanTrain",))
print(f"Completed: {'metro_train_regularity':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_train_availablility"] = suburb_metadata_df["locality"].apply(availabilityFeature, args=("metropolitanTrain",))
print(f"Completed: {'metro_train_availablility':35}({round(perf_counter() - start, 2)} s)")


# bus based features
# ==============================================================================

start = perf_counter()
suburb_metadata_df["regional_bus_distance"] = distanceFeature("regionalBus")
print(f"Completed: {'regional_bus_distance':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["regional_bus_regularity"] = suburb_metadata_df["locality"].apply(regularityFeature, args=("regionalBus",))
print(f"Completed: {'regional_bus_regularity':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["regional_bus_availablility"] = suburb_metadata_df["locality"].apply(availabilityFeature, args=("regionalBus",))
print(f"Completed: {'regional_bus_availablility':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_bus_distance"] = distanceFeature("metropolitanBus")
print(f"Completed: {'metro_bus_distance':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_bus_regularity"] = suburb_metadata_df["locality"].apply(regularityFeature, args=("metropolitanBus",))
print(f"Completed: {'metro_bus_regularity':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["metro_bus_availablility"] = suburb_metadata_df["locality"].apply(availabilityFeature, args=("metropolitanBus",))
print(f"Completed: {'metro_bus_availablility':35}({round(perf_counter() - start, 2)} s)")


# tram based features
# ==============================================================================

start = perf_counter()
suburb_metadata_df["tram_distance"] = distanceFeature("metropolitanTram")
print(f"Completed: {'tram_distance':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["tram_regularity"] = suburb_metadata_df["locality"].apply(regularityFeature, args=("metropolitanTram",))
print(f"Completed: {'tram_regularity':35}({round(perf_counter() - start, 2)} s)")

start = perf_counter()
suburb_metadata_df["tram_availablility"] = suburb_metadata_df["locality"].apply(availabilityFeature, args=("metropolitanTram",))
print(f"Completed: {'tram_availablility':35}({round(perf_counter() - start, 2)} s)")


Completed: regional_train_distance            (0.0 s)
Completed: regional_train_regularity          (77.99 s)
Completed: regional_train_availablility       (77.87 s)
Completed: metro_train_distance               (0.0 s)
Completed: metro_train_regularity             (80.88 s)
Completed: metro_train_availablility          (80.47 s)
Completed: regional_bus_distance              (0.0 s)
Completed: regional_bus_regularity            (80.58 s)
Completed: regional_bus_availablility         (81.55 s)
Completed: metro_bus_distance                 (0.0 s)
Completed: metro_bus_regularity               (115.29 s)
Completed: metro_bus_availablility            (111.92 s)
Completed: tram_distance                      (0.0 s)
Completed: tram_regularity                    (88.88 s)
Completed: tram_availablility                 (89.78 s)


We need to replace any `np.nan` values with 0 (since these were created by the absence of enough rows to determine the 'average' time between services). And once we have these intermediary features without NaN values, we can calculate the final transport 'scores' as the equal weight of each of the intermediary scores summed together.

In [14]:
temp = suburb_metadata_df.copy()

temp = temp.fillna(0)

columns = [
    "regional_train_distance",
    "regional_train_regularity",
    "regional_train_availablility",
    "metro_train_distance",
    "metro_train_regularity",
    "metro_train_availablility",
    "regional_bus_distance",
    "regional_bus_regularity",
    "regional_bus_availablility",
    "metro_bus_distance",
    "metro_bus_regularity",
    "metro_bus_availablility",
    "tram_distance",
    "tram_regularity",
    "tram_availablility",
]

for col_name in columns:
    temp[col_name] = (temp[col_name] - temp[col_name].min()) / (temp[col_name].max() - temp[col_name].min())

temp["tra_train"] = temp["regional_train_distance"] + temp["regional_train_regularity"] + temp["regional_train_availablility"] + temp["metro_train_distance"] + temp["metro_train_regularity"] + temp["metro_train_availablility"]
temp["tra_bus"] = temp["regional_bus_distance"] + temp["regional_bus_regularity"] + temp["regional_bus_availablility"] + temp["metro_bus_distance"] + temp["metro_bus_regularity"] + temp["metro_bus_availablility"]
temp["tra_tram"] = (temp["tram_distance"] + temp["tram_regularity"] + temp["tram_availablility"]) * 2  # multiple by 2 to make it 'scaled' like the other two (since tram only has 3 intermediate attributes)

temp

temp[["locality", "tra_train", "tra_bus", "tra_tram"]].to_csv(os.path.join(DERIVED_DATA_PATH, "SuburbTransportScores.csv"), index=False)


Unnamed: 0,postcode,locality,coordinates,lgaregion,sa1_code_2016,sa1_code_2021,sa2_code_2016,sa2_name_2016,sa2_code_2021,sa2_name_2021,...,regional_bus_availablility,metro_bus_distance,metro_bus_regularity,metro_bus_availablility,tram_distance,tram_regularity,tram_availablility,tra_train,tra_bus,tra_tram
0,3000,Melbourne,"(-37.8152065, 144.963937)",Melbourne,2.060411e+10,20604150327,206041122.0,Melbourne,206041503,Melbourne CBD - East,...,0.495726,0.172907,0.008289,0.985714,0.317436,0.078933,0.50,2.574785,2.060283,1.792738
1,3002,East Melbourne,"(-37.8161444, 144.9804594)",Yarra,2.060411e+10,20604111914,206041119.0,East Melbourne,206041119,East Melbourne,...,0.495726,0.000401,0.018943,0.985714,0.001671,0.255311,0.50,2.067902,1.841857,1.513963
2,3003,West Melbourne,"(-37.8114504, 144.9253974)",Melbourne,2.060411e+10,20604112701,206041127.0,West Melbourne,206041127,West Melbourne - Industrial,...,0.495726,0.000196,0.031458,0.878571,0.001877,0.116463,0.55,2.043087,1.651727,1.336680
3,3004,St Kilda Road Central,"(-37.8367638, 144.9756445)",Yarra,2.060411e+10,20604112506,206041125.0,South Yarra - West,206041125,South Yarra - West,...,0.213675,0.105058,0.032642,0.628571,0.198049,0.159692,0.70,2.443637,1.315487,2.115481
4,3004,St Kilda Road Melbourne,"(-37.8367638, 144.9756445)",Yarra,2.060411e+10,20604112506,206041125.0,South Yarra - West,206041125,South Yarra - West,...,0.213675,0.004974,0.032642,0.628571,0.137051,0.159692,0.70,2.392350,1.040909,1.993485
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3263,3988,Poowong North,"(-38.2808915, 145.7825971)",South Gippsland,2.050311e+10,20503108916,205031089.0,Korumburra,205031089,Korumburra,...,0.000000,0.210410,0.110215,0.757143,0.320048,0.521002,0.75,2.640861,1.397214,3.182100
3264,3960,Tidal River,"(-39.02979215080867, 146.3208933505564)",Wellington,2.050311e+10,20503109201,205031092.0,Wilsons Promontory,205031092,Wilsons Promontory,...,0.307692,0.002885,0.141896,0.592857,0.026640,0.521002,0.75,2.120436,1.365191,2.595285
3265,3960,Wilsons Promontory,"(-38.9572966, 146.28311)",Wellington,2.050311e+10,20503109201,205031092.0,Wilsons Promontory,205031092,Wilsons Promontory,...,0.307692,0.257113,0.057057,0.121429,0.326216,0.521002,0.75,2.584741,1.208879,3.194436
3266,3707,Bringenbrong,"(-36.1428573, 148.0518011)",Towong,1.130213e+10,11302126013,113021260.0,Tumbarumba,113021260,Tumbarumba,...,0.111111,0.216357,0.082883,0.064286,0.298453,0.378887,1.00,2.425354,1.163270,3.354680
