**NSW Toll Road Data Analytics**

***Download and Unpackage Toll Road Data***

This notebook has been developed to download and unpackage (unzip) traffic data from the NSW Toll Road Data website - see: https://nswtollroaddata.com/

The datasets on this website contain traffic data for the following toll roads in Sydney, New South Wales, Australia that are wholly or partly owned by Transurban:
- Cross City Tunnel ("CCT")
- Hills M2 ("M2")
- Lane Cove Tunnel / Military Road E-Ramp ("LCT")
- M1 Eastern Distributor ("ED")
- M4 ("M4")
- M5 South West Motorway ("M5")
- M5 East & M8 ("M5E")
- NorthConnex ("NCX")
- Westlink M7 ("M7")

Quarterly traffic data is published on nswtollroaddata.com in accordance with Transurban’s obligations under an Undertaking accepted by the ACCC on 29 August 2018 under section 87B of the Competition and Consumer Act 2010 (Cth) (Undertaking) and is made available under the Creative Commons Attribution 4.0 license.

In [1]:
# Import dependency packages
import os
import pandas as pd
import glob
import zipfile
import datetime
import urllib.request
from pathlib import Path

time_start = datetime.datetime.now()
print(time_start)

2021-08-08 14:26:50.982981


In [2]:
# Specify the directory on your machine where the unzipped CSV files are saved.
# The CSV files can be downloaded from here: https://nswtollroaddata.com/data-download/
base_dir = os.getcwd() 
data_dir = base_dir
inputs_dir = os.path.join(data_dir, 'Inputs')
outputs_dir = os.path.join(data_dir, 'Outputs')
print(base_dir)

C:\Users\peter\Documents\GitHub\NSW_Toll_Road_Data_Analytics_dev


***Download Data***

The following cells download and unzip the data from the NSW Toll Road Data website.

In [3]:
# Download all data from NSW Toll Road Website automatically
FinancialQuarter = 1
FinancialYearStart = 2009
FinancialYearEnd = time_start.year +1 # Current year, plus one as the financial year starts in July (i.e. FY2022 starts in July 2021)
FinancialYear = FinancialYearStart

# List of toll road Assets to download data for.
# Comment out any Assets where you do not need to download the data.
Assets = [
    'CCT', 
    'LCT', 
    'ED', 
    'NCX', 
    'M2', 
    'M4', 
    'M5', 
    'M5E', 
    'M7'
]

print(Assets)

['CCT', 'LCT', 'ED', 'NCX', 'M2', 'M4', 'M5', 'M5E', 'M7']


In [4]:
for asset in Assets:   
    while FinancialYear <= FinancialYearEnd:
        print("Downloading data for {} for FY {}".format(asset, FinancialYear))
        while FinancialQuarter <=4:
            print(FinancialYear, FinancialQuarter)
            web_url = "https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/{}/{}_traffic-data_FY{}_Q{}_csv.zip".format(
                asset, asset, str(FinancialYear), str(FinancialQuarter)
            )
            print(web_url)
            download_fp = os.path.join(data_dir, 'Inputs', '{}_traffic-data_FY{}_Q{}_csv.zip'.format(asset, str(FinancialYear), str(FinancialQuarter)))
            try: 
                urllib.request.urlretrieve(web_url, download_fp)
            except:
                print("No file available.")
            FinancialQuarter += 1
        FinancialYear += 1
        FinancialQuarter = 1
    FinancialYear = FinancialYearStart

Downloading data for CCT for FY 2009
2009 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2009_Q1_csv.zip
No file available.
2009 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2009_Q2_csv.zip
2009 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2009_Q3_csv.zip
2009 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2009_Q4_csv.zip
Downloading data for CCT for FY 2010
2010 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2010_Q1_csv.zip
2010 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2010_Q2_csv.zip
2010 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/CCT/CCT_traffic-data_FY2010_Q3_csv.zip
2010

No file available.
Downloading data for LCT for FY 2010
2010 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2010_Q1_csv.zip
No file available.
2010 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2010_Q2_csv.zip
No file available.
2010 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2010_Q3_csv.zip
2010 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2010_Q4_csv.zip
Downloading data for LCT for FY 2011
2011 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2011_Q1_csv.zip
2011 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/LCT_traffic-data_FY2011_Q2_csv.zip
2011 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/LCT/L

2010 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2010_Q4_csv.zip
Downloading data for ED for FY 2011
2011 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2011_Q1_csv.zip
2011 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2011_Q2_csv.zip
2011 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2011_Q3_csv.zip
2011 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2011_Q4_csv.zip
Downloading data for ED for FY 2012
2012 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2012_Q1_csv.zip
2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/ED/ED_traffic-data_FY2012_Q2_csv.zip
2012 3
https://s3.ap-southeast-2.amazon

No file available.
2011 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2011_Q3_csv.zip
No file available.
2011 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2011_Q4_csv.zip
No file available.
Downloading data for NCX for FY 2012
2012 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2012_Q1_csv.zip
No file available.
2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2012_Q2_csv.zip
No file available.
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2012_Q3_csv.zip
No file available.
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/NCX/NCX_traffic-data_FY2012_Q4_csv.zip
No file available.
Downloading data for NCX for FY 2013
2013 1
https://s3.a

2011 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2011_Q2_csv.zip
2011 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2011_Q3_csv.zip
2011 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2011_Q4_csv.zip
Downloading data for M2 for FY 2012
2012 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2012_Q1_csv.zip
2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2012_Q2_csv.zip
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2012_Q3_csv.zip
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M2/M2_traffic-data_FY2012_Q4_csv.zip
Downloading data for M2 for FY 2013
2013 1
https://s3.ap-southeast-2.amazon

No file available.
2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2012_Q2_csv.zip
No file available.
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2012_Q3_csv.zip
No file available.
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2012_Q4_csv.zip
No file available.
Downloading data for M4 for FY 2013
2013 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2013_Q1_csv.zip
No file available.
2013 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2013_Q2_csv.zip
No file available.
2013 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M4/M4_traffic-data_FY2013_Q3_csv.zip
No file available.
2013 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/da

No file available.
2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2012_Q2_csv.zip
No file available.
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2012_Q3_csv.zip
No file available.
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2012_Q4_csv.zip
Downloading data for M5 for FY 2013
2013 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2013_Q1_csv.zip
No file available.
2013 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2013_Q2_csv.zip
No file available.
2013 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5/M5_traffic-data_FY2013_Q3_csv.zip
No file available.
2013 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asse

No file available.
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2012_Q3_csv.zip
No file available.
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2012_Q4_csv.zip
No file available.
Downloading data for M5E for FY 2013
2013 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2013_Q1_csv.zip
No file available.
2013 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2013_Q2_csv.zip
No file available.
2013 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2013_Q3_csv.zip
No file available.
2013 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M5E/M5E_traffic-data_FY2013_Q4_csv.zip
No file available.
Downloading data for M5E for FY 2014
2014 1
https://s3.a

2012 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2012_Q2_csv.zip
2012 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2012_Q3_csv.zip
2012 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2012_Q4_csv.zip
Downloading data for M7 for FY 2013
2013 1
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2013_Q1_csv.zip
2013 2
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2013_Q2_csv.zip
2013 3
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2013_Q3_csv.zip
2013 4
https://s3.ap-southeast-2.amazonaws.com/accc-assetdata-prod/data/accc/upload/asset/M7/M7_traffic-data_FY2013_Q4_csv.zip
Downloading data for M7 for FY 2014
2014 1
https://s3.ap-southeast-2.amazon

In [5]:
# Unzip all data
inputs_dir = os.path.join(data_dir, 'Inputs')
extension = ".zip"

os.chdir(inputs_dir) # change directory from working dir to dir with files

for item in os.listdir(inputs_dir): # loop through items in dir
    if item.endswith(extension): # check for ".zip" extension
        file_name = os.path.abspath(item) # get full path of files
        zip_ref = zipfile.ZipFile(file_name) # create zipfile object
        zip_ref.extractall(inputs_dir) # extract file to dir
        zip_ref.close() # close file
        
os.chdir(base_dir)

In [6]:
# TODO- Rezip all data for record-keeping
# To do - deal with unziping of older M5 South-West Data Format