## Getting all the historical crash data for QLD

https://www.data.qld.gov.au/dataset/crash-data-from-queensland-roads/resource/f999155b-37f7-48aa-b5dd-644838130b0b?inner_span=True 

The API here has memory limitations i.e. it only lets you load a subset of the data at a time. So if you want to get all of the data, you need to extract mutually exclusive portions of the dataset and then assemble, as demonstrated below: 

In [1]:
import pandas as pd
import requests

### By postcode

If you have a list of all QLD postcodes, you could, in theory, loop through all of them to get the crash stats then aggregate:

In [12]:
limit = 10000

data_url = 'https://www.data.qld.gov.au/api/3/action/datastore_search?' \
    + 'resource_id=e88943c0-5968-4972-a15f-38e120d72ec0' \
    + '&limit=' + str(limit) \
    + '&q=' + '{"Loc_Post_Code":"4650"}'

qld_cc_json = requests.get(data_url).json()
qld_cc = pd.DataFrame(qld_cc_json['result']['records'])
qld_cc.iloc[:,1:20].sample(10)

Unnamed: 0,Crash_Ref_Number,Crash_Severity,Crash_Year,Crash_Month,Crash_Day_Of_Week,Crash_Hour,Crash_Nature,Crash_Type,Crash_Longitude_GDA94,Crash_Latitude_GDA94,Crash_Street,Crash_Street_Intersecting,State_Road_Name,Loc_Suburb,Loc_Local_Government_Area,Loc_Post_Code,Loc_Police_Division,Loc_Police_District,Loc_Police_Region
1750,226257,Medical treatment,2001,August,Thursday,9,Angle,Multi-Vehicle,152.684625,-25.545896,Maryborough - Hervey Bay Rd,,Maryborough - Hervey Bay Road,Tinana,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
2182,266706,Property damage only,2005,November,Thursday,11,Angle,Multi-Vehicle,152.703175,-25.543562,Alice St,Maryborough - Cooloola Rd,,Maryborough,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
748,88484,Property damage only,2003,October,Wednesday,15,Rear-end,Multi-Vehicle,152.701301,-25.522836,Frank St,Pallas St,,Maryborough,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
357,33471,Hospitalisation,2018,September,Tuesday,19,Hit object,Single Vehicle,152.663218,-25.487785,Bruce Hwy,,Bruce Highway (Maryborough - Gin Gin),Maryborough West,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
1713,196594,Hospitalisation,2017,May,Tuesday,11,Rear-end,Multi-Vehicle,152.641241,-25.625247,Bruce Hwy,,Bruce Highway (Gympie - Maryborough),Glenorchy,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
24,2655,Property damage only,2002,February,Friday,7,Angle,Multi-Vehicle,152.699641,-25.530653,Walker St,,,Maryborough,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
1535,177066,Medical treatment,2006,June,Saturday,15,Angle,Multi-Vehicle,152.694544,-25.539674,Albert St,Fort St,,Maryborough,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
1606,184393,Property damage only,2009,June,Sunday,18,Overturned,Single Vehicle,152.562201,-25.751342,Netherby Rd,,,Tiaro,Fraser Coast Region,4650,Tiaro,Wide Bay Burnett,Central
389,43614,Medical treatment,2002,March,Saturday,14,Rear-end,Multi-Vehicle,152.715244,-25.54235,Maryborough - Cooloola Rd,,Maryborough - Cooloola Road,Granville,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central
1311,149222,Hospitalisation,2012,June,Monday,6,Hit object,Single Vehicle,152.629651,-25.65424,Bruce Hwy,,Bruce Highway (Gympie - Maryborough),Owanyilla,Fraser Coast Region,4650,Maryborough,Wide Bay Burnett,Central


Seems we got all the crashes in postcode 4650:

In [9]:
qld_cc["Loc_Post_Code"].value_counts()

4650    2775
Name: Loc_Post_Code, dtype: int64

### By Year

We can loop through all available years to construct all the data:

In [17]:
%%time
limit = 32000
df_list = []

for year in range(2000,2019):

    data_url = 'https://www.data.qld.gov.au/api/3/action/datastore_search?' \
        + 'resource_id=e88943c0-5968-4972-a15f-38e120d72ec0' \
        + '&limit=' + str(limit) \
        + '&q=' + '{"Crash_Year":"'+str(year)+'"}'

    qld_cc_json = requests.get(data_url).json()
    df_list.append(pd.DataFrame(qld_cc_json['result']['records']))
    
qld_cc = pd.concat(df_list)
qld_cc.info()    

<class 'pandas.core.frame.DataFrame'>
Int64Index: 328247 entries, 0 to 12648
Data columns (total 54 columns):
_id                                328247 non-null int64
Crash_Ref_Number                   328247 non-null int64
Crash_Severity                     328247 non-null object
Crash_Year                         328247 non-null int64
Crash_Month                        328247 non-null object
Crash_Day_Of_Week                  328247 non-null object
Crash_Hour                         328247 non-null int64
Crash_Nature                       328247 non-null object
Crash_Type                         328247 non-null object
Crash_Longitude_GDA94              328247 non-null float64
Crash_Latitude_GDA94               328247 non-null float64
Crash_Street                       328247 non-null object
Crash_Street_Intersecting          328247 non-null object
State_Road_Name                    328247 non-null object
Loc_Suburb                         328247 non-null object
Loc_Local_Government_A