<div class="usecase-title">API v2.1 Tutorial: The City of Melbourne (CoM) API is organised around REST using Opendatasoft Explore API v2. It provides access to all the data available through the platform in a heirarchial structure.</div>

<div class="usecase-authors"><b>Authored by: </b>Te' Claire</div>

<div class="usecase-date"><b>Date: </b> March-July 2024</div>

<div class="usecase-duration"><b>Duration:</b> 40 mins</div>

<div class="usecase-level-skill">
    <div class="usecase-level"><b> Level: </b>Beginner</div>
    <div class="usecase-skill"><b> Pre-requisite Skills: </b>Python <i>Optional</i> Google Collaborate access</div>
</div>

<div class="usecase-subsection-blurb">
  <i>Link 1:</i> API Explore v2.1 Console
  <br>
  <a href="https://data.melbourne.vic.gov.au/api/explore/v2.1/console" target="_blank">Link</a>
  <br>
</div>
<br>

##### Context: To provide guidance of the City of Melbourne (CoM) API usage.
1. API and GitHub (Cloud) IDE
2. exports endpoint (no limitations)
3. records endpoint (limited to number of returned records)


---

###### CoM API endpoints:
- Endpoints allow you to enumerate datasets
- List export formats
- Export data
- List facet values
- Manage individual dataset records
<br>


###### Catalog API
* `GET /catalog/datasets` <br>
`GET https://data.melbourne.vic.gov.au/api/catalog/datasets`
- **Purpose** To list all the datasets available in the catalog
  - Used to get an overview of the datasets available in the system

* `GET /catalog/exports`
- **Purpose** To list all export formats that the catalog supports
  - Useful for understanding what formats the data can be exported (CSV, JSON)

  
* `GET /catalog/exports/{format}`
- **Purpose** To export the entire catalog in a specific format
  - Used when you want to download the entire catalog in one of the supported formats

* `GET /catalog/exports/csv`
- **Purpose**  Specifically for exporting the catalog in CSV format.
  - A direct endpoint for exporting data in a common, easily usable format.

* `GET /catalog/exports/dcat{dcat_ap_format}`
- **Purpose** To export the catalog in RDF/XML format using DCAT
  - Exporting data in a format that's suitable for integrating with other data catalogs or systems following the DCAT standard

* `GET /catalog/facets`
- **Purpose** To list all the facet values available in the catalog
  - Facets are used to filter or categorize datasets/ helps understand the categorization

* `GET /catalog/datasets/{dataset_id}`
- **Purpose** To show detailed information about a specific dataset
  - When you need metadata or details about a particular dataset


###### Dataset API

* `GET /catalog/datasets/{dataset_id}/records`
- **Purpose** To query records within a specific dataset
  - To retrieve the data entries or records from a specific dataset

* `GET /catalog/datasets/{dataset_id}/exports`
- **Purpose** To list the export formats available for a specific dataset
- Understands in what formats you can export the data from this dataset
  
* `GET /catalog/datasets/{dataset_id}/exports/{format}`
- **Purpose** To export a specific dataset in a specified format
  - To download data from a specific dataset in a particular format

* `GET /catalog/datasets/{dataset_id}/exports/csv`
- **Purpose** To export a specific dataset in CSV format
  - Direct endpoint for exporting dataset data in CSV, a commonly used data format

* `GET /catalog/datasets/{dataset_id}/exports/gpx`
- **Purpose** To export a specific dataset in GPX format
  - Useful for datasets related to geographical data, which GPX format is well-suited for

* `GET /catalog/datasets/{dataset_id}/facets`
- **Purpose** To list the facets for a specific dataset
  - To get an understanding of the different dimensions or categories within a dataset

* `GET /catalog/datasets/{dataset_id}/attachments`
- **Purpose** To list attachments for a specific dataset
  - When datasets have additional files or documents attached, this endpoint lets you enumerate them
* `GET /catalog/datasets/{dataset_id}/records/{record_id}`
- **Purpose** To read a specific record within a dataset
  - To get detailed information about a particular entry or record in a dataset


###### Load Dependencies

In [3]:
# Dependencies
import warnings
warnings.filterwarnings("ignore")

import requests
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)

###### Cloud or Local IDE (Run notebook/ script)
- To collect API from directory located in Google Collab

In [4]:
from google.colab import drive
drive.mount('/content/drive')
with open('/content/drive/My Drive/SIT378/h.txt', 'r') as file:
    api_key = file.read().strip()

import os
api_key = os.getenv(api_key)

Mounted at /content/drive


##### **Preferred Method**: Export Endpoint
##### Single Request for CSV File Download
`GET/catalog/exports/catalog/datasets/`
- ODSQL Function Export CSV or json_format
- Read response directly into dataframe
- `response.content.decode('utf-8')` converts binary repsonse into UTF-8 string (encoded)
- Data uses a delimiter (;)

In [5]:
# **Preferred Method**: Export Endpoint
import requests
import pandas as pd
from io import StringIO

# https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour/information/
dataset_id = 'pedestrian-counting-system-monthly-counts-per-hour'

base_url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/'
apikey = api_key
dataset_id = dataset_id
format = 'csv'

url = f'{base_url}{dataset_id}/exports/{format}'
params = {
    'select': '*',
    'limit': -1,  # all records
    'lang': 'en',
    'timezone': 'UTC',
    'api_key': apikey
}

# GET request
response = requests.get(url, params=params)

if response.status_code == 200:
    # StringIO to read the CSV data
    url_content = response.content.decode('utf-8')
    pedestrian_hour = pd.read_csv(StringIO(url_content), delimiter=';')
    print(pedestrian_hour.sample(10, random_state=999)) # Test
else:
    print(f'Request failed with status code {response.status_code}')

       sensor_name                  timestamp  locationid  direction_1  \
299310    Col700_T  2023-06-11T00:00:00+00:00           9           72   
273077    SprFli_T  2024-01-16T19:00:00+00:00          75           30   
230538    Bou688_T  2023-08-22T21:00:00+00:00          58          794   
545967    Eli501_T  2024-03-08T17:00:00+00:00          49           24   
41658     BouHbr_T  2023-07-06T06:00:00+00:00          10          275   
311686    BouBri_T  2023-06-03T18:00:00+00:00          57            2   
503749      ACMI_T  2024-02-27T12:00:00+00:00          72           10   
362060     Col12_T  2023-12-08T17:00:00+00:00          18          471   
90696         AG_T  2023-09-29T13:00:00+00:00          29           71   
340770      ACMI_T  2023-08-06T06:00:00+00:00          72          352   

        direction_2  total_of_directions                    location  
299310           93                  165  -37.81982992, 144.95102555  
273077           18                   48  -

###### Check number of records in dataset (dataset_id)

In [6]:
###### Check number of records in dataset (dataset_id)
num_records = len(pedestrian_hour)
print(f'The dataset contains {num_records} records.')

The dataset contains 549976 records.


In [7]:
# View dataset
pedestrian_hour.head()

Unnamed: 0,sensor_name,timestamp,locationid,direction_1,direction_2,total_of_directions,location
0,SprFli_T,2023-04-24T21:00:00+00:00,75,36,17,53,"-37.81515276, 144.97467661"
1,SprFli_T,2023-04-25T00:00:00+00:00,75,28,50,78,"-37.81515276, 144.97467661"
2,SprFli_T,2023-04-25T01:00:00+00:00,75,63,63,126,"-37.81515276, 144.97467661"
3,SprFli_T,2023-04-25T02:00:00+00:00,75,85,89,174,"-37.81515276, 144.97467661"
4,SprFli_T,2023-04-25T08:00:00+00:00,75,365,59,424,"-37.81515276, 144.97467661"




---



##### **Example: Catalog API to enumerate datasets** <br>
GET/catalog/datasets  <br>
`GET https://data.melbourne.vic.gov.au/api/catalog/datasets`
- list all datasets available in the Melbourne data catalog

######limit parameter controls the number of records or datasets returned in the response.

In [8]:
import requests
url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets'
params = {
    'select': '*',
    # 'limit': 10,
    'offset': 0,
    'timezone': 'UTC',
    'include_links': 'false',
    'include_app_metas': 'false'
}
headers = {
    'accept': 'application/json; charset=utf-8'
}

# GET request
response = requests.get(url, headers=headers, params=params)

if response.status_code == 200: # Status code Check
    # Successful
    print(response.json())
else:
    # Error
    print(f'Request failed with status code {response.status_code}')


{'total_count': 228, 'results': [{'visibility': 'domain', 'dataset_id': 'pay-stay-parking-restrictions', 'dataset_uid': 'da_cs4ajy', 'has_records': True, 'features': ['analyze'], 'attachments': [], 'alternative_exports': [], 'data_visible': True, 'fields': [{'name': 'pay_stay_zone', 'description': 'A collection of 1 or more parking bays in which the same restrictions apply', 'annotations': {}, 'label': 'pay_stay_zone', 'type': 'int'}, {'name': 'day_of_week', 'description': 'Day of the week the restriction applies.\n1 - Sunday\n2- Monday\n3- Tuesday\n4- Wednesday\n5- Thursday\n6- Friday\n7- Saturday\n\n', 'annotations': {'facetsort': '-count'}, 'label': 'day_of_week', 'type': 'text'}, {'name': 'start_time', 'description': 'What time the restriction starts', 'annotations': {'facetsort': '-count'}, 'label': 'start_time', 'type': 'text'}, {'name': 'end_time', 'description': 'What time the restriction ends', 'annotations': {'facetsort': '-count'}, 'label': 'end_time', 'type': 'text'}, {'nam

##### **Example: Show dataset Information** <br>
GET /catalog/datasets/{dataset_id}  <br>
`GET https://data.melbourne.vic.gov.au/api/catalog/datasets/pedestrian-counting-system-monthly-counts-per-hour`
- list all datasets available in the Melbourne data catalog

In [9]:
# dataset_id
# https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour/information/
dataset_id = 'pedestrian-counting-system-monthly-counts-per-hour'

In [10]:
import requests
url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/' + dataset_id
# or use full URL
# # https://data.melbourne.vic.gov.au/explore/dataset/pedestrian-counting-system-monthly-counts-per-hour/information/

params = {
    'select': '*',
    'lang': 'en',
    'timezone': 'UTC',
    'include_links': 'false',
    'include_app_metas': 'false'
}
headers = {
    'accept': 'application/json; charset=utf-8'
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)
if response.status_code == 200:
    # Successful
    print(response.json())
else:
    # Error
    print(f'Request failed with status code {response.status_code}')


{'visibility': 'domain', 'dataset_id': 'pedestrian-counting-system-monthly-counts-per-hour', 'dataset_uid': 'da_3k86ps', 'has_records': True, 'features': ['geo', 'analyze', 'timeserie'], 'attachments': [{'id': 'pedestrian_counting_system_monthly_counts_per_hour_may_2009_to_14_dec_2022_csv_zip', 'title': 'Pedestrian_Counting_System_Monthly_counts_per_hour_may_2009_to_14_dec_2022.csv.zip', 'mimetype': 'application/zip', 'url': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/pedestrian-counting-system-monthly-counts-per-hour/attachments/pedestrian_counting_system_monthly_counts_per_hour_may_2009_to_14_dec_2022_csv_zip'}], 'alternative_exports': [], 'data_visible': True, 'fields': [{'name': 'sensor_name', 'description': None, 'annotations': {'facet': True, 'id': True}, 'label': 'Sensor_Name', 'type': 'text'}, {'name': 'timestamp', 'description': 'Hourly sensor reading time of Direction 1 and Direction 2 sensors', 'annotations': {'facet': True, 'facetsort': '-alphanum',

###### Check available export formats for dataset

In [11]:
import requests
url = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports'
params = {
    'select': '*',
    'lang': 'en',
    'timezone': 'UTC',
    'include_links': 'false',
    'include_app_metas': 'false'
}
headers = {
    'accept': 'application/json; charset=utf-8'
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)
if response.status_code == 200:
    # Successful
    print(response.json())
else:
    # Error
    print(f'Request failed with status code {response.status_code}')


{'links': [{'rel': 'self', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports'}, {'rel': 'csv', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/csv'}, {'rel': 'json', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/json'}, {'rel': 'data.json', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/data.json'}, {'rel': 'rdf', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/rdf'}, {'rel': 'ttl', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/ttl'}, {'rel': 'dcat', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/dcat'}, {'rel': 'rss', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/rss'}, {'rel': 'sitemap', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/exports/sitemap'}, {'rel': 'xlsx', 'href': 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/ex



---



###### Records Endpoint
###### Function `fetch_data` paginates iterates over data in chunks (num_records and offset) until all records are retrieved or a maximum offset is reached.
- This endpoint is subjected to a limited number of returned records: <10000

In [12]:
import requests
import pandas as pd
def fetch_data(base_url, dataset, api_key, num_records=99, offset=0):
    all_records = []
    max_offset = 9900

    while True:
        if offset > max_offset:
            break

        filters = f'{dataset}/records?limit={num_records}&offset={offset}'
        url = f'{base_url}{filters}&api_key={api_key}'

        try:
            result = requests.get(url, timeout = 10)
            result.raise_for_status()
            records = result.json().get('results')
        except requests.exceptions.RequestException as e:
            raise Exception(f'API request failed: {e}')
        if records is None:
            break
        all_records.extend(records)
        if len(records) < num_records:
            break

        offset += num_records

    df = pd.DataFrame(all_records)
    return df

BASE_URL = 'https://data.melbourne.vic.gov.au/api/explore/v2.1/catalog/datasets/'
API_KEY = api_key

In [13]:
# data set name
SENSOR_DATASET = 'on-street-parking-bay-sensors'
df = fetch_data(BASE_URL, SENSOR_DATASET, API_KEY)
df

Unnamed: 0,lastupdated,status_timestamp,zone_number,status_description,kerbsideid,location
0,2023-12-14T04:45:34+00:00,2023-12-14T03:41:25+00:00,7695.0,Unoccupied,22959,"{'lon': 144.95938672872117, 'lat': -37.8184477..."
1,2023-12-14T04:45:34+00:00,2023-12-13T06:21:58+00:00,7939.0,Unoccupied,10136,"{'lon': 144.95263753679632, 'lat': -37.8099909..."
2,2023-12-14T04:45:34+00:00,2023-12-13T07:44:31+00:00,,Unoccupied,6992,"{'lon': 144.95965213010888, 'lat': -37.8189845..."
3,2023-12-14T23:45:34+00:00,2023-12-14T23:35:02+00:00,,Unoccupied,6527,"{'lon': 144.95642622505966, 'lat': -37.8106009..."
4,2023-12-14T23:45:34+00:00,2023-12-14T22:39:46+00:00,,Unoccupied,6526,"{'lon': 144.95649292476088, 'lat': -37.8105814..."
...,...,...,...,...,...,...
6226,2024-03-19T22:26:27+00:00,2024-03-19T22:08:41+00:00,7630.0,Unoccupied,51619,"{'lon': 144.96962473165303, 'lat': -37.8158507..."
6227,2024-03-19T22:26:27+00:00,2024-03-19T15:13:56+00:00,7770.0,Unoccupied,65359,"{'lon': 144.96329725545985, 'lat': -37.8120799..."
6228,2024-03-19T22:26:27+00:00,2024-03-19T22:20:09+00:00,7770.0,Present,65344,"{'lon': 144.96442650276526, 'lat': -37.8117512..."
6229,2024-03-19T22:26:27+00:00,2024-03-19T21:37:18+00:00,7603.0,Unoccupied,65329,"{'lon': 144.9628107705015, 'lat': -37.81202477..."
