# <b> <span style="color:white">Electricity Sector Data Streaming & Analysis</span></b>


# <b> <span style="color:white">GROUP 04</span></b>


| Name                   | SID       | Unikey   |
| ---------------------- | --------- | -------- |
| Putu Eka Udiyani Putri | 550067302 | pput0940 |
| Rengga Firmandika      | 550126632 | rfir0117 |
| Vincentius Ansel Suppa | 550206406 | vsup0468 |


## <b> <span style="color:orange">0. Configuration and Import Required Libraries</span></b>


This notebook acquires, integrates, augments, and stores Australian electricity/emissions datasets:

- [NGER](https://data.cer.gov.au/datasets/NGER/ID0243) (emissions & generation, 2014–2024)
- [CER](https://cer.gov.au/markets/reports-and-data/large-scale-renewable-energy-data) (approved/committed/probable projects)
- [ABS](https://www.abs.gov.au/methodologies/data-region-methodology/2011-24#data-downloads) (population & industry by state)
- Geocoding augmentation via OpenStreetMap Nominatim

Outputs are loaded into a DuckDB database.

**Quick start:**
1. Project structure:
   
   <pre>
   Assignment2_Tut07_G04/
   ├── Assignment_2.ipynb      # main notebook
   ├── requirements.txt        # list of required libraries to run the notebook
   └── AUGMENTED/      # geocoded data
   </pre>

   Ensure your working directory is writable.

2. Create venv & install exact dependencies<br/>
   `python -m venv .venv`<br/>
   Windows: `.\.venv\Scripts\activate` | macOS/Linux: `source .venv/bin/activate`<br/>
   `python -m pip install --upgrade pip`<br/>
   `pip install -r requirements.txt`

3. Copy `.env.template` to `.env` file, replace `your_api_key` with your actual API key. 

4. Run the full pipeline (extract -> clean -> augment -> transform -> load)<br/>

**Notes:**

1. Geocoder fallback is flagged in geo_resolution (exact -> approximated by postcode -> approximated by state).
2. Augmentation process may take a considerable amount of time. We cached the previous API calls in augmented_dataset.txt so reruns do not to re-hit the API. To redo the geocoding process from the start, please remove the augmented_dataset.txt from the folder.

Import all the required libraries first.


In [1]:
from dotenv import load_dotenv
import os

import requests
import pandas as pd
from datetime import datetime, timedelta
import time

## <b> <span style="color:orange">1. Data Retrieval</span></b>


### <b> <span style="color:pink">1.1 National Greenhouse and Energy Reporting (NGER)</span></b>


This dataset consists of 10 annual CSV files released by the Clean Energy Regulator, covering the years 2014 to 2024.  
The files contain information on electricity generation and emissions intensity from facilities that are connected to major electricity networks in Australia.


In [None]:
# to get the API KEY from .env file
load_dotenv()

# basic configs
API_KEY = os.getenv("OPENELECTRICITY_API_KEY")
API_KEY = API_KEY.strip().strip('"').strip("'")  
BASE_URL = "https://api.openelectricity.org.au/v4"
START_DATE = datetime(2025, 10, 1) 
END_DATE = START_DATE + timedelta(days=7) 
DATA_ENDPOINTS = {
    "facility": "facilities"
}
ENDPOINT = "data/facilities/NEM"
PARAMS1 = {
    'network_code': 'NEM',
    'metrics': 'power',
    'interval': '5m',
    'start_date': START_DATE.strftime('%Y-%m-%d'),  # Format dates as strings
    'end_date': END_DATE.strftime('%Y-%m-%d')
}

def fetch_data_from_API(endpoint: str, query_params: dict):
    url = f"{BASE_URL}{endpoint}"
    
    headers = {
        "Authorization": f"Bearer {API_KEY}",
        "Accept": "application/json",
    }

    try:
        print(f"Making request to: {url}")
        print(f"With headers: {headers}")
        print(f"With params: {query_params}")
        
        response = requests.get(url, headers=headers, params=query_params)
        
        print(f"Response status: {response.status_code}")
        print(f"Response url: {response.url}")
        
        if response.status_code == 200:
            return response.json()
        else:
            print(f"   API Error {response.status_code}: {response.text}")
            print(f"   Response headers: {dict(response.headers)}")

            try:
                error_json = response.json()
                print(f"   Error details: {error_json}")
            except:
                print("   Could not parse error response as JSON")
            return None
    except requests.exceptions.RequestException as e:
        print(f"[EXCEPTION] Request failed: {e}")
        return None


In [3]:
print(f"Bearer {API_KEY}")

Bearer oe_3ZS8cWfNSpHr3tSyqTSSvL8d


In [None]:
BASE_URL = "https://api.openelectricity.org.au/v4"
ENDPOINT = "data/facilities/NEM"
url = f"{BASE_URL}{ENDPOINT}"
url

'https://api.openelectricity.org.au/v4/data/facilities/NEM/'

In [6]:
data = fetch_data_from_API(endpoint=ENDPOINT, query_params=PARAMS1)
data


Making request to: https://api.openelectricity.org.au/v4/data/facilities/NEM/
With headers: {'Authorization': 'Bearer oe_3ZS8cWfNSpHr3tSyqTSSvL8d', 'Content-Type': 'application/json', 'Accept': '*/*', 'Connection': 'keep-alive', 'User-Agent': 'DEAssignment2'}
With params: {'network_code': 'NEM', 'metrics': 'power', 'interval': '5m', 'start_date': '2025-10-01', 'end_date': '2025-10-08'}
Response status: 403
Response url: https://api.openelectricity.org.au/v4/data/facilities/NEM?network_code=NEM&metrics=power&interval=5m&start_date=2025-10-01&end_date=2025-10-08
   API Error 403: {"version":"4.3.0","response_status":"ERROR","error":"Not authenticated","success":false}
   Response headers: {'Date': 'Sun, 19 Oct 2025 03:43:00 GMT', 'Content-Type': 'application/json', 'Transfer-Encoding': 'chunked', 'Connection': 'keep-alive', 'Report-To': '{"group":"cf-nel","max_age":604800,"endpoints":[{"url":"https://a.nel.cloudflare.com/report/v4?s=3Ux4T%2BzdiLwXs38FB6N%2F%2BGdPolqz96gWfJPtsdyyYv77qzlv5

## <b> <span style="color:orange">2. Data Integration and Caching</span></b>


## <b> <span style="color:orange">3. Data Publishing via MTQQ</span></b>


## <b> <span style="color:orange">4. Dashboard and Visualisation</span></b>
