# Wind speed data download

M1 APPLIED ECONOMETRICS, Spring 2024

Applied Econometrics - Master TSE 1 - 2023/2024

> Sunlight Synchronization: Exploring the Influence of Daylight Saving Time on 
> CO2 Emissions and Electricity Consumption in Australia's Electricity Grid

LAST MODIFIED: 29/02/2024 

LAST MODIFIED BY: Matthew Davis

Script duration: a few minutes

Disk storage requirement: negligible (<10MB), but in addition to the large requirements of adjacent scripts

Bandwidth requirement: Negligible. Any normal internet connection will be sufficient.

Memory requirement: negligible. Any modern laptop/desktop will be sufficient.

--------------------

This script webscrapes windspeed data from Willy Weather.

e.g.
https://www.willyweather.com.au/climate/weather-stations/tas/hobart/hobart-ellerslie-road.html?superGraph=plots:wind-speed,wind-gust,grain:monthly,graphRange:1year&climateRecords=period:all-time&longTermGraph=plots:temperature,period:all-time,month:all&windRose=period:1-year,month:all-months

Units are km/h

This script takes a few minutes.

In [1]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


In [2]:
import datetime as dt
import os
import sys

import requests
import pandas as pd
from tqdm import tqdm

In [3]:
assert sys.version_info >= (3, 6), "Python version too low."

In [4]:
# relative to this file
base_data_dir = '../data'

In [5]:
stations = [
    {
        'regionid': 'NSW1',
        'name': 'Dubbo',
        'id': 340,
    },
    {
        'regionid': 'SA1',
        'name': 'Coober Pedy',
        'id': 133
    },
    {
        'regionid': 'QLD1',
        'name': 'Longreach',
        'id': 236,
    },
    {
        'regionid': 'VIC1',
        'name': 'Bendigo',
        'id': 411
    },
    {
        'regionid': 'TAS1',
        'name': 'Hobart',
        'id': 501
    },
    
]
start_year = 2009
end_year = 2023

download the data

In [6]:
data = []
print("Starting")
for year in tqdm(range(start_year, end_year+1)):
    for station in stations:
        
        start_date = f"{year}-01-01"
        end_date = f"{year}-12-31"
        
        url = f"https://www.willyweather.com.au/climate/weather-stations/graphs.json?graph=station:{station['id']},startDate:{start_date},endDate:{end_date},grain:daily,longTermAverageType:all-time,series=order:5,id:avg-wind-speed,type:climate,series=order:6,id:max-wind-speed,type:climate,series=order:7,id:avg-wind-speed,type:climate,series=order:8,id:max-wind-direction,type:climate,series=order:9,id:long-term-avg-avg-wind-speed,type:climate,series=order:10,id:long-term-avg-max-wind-speed,type:climate"
        raw = requests.get(url).json()
        for metric in raw['data']['climateGraphs']:
            for day in raw['data']['climateGraphs'][metric]['dataConfig']['series']['groups']:
                for point in day['points']:
                    data.append({
                        'regionid': station['regionid'],
                        'metric': metric.replace('-', '_') + '_km_per_h',
                        'date': dt.datetime.strptime(day['dateTime'], "%Y-%m-%d %H:%M:%S").date(),
                        't': point['x'],
                        'value': point['y']
                    })

Starting


100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [02:06<00:00,  8.46s/it]


Data tidying

In [7]:
df = pd.DataFrame(data)
df = df.pivot(index=['regionid', 'date'], columns='metric', values='value').reset_index()
df = df.groupby(['date', 'regionid']).agg('mean').reset_index()
df.head()

metric,date,regionid,avg_wind_speed_km_per_h,max_wind_speed_km_per_h
0,2009-01-01,NSW1,14.8,42.5
1,2009-01-01,QLD1,17.3,70.2
2,2009-01-01,SA1,23.4,50.0
3,2009-01-01,TAS1,24.5,76.0
4,2009-01-01,VIC1,18.4,55.4


In [8]:
output_path = os.path.join(base_data_dir, '02-wind.csv')
df.to_csv(output_path, index=False)