# Generate list of DST transitions dates

We need to obtain a list of when daylight savings transitions happened.

Python is able to convert from timezones with DST to ones without. (e.g. `Australia/Sydney` to `Australia/Brisbane`.) Therefore the python datetime library must contain the raw data about when timezone changes happen.

This script grabs that data, and writes it into a convenient form.

We do this instead of downloading from an official site such as [this](https://www.nsw.gov.au/about-nsw/daylight-saving), because the official site has one page per year. It would be a pain to go through each one. We'd probably make a mistake doing so. 

Note that over the last decade or two, all regions with DST transition  on the same day. 
They all shift forward/back by 1 hour.
Although SA is permenantly half an hour off from VIC, NSW, TAS.
(So they actually move their clocks back/forward half an hour before the others. For now we're just looking at dates and not caring about that. Since in practice people physically change their clocks just after dinnertime. If anything, people may go to be earlier/later a few hours before a DST transition. So the exact 2am transition time is not really relevant for us. Although this scripted approach would allow us to add time-of-day easily if we have enough time.)

Note that the dates are saved to CSV as `yyyy-mm-dd`. I have verified manually that R `read_csv` interprets this correctly.

In [None]:
import datetime as dt

import pytz
import pandas as pd

In [None]:
# AEMO data is in "market time", which is Brisbane time (no DST), UTC+10.
MARKET_TIME_OFFSET = 10 

In [None]:
data = []
tz = pytz.timezone('Australia/Sydney')
assert len(tz._utc_transition_times) == len(tz._transition_info)
for (transition_utc, (delta, t2, tz_acronym)) in zip(tz._utc_transition_times, tz._transition_info):
    if (dt.datetime(year=2000, month=1, day=1) < transition_utc) and (transition_utc < dt.datetime.now()):
        data.append({
            'date': transition_utc.date(), 
            'direction': 'start' if (t2 > dt.timedelta(0)) else 'stop',
        })

df = pd.DataFrame(data)
df.to_csv('data/02-dst-dates.csv', index=False)