# **University of Otago Key Dates to Calendar** 

Throughout this notebook, I will document the process I have taken to process the events from the [University of Otago's key dates page](https://www.otago.ac.nz/news/events/keydates/) in to Google Calendar events. Techniques I used included web scraping and OAuth authentication and making calls to Google Calendars API. 

## **Problem**

The problem I was having was that I didnt want to have to go through Otago's key dates website and manually add all the events I needed to keep track of. So I devised a Python script to do it for me. 


## **Set Up**

First and foremost we need to import all the required libraries into our project

In [16]:
# Importing modules for web scraping
import requests
from bs4 import BeautifulSoup
from datetime import datetime

## **Web Scraping the Key Dates Page**

After I installed all of the packages required, I got streight on to learning the python web scraping library `BeautifulSoup`. 

Before we start processing the webpage, we need to use `requests` to get the page. We then follow this creating an instance of `BeautifulSoup`, passing in our responce. 

In [17]:
# Scraping the webpage and getting the data
url = "https://www.otago.ac.nz/news/events/keydates/"
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
data = {}

#### **Processing the web page** 

From here we can start to process the page and extract all the elements we want. In our case, our months are stored under the selector `#main > div.atoz-content > ul > a > h2`, and the tables containing the events stored under the selector `#main > div.atoz-content > ul > dl`

In [18]:
# Stores all the months
months = soup.select("#main > div.atoz-content > ul > a > h2") 

# Stores all the Dates and Events from each month
months_tables = soup.select("#main > div.atoz-content > ul > dl")

# Processing the events into a dictionary
for i in range(len(months)):
    month_key = months[i].text          # month text eg. "January"
    month_events = months_tables[i]

    # Appending the events to the month key in the dictionary
    data[month_key] = month_events

# Iterate through each month and create a dictionary of key = date and value = event
for month in data:
    events = data[month]
    event_dict = {}

    # Iterate over each dt and dd elements
    for i in range(len(events.select("dt"))):
        date = events.select("dt")[i].text
        event = events.select("dd")[i].text

        # Extracting the last 4 digits of the month 
        current_year = int(month[-4:])

        # convert date to datetime object from this format: Monday, 9 January
        date = datetime.strptime(date, "%A, %d %B").replace(year=current_year)
        date = date.strftime("%Y-%m-%d")

        # Append the date and event to the dictionary
        event_dict[date] = event

    # Replace the list of events with the dictionary
    data[month] = event_dict

# Printing the data in a readable format
for month in data:
    print(month)
    for date in data[month]:
        print(f"{date}: {data[month][date]}")
    print()

April 2024
2024-04-25: ANZAC Day

May 2024
2024-05-03: Last day to withdraw from semester 1 papers (11:59pm deadline)
2024-05-11: Graduation ceremonies
2024-05-18: Graduation ceremonies
2024-05-31: Lectures cease before semester 1 examinations

June 2024
2024-06-03: King's Birthday
2024-06-05: Semester 1 examinations begin
2024-06-19: Semester 1 examinations end
2024-06-25: Due date for submission of papers for course approval by students taking only semester 2 papers
2024-06-28: Matariki

July 2024
2024-07-10: Due date for payment of all fees enrolled for students registering for study commencing in semester 2 only
2024-07-15: Due date for completion of course enrolment declaration by students taking only semester 2 papers (late fee may apply)
2024-07-19: Last day to add semester 2 papers (11:59pm deadline)
2024-07-24: 2024 Sem 1 Special Exams (centrally managed) end

August 2024
2024-08-02: Last day to delete semester 2 papers with refund of fees (11:59pm deadline)
2024-08-17: Gradua

### **Saving the data to JSON**

After we have processed the page, we can save the data to a JSON file. This will allow us to easily access the data later on.

In [20]:
# Saving data to a JSON file 
import json

export_data = {
    "separated_by_month": data,
    "flat": {},
}

for month in data: 
    for date in data[month]:
        export_data["flat"][date] = data[month][date]

with open("./data/events.json", "w") as f:
    json.dump(export_data, f, indent=4)