In [31]:
import json
import copy
from pprint import PrettyPrinter
pp = PrettyPrinter(indent=4)

import pandas as pd
print(pd.__version__)

1.3.4


#### Turn the JSON file into a Pandas DataFrame

We can convert the JSON data into a Pandas DataFrame, but we need to modify the data first.

In [35]:
with open('mytimetable(individual).json', 'r') as f:
    activities = json.load(f)

df_data = copy.deepcopy(activities)
for activity in df_data:
    # The 'Location' value is a list currently, which isn't ideal
    room, gmaps_link = activity['Location']
    del activity['Location']
    activity['Room'] = room
    activity['Google Maps Link'] = gmaps_link

    # 'Time' is also a list, so we separate it out into the start and end times
    start, end = activity['Time']
    del activity['Time']
    activity['Start Time'] = start
    activity['End Time'] = end

    # 'Week' is a dict, so we 'flatten' it
    week = activity['Week']
    del activity['Week']
    activity['Teaching Week'] = week['Teaching Week']
    activity['Term'] = week['Term']
    activity['Week Number'] = week['Week Number']

df = pd.DataFrame(df_data)
df

Unnamed: 0,Type,Name,With,Date,Day of the Week,Timetable URL,Room,Google Maps Link,Start Time,End Time,Teaching Week,Term,Week Number
0,Seminar,CFLS1G21 - German Stage 2,MRS ZHANNA KOZMENKO-IHSSEN,2023-10-02,Monday,https://mytimetable.durham.ac.uk/weekly/activi...,D/ERA30,https://www.google.com/maps/search/?api=1&quer...,11:00,12:00,1,Michaelmas,12
1,Lecture,COMP3567 - Multimedia and Game Development,DR WOON-BOR LI,2023-10-02,Monday,https://mytimetable.durham.ac.uk/weekly/activi...,D/TLC113,https://www.google.com/maps/search/?api=1&quer...,14:00,15:00,1,Michaelmas,12
2,Lecture,COMP3647 - Human-AI Interaction Design,PROF EFFIE LAI-CHONG LAW,2023-10-02,Monday,https://mytimetable.durham.ac.uk/weekly/activi...,D/CG91,https://www.google.com/maps/search/?api=1&quer...,15:00,16:00,1,Michaelmas,12
3,Seminar,CFLS1G21 - German Stage 2,MRS ZHANNA KOZMENKO-IHSSEN,2023-10-04,Wednesday,https://mytimetable.durham.ac.uk/weekly/activi...,D/ER207,https://www.google.com/maps/search/?api=1&quer...,12:00,13:00,1,Michaelmas,12
4,Seminar,CFLS1G21 - German Stage 2,MRS ZHANNA KOZMENKO-IHSSEN,2023-10-05,Thursday,https://mytimetable.durham.ac.uk/weekly/activi...,D/ER278,https://www.google.com/maps/search/?api=1&quer...,12:00,13:00,1,Michaelmas,12
...,...,...,...,...,...,...,...,...,...,...,...,...,...
179,Lecture,COMP3687 - Data Compression,PROF MARK ROBERT HAIGH GOTHAM,2024-03-15,Friday,https://mytimetable.durham.ac.uk/weekly/activi...,D/CB-0008,https://www.google.com/maps/search/?api=1&quer...,12:00,13:00,20,Epiphany,35
180,Lecture,COMP3717 - Introduction to Music Processing,DR ROBERT LIECK,2024-04-23,Tuesday,https://mytimetable.durham.ac.uk/weekly/activi...,D/TLC101,https://www.google.com/maps/search/?api=1&quer...,10:00,11:00,21,Easter,41
181,Lecture,COMP3687 - Data Compression,DR IOANNIS IVRISSIMTZIS,2024-04-23,Tuesday,https://mytimetable.durham.ac.uk/weekly/activi...,D/TLC101,https://www.google.com/maps/search/?api=1&quer...,15:00,16:00,21,Easter,41
182,Lecture,COMP3717 - Introduction to Music Processing,PROF MARK ROBERT HAIGH GOTHAM,2024-04-30,Tuesday,https://mytimetable.durham.ac.uk/weekly/activi...,D/TLC101,https://www.google.com/maps/search/?api=1&quer...,10:00,11:00,22,Easter,42


#### Now we can begin the analysis

See my prompt to ChatGPT, and its response:

<hr>

I am a university student and I have received my timetable for the upcoming academic year. I've written some code which scrapes the timetable activity data off the university website. Now I'm wondering how i should analyse the data. As in, what stats should i investigate. One could be average number of classes per week per semester. Anything else?

<hr>

Analyzing your university timetable data can provide valuable insights into your academic schedule and help you make informed decisions about managing your time and workload. Here are some statistical analyses and insights you can derive from your timetable data:

1. **Average Number of Classes per Week per Semester:** As you mentioned, calculating the average number of classes per week per semester can give you a sense of your weekly workload. This can help you plan your study and extracurricular activities more effectively.

2. **Distribution of Classes by Day of the Week:** Determine which days of the week are the busiest in terms of classes. This information can help you schedule study sessions or other commitments on less busy days.

3. **Class Duration Analysis:** Analyze the distribution of class durations. Are most of your classes short (e.g., 1 hour) or long (e.g., 3 hours)? Understanding the duration of classes can help you plan your day and study breaks accordingly.

4. **Gaps between Classes:** Calculate the average and maximum gaps between classes on any given day. This can help you identify opportunities for productive use of your time between classes.

5. **Course Overlaps:** Identify any overlaps in your timetable, where two classes or activities occur simultaneously. This can be crucial for resolving scheduling conflicts or requesting adjustments.

6. **Class Location Analysis:** Determine the distribution of classes across different locations on campus. This information can be useful for optimizing your travel time between classes.

7. **Course Load by Semester:** Calculate the total number of credits or hours for each semester to understand how your workload varies from one semester to another.

8. **Time of Day Analysis:** Analyze the distribution of classes throughout the day. Are you more likely to have early morning or late afternoon classes? This can help you plan your daily routine and study habits.

9. **Week-to-Week Variation:** Look for patterns or irregularities in your timetable, such as weeks with significantly more or fewer classes. This can help you anticipate busy periods and plan your study schedule accordingly.

10. **Breakdown by Course Type:** Categorize your classes by type (e.g., lectures, labs, seminars) and analyze how they are distributed throughout the week. This can help you prepare differently for various types of classes.

11. **Comparison with Personal Goals:** Compare your timetable data with your personal goals and commitments. Are there conflicts between your academic schedule and other activities or responsibilities? This can help you make adjustments as needed.

12. **Long-Term Planning:** Use the data to plan ahead for future semesters. Are there courses you want to take that consistently conflict with your existing schedule? This can inform your course selection decisions.

Remember that the specific analyses you perform may vary depending on your personal goals and preferences. The key is to use the data to make informed decisions about how you manage your time and balance your academic and personal life effectively. Additionally, visualizing your timetable data using charts or graphs can make it easier to interpret and identify patterns.

#### 1. Average Number of Classes per Week per Semester

In [46]:
df1 = df.copy()

# First, group the DataFrame by two 'Term' and 'Week Number'.
# Rows with the same 'Term' and 'Week Number' will be grouped together.
class_counts = df1.groupby(['Term', 'Week Number'])

# Now we need to count the number of classes in each group.
# The 'size' method does this, and gives us the result as a Series.
class_counts = class_counts.size()

# The Series has a multi-level index with 'Term' and 'Week Number'.
# To make it easier to work with, we reset the index using 'reset_index'.
# This converts the Series into a new DataFrame with 'Term', 'Week Number',
# and 'Class Count' columns
class_counts = class_counts.reset_index(name='Class Count')

# We can use this new DataFrame to calculate the average number of classes
# per week for each term.
# We group class_counts by the 'Term' column, and calculate the mean of the
# 'Class Count' within each term group. The result is a Series with 'Term'
# as the index and 'Average Classes Per Week' as the values.
average_classes_per_week = class_counts.groupby('Term')['Class Count'].mean()

# We convert the Series into a DataFrame with 'Term' and 'Average Classes Per Week'
# columns to make it a bit more organised.
average_classes_per_week = average_classes_per_week.reset_index(name='Average Classes Per Week')

average_classes_per_week

Unnamed: 0,Term,Average Classes Per Week
0,Easter,2.0
1,Epiphany,9.8
2,Michaelmas,8.2
