# Sort out Academic agenda and plan classes

In this notebook, you can import your agenda in `ics` files and inspect proposed classes schedule (typically in XLS files) for next semester. 

As a teacher in Ecole Centrale de Nantes, this version deals with EI1 weekly schedule and options/small groups schedules.

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
# Import necessary libraries
import pandas as pd
from icalendar import Calendar
from jupyter_utils.agenda_ecn import * 

## Step 1: Import ICS files

In [3]:
# reading the ICS file exported from my agenda manager (thunderbird)
file_path="../ECN.ics"
ecn_df=read_ics(file_path)

# Step 2 reading excel files

## Step 2.1 reading EI1 xls file

Albeit annoying, the formatting of the agenda is fixed, so it's fairly easy to obtain the schedule of a given course for a given group.

It is build as follow:
- each sheet is a week named accordingly
- each group is a row
- columns correspond to time slots from Mon M1("C) to Fri S2 ("V")
- Each cell is either void or contains the course short name e.g. FLUID, followed by course type (TP, TD, CM)

Thus, the extraction function needs to know the structure and find course short name occurences on all sheets and output a dataframe with the same structure as ICS pandas imports.

Because of numerous merged cells, it is better for us to rely on `openpyxl` directly. to extract course times and groups. These functions will then be converted by some helper functions for yearly analysis. As of now the extracted events are saved in a list structure.

```python
def extract_schedule_1sheet_format(file_path, 
                                   course_name, 
                                   date_column, 
                                   line_slot="2", 
                                   group_name=None, 
                                   course_type=None, 
                                   display_group_schedule=False):
    """
    Extracts the schedule for a specific course from a single-sheet formatted course schedule Excel file.
    
    Args:
        file_path (str): The path to the Excel file containing the course schedule.
        course_name (str): The name of the course to search for in the schedule.
        date_column (str): The column containing the dates in the schedule.
        line_slot (str, optional): The row number in the sheet to look for class slots (default is "2").
        group_name (str, optional): The group name (not used in current implementation).
        course_type (str, optional): The type of the course (not used in current implementation).
        display_group_schedule (bool, optional): If True, displays the schedule for the group (default is False).

    Returns:
        list: A list of dictionaries, each representing an event with keys 'summary', 'dtstart', and 'dtend'.
    
    Notes:
        This function is specific to the design of one-sheet course schedule xlsx files at ECN. 
        It assumes that the schedule is structured in a specific way with class slots defined as:
            - M1: 08:00-10:00
            - M2: 10:15-12:15
            - S1: 13:45-15:45
            - S2: 16:00-18:00
        The function handles merged cells and extracts the necessary information based on the course name.

        For successive courses, you can add a blank space at the end of the course name in order to disentangle these.
    """
```
I'm in charge of two groups for 2024-2025 academic year (E and H). I set `display_group_schedule` to `True` so I can check that the sequence order is correct. Notice that TPx appears twice since the initial 4h event is split into two.

In [4]:
file_path="../general/EDTs24-25/ET_EI1S5 _2024-2025_VF.xlsx"

In [5]:

EI1_events=extract_schedule_by_group_EI1(file_path,"ALGO","E",["TP","TD"],True) 
EI1_events+=extract_schedule_by_group_EI1(file_path,"ALGO","H",["TP","TD"],False)

LUNDI    30/09/24 M2 ALGO TD 1
MERCREDI 09/10/24 S2 ALGO TD 2
LUNDI    04/11/24 M1 ALGO TD 3
VENDREDI 15/11/24 S2 ALGO TD 4
MERCREDI 20/11/24 S1 ALGO TD 5
LUNDI    02/12/24 S1 ALGO TD 6
MERCREDI 18/12/24 S1 ALGO TP1
MERCREDI 18/12/24 S2 ALGO TP1
LUNDI    06/01/25 M1 ALGO TP2
LUNDI    06/01/25 M2 ALGO TP2
VENDREDI 10/01/25 S1 ALGO TP3
VENDREDI 10/01/25 S2 ALGO TP3


## Step 2.2 : read Option schedule and BBA

Both file follow the same format where a single sheet represents the whole semester. There is only one group (for now during other years, we had a label in the text to know which group does what).
A specific column gives the date of the first day of the week. It has to be given by the user.. Then each slot is organized as in EI1 from monday M1 to friday S2. 

In [13]:
INFOIA=extract_schedule_1sheet_format("../general/EDTs24-25/Fichier_type_24_25.INFO IA.xlsx", "PAPY", "G",line_slot="2",display_group_schedule=True)

Thu 05/09/24 M2 PAPY CM - LL
Fri 06/09/24 S1 PAPY TP – LL
Fri 06/09/24 S2 PAPY TP – LL
Thu 12/09/24 M1 PAPY CM - LL
Thu 12/09/24 M2 PAPY TP – LL
Thu 26/09/24 M1 PAPY TP – LL
Thu 26/09/24 M2 PAPY TP – LL
Thu 03/10/24 M1 PAPY CM - LL
Thu 03/10/24 M2 PAPY TP – LL
Thu 10/10/24 M1 PAPY TP – LL
Thu 10/10/24 M2 PAPY TP – LL
Thu 17/10/24 M1 PAPY CM - LL
Thu 17/10/24 M2 PAPY TP – LL
Thu 24/10/24 M1 PAPY TP – LL
Thu 24/10/24 M2 PAPY TP – LL
Thu 07/11/24 M1 PAPY DS – LL


In [14]:
#in order to disambiguate INFOV from INFOVI, I added a blank space at the end of the string.
BBA=extract_schedule_1sheet_format("../general/EDTs24-25/BBA2_24_25.xlsx", "INFO V ", "C",display_group_schedule=True)

Wed 18/09/24 S1 INFO V - CM LL
Wed 25/09/24 S1 INFO V - TP LL + ?
Wed 02/10/24 S1 INFO V - CM LL
Wed 09/10/24 S1 INFO V - TP LL + ?
Wed 16/10/24 S1 INFO V - CM LL
Wed 23/10/24 S1 INFO V - TP LL + ?
Wed 06/11/24 S1 INFO V - CM LL
Wed 13/11/24 S1 INFO V - TP LL + ?
Wed 20/11/24 S1 INFO V - CM LL
Wed 27/11/24 S1 INFO V - TP LL + ?
Wed 04/12/24 S1 INFO V - CM LL
Wed 11/12/24 S1 INFO V - TP LL + ?
Fri 20/12/24 M2 INFO V - DS LL


# Step 3: report conflict and export complete schedule for vizualtion

In [8]:
# Merging events from ecn
combined_df = pd.concat([ecn_df,pd.DataFrame(BBA), pd.DataFrame(INFOIA),pd.DataFrame(EI1_events)], ignore_index=True)
new_courses_df = pd.concat([pd.DataFrame(BBA), pd.DataFrame(INFOIA),pd.DataFrame(EI1_events)], ignore_index=True)

In [9]:
# Sort the DataFrame by the 'dtstart' column
combined_df = combined_df.sort_values(by='dtstart').reset_index(drop=True)
#combined_df

In [10]:
conflicts_df=find_conflicting_events(combined_df)

In [11]:
# displaying conflicts
conflicts_df

Unnamed: 0,event1_summary,event1_dtstart,event1_dtend,event2_summary,event2_dtstart,event2_dtend
0,https://www.wccm2024.org/,2024-07-21 00:00:00+00:00,2024-07-27 23:59:59.999999+00:00,Vol N°AF375 de YVR à CDG - ref:VPFG3H pour LES...,2024-07-27 13:30:00+02:00,2024-07-28 08:15:00+02:00
1,AF 7771 de Aéroport Paris–Charles de Gaulle - ...,2024-07-28 09:45:00+02:00,2024-07-28 13:19:00+02:00,Air France- AF7771- Paris 7/28/2024 9:45:00 A...,2024-07-28 07:45:00+00:00,2024-07-28 11:19:00+00:00
2,Jurys BBA 3,2024-09-02 13:30:00+02:00,2024-09-02 17:30:00+02:00,Jurys BBA 3,2024-09-02 13:30:00+02:00,2024-09-02 17:30:00+02:00
3,ALGO TD 1,2024-10-02 13:45:00+02:00,2024-10-02 15:45:00+02:00,INFO V - CM LL,2024-10-02 13:45:00+02:00,2024-10-02 15:45:00+02:00
4,PAPY CM - LL,2024-10-17 08:00:00+02:00,2024-10-17 10:00:00+02:00,Matinée séminaire CA,2024-10-17 09:30:00+02:00,2024-10-17 12:30:00+02:00
5,Matinée séminaire CA,2024-10-17 09:30:00+02:00,2024-10-17 12:30:00+02:00,PAPY TP – LL,2024-10-17 10:15:00+02:00,2024-10-17 12:15:00+02:00
6,INFO V - CM LL,2024-11-20 13:45:00+01:00,2024-11-20 15:45:00+01:00,ALGO TD 5,2024-11-20 13:45:00+01:00,2024-11-20 15:45:00+01:00
7,CA Audition des candidats Direction,2025-04-03 00:00:00+00:00,2025-04-04 23:59:59.999999+00:00,Vote,2025-04-04 09:00:00+02:00,2025-04-04 10:00:00+02:00


# Step 4 export to ICS

In [15]:
df_to_ics(new_courses_df,"2025.ics")

It is now easy to import into your calendar app to check for unforeseen incompatibilities or have a preview of next year's classes before "Scolarité" adds it to onboard.