# Sort out Academic agenda and plan classes

In this notebook, you can import your agenda in `ics` files and inspect proposed classes schedule (typically in XLS files) for next semester. 

As a teacher in Ecole Centrale de Nantes, this version deals with EI1 weekly schedule and options/small groups schedules.

In [1]:
# Import necessary libraries
import pandas as pd
from icalendar import Calendar
from datetime import datetime, time
from datetime import datetime, timedelta
import calendar
import pytz

## Step 1: Import ICS files

In [2]:
# Function to read and parse an ICS file
def read_ics(file_path):
    with open(file_path, 'r') as file:
        gcal = Calendar.from_ical(file.read())
    utc=pytz.UTC
    now = datetime.now().replace(tzinfo=utc)
    events = []
    for component in gcal.walk():
        if component.name == "VEVENT":
           # try:
                dtstart=component.get('dtstart').dt
                if isinstance(dtstart, datetime):
                    if not dtstart.tzinfo:
                        dtstart = dtstart.replace(tzinfo=pytz.UTC)
                else:  # if dtstart is a date object
                    dtstart = datetime.combine(dtstart, time.min, tzinfo=pytz.UTC)
                dtend = component.get('dtend').dt if component.get('dtend') else dtstart
                event = {
                    'summary': component.get('summary'),
                    'dtstart': dtstart,
                    'dtend': dtend,
                    'location': component.get('location'),
                    'description': component.get('description')
                }
                if dtstart > now:
                    events.append(event)
            #except:
            #    print("Data ingnored: some error occured in:", component)

        # Filter events to only include those after now
    return pd.DataFrame(events)

In [3]:
file_path="../ECN.ics"
ecn_df=read_ics(file_path)

# Step 2 reading excel files

## Step 2.1 reading EI1 xls file

Albeit annoying, the formatting of the agenda is fixed, so it's fairly easy to obtain the schedule of a given course for a given group.

It is build as follow:
- each sheet is a week named accordingly
- each group is a row
- columns correspond to time slots from Mon M1("C) to Fri S2 ("V")
- Each cell is either void or contains the course short name e.g. FLUID, followed by course type (TP, TD, CM)

Thus, the extraction function needs to know the structure and find course short name occurences on all sheets and output a dataframe with the same structure as ICS pandas imports.

Because of numerous merged cells, it is better for us to rely on `openpyxl` directly. to extract course times and groups. These functions will then be converted into some helper script for yearly analysis.

In [4]:
class_slots={"M1": ["08:00","10:00"],"M2": ["08:00","10:00"],"S1": ["10:15","15:45"],"S2": ["16:00","18:00"]}
class_slots

{'M1': ['08:00', '10:00'],
 'M2': ['08:00', '10:00'],
 'S1': ['10:15', '15:45'],
 'S2': ['16:00', '18:00']}

In [5]:
file_path="../general/EDTs24-25/ET_EI1S5 _2024-2025_VF.xlsx"

In [12]:
#https://www.reddit.com/r/excel/comments/10w12bt/is_there_a_way_to_unmerge_cells_and_automatically/
import openpyxl


def create_merged_cell_lookup(sheet) -> dict:
    """
    Creates a lookup dictionary for merged cells in a given sheet.
    
    This function iterates through all merged cell ranges in the given sheet,
    and creates a dictionary where the keys are the merged cell ranges (as strings)
    and the values are the values of the top-left cell in each merged cell range.

    Args:
        sheet (openpyxl.worksheet.worksheet.Worksheet): The worksheet object to process.

    Returns:
        dict: A dictionary with merged cell ranges as keys and the top-left cell values as values.
    """
    merged_lookup = {}
    for cell_group in sheet.merged_cells.ranges:
        min_col, min_row, max_col, max_row = openpyxl.utils.range_boundaries(str(cell_group))
        #if min_col == max_col:
        top_left_cell_value = sheet.cell(row=min_row, column=min_col).value
        merged_lookup[str(cell_group)] = top_left_cell_value
    return merged_lookup

 
def unmerge_cell_copy_top_value(workbook_path: str, output_save="", verbose: bool=False):
    """
    Unmerges cells in the given workbook and copies the top-left cell value to all cells in each previously merged range.
    
    This function opens the workbook at the given path, processes each worksheet by unmerging all merged cells,
    and copies the value of the top-left cell in each merged range to all cells in that range. The modified workbook
    is saved as "ready4Import.xlsx" in the current working directory.

    Args:
        workbook_path (str): The path to the Excel workbook to process.
        output_save (str): if not empty, will save the file to requested ouput.
        verbose (bool): If True, print debug information during processing. Default is False.

    Returns:
        openpyxl.workbook.workbook.Workbook: The modified workbook object.
    """
    wbook = openpyxl.load_workbook(workbook_path)
    
    for sheet in wbook.worksheets:
        lookup = create_merged_cell_lookup(sheet)
        if verbose: print(lookup)
        cell_group_list = lookup.keys()
        for cell_group in cell_group_list:
            min_col, min_row, max_col, max_row = openpyxl.utils.range_boundaries(str(cell_group))
            sheet.unmerge_cells(str(cell_group))
            if verbose: print(min_col, min_row, max_col, max_row)
            for row in sheet.iter_rows(min_col=min_col, min_row=min_row, max_col=max_col, max_row=max_row):
                if verbose :print(lookup[cell_group])
                for cell in row:
                    cell.value = lookup[cell_group]
                    if verbose: print(cell.coordinate)
    if output_save : 
        wbook.save(output_save)
    return wbook

def search_string_in_workbook(wbook: "openpyxl workbook", search_string: str):
    """
    Searches for a specified string in all sheets of a workbook and extracts their coordinates along with sheet names and cell values.

    This function iterates through each worksheet in the provided openpyxl workbook, searches for the specified string in all cells,
    and stores the results, including the sheet name, cell coordinate, and cell value.

    Args:
        wbook (openpyxl.workbook.workbook.Workbook): The openpyxl workbook object to process.
        search_string (str): The string to search for in the workbook.

    Returns:
        list: A list of lists, where each inner list contains the sheet name, cell coordinate, and cell value of a match.
    """
    search_results=[]
    for sheet in wbook.worksheets:
        search_results+=search_string_in_worksheet(sheet,search_string)
    
    return search_results


def search_string_in_worksheet(sheet, search_string):
    search_results=[]
    for row in sheet.iter_rows():
            for cell in row:
                if cell.value and isinstance(cell.value, str):
                    if search_string in cell.value:
                        search_results.append([sheet.title,cell.coordinate, cell.value])
    return search_results


In [71]:
def extract_schedule_by_group_EI1(file_path,course_name,group_name,course_type, display_group_schedule=False):
    """
    This function is specific to the design of EI1 course schedule xlsx file at ECN
    """
    class_slots={"M1": ["08:00","10:00"],"M2": ["08:00","10:00"],"S1": ["10:15","15:45"],"S2": ["16:00","18:00"]}
    tz = pytz.timezone('Europe/Paris')

    events=[]
    wbook=unmerge_cell_copy_top_value(file_path)
    
    #finding group row using column B
    for row in wbook["40"].iter_rows(min_col=2, min_row=4, max_col=2, max_row=20):
        for cell in row:
            if cell.value and isinstance(cell.value, str):
                if group_name in cell.value:
                    groups_row=str(cell.row)
                    
    occurences=search_string_in_workbook(wbook , search_string=course_name)
    for occ in occurences:
        if occ[1][1:] != groups_row or occ[2].split()[1][:2] not in course_type: continue # Beware very hacky test
        # extracting date
        date_pos=occ[1][0]+"3"
        date=wbook[occ[0]][date_pos].value.split()[1] # dropping day name
        slot_pos=occ[1][0]+"4"
        slot=wbook[occ[0]][slot_pos].value.strip()
        
        dtstart= tz.localize(datetime.strptime(date+" "+class_slots[slot][0],'%d/%m/%y %H:%M'))#converting to date object
        dtend= tz.localize(datetime.strptime(date+" "+class_slots[slot][1],'%d/%m/%y %H:%M'))#converting to date object

        if display_group_schedule: 
            print(date,slot, occ[2])
            
        events.append({ 'summary': occ[2],
                        'dtstart': dtstart,
                        'dtend': dtend})
    return events

In [74]:
EI1_events=extract_schedule_by_group_EI1(file_path,"ALGO","B",["TP","TD", "CM"],True)

27/09/24 M2 ALGO CM1
27/09/24 S2 ALGO TD 1
04/10/24 S2 ALGO CM2
11/10/24 S1 ALGO TD 2
04/11/24 S2 ALGO TD 3
13/11/24 M2 ALGO TD 4
20/11/24 S2 ALGO TD 5
28/11/24 M2 ALGO CM3
04/12/24 M1 ALGO TD 6
13/12/24 S1 ALGO TP1
13/12/24 S2 ALGO TP1
20/12/24 S1 ALGO TP2
20/12/24 S2 ALGO TP2
10/01/25 M1 ALGO TP3
10/01/25 M2 ALGO TP3


In [75]:
# export this data in the same format as ICS extractions to facilitate review

## Step 2.2 : read Option schedule and BBA

# Step 3: report conflict and export complete schedule for vizualtion