<a href="https://colab.research.google.com/github/shiva-samy/ttgen/blob/main/TTGen2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Automatic Timetable Generator using Genetic algorithm

**Problem Statement:**

Provide solution to develop an automated Timetable Generation System for college that optimally assigns classes to students and teachers while adhering to various constraints and preferences.

**Guidelines:**
* Inputs:
 * For each class, List of subjects along with number of periods per week for each
subject
 * Subject allotment - Handling faculty for each subject
 * Elective Subjects (Parallel allotment possibility)
 * Combined classes if any
 * Workload of each faculty
 * Classrooms with capacity
 * Laboratories with capacity
 * TWM, Seminar etc.. may be considered as one subject and included in the workload
* Outputs:
 * Class Timetable
 * Faculty Timetable
 * Classroom wise workload
 * Lab Timetable
* Constraints and Requirements to consider:Staff Schedule:
 * Ensure that teachers do not have consecutive teaching hours. Maximum 3 teaching
periods per day.
 * Lab classes require at least 2 teachers.
 * Teacher should have class on all days.
* Room and Lab Availability:
 * Ensure the availability of classrooms and labs.
 * Maximize the utilization of classrooms and labs.
 * Only one lab class should be scheduled per day. Unavoidable situation you can schedule 2.
* Class Timing:
 * 5 days per week and 7 (4+3) periods per day. In some case you can have 4 periods in the afternoon while scheduling lab classes.

In [None]:
import pandas as pd
import numpy as np
import random
from collections import OrderedDict, defaultdict
import heapq

In the above code, we are importing the libraries necessary for running the timetable scheduling algorithm.

In [None]:
print(pd.read_csv("/content/staff_workload.csv"))
print(pd.read_csv("/content/all_year_subs.csv"))

   Staff  Workload Department
0     S1         6        ENG
1     S2         6        ENG
2     S3        10        SCI
3     S4        10        SCI
4     S5        10        SCI
5     S6        10        SCI
6     S7         6        ECO
7     S8         6        ECO
8     S9        10        MAT
9    S10        10        MAT
10   S11        15        MAT
11   S12        15        MAT
12   S13        15        MAT
13   S14        15        MAT
14   S15        15         CS
15   S16        15         CS
16   S17        15         CS
17   S18        15         CS
18   S19        15         CS
19   S20        15         CS
20   S21        15         CS
21   S22        15         CS
22   S23        15         CS
23   S24        15         CS
24   S25        15         CS
25   S26        15         CS
26   S27        15         CS
27   S28        15         CS
   Subject  Year  No. of Hrs Type Dependent Subject  Combined Department  \
0   19G105     1           3  Lec               NIL   

We have observed that the 2 CSV files contain the columns "Staff", "Workload" and "Subject", "Year", "No. of Hrs", "Type" "Dependent Subject", "Combined" respectively.

So, we need to create a function that processes the two CSV files (subjects and staff workload) and generates a resulting dataframe with the desired structure. We need to follow these steps:

* Load the CSVs: Read the subjects and staff CSV files into pandas dataframes.
* Initialize Data Structures: Create necessary data structures to hold the allocation details.
* Assign Subjects to Staff: Implement logic to assign subjects to staff based on their workload and the constraints provided.
* Generate Output DataFrames: Create dataframes for each year and populate them with the allocation details.
* Save the DataFrames to CSVs: Save the resulting dataframes to separate CSV files for each year.

In [None]:
def read_csvs(subjects_file, staff_file):
    subjects_df = pd.read_csv(subjects_file)
    staff_df = pd.read_csv(staff_file)
    return subjects_df, staff_df

def group_faculties_by_department(staff_df):
    department_staff = {}
    for _, row in staff_df.iterrows():
        staff = row['Staff']
        workload = row['Workload']
        department = row['Department']
        if department not in department_staff:
            department_staff[department] = []
        department_staff[department].append((staff, workload))

    for department in department_staff:
        department_staff[department].sort(key=lambda x: x[1], reverse=True)

    return department_staff

def initialize_staff_allocation(staff_df):
    staff_allocation = {}
    for _, row in staff_df.iterrows():
        staff = row['Staff']
        workload = row['Workload']
        staff_allocation[staff] = {
            'initial_workload': workload,
            'remaining_workload': workload,
            'years': [],
            'subjects': [],
            'combined_class': [],
            'department': row['Department'],
            'assigned_lecture_subjects': 0,
            'assigned_lab_subjects': 0
        }
    return staff_allocation

def allocate_subjects(subjects_df, department_staff, staff_allocation):
    subject_assignment = {}

    # Handle dependent lab subjects first
    for _, row in subjects_df.iterrows():
        if row['Type'] == 'Lab' and row['No. of Hrs'] == 4 and row['Dependent Subject'] != 'NIL':
            handle_dependent_subject(row['Subject'], row.to_dict(), department_staff, staff_allocation, subject_assignment, subjects_df)

    sorted_subjects = subjects_df.sort_values(by='No. of Hrs', ascending=False)
    for _, row in sorted_subjects.iterrows():
        if row['Subject'] not in subject_assignment:
            if row['Dependent Subject'] != 'NIL':
                handle_dependent_subject(row['Subject'], row.to_dict(), department_staff, staff_allocation, subject_assignment, subjects_df)
            else:
                assign_faculties(row['Subject'], row.to_dict(), department_staff, staff_allocation, subject_assignment)

    return staff_allocation, subject_assignment

def handle_dependent_subject(subject, details, department_staff, staff_allocation, subject_assignment, subjects_df):
    dep_subs = details['Dependent Subject'].split(', ')
    department = details['Department']

    if details['Type'] == 'Lab':
        assign_faculties(subject, details, department_staff, staff_allocation, subject_assignment)
        assigned_faculties = subject_assignment[subject]

        for dep_sub in dep_subs:
            dep_details = subjects_df[subjects_df['Subject'] == dep_sub].iloc[0].to_dict()
            if dep_details['Type'] == 'Lec':
                if details['No. of Hrs'] == 2:
                    subject_assignment[dep_sub] = assigned_faculties[:2]
                elif details['No. of Hrs'] == 4:
                    if len(dep_subs) == 1:
                        subject_assignment[dep_sub] = assigned_faculties[:2]
                    elif len(dep_subs) == 2:
                        if dep_sub == dep_subs[0]:
                            subject_assignment[dep_sub] = assigned_faculties[:2]
                        else:
                            subject_assignment[dep_sub] = assigned_faculties[2:]
                for staff in subject_assignment[dep_sub]:
                    staff_allocation[staff]['remaining_workload'] -= dep_details['No. of Hrs']
                    staff_allocation[staff]['subjects'].append(dep_sub)
                    staff_allocation[staff]['combined_class'].append(dep_details['Combined'])
                    staff_allocation[staff]['years'].append(dep_details['Year'])
                    staff_allocation[staff]['assigned_lecture_subjects'] += 1

    elif details['Type'] == 'Lec':
        assign_faculties(subject, details, department_staff, staff_allocation, subject_assignment)
        assigned_faculties = subject_assignment[subject]

        for dep_sub in dep_subs:
            dep_details = subjects_df[subjects_df['Subject'] == dep_sub].iloc[0].to_dict()
            subject_assignment[dep_sub] = assigned_faculties
            for staff in assigned_faculties:
                staff_allocation[staff]['remaining_workload'] -= dep_details['No. of Hrs']
                staff_allocation[staff]['subjects'].append(dep_sub)
                staff_allocation[staff]['combined_class'].append(dep_details['Combined'])
                staff_allocation[staff]['years'].append(dep_details['Year'])
                staff_allocation[staff]['assigned_lab_subjects'] += 1

def assign_faculties(subject, details, department_staff, staff_allocation, subject_assignment):
    department = details['Department']
    num_faculties = 4 if details['Type'] == 'Lab' and details['No. of Hrs'] == 4 else 2
    faculties_assigned = 0
    subject_assignment[subject] = []

    eligible_staff = [staff for staff, workload in department_staff[department] if
                      staff_allocation[staff]['remaining_workload'] >= details['No. of Hrs']]

    round_robin_index = 0
    while faculties_assigned < num_faculties and eligible_staff:
        staff = eligible_staff[round_robin_index % len(eligible_staff)]
        round_robin_index += 1

        info = staff_allocation[staff]
        if ((details['Type'] == 'Lab' and info['assigned_lab_subjects'] < 2) or
            (details['Type'] == 'Lec' and info['assigned_lecture_subjects'] < 3)) and details['Year'] not in info['years']:
            info['remaining_workload'] -= details['No. of Hrs']
            info['years'].append(details['Year'])
            info['subjects'].append(subject)
            info['combined_class'].append(details['Combined'])
            subject_assignment[subject].append(staff)
            faculties_assigned += 1

            if details['Type'] == 'Lab':
                info['assigned_lab_subjects'] += 1
            else:
                info['assigned_lecture_subjects'] += 1

def create_output_dataframe(subjects_df, subject_assignment):
    output_data = []
    for _, row in subjects_df.iterrows():
        subject = row['Subject']
        faculties = subject_assignment.get(subject, [])
        output_data.append({
            'Subject': subject,
            'Type': row['Type'],
            'No. of Hrs': row['No. of Hrs'],
            'Combined': row['Combined'],
            'Faculties': ', '.join(faculties),
            'Lab_preference': row['Preferred Lab']
        })
    return pd.DataFrame(output_data)

def create_workload_dataframe(staff_allocation):
    workload_data = []
    for staff, info in staff_allocation.items():
        workload_data.append({
            'Staff': staff,
            'Department': info['department'],
            'Initial Workload': info['initial_workload'],
            'Remaining Workload': info['remaining_workload']
        })
    return pd.DataFrame(workload_data)

def master(subjects_file, staff_file):
    subjects_df, staff_df = read_csvs(subjects_file, staff_file)
    department_staff = group_faculties_by_department(staff_df)
    staff_allocation = initialize_staff_allocation(staff_df)

    staff_allocation, subject_assignment = allocate_subjects(subjects_df, department_staff, staff_allocation)

    years = subjects_df['Year'].unique()
    for year in years:
        year_df = subjects_df[subjects_df['Year'] == year]
        output_year_df = create_output_dataframe(year_df, subject_assignment)
        output_year_df.to_csv(f'output_year_{year}.csv', index=False)

    workload_df = create_workload_dataframe(staff_allocation)
    workload_df.to_csv('staff_remaining_workload.csv', index=False)

# Example usage
master('/content/all_year_subs.csv', '/content/staff_workload.csv')

print("Staff Remaining Workload:")
print(pd.read_csv('/content/staff_remaining_workload.csv'))
print("Year 1:")
print(pd.read_csv('/content/output_year_1.csv'))
print("Year 2:")
print(pd.read_csv('/content/output_year_2.csv'))
print("Year 3:")
print(pd.read_csv('/content/output_year_3.csv'))
print("Year 4:")
print(pd.read_csv('/content/output_year_4.csv'))

Staff Remaining Workload:
   Staff Department  Initial Workload  Remaining Workload
0     S1        ENG                 6                   3
1     S2        ENG                 6                   3
2     S3        SCI                10                   3
3     S4        SCI                10                   3
4     S5        SCI                10                   3
5     S6        SCI                10                   3
6     S7        ECO                 6                   3
7     S8        ECO                 6                   3
8     S9        MAT                10                  10
9    S10        MAT                10                  10
10   S11        MAT                15                   7
11   S12        MAT                15                   7
12   S13        MAT                15                  12
13   S14        MAT                15                  12
14   S15         CS                15                   0
15   S16         CS                15         

Now let's start generating timetable for each year. First lets load the CSV files.

In [None]:
df1 = pd.read_csv('/content/output_year_1.csv')
df2 = pd.read_csv('/content/output_year_2.csv')
df3 = pd.read_csv('/content/output_year_3.csv')
df4 = pd.read_csv('/content/output_year_4.csv')
df_list=[df1, df2, df3, df4]

generate_valid_timetable() Function:

* This function generates a random valid timetable by allocating classes based on certain constraints.
* It initializes an empty timetable and dictionaries to keep track of subject counts and daily workloads.
* It prioritizes lab classes by selecting slots for 4-hour labs and 2-hour labs from pre-defined priority slots.
* For lecture classes, it prioritizes slots from 8:30 to 2:30. If those slots are filled, it randomly selects from all slots.
* When Allocating classes, it checks the respective faculty's and lab's timetable and allocate slots which are free.
* In this way it ensures that there is no conflict in Faculty and Lab timetable
* After allocating classes, it checks if each subject has been assigned the correct number of hours and if the workload is evenly distributed across days.
* If any constraints are violated, it regenerates the timetable.

is Valid Timetable() Function:

* This Function checks numerous constraints from allocating the exact number of hours for subjects to check the consecutive hours of classes and distribute the workload among all days
* This aids in building a robust class timetable without any conflicts or discrepancies.

calculate_fitness() Function:

* This function evaluates the fitness of a timetable based on certain criteria.
* It penalizes timetables for violations such as exceeding 7 hours per day, having consecutive classes without breaks, and not filling prioritized slots.

selection() Function:

* This function selects individuals (timetables) for crossover based on their fitness scores.
* It sorts the indices of individuals based on fitness scores in descending order and selects the best half of the population.
* If the population size is odd, it randomly selects one more individual to maintain the population size.

crossover() Function:

* This function performs single-point crossover between pairs of parents to produce offspring.
* It randomly selects a crossover point and combines the genetic information of two parents to create two offspring.

mutation() Function:

* This function introduces random changes (mutations) to individual timetables to explore new solutions.
* It randomly selects a subset of genes (slots) and shuffles their values to introduce variability.

Genetic Algorithm Main Loop (genetic_algorithm() Function):

* It initializes a population of timetables.
* For a specified number of generations, it evaluates the fitness of each individual in the population, selects parents for crossover, performs crossover and mutation to create offspring, and replaces the weaker half of the old population with the stronger half of the new population.
* It prints information about each generation's best fitness and average fitness.
* Finally, it selects the best individual (timetable) from the final population as the solution.


Overall, this algorithm aims to iteratively improve the quality of timetables by evolving populations of solutions through selection, crossover, and mutation, while ensuring adherence to constraints and maximizing fitness.

We observed that for the given data, the following paramaters gave the best output

Generations: 10

Population: 100

In [None]:
def generate_valid_timetable():
    timetable = pd.DataFrame(index=days, columns=slots, data=None)
    subject_count = {sub: 0 for sub in df["Subject"]}
    daily_workload = {day: 0 for day in days}

    consecutive_hours = {sub: 0 for sub in df[df["Type"] == "Lec"]["Subject"]}
    weekly_hours = {day: 0 for day in days}

    lab_classes = df[df["Type"] == "Lab"]
    lec_classes = df[df["Type"] == "Lec"]

    # Define priority slots for lab classes with 2 hours duration
    lab_priority_slots = [['08:30', '09:20'], ['10:30', '11:20'], ['01:40', '02:30']]

    # Iterate through lab classes first
    for idx, row in lab_classes.iterrows():
        sub = row["Subject"]
        hours = row["No. of Hrs"]
        classtype = row["Type"]
        _staff=", ".join(course_to_faculty[sub])

        if classtype == "Lab":
            lab_room = row["Lab_preference"]
            if hours == 4:
                day = random.choice(days)
                valid_slots = [['08:30', '09:20', '10:30', '11:20'], ['01:40', '02:30', '03:30', '04:20']]
                slot_range = random.choice(valid_slots)
                # For Lab classes of 4 hours, select a random day and a valid slot range with availability in faculty_df also
                while any(not pd.isna(timetable.loc[day, slot]) or (not pd.isna(lab_room) and (not pd.isna(lab_dfs[lab_room].loc[day,slot]))) or not all(pd.isna(faculty_df[course_to_faculty[sub][i]].loc[day, slot]) for i in range(4)) for slot in slot_range):
                    # Assign the lab class to the selected slots
                    day = random.choice(days)
                    slot_range = random.choice(valid_slots)

                for slot in slot_range:
                  timetable.loc[day, slot] = sub + f" ({_staff})"
                  subject_count[sub] += 1
                  daily_workload[day] += 1

            else:  # For Lab classes of 2 hours
                # Select a random day and a valid slot range from priority slots
                day = random.choice(days)
                slot_range = random.choice(lab_priority_slots)

                # Check if there are enough slots available for lab class
                while any(not pd.isna(timetable.loc[day, slot]) or (not pd.isna(lab_room) and (not pd.isna(lab_dfs[lab_room].loc[day,slot]))) or not all(pd.isna(faculty_df[course_to_faculty[sub][i]].loc[day, slot]) for i in range(2)) for slot in slot_range):
                  day = random.choice(days)
                  slot_range = random.choice(lab_priority_slots)

                # Assign the lab class to the selected slots
                for slot in slot_range:
                  timetable.loc[day, slot] = sub + f" ({_staff})"
                  subject_count[sub] += 1
                  daily_workload[day] += 1

    # For Lecture classes
    for idx, row in lec_classes.iterrows():
        sub = row["Subject"]
        hours = row["No. of Hrs"]
        classtype = row["Type"]
        _staff=random.choice(course_to_faculty[sub])

        for _ in range(hours):
            # Prioritize slots from 8:30 to 2:30 for lecture classes
            prioritized_slots = ['08:30','09:20', '10:30', '11:20', '01:40', '02:30']
            # Shuffle the prioritized slots to randomize selection
            random.shuffle(prioritized_slots)

            day, slot = None, None
            for ps in prioritized_slots:
                day = random.choice(days)
                if pd.isna(timetable.loc[day, ps]) and pd.isna(faculty_df[_staff].loc[day, ps]):
                    slot = ps
                    break

            if day is None or slot is None:
                # If the prioritized slots are filled, choose randomly from all slots
                valid_slots = [(d, s) for d in days for s in slots if pd.isna(timetable.loc[d, s]) and pd.isna(faculty_df[_staff].loc[d, s])]
                if valid_slots:
                    day, slot = random.choice(valid_slots)
                else:
                    # Handle the case when no valid slots are available (optional)
                    # You might want to log this situation or handle it differently based on your requirements
                    pass

            if day is not None and slot is not None:
                timetable.loc[day, slot] = sub + f" ({_staff})"
                subject_count[sub] += 1
                daily_workload[day] += 1

    # Check if all subjects have been allocated the correct number of hours
    for sub, count in subject_count.items():
        if count != df[df["Subject"] == sub]["No. of Hrs"].values[0]:
            # If the count doesn't match the number of hours, regenerate timetable
            return generate_valid_timetable()

    # Check if the workload is distributed evenly across days
    max_workload = max(daily_workload.values())
    min_workload = min(daily_workload.values())
    workload_difference = max_workload - min_workload
    if workload_difference > 1:  # Adjust as needed based on your workload balancing preference
        # If workload difference is greater than 1, regenerate timetable
        return generate_valid_timetable()

    return timetable

def is_valid_timetable(timetable, subject):
    # Constraint 1: Each subject should be allocated the correct number of hours
    for sub, hours in subject.items():
        tt_course_list=[_tt.split(" ")[0] for _tt in timetable.stack().tolist()]
        if tt_course_list.count(sub) != hours[0]:
            return False

    # Constraint 2: Workload should be distributed evenly across days
    min_workload = min(timetable.apply(lambda x: x.count(), axis=1))
    max_workload = max(timetable.apply(lambda x: x.count(), axis=1))
    if max_workload - min_workload > 1:  # Adjust as needed based on your workload balancing preference
        return False

    # Constraint 3: There should be no more than 7 consecutive hours of classes
    for day, slots in timetable.iterrows():
        consecutive_hours = 0
        for slot in slots:
            if pd.notna(slot):
                consecutive_hours += 1
                if consecutive_hours > 7:
                    return False
            else:
                consecutive_hours = 0

    # Constraint 4: There should be no more than 2 consecutive lecture classes of the same subject
    lecture_subjects = {sub: details for sub, details in subject.items() if details[1] == 'Lec'}
    for sub, details in lecture_subjects.items():
        consecutive_lectures = 0
        for day, slots in timetable.iterrows():
            for slot in slots:
                if pd.notna(slot) and sub in slot.split()[0]:
                    consecutive_lectures += 1
                    if consecutive_lectures > 2:
                        return False
                else:
                    consecutive_lectures = 0

    #Constraint 5: all lab classes must be allocated sequencially
    lab_classes={sub:hours[0] for sub,hours in subject.items() if hours[1]=='Lab'}
    for sub in lab_classes:
      lab_class_count = 0
      non_lab_class_found = False
      for day in timetable.index:
        exit_slot_loop=False
        for slot in timetable.columns:
          if pd.notna(timetable.loc[day,slot]) :
            if timetable.loc[day,slot].split(" ")[0]==sub :
              lab_class_count+=1
              if non_lab_class_found:
                print('yes')
                return False
            else:
              if lab_class_count>0:
                non_lab_class_found=True
            if lab_class_count == lab_classes[sub]:
                    # Reached the maximum consecutive lab classes allowed
                exit_slot_loop = True  # Set flag to exit slot loop
                break  # Exit the slot loop
        if exit_slot_loop:
            break  # Exit the day loop if flag is set


    # Constraint 6: There should be same faculty assigned for a course
    #Constraint 7: There shd be no conflict btw class_tt and faculty_tt
    course_faculty_map = {}

    for day in timetable.index:
        for slot in timetable.columns:
            cell_value = timetable.loc[day, slot]
            if pd.notna(cell_value):  # Check if the cell is not NaN
                courses=cell_value.split(' (')
                course, faculty = courses[0], courses[1].rstrip(')').split(', ')

                if course not in course_faculty_map:
                    course_faculty_map[course] = set()
                for _faculty in faculty:
                    course_faculty_map[course].add(_faculty.strip())

                #check if only 1 faculty exist for one course
                if subject[course][1]=='Lec' and len(course_faculty_map[course])>1:
                    return False

                #Check if there is no conflict in class_tt and faculty_tt
                for _faculty in course_faculty_map[course]:
                    if pd.notna(faculty_df[_faculty].loc[day,slot]):
                        return False
    return True

def calculate_fitness(timetable):
    days = timetable.index
    slots = timetable.columns

    # Calculate additional penalty points for constraints violation
    penalty_points = 0
    for day in days:
        consecutive_hours = 0
        prev_subject = None
        prev_slot = None
        for slot in slots:
            if pd.isna(timetable.loc[day, slot]):
                consecutive_hours += 1
                if prev_subject == "Break" and prev_slot == "01:40" and slot == "02:30":
                    penalty_points -= 1
                if consecutive_hours > 7:
                    penalty_points += 5  # Add penalty for exceeding 7 hours
                    break  # No need to continue checking if limit is reached
            else:
                subject = timetable.loc[day, slot]

                if prev_subject is not None and subject != prev_subject:
                    if prev_slot is not None and prev_slot != "04:20" and slot != "08:30":
                        penalty_points -= 1

                consecutive_hours = 0
                prev_subject = subject
            prev_slot = slot

    # Calculate total fitness
    total_fitness = -penalty_points  # Penalize for violations
    return total_fitness

def selection(population, fitness_scores):
    # Sort indices based on fitness scores in descending order
    sorted_indices = sorted(range(len(fitness_scores)), key=lambda k: fitness_scores[k], reverse=True)

    # Select the best half of the population
    selected_indices = sorted_indices[:len(population)//2]
    selected_individuals = [population[i] for i in selected_indices]

    # If the population size is odd, randomly select one more individual
    if len(selected_individuals) % 2 != 0:
        selected_individuals.append(random.choice(population))

    return selected_individuals

def crossover(parent1, parent2):
    # Initialize offspring as copies of parents
    offspring1 = parent1.copy()
    offspring2 = parent2.copy()

    # Select a random day and slot for crossover
    crossover_day = random.choice(parent1.index)
    crossover_slot = random.choice(parent1.columns)

    # Perform single-point crossover
    for day in parent1.index:
        for slot in parent1.columns:
            if (day, slot) >= (crossover_day, crossover_slot):
                # Swap subjects between parents after the crossover point
                offspring1.loc[day, slot], offspring2.loc[day, slot] = (
                    parent2.loc[day, slot],
                    parent1.loc[day, slot]
                )

    # Ensure offspring timetables are valid
    if not is_valid_timetable(offspring1,subject):
        offspring1 = parent1.copy()  # Revert to parent1 if offspring1 is invalid
    if not is_valid_timetable(offspring2,subject):
        offspring2 = parent2.copy()  # Revert to parent2 if offspring2 is invalid

    return offspring1, offspring2

def mutation(individual):
    mutated_individual = individual.copy()

    # Select a random day and slot for mutation
    mutation_day = random.choice(individual.index)
    mutation_slot = random.choice(individual.columns)

    # Define valid subjects based on the subjects already present in the timetable
    valid_subjects = [sub for sub in individual.stack().unique() if pd.notna(sub)]

    # Select a new subject for mutation
    new_subject = random.choice(valid_subjects)

    # Perform mutation by replacing the subject at the mutation slot with the new subject
    mutated_individual.loc[mutation_day, mutation_slot] = new_subject

    # Ensure mutated individual is a valid timetable
    if not is_valid_timetable(mutated_individual,subject):
        mutated_individual = individual.copy()  # Revert to original if mutated individual is invalid

    return mutated_individual

# Genetic algorithm main loop
def genetic_algorithm(population_size, generations):
    # Initialize population
    population = initialize_population(population_size)

    for generation in range(generations):
        # Evaluate fitness of each individual in the population
        fitness_scores = [calculate_fitness(individual) for individual in population]

        # Select individuals for crossover
        selected_parents = selection(population, fitness_scores)

        # Perform crossover to create offspring
        offspring = []
        for i in range(0, len(selected_parents), 2):
            parent1 = selected_parents[i]
            parent2 = selected_parents[i+1]
            child1, child2 = crossover(parent1, parent2)
            offspring.extend([child1, child2])

        # Perform mutation on offspring
        mutated_offspring = [mutation(individual) for individual in offspring]

        # Combine the mutated offspring and the fittest parents to form the new population
        population = mutated_offspring + selected_parents[:population_size - len(mutated_offspring)]

    # Select the best individual from the final population as the solution
    best_max_individual=max(population,key=calculate_fitness)

    return best_max_individual

def initialize_population(population_size):
    population = []
    for _ in range(population_size):
        timetable = generate_valid_timetable()
        population.append(timetable)
    return population

Here we iterate every year subjects and generate timetable for them using Genetic Algorithm.

* Generating Key - value pairs for subject, staff and labs
* For each staff and lab initialize its timetable by allocating a separate dataframe

After all the class timetables are generated, start assigning the staff and lab timetables

* For each class timetable, traverse each cell and allocate the respective staff and lab timetable
* Split the cell values as courses and staff names accordingly
* Pass them to their respective functions to allocate slots for staff and lab timetable
* Repeat this process for each class timetable to get the all final staff and lab timetables
* Finally export them as CSVs for future usage  

get_staff_df function(): - It creates and allocates new dataframe for staff

add_staff_df function(): - It allocates the course in the respective faculty Timetable

get_lab_df function(): - It creates and allocates new dataframe for lab

add_lab_df function(): It allocates the course in the respective lab Timetable








In [None]:
slots = ['08:30', '09:20', '10:30', '11:20', '01:40', '02:30', '03:30', '04:20']

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']

i=1

faculty_df={}
lab_dfs={}

for df in df_list:
    # Generating key-value pairs for subject
    subject = {}
    for index, row in df.iterrows():
        key = row['Subject']
        value = [row['No. of Hrs'], row['Type']]
        subject[key] = value

    print("Subjects:")
    print(subject)

    # Generating key-value pairs for labs
    labs = {}
    for index, row in df.iterrows():
        if(row['Type']=='Lab') and pd.notna(row['Lab_preference']):
          key=row['Subject']
          value=row['Lab_preference']
          labs[key]=value

    print("Labs:")
    print(labs)

    # Generating key-value pairs for staff
    staff = {}
    for index, row in df.iterrows():
        faculties = row['Faculties'].split(', ')
        for faculty in faculties:
            if faculty in staff:
                staff[faculty].append(row['Subject'])
            else:

                staff[faculty] = [row['Subject']]

    staff=dict(sorted(staff.items()))
    print("Staff:")
    print(staff)

    # Iterate through each value of faculty_df
    for _staff in staff.keys():
        if _staff not in faculty_df:
            # Initialize as new DataFrame
            faculty_df[_staff] = pd.DataFrame(index=days, columns=slots, data=None)
        # If csv file added add its refernce to faculty_df
        else:
            # Leave a TODO (if needed for future logic)
            pass  # Leave as is if not an empty string

    for _lab in list(labs.values()):
        if _lab not in lab_dfs:
            # Initialize as new DataFrame
            lab_dfs[_lab] = pd.DataFrame(index=days, columns=slots, data=None)
        # If csv file added add its refernce to faculty_df
        else:
            # Leave a TODO (if needed for future logic)
            pass  # Leave as is if not an empty string

    course_to_faculty = {}
    for faculty, courses in staff.items():
        for course in courses:
            course_to_faculty.setdefault(course, []).append(faculty)

    best_timetable = genetic_algorithm(population_size=10, generations=2)
    print("Best Timetable:")
    print(best_timetable)
    best_timetable.to_csv(f'/content/year_{i}_timetable.csv')

    #Start allocating Staff and Labs
    class_tt=pd.read_csv(f"year_{i}_timetable.csv")

    # Function to create or get a dataframe for a staff member
    def get_staff_df(staff_no):
        if staff_no not in faculty_df:
            faculty_df[staff_no] = pd.DataFrame(index=days, columns=slots, data=None)
        return faculty_df[staff_no]

    # Function to create or get a dataframe for a lab
    def get_lab_df(lab):
        if lab not in lab_dfs:
            lab_dfs[lab] = pd.DataFrame(index=days, columns=slots, data=None)
        return lab_dfs[lab]

    # Function to add course to the respective staff dataframe
    def add_course_to_staff(course, staff_no,time_slot,day_value):
        staff_df = get_staff_df(staff_no)
        staff_df.loc[day_value,time_slot]=course

    # Function to add course to the respective lab dataframe
    def add_course_to_lab(course, lab,time_slot,day_value):
        lab_df = get_lab_df(lab)
        lab_df.loc[day_value,time_slot]=course

    # Traverse the schedule dataframe and add courses to staff dataframes
    def call_function(schedule_data,subject,labs,class_name):
      for index, row in schedule_data.iterrows():
        for time_slot, cell in row.items():
            if pd.notna(cell):#If cell is not empty
              if cell in ['Monday','Tuesday','Wednesday','Thursday','Friday']:#if cell is Day copy its value and skip
                day_value=cell
                continue
              courses = cell.split()
              course_no = courses[0] + class_name
              staff_str = ' '.join(courses[1:])  # Join the remaining elements after the course number
              staff_no = [staff.strip() for staff in staff_str.strip("()").split(",")]

              #Allocate staffs tt
              if len(staff_no) > 1:
                  for staff in staff_no:
                      add_course_to_staff(course_no, staff, time_slot, day_value)
              else:
                        # Convert staff_no to a string before using it as a key
                  staff_str = ','.join(staff_no)
                  add_course_to_staff(course_no, staff_str, time_slot, day_value)

              #Allocate room tt
              if(subject[courses[0]][1]=='Lab' and courses[0] in labs):
                lab_name=labs[courses[0]]
                add_course_to_lab(course_no, lab_name,time_slot,day_value)

    call_function(class_tt,subject,labs," G1")

    # Print the dataframes for each staff member
    for staff_no, staff_df in faculty_df.items():
        print(f"Staff {staff_no}:")
        print(staff_df)
        staff_df.to_csv(staff_no+" timetable.csv",index=True)
        print()

    for lab_name, lab_df in lab_dfs.items():
        print(f"Lab {lab_name}:")
        print(lab_df)
        lab_df.to_csv(lab_name+" timetable.csv",index=True)
        print()
    i+=1

Subjects:
{'19G105': [3, 'Lec'], '19Z101': [4, 'Lec'], '19Z102': [3, 'Lec'], '19Z103': [3, 'Lec'], '19Z104': [3, 'Lec'], '19Z110': [4, 'Lab'], '19Z111': [2, 'Lab'], '19Z112': [4, 'Lab']}
Labs:
{'19Z110': 'OL', '19Z111': 'HL', '19Z112': 'SL'}
Staff:
{'S1': ['19G105'], 'S11': ['19Z101'], 'S12': ['19Z101'], 'S15': ['19Z104', '19Z112'], 'S16': ['19Z104', '19Z112'], 'S17': ['19Z112'], 'S18': ['19Z112'], 'S2': ['19G105'], 'S21': ['19Z111'], 'S22': ['19Z111'], 'S3': ['19Z102', '19Z110'], 'S4': ['19Z102', '19Z110'], 'S5': ['19Z103', '19Z110'], 'S6': ['19Z103', '19Z110']}
Best Timetable:
                             08:30                    09:20  \
Monday                         NaN              19Z102 (S4)   
Tuesday    19Z110 (S3, S4, S5, S6)  19Z110 (S3, S4, S5, S6)   
Wednesday                      NaN              19G105 (S1)   
Thursday         19Z111 (S21, S22)        19Z111 (S21, S22)   
Friday                 19G105 (S1)                      NaN   

                             10:30 