# Facility Booking List for FBS Booking (Ver 3.0)
>
    **Objective**: Produce a populated Facility Booking List with all cells in Text Format 
>
    **Overview**: 
- Assign a data frame *(a table like structure with rows and columns to manipulate data)* to progressively obtain the necessary column values from the various excel sheets provided
- Throughout the process of obtaining values, data processing and cleaning is needed to obtain the values in the desired format. 
- The data frame will be then used to populate the Facility Booking List.
>
    **Step-by-Step Approach**:
1. Load all excel files require to obtain the necessary values
    - Read and combine all the gvSession files into a singular dataframe
2. Filter Out Necessary Columns in gvSession, and for courses that are not conducted within SMU, upload the data to separate sheet in the Facility Booking List template file
    - "Sch#" column will also be extracted from the "gvSession" file
3. Obtain Time From and Time To Values from gvSession
    - Perform data cleaning to reflect the requirements for the Booking Start Time and End Times
4. Obtain Use Type Column by referencing if courses have an IO code or not in (Latest) SSG Approved_Master Listing 
    - If have, "Event". If not, "Adhoc"
5. Obtain Purpose Column 
6. Obtain Event Code Column by referencing if courses have an IO code or not in (Latest) SSG Approved_Master Listing 
    - Event Code will be the IO Code
7. Obtain Facility Name, Building, and Floor columns
8. Remove any unnecessary columns, keeping only the columns needed in the Facility Booking List file
9. Perform all additional checks
    - For courses that run for more than 1 day, assign the same venue
    - Check for venue clashes and see if 2 or more courses have the same venue, and correct accordingly
    - Book Catering for the respective venues
    - Generate the "Venue Preference" Column
    - Generate the "No.of Course Days" Column to indicate the run duration for each course, ie. 1, 2, 3 days or "Above 3 Days"
    - If course code cannot be found in FBS Report: Indicate "Assignment of Venue required" under "Venue" Column
    - Ensure that all the facility names follow the naming convention suitable for FBS, using Facility Names File to look up the correct naming  
10. Final Touches
    - Rename "Date of Booking", "Time Booking From" and "Time Booking To" to required format
    - Generate the "Any Comments" column for users to indicate additional comments
    - Sort the dataframe to group courses together, by "Purpose" and "Date of Booking"
    - Change the format of all column values to text format
11. Populate Facility+Booking+List.xlsx with the final DataFrame


## 1.0 Import python libraries & read all datasets and files

In [1]:
import pandas as pd
import numpy as np
import warnings
import os
import glob
import datetime
import openpyxl
import random
import logging


from dateutil.parser import parse, ParserError
from openpyxl import load_workbook
from openpyxl.utils.dataframe import dataframe_to_rows
from openpyxl.utils import get_column_letter
from datetime import datetime, timedelta

#Ignore warnings
warnings.filterwarnings("ignore")

### 1.1 Verify The Current Working Directory, ie. the location where all files are placed

In [2]:
# Check current directory
print(os.getcwd())  # Verify you are in /content

c:\Users\somes\OneDrive\Desktop\PT Job Materials\FBS Pre-Booking Code\17th Feb Testing


### 1.2 Import All Required Datasets 

- **FBS Report:** BKG01+-+Details+of+Facilities+Booking+by+Individual's+Organization+Unit_Cost+Centre.xlsx
- **TMS File:** gvSession.xlsx
- **SMUA Venue Matters File:** SMUA Venue matters.xlsx
- **Facility Names File:** Facility Names.xlsx
- **SSG Master Listing File:** (Latest) SSG Approved_Master Listing.xlsx
- **Facility Booking File:** Facility+Booking+List.xlsx (Will be used later)
- **Venues Booked by Stakeholders**: Venue booked by stakeholders.xlsx (Will be used later) 
- **Facility List File**: Facility List.xlsx (Will be used later) 

In [3]:
# gvSession
gvSession_initial_df = pd.read_excel('gvSession.xlsx')
print(gvSession_initial_df.shape)

# Rooms Booked over FBS, remove last row and remove first 10 rows
fbs_report_df = pd.read_excel("BKG01+-+Details+of+Facilities+Booking+by+Individual's+Organization+Unit_Cost+Centre.xlsx", skiprows=9, skipfooter=1)

# # SMUA Venue Matters
pillar_room_pref_df = pd.read_excel('SMUA Venue matters.xlsx', sheet_name=None, skiprows=1)

# SSG Master Listing, read the "Master" Sheet into the dataframe
ssg_master_listing_df = pd.read_excel('(Latest) SSG Approved_Master Listing.xlsx', sheet_name='Master')

# Facility Names
facility_names_df = pd.read_excel('Facility Names.xlsx')

# List of Facilities available to Book
facility_list_df = pd.read_excel('Facility List.xlsx')

# Stakeholder Courses
stakeholder_courses_df = pd.read_excel('Venue booked by stakeholders.xlsx')

print("All Files Loaded Succesfully!")

(1000, 22)
All Files Loaded Succesfully!


In [4]:
# import pandas as pd

# # Load the Excel file and specify the sheet name
# facility_list_df_1 = pd.read_excel('Facility+Booking+List.xlsx', sheet_name='02. Facility List')

# # Display the DataFrame
# facility_list_df_1


### 1.3 Merge All "gvSession" Files

1. There are multiple gvSession files that reside in the same directory as the python code. We need to identify all of these files, read all of these files into Python, and combine them into a singular gvSession dataframe. 

2. The method for identifying these files in the current directory is to check if the "gvSession" prefix exists in the filename, and then concatenates these files together

In [5]:
#get the current working directory
current_directory = os.getcwd()
directory_path = current_directory
file_prefix = "gvSession"
file_paths = glob.glob(os.path.join(directory_path, f"{file_prefix}*.xlsx"))

dataframes = []

# Iterate through the file paths and read Excel files
for file_path in file_paths:
    try:
        df = pd.read_excel(file_path)
        dataframes.append(df)
    except Exception as e:
        print(f"Error reading file {file_path}: {e}")

# Concatenate all dataframes if there are any
if dataframes:
    gvSession_df = pd.concat(dataframes, ignore_index=True)
    print("Concatenation successful.")
else:
    print("No files found matching the pattern.")

gvSession_df.shape

Concatenation successful.


(3145, 22)

**BEFORE PROCEEDING: Change the "directory" variable below to your own directory. Make sure to put the letter "r" at the start of the line, before the quotes, similar to below**

1. Next, we need to verify if the dimensions of the **combined gvSession dataframe**, which are the number of rows and columns, matches the total number of rows and columns of all **gvSession** files found in the current working directory.

2. Similar to above, we will loop through the directory to check if the file name of each file has the prefix "gvSession" contained within it, print out the number of rows and then sum up the total number of rows and columns to see if it tallies up with the current "gvSession" dataframe



In [6]:
# Directory where "gvSession" files are located, change this path within the quotes to your directory 
directory = directory_path

total_cols = 0
total_rows = 0

# Identify and loop through all "gvSession" files in the directory
for filename in os.listdir(directory):
    if filename.startswith('gvSession') and filename.endswith('.xlsx'):
        file_path = os.path.join(directory, filename)
        
        # Get the shape of each "gvSession" Excel file
        xls = pd.ExcelFile(file_path)
        sheet_names = xls.sheet_names
        print(f"File: {filename}")
        num_rows = 0
        num_cols = 0
        
        for sheet_name in sheet_names:
            df = pd.read_excel(file_path, sheet_name=sheet_name)
            rows, cols = df.shape
            num_rows += rows
            num_cols += cols
            print(f"  - Sheet '{sheet_name}': Shape {df.shape}")
            
        for sheet_name in sheet_names:
            df = pd.read_excel(file_path, sheet_name=sheet_name)
            rows, cols = df.shape
            total_rows += rows
            total_cols += cols
        
        print(f"Total sum of shapes for '{filename}': {num_rows} rows, {num_cols} columns")
        print()  # Empty line for better readability between files

print(f"Total sum of rows and columns for all files: {total_rows} rows, 22 columns")

print()
#Compare with the shape of the combined gvSession dataframe 
print(f"Shape of the combined gvSession Dataframe: {gvSession_df.shape}")

File: gvSession (1).xlsx
  - Sheet 'Sheet': Shape (1000, 22)
Total sum of shapes for 'gvSession (1).xlsx': 1000 rows, 22 columns

File: gvSession (2).xlsx
  - Sheet 'Sheet': Shape (1000, 22)
Total sum of shapes for 'gvSession (2).xlsx': 1000 rows, 22 columns

File: gvSession (3).xlsx
  - Sheet 'Sheet': Shape (145, 22)
Total sum of shapes for 'gvSession (3).xlsx': 145 rows, 22 columns

File: gvSession.xlsx
  - Sheet 'Sheet': Shape (1000, 22)
Total sum of shapes for 'gvSession.xlsx': 1000 rows, 22 columns

Total sum of rows and columns for all files: 3145 rows, 22 columns

Shape of the combined gvSession Dataframe: (3145, 22)


## 2.0 Extract relevant columns from gvSession

From the combined "gvSession" dataframe, we will extract the following columns

- Dept
- Sch#
- Course Code
- Course Title
- Session Type
- Session Date
- Session Day
- S-Time
- E-Time
- Venue
- Sch Status

**Note that we only need values where the "Sch Status" Column indicates "Pending" or "Confirmed"**

In [7]:
filtered_columns_df = gvSession_df.loc[:,['Dept','Sch #','Course Code','Course Title','Session Type',
                              'Session Date','Session Day','S-Time','E-Time','Venue', 'Sch Status']]

#Filter the dataframe such that the "Sch Status" column indicates either "Pending" or "Confirmed"
filtered_columns_df = filtered_columns_df[filtered_columns_df['Sch Status'].isin(['Pending', 'Confirmed'])]
 
print("Rows with courses of all dates in gvSession:", filtered_columns_df.shape[1])

Rows with courses of all dates in gvSession: 11


### 2.1 Filter out rows

1. For all other courses that are not conducted in SMU, filter out the data and upload to a separate sheet in the Facility Booking Template
2. Only courses that are conducted in SMU require venues to be booked, and these courses will be indicated as "SMUA01" in the "Venue" Column

In [8]:
physical_courses_df = filtered_columns_df[filtered_columns_df["Venue"] == "SMUA Room 1"]
other_courses_df = filtered_columns_df[filtered_columns_df['Venue'] != "SMUA Room 1"]

Check the number of rows in each dataframe

In [9]:
#Check the number of rows in both dataframes
print(f"Number of rows in other_courses_df: {other_courses_df.shape[0]}")
print(f"Number of rows in physical_courses_df: {physical_courses_df.shape[0]}")

Number of rows in other_courses_df: 683
Number of rows in physical_courses_df: 1861


In [10]:
other_courses_df["Venue"].unique()

array(['Straits Interactive (Blk 43A Beach Road #02-00 Evershine & Century Complex Singapore 189681)',
       'Online Class', 'Asynchronous E-learning', 'South Korea', nan,
       'To be confirmed', 'San Francisco', 'Chongqing, China',
       'CAG Office', 'Jeju', 'Japan', 'Taiwan', 'Bangkok'], dtype=object)

In [11]:
physical_courses_df["Venue"].unique()

array(['SMUA Room 1'], dtype=object)

Upload the Online Classes Dataframe into a separate sheet in the Facility Bookings Template File

In [12]:
#Specify the output file path, which will be FBS Template Excel File in this case, and the sheet name you want to put the data under
output_file_path = 'Facility+Booking+List.xlsx'
sheet_name = 'Courses not conducted in SMU'

# Open the existing workbook
workbook = load_workbook(output_file_path)

# Delete the sheet if it exists
if sheet_name in workbook.sheetnames:
    sheet = workbook[sheet_name]
    workbook.remove(sheet)
    workbook.save(output_file_path)

# Write the other_courses_df to the output file
with pd.ExcelWriter(output_file_path, engine='openpyxl', mode='a') as writer:
    other_courses_df.to_excel(writer, sheet_name=sheet_name, index=False)

    # Access the workbook and the specific sheet
    workbook = writer.book
    sheet = writer.sheets[sheet_name]

    # Autofit column widths
    for column in sheet.columns:
        max_length = 0
        column_letter = column[0].column_letter
        for cell in column:
            try:
                if len(str(cell.value)) > max_length:
                    max_length = len(cell.value)
            except:
                pass
        adjusted_width = (max_length + 2)
        sheet.column_dimensions[column_letter].width = adjusted_width

print("Data for all other courses not conducted not within SMU have been uploaded to Sheet: Courses not conducted in SMU")

other_courses_df.head()

Data for all other courses not conducted not within SMU have been uploaded to Sheet: Courses not conducted in SMU


Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status
5,Finance & Technology,SCH24-Advanced Diploma-SCTP-ACGRMDC4-00001,SCTP-ACGRMDC4,"Advanced Certificate in Governance, Risk Manag...",Assessment #05,2025-02-07,Friday,06:00 PM,07:00 PM,Straits Interactive (Blk 43A Beach Road #02-00...,Pending
7,Finance & Technology,SCH24-Not Specified-CIPPE-00005,CIPPE,Certified Information Privacy Professional/ Eu...,Assessment #02,2025-03-20,Thursday,02:00 PM,06:00 PM,Online Class,Pending
18,"Human Capital, Management & Leadership",SCH24-Graduate Certificate-HGCTASFS-00001,HGCTASFS,HR Graduate Certification - Talent Acquisition,Morning #02,2025-04-05,Saturday,09:00 AM,01:00 PM,Online Class,Pending
25,Finance & Technology,SCH24-Not Specified-ECBDAFS-00023,ECBDAFS,Executive Certificate in Blockchain and Digita...,,2025-02-28,Friday,10:59 PM,11:59 PM,Asynchronous E-learning,Confirmed
26,"Services, Operations and Business Improvement",SCH24-Advanced Diploma-ACADAM2SF-00004,ACADAM2SF,Advanced Certificate in Applied Data Analytics...,,2025-01-22,Wednesday,01:00 PM,03:30 PM,Online Class,Pending


### 2.2 Further filter out courses that are booked by stakeholders

In [13]:
print(f"Number of rows in Venues Booked By Stakeholders excel: {stakeholder_courses_df.shape[0]}")
print(f"Number of rows in physical_courses_df: {physical_courses_df.shape[0]}")
stakeholder_courses_df.head()

Number of rows in Venues Booked By Stakeholders excel: 22
Number of rows in physical_courses_df: 1861


Unnamed: 0,Course Title,Course Code,Any Comments\n
0,Coding with Python: Workshop for Accounting an...,CPWAFP,Will be booked by SOA
1,Cyber Security Risk Management for Finance Pro...,CSRMFP,Will be booked by SOA
2,"Fund Raising, IPOs & Capital Restructuring Wor...",FRICRW,Will be booked by SOA
3,Graduate Certificate in Law and Technology Mod...,GCLTM1,Will be booked by SMULA
4,Graduate Certificate in Law and Technology Mod...,GCLTM10,Will be booked by SMULA


If there are rows in the physical_courses df that match the Course Title and Course Code in stakeholder_courses_df, then filter out these rows from Physical Courses

In [14]:
filtered_physical_df = physical_courses_df[
    ~(
        physical_courses_df['Course Title'].isin(stakeholder_courses_df['Course Title']) |
        physical_courses_df['Course Code'].isin(stakeholder_courses_df['Course Code'])
    )
]

In [15]:
#Check if they still exist in the dataframe
matching_rows = filtered_physical_df[
    filtered_physical_df['Course Title'].isin(stakeholder_courses_df['Course Title']) |
    filtered_physical_df['Course Code'].isin(stakeholder_courses_df['Course Code'])
]

# If matching_rows is not empty, there are still matching rows
if not matching_rows.empty:
    print("There are matching rows:")
    print(matching_rows)
else:
    print("No matching rows found.")
    
print(filtered_physical_df.shape[0])

No matching rows found.
1836


## 3.0 Obtain Time From and Time To Values

For each course, obtain the "Time of Booking From " and "Time of Booking To" by separating the morning, afternoon and night courses 

In [16]:
#Rename the physical_courses_df dataframe
facility_booking_df = filtered_physical_df

In [17]:
facility_booking_df.head()

Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status
1,Finance & Technology,SCH24-Not Specified-ACCMBM6V2-00002,ACCMBM6V2,Advanced Certificate in Crisis Management for ...,Assessment #02,2025-01-20,Monday,02:00 PM,06:00 PM,SMUA Room 1,Pending
2,Finance & Technology,SCH24-Not Specified-ECCRMM3-00002,ECCRMM3,Executive Certificate in Corporate Relationshi...,Assessment #02,2025-02-24,Monday,02:00 PM,06:00 PM,SMUA Room 1,Pending
3,Finance & Technology,SCH24-Not Specified-BCSC-00004,BCSC,"Blockchain, Cryptocurrencies and Smart Contracts",Afternoon #07,2025-02-04,Tuesday,06:00 PM,07:00 PM,SMUA Room 1,Pending
6,Finance & Technology,SCH24-Not Specified-ECSDF-00003,ECSDF,Executive Certificate in Successful Data Trans...,Morning #01,2025-02-28,Friday,09:00 AM,12:30 PM,SMUA Room 1,Pending
8,Finance & Technology,SCH24-Not Specified-ACGAIM2-00009,ACGAIM2,"Advanced Certificate in Generative AI, Ethics ...",Afternoon #02,2025-02-06,Thursday,01:30 PM,05:00 PM,SMUA Room 1,Pending


In [18]:
# Check if current DF has the right number of rows
print("Number of rows in current data frame: ", facility_booking_df.shape[0])

Number of rows in current data frame:  1836


Reformat the S-Time and E-Time columns into datetime format

Then we need to identify the Morning, Afternoon and Night Time Courses, and create separate dataframes for each type of session

1. For Morning courses, they start before 12pm
2. For Afternoon courses, they start at or after 12pm, no later than 7pm
3. For Night courses, they start at 7pm or late


In [19]:
# Creates a new DataFrame that is a separate object from the original
booking_time_df = facility_booking_df.copy()

# Create new "DT S-Time" and "DT E-Time" columns to define functions later on
booking_time_df['DT S-Time'] = pd.to_datetime(booking_time_df['S-Time'], format='%I:%M %p')
booking_time_df['DT E-Time'] = pd.to_datetime(booking_time_df['E-Time'], format='%I:%M %p')

# Night Courses that start at 7pm or later
night_df = booking_time_df[(booking_time_df["DT S-Time"] >= pd.to_datetime('19:00', format='%H:%M'))]

# Morning Courses that start before 12pm
morning_df = booking_time_df[booking_time_df["DT S-Time"] < pd.to_datetime('12:00', format='%H:%M')]

# Afternoon courses that start at or after 12pm, before 7pm
afternoon_df = booking_time_df[(booking_time_df["DT S-Time"] >= pd.to_datetime('12:00', format='%H:%M')) & 
                               (booking_time_df["DT S-Time"] < pd.to_datetime('19:00', format='%H:%M'))]

#check the shapes of the different dataframes
print("booking_time_df", booking_time_df.shape)
print("night_df", night_df.shape)
print("morning_df", morning_df.shape)
print("afternoon_df", afternoon_df.shape)
print("Total rows from the time period dataframes: ", (morning_df.shape[0] + afternoon_df.shape[0] + night_df.shape[0]))


booking_time_df (1836, 13)
night_df (60, 13)
morning_df (721, 13)
afternoon_df (1055, 13)
Total rows from the time period dataframes:  1836


In [20]:
#Check the columns in booking_time_df
booking_time_df.columns

Index(['Dept', 'Sch #', 'Course Code', 'Course Title', 'Session Type',
       'Session Date', 'Session Day', 'S-Time', 'E-Time', 'Venue',
       'Sch Status', 'DT S-Time', 'DT E-Time'],
      dtype='object')

### 3.1 Night Sessions

For all Weekday 7-10pm courses, The "Time Booked From" will always be **1 hour** before S-Time, and the "Time Booked To" will be the **same** as E-Time  

In [21]:
#Check the start times for the night courses
night_df['DT S-Time'].value_counts()

DT S-Time
1900-01-01 19:00:00    55
1900-01-01 21:00:00     4
1900-01-01 20:30:00     1
Name: count, dtype: int64

In [22]:
# Function to add 1 hr before
def add_hour(time):
          return time - timedelta(minutes=60)

night_df["DT S-Time"] = night_df["DT S-Time"].apply(add_hour)

#Define a function to populate the "Time Booked From" and "Time Booked To" columns based on the "HH:MM AM/PM" format
def populate_time_booked_from(time):
        return time.strftime("%I:%M %p")

night_df['Time Booked From'] = night_df['DT S-Time'].apply(populate_time_booked_from)
night_df['Time Booked To'] = night_df['DT E-Time'].apply(populate_time_booked_from)

print("The number of night courses: ", night_df.shape[0])

The number of night courses:  60


In [23]:
night_df['Time Booked From'].value_counts()

Time Booked From
06:00 PM    55
08:00 PM     4
07:30 PM     1
Name: count, dtype: int64

### 3.2 Morning & Afternoon Sessions

For Weekday + Weekend courses 
 - Time Booked To: Same as E-Time as reflected in gvSession
 - Time Booked From: 
    1. All morning sessions to start at **8am**
    2. Afternoon sessions to start **1hr before**  

Morning Courses

In [24]:
# Check the start times for the morning courses
morning_df['DT S-Time'].value_counts()  

DT S-Time
1900-01-01 09:00:00    714
1900-01-01 08:15:00      4
1900-01-01 09:15:00      3
Name: count, dtype: int64

In [25]:
# Populate "Time Booked From" to 8am

# S-Time is 8am for Morning Session
morning_df.loc[:,'Time Booked From'] = "08:00 AM"

#Check if all morning sessions start at 8am, by checking if there are any rows where the time booked from is not 8am
morning_df[morning_df['Time Booked From'] != "08:00 AM"]


Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status,DT S-Time,DT E-Time,Time Booked From


In [26]:
#Check the updated timings in the "Time Booked From" column
morning_df['Time Booked From'].value_counts()

Time Booked From
08:00 AM    721
Name: count, dtype: int64

In [27]:
#Time Booked To will be the same as E-Time, as reflected in gvSession
morning_df["Time Booked To"] = morning_df["E-Time"]
 
morning_df.sample(10)

Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status,DT S-Time,DT E-Time,Time Booked From,Time Booked To
1234,"Human Capital, Management & Leadership",SCH24-Not Specified-ACSUSPTGWYW-00010,ACSUSPTGWYW,Advanced Communication Strategies: Using Strat...,Morning #02,2025-01-21,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,Confirmed,1900-01-01 09:00:00,1900-01-01 13:00:00,08:00 AM,01:00 PM
2472,"Services, Operations and Business Improvement",SCH24-Professional Certificate-DMCCDCYDS-00004,DMCCDCYDS,Digital Marketing - Creating and Curating Disp...,Morning #01,2025-02-20,Thursday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
2010,Finance & Technology,SCH24-Not Specified-HMEPR-00005,HMEPR,Hedging & Management of Energy Price Risk,Morning #01,2025-02-12,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
2094,Finance & Technology,SCH24-Not Specified-ACDSCM4-00006,ACDSCM4,Advanced Certificate in Digital Supply Chain M...,Morning #01,2025-03-05,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
1700,Finance & Technology,SCH24-Not Specified-DAUPB-00040,DAUPB,Data Analytics Using Power BI,Morning #01,2025-03-24,Monday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
2238,"Services, Operations and Business Improvement",SCH24-Not Specified-CNW2-00004,CNW2,Certificate in Nutrition and Wellness: Nutriti...,Morning #01,2025-02-22,Saturday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
2564,"Services, Operations and Business Improvement",SCH24-Not Specified-AIIM-00018,AIIM,Artificial Intelligence (AI) in Marketing: The...,Morning #01,2025-03-13,Thursday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
479,"Human Capital, Management & Leadership",SCH24-Not Specified-PWRWM1-00002,PWRWM1,Personal Wellbeing and Resilience in the Workp...,Morning #01,2025-03-11,Tuesday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM
1338,Business Management,SCH24-Not Specified-ACSSBM5-00002,ACSSBM5,Module 5: Sustainable Finance And Impact Inves...,Morning #01,2025-02-18,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 13:00:00,08:00 AM,01:00 PM
947,"Services, Operations and Business Improvement",SCH24-Graduate Diploma-FBS-00004,FBS,Foundations of Brand Storytelling,Morning #01,2025-02-13,Thursday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM


In [28]:
print("Number of morning classes: ", morning_df.shape[0])

Number of morning classes:  721


Afternoon Courses

In [29]:
afternoon_df['DT S-Time'].value_counts() 

DT S-Time
1900-01-01 14:00:00    371
1900-01-01 13:30:00    307
1900-01-01 17:00:00    268
1900-01-01 16:00:00     36
1900-01-01 13:00:00     31
1900-01-01 18:00:00     27
1900-01-01 13:45:00      4
1900-01-01 17:30:00      4
1900-01-01 15:30:00      2
1900-01-01 12:30:00      2
1900-01-01 15:00:00      2
1900-01-01 16:30:00      1
Name: count, dtype: int64

In [30]:
# Populate "Time Book From" to 1 hr before "DT S-Time"

# Define a function to add 1 hr before
def add_hour(time):
          return time - timedelta(minutes=60)

afternoon_df["DT S-Time"] = afternoon_df["DT S-Time"].apply(add_hour)

def populate_time_booked_from(time):
      
        return time.strftime("%I:%M %p")


afternoon_df['Time Booked From'] = afternoon_df['DT S-Time'].apply(populate_time_booked_from)

In [31]:
#Check the updated values in the "Time Booked From" column
afternoon_df['Time Booked From'].value_counts()

Time Booked From
01:00 PM    371
12:30 PM    307
04:00 PM    268
03:00 PM     36
12:00 PM     31
05:00 PM     27
12:45 PM      4
04:30 PM      4
02:30 PM      2
11:30 AM      2
02:00 PM      2
03:30 PM      1
Name: count, dtype: int64

In [32]:
#Time booked to is the same as the original E-Time as reflected in gvSession
afternoon_df["Time Booked To"] = afternoon_df["E-Time"]
afternoon_df.sample(10)

Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status,DT S-Time,DT E-Time,Time Booked From,Time Booked To
1979,Finance & Technology,SCH24-Not Specified-AFMCV-00004,AFMCV,Advanced Financial Modelling and Corporate Val...,Assessment #02,2025-04-01,Tuesday,02:00 PM,06:00 PM,SMUA Room 1,Pending,1900-01-01 13:00:00,1900-01-01 18:00:00,01:00 PM,06:00 PM
181,Finance & Technology,SCH24-Not Specified-ACVAUTM4-00002,ACVAUTM4,Advanced Certificate in Visual Analytics Using...,Afternoon #02,2025-03-26,Wednesday,01:30 PM,05:00 PM,SMUA Room 1,Pending,1900-01-01 12:30:00,1900-01-01 17:00:00,12:30 PM,05:00 PM
786,"Human Capital, Management & Leadership",SCH24-Not Specified-HACUAIPD-00007,HACUAIPD,HR Analytics Certificate - Using Analytics to ...,Afternoon #05,2025-02-27,Thursday,02:00 PM,06:00 PM,SMUA Room 1,Pending,1900-01-01 13:00:00,1900-01-01 18:00:00,01:00 PM,06:00 PM
2794,"Human Capital, Management & Leadership",SCH24-Not Specified-ACSUSPTGWYW-00012,ACSUSPTGWYW,Advanced Communication Strategies: Using Strat...,Afternoon #05,2025-03-10,Monday,02:00 PM,06:00 PM,SMUA Room 1,Confirmed,1900-01-01 13:00:00,1900-01-01 18:00:00,01:00 PM,06:00 PM
357,Finance & Technology,SCH24-Not Specified-DAUPB-00040,DAUPB,Data Analytics Using Power BI,Afternoon #02,2025-03-24,Monday,01:30 PM,05:00 PM,SMUA Room 1,Pending,1900-01-01 12:30:00,1900-01-01 17:00:00,12:30 PM,05:00 PM
3068,Finance & Technology,SCH24-Not Specified-GCDFM6-00003,GCDFM6,Graduate Certificate in Digital Finance Module...,,2025-02-12,Wednesday,06:00 PM,07:00 PM,SMUA Room 1,Pending,1900-01-01 17:00:00,1900-01-01 19:00:00,05:00 PM,07:00 PM
1512,"Human Capital, Management & Leadership",SCH24-Not Specified-HGCRBSFS-00001,HGCRBSFS,HR Graduate Certification - Rewards (Compensat...,Afternoon #11,2025-02-25,Tuesday,04:00 PM,06:00 PM,SMUA Room 1,Confirmed,1900-01-01 15:00:00,1900-01-01 18:00:00,03:00 PM,06:00 PM
2311,Finance & Technology,SCH24-Not Specified-CWACEWE2-00007,CWACEWE2,Clearly Write: A Course on Effective Writing f...,,2025-01-16,Thursday,01:30 PM,04:00 PM,SMUA Room 1,Confirmed,1900-01-01 12:30:00,1900-01-01 16:00:00,12:30 PM,04:00 PM
3083,Finance & Technology,SCH24-Not Specified-ACVAUTM4-00002,ACVAUTM4,Advanced Certificate in Visual Analytics Using...,Afternoon #02,2025-03-27,Thursday,01:30 PM,05:00 PM,SMUA Room 1,Pending,1900-01-01 12:30:00,1900-01-01 17:00:00,12:30 PM,05:00 PM
590,Finance & Technology,SCH24-Not Specified-ACORMBM6-00002,ACORMBM6,Advanced Certificate in Online Reputation Mana...,Assessment #05,2025-02-07,Friday,05:00 PM,06:00 PM,SMUA Room 1,Pending,1900-01-01 16:00:00,1900-01-01 18:00:00,04:00 PM,06:00 PM


In [33]:
print("Number of afternoon classes: ", afternoon_df.shape[0])


Number of afternoon classes:  1055


In [34]:
#merge all time data frames together
booking_time_df = pd.concat([morning_df,afternoon_df,night_df], ignore_index=True)

In [35]:
# Quick check to see if number of rows still tally
print("Number of rows in current working dataframe: ", booking_time_df.shape[0])
print("Number of rows in previous dataframe:", facility_booking_df.shape[0])

Number of rows in current working dataframe:  1836
Number of rows in previous dataframe: 1836


In [36]:
#Check for any null values in the Time Booked From and Time Booked To columns
print("Time Booked From Col Count: ", booking_time_df["Time Booked From"].isnull().sum())
print("Time Booked To Col Count: ", booking_time_df["Time Booked To"].isnull().sum())

Time Booked From Col Count:  0
Time Booked To Col Count:  0


In [37]:
booking_time_df['Time Booked From'].value_counts()

Time Booked From
08:00 AM    721
01:00 PM    371
12:30 PM    307
04:00 PM    268
06:00 PM     55
03:00 PM     36
12:00 PM     31
05:00 PM     27
12:45 PM      4
04:30 PM      4
08:00 PM      4
02:30 PM      2
11:30 AM      2
02:00 PM      2
03:30 PM      1
07:30 PM      1
Name: count, dtype: int64

### 3.3 Merging full day courses

Find rows with the Same Course Title and Session Date, 

1. The start time will be the earliest between the two 
2. The end time will be the latest of the row


In [38]:
merge_courses_df = booking_time_df.copy()
merge_courses_df.columns

Index(['Dept', 'Sch #', 'Course Code', 'Course Title', 'Session Type',
       'Session Date', 'Session Day', 'S-Time', 'E-Time', 'Venue',
       'Sch Status', 'DT S-Time', 'DT E-Time', 'Time Booked From',
       'Time Booked To'],
      dtype='object')

In [39]:
print(f"Number of rows before merging courses: {merge_courses_df.shape[0]}")

Number of rows before merging courses: 1836


In [40]:
#Convert the Start Time and End Time Columns to DateTime Objects
merge_courses_df['Start Timing'] = pd.to_datetime(merge_courses_df['Time Booked From'])
merge_courses_df['End Timing'] = pd.to_datetime(merge_courses_df['Time Booked To'])

merge_courses_df.head()

Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status,DT S-Time,DT E-Time,Time Booked From,Time Booked To,Start Timing,End Timing
0,Finance & Technology,SCH24-Not Specified-ECSDF-00003,ECSDF,Executive Certificate in Successful Data Trans...,Morning #01,2025-02-28,Friday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM,2025-03-03 08:00:00,2025-03-03 12:30:00
1,Finance & Technology,SCH24-Not Specified-MW3-00005,MW3,The Metaverse and Web 3.0,Morning #02,2025-01-22,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 13:00:00,08:00 AM,01:00 PM,2025-03-03 08:00:00,2025-03-03 13:00:00
2,"Services, Operations and Business Improvement",SCH24-Not Specified-TMBO-00010,TMBO,TikTok Marketing and Ads for Business,Morning #01,2025-03-28,Friday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM,2025-03-03 08:00:00,2025-03-03 12:30:00
3,Finance & Technology,SCH24-Not Specified-GSRPA-00004,GSRPA,Getting Started with Robotic Process Automation,Morning #01,2025-02-20,Thursday,09:00 AM,12:30 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 12:30:00,08:00 AM,12:30 PM,2025-03-03 08:00:00,2025-03-03 12:30:00
4,Finance & Technology,SCH24-Not Specified-ACCMBM4V2-00005,ACCMBM4V2,Advanced Certificate in Crisis Management for ...,Morning #02,2025-03-05,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,Pending,1900-01-01 09:00:00,1900-01-01 13:00:00,08:00 AM,01:00 PM,2025-03-03 08:00:00,2025-03-03 13:00:00


In [41]:
merge_courses_df = merge_courses_df.groupby(['Course Title', 'Session Date']).agg({'DT S-Time': 'first', 
                                                                           'DT E-Time': 'first',
                                                                            'Dept': 'first', 
                                                                            'Course Code': 'first',
                                                                                   'Sch #': 'first',
                                                                            'Session Type': 'first',
                                                                            'Session Day': 'first',
                                                                            'S-Time': 'first', 
                                                                            'E-Time': 'first',
                                                                            'Venue': 'first',
                                                                            'Start Timing': 'min',
                                                                            'End Timing': 'max'}).reset_index()

In [42]:
# Convert start_time and end_time back to time format
merge_courses_df['Time Booked From'] = merge_courses_df['Start Timing'].dt.strftime('%I:%M %p')
merge_courses_df['Time Booked To'] = merge_courses_df['End Timing'].dt.strftime('%I:%M %p')

In [43]:
print(f"Number of rows after merging courses: {merge_courses_df.shape[0]}")

Number of rows after merging courses: 780


In [44]:
merge_courses_df

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To
0,A Case Approach to Modelling Corporate Acquisi...,2025-03-06,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
1,A Case Approach to Modelling Corporate Acquisi...,2025-03-07,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
2,"AI and Machine Learning: Tools, Applications a...",2025-02-14,1900-01-01 09:00:00,1900-01-01 12:30:00,Finance & Technology,FFHPAIML,SCH24-Not Specified-FFHPAIML-00004,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
3,Adaptability In the Face of Disruptive Change:...,2025-03-12,1900-01-01 09:00:00,1900-01-01 12:30:00,Finance & Technology,AFDCPV2,SCH24-Not Specified-AFDCPV2-00001,Morning #01,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
4,Advanced Cash Flow Analysis - Counterparty Cre...,2025-02-19,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,ACFACCAV2,SCH24-Not Specified-ACFACCAV2-00001,Morning #02,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,Workplace Automation 101: Automation at Work w...,2025-01-17,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
776,Workplace Automation 101: Automation at Work w...,2025-04-14,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Monday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
777,Workplace Automation 101: Automation at Work w...,2025-04-15,1900-01-01 09:00:00,1900-01-01 13:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
778,Writing Effectively for the Digital Age,2025-02-27,1900-01-01 09:00:00,1900-01-01 12:30:00,Finance & Technology,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Morning #01,Thursday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 17:00:00,08:00 AM,05:00 PM


Verify merge booking

- From a list of unique Course Titles, look for a Course Title with two instances of the same day
- From the Facility Booking DF, check for the earliest start time and latest end time
- In the new Merge Booking DF, check if the two instances have been merged with the earliest start time and latest end time

In [45]:
facility_booking_df["Course Title"].value_counts() 

Course Title
Advanced Communication Strategies: Using Strategic Persuasion To Get What You Want                                                                                30
Data Analytics Using Power BI                                                                                                                                     30
Innovation Culture Catalyst (ICC): The Game Changer                                                                                                               20
Advanced Certificate in Visual Analytics Using Tableau Module 1: Unlocking Insights with Analytics                                                                20
Graduate Certificate in Media Communication and Strategy Module 1: Media Writing Across Platforms: Video, Podcasts, Online                                        15
                                                                                                                                                                  

In [46]:
facility_booking_df[(facility_booking_df["Course Title"] == "Borrowing Base Lending")]

Unnamed: 0,Dept,Sch #,Course Code,Course Title,Session Type,Session Date,Session Day,S-Time,E-Time,Venue,Sch Status


In [47]:
merge_courses_df[(merge_courses_df["Course Title"] == "Borrowing Base Lending")]

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To


### 3.4 Weird Timings Rounded Down
In this section, we ensure that the time values in the "Time Booked From" and "Time Booked To" column will be perfect values

For example, 
1. If a course starts/ends at 06:01 PM, it is rounded down to 06:00 PM
2. If a course starts/ends at 05:31 PM, it is rounded down to 05:30 PM


Check if any of the values in the "Time Booked From" or "Time Booked To" column are weird

In [48]:
merge_courses_df['Time Booked To'].value_counts()

Time Booked To
06:00 PM    512
05:00 PM    168
10:30 PM     35
07:00 PM     24
10:00 PM     12
09:30 PM      8
05:30 PM      5
01:00 PM      4
06:45 PM      4
10:45 AM      3
04:30 PM      2
06:15 PM      2
10:00 AM      1
Name: count, dtype: int64

In [49]:
merge_courses_df['Time Booked To'].value_counts()

Time Booked To
06:00 PM    512
05:00 PM    168
10:30 PM     35
07:00 PM     24
10:00 PM     12
09:30 PM      8
05:30 PM      5
01:00 PM      4
06:45 PM      4
10:45 AM      3
04:30 PM      2
06:15 PM      2
10:00 AM      1
Name: count, dtype: int64

In [50]:
# Function to convert string time to datetime object
def convert_to_datetime(time_str):
    return datetime.strptime(time_str, '%I:%M %p')

# Function to round down to the nearest multiple of 5 minutes
def round_down_to_nearest_five(time):
    minutes = (time.minute // 5) * 5
    return time.replace(minute=minutes, second=0, microsecond=0)

# Convert the two columns to datetime objects
merge_courses_df['DT S-Time'] = merge_courses_df['Time Booked From'].apply(convert_to_datetime)
merge_courses_df['DT E-Time'] = merge_courses_df['Time Booked To'].apply(convert_to_datetime)

# Identify and isolate timings that don't end on a 5-minute mark
merge_courses_df['Original Start Time'] = merge_courses_df['Time Booked From']  # Keep a copy of the original start times
merge_courses_df['Original End Time'] = merge_courses_df['Time Booked To']  # Keep a copy of the original end times

# Round down start and end times to the nearest 5-minute mark
merge_courses_df['DT S-Time'] = merge_courses_df['DT S-Time'].apply(round_down_to_nearest_five)
merge_courses_df['DT E-Time'] = merge_courses_df['DT E-Time'].apply(round_down_to_nearest_five)

# Convert back to string format for consistency with original data
merge_courses_df['Time Booked From'] = merge_courses_df['DT S-Time'].apply(lambda x: x.strftime('%I:%M %p'))
merge_courses_df['Time Booked To'] = merge_courses_df['DT E-Time'].apply(lambda x: x.strftime('%I:%M %p'))

# Print the number of courses with start or end times that were rounded down
rounded_down_start = len(merge_courses_df[merge_courses_df['Original Start Time'] != merge_courses_df['Time Booked From']])
rounded_down_end = len(merge_courses_df[merge_courses_df['Original End Time'] != merge_courses_df['Time Booked To']])
print("Number of courses with start times rounded down:", rounded_down_start)
print("Number of courses with end times rounded down:", rounded_down_end)

# Check if any timings don't end on a 5-minute mark after rounding, and see if any are left
weird_timings_start = merge_courses_df[merge_courses_df['DT S-Time'].apply(lambda x: x.minute % 5) != 0]
weird_timings_end = merge_courses_df[merge_courses_df['DT E-Time'].apply(lambda x: x.minute % 5) != 0]
remaining_weird_timings_start = len(weird_timings_start)
remaining_weird_timings_end = len(weird_timings_end)
print("Are there any imperfect start timings left?", remaining_weird_timings_start > 0)
print("Are there any imperfect end timings left?", remaining_weird_timings_end > 0)


Number of courses with start times rounded down: 0
Number of courses with end times rounded down: 0
Are there any imperfect start timings left? False
Are there any imperfect end timings left? False


Check if there are any weird end timings left

In [51]:
merge_courses_df['Time Booked To'].value_counts()

Time Booked To
06:00 PM    512
05:00 PM    168
10:30 PM     35
07:00 PM     24
10:00 PM     12
09:30 PM      8
05:30 PM      5
01:00 PM      4
06:45 PM      4
10:45 AM      3
04:30 PM      2
06:15 PM      2
10:00 AM      1
Name: count, dtype: int64

In [52]:
merge_courses_df['Time Booked From'].value_counts()

Time Booked From
08:00 AM    720
06:00 PM     55
05:00 PM      4
11:30 AM      1
Name: count, dtype: int64

In [53]:
#drop the extra columns generated
merge_courses_df.drop(columns=['Original End Time', 'Original Start Time'], inplace=True)

In [54]:
#Verify changes made to "Time Booked To" column
merge_courses_df.sample(10)

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To
524,Graduate Certificate in Media Communication an...,2025-02-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,GCMCSM2,SCH24-Not Specified-GCMCSM2-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
198,Advanced Certificate in Venture Capital Module...,2025-04-04,1900-01-01 08:00:00,1900-01-01 18:00:00,"Human Capital, Management & Leadership",ACVCM4,SCH24-Not Specified-ACVCM4-00002,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
468,Finance for Non-Finance Professionals,2025-03-24,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,FNP,SCH24-Not Specified-FNP-00007,Morning #02,Monday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
535,Graduate Certificate in Strategic Digital Tran...,2025-04-21,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,GCSDTFM2,SCH24-Not Specified-GCSDTFM2-00003,Morning #02,Monday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
255,Applied Design Thinking: The Role of Artificia...,2025-03-20,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ADTTRAIML2,SCH24-Not Specified-ADTTRAIML2-00004,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
125,Advanced Certificate in Online Reputation Mana...,2025-02-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACORMBM6,SCH24-Not Specified-ACORMBM6-00002,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
101,Advanced Certificate in Logistics and Supply C...,2025-04-22,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACLSCMM4,SCH24-Advanced Diploma-ACLSCMM4-00002,Morning #02,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
530,Graduate Certificate in Media Communication an...,2025-04-22,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,GCMCSM5,SCH24-Not Specified-GCMCSM5-00005,Morning #02,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
663,"Module 4: Governance, Labour, Human Rights Man...",2025-02-12,1900-01-01 08:00:00,1900-01-01 18:00:00,Business Management,ACSSBM4,SCH24-Not Specified-ACSSBM4-00002,Morning #01,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM
492,Graduate Certificate in Allied Legal Professio...,2025-02-27,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,SLTIGCALPSM4,SCH24-Not Specified-SLTIGCALPSM4-00002,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM


## 4.0 Obtain Use Type Column

Use the "(Latest) SSG Masterlisting" Excel File as a Master List to match the IO Code to its respective Course Title/Course Code
- If IO Code present, Use Type is “Event” 
- If absent, Use Type is “AdHoc” 

In [55]:
ssg_master_listing_df.head()

Unnamed: 0,Column1,Course Title (F2F only),Course Title (Synchronous E-Learning only),Course Title [Blended Delivery Mode],TPG Course Code (F2F only),TPG Course Code (Synchronous E-learning only),TPG Course Code\n[Blended Delivery Mode],SCN Course Code for F2F,SCN Course Code (Synchronous E-learning),IBF for F2F,...,Column8,Column9,Column10,Column11,Column12,Column13,Column14,Column15,Column16,Column17
0,1,Agile for Successful Project Implementation,Agile for Successful Project Implementation (S...,,TGS-2020501663,TGS-2020512848,,CRS-N-0045079,CRS-N-0052991,,...,,,,,,Digital,,,,
1,2,Achieving Mastery and Success Through Growth M...,Achieving Mastery and Success Through Growth M...,,TGS-2020001396,TGS-2020501599,,,CRS-N-0044320,,...,,,,,,Care,,,,
2,3,Adopting DevOps,,,TGS-2020501660,,,CRS-N-0045076,,,...,,,,,,Industry4.0,,,,
3,4,Advanced Communication Strategies: Using Strat...,Advanced Communication Strategies: Using Strat...,,TGS-2020512939,TGS-2020501602,,CRS-N-0053385,CRS-N-0044322,,...,,,,,,Care,,,,
4,5,Anti-Money Laundering and its Ecosystem,Anti-Money Laundering and its Ecosystem (Synch...,,TGS-2020501697,TGS-2020513008,,CRS-N-0045293,CRS-N-0053220,,...,,,,,,Industry4.0,,,,


This section of the code is to standardize the naming format for the Course Title for merging later on

1. First, we extract only the "Course Title (F2F only)" and "IO Code" columns from the masterlisting file, into a dataframe
2. Then, define a function to standardize and clean up the naming format in this dataframe to merge later on 

In [56]:
# Take only necessary rows as we just need course title to reference and IO code for populating columns
master_df = ssg_master_listing_df[["Course Title (F2F only)","IO Code"]]

# Rename to Course Title so that it can be merged on this column
master_df.rename(columns = {'Course Title (F2F only)':'Course Title'}, inplace = True)

# Function to standardize course title naming format
def course_title_cleanup(title):
    
     if "(Classroom & Asynchronous)" in str(title):
        return title.split("(Classroom & Asynchronous)")[0]
     else:
        return title

# Apply the function to the Course Title Column
master_df['Course Title'] = master_df['Course Title'].apply(course_title_cleanup)

# Remove any possible trailing spaces 
master_df['Course Title'] = master_df['Course Title'].str.strip()

In [57]:
#Check the number of rows in the master_df
print("Number of rows in (Latest) SSG Approved_Master Listing: ", master_df.shape[0])

Number of rows in (Latest) SSG Approved_Master Listing:  1401


In [58]:
master_df.head()

Unnamed: 0,Course Title,IO Code
0,Agile for Successful Project Implementation,ZAC1D2104
1,Achieving Mastery and Success Through Growth M...,ZAC1C50055
2,Adopting DevOps,ZAC1D2020
3,Advanced Communication Strategies: Using Strat...,ZAC1C50045
4,Anti-Money Laundering and its Ecosystem,ZAC1D2030


In [59]:
#Merge the two dataframes
io_code_df = merge_courses_df.merge(master_df.drop_duplicates(subset=['Course Title']), how='left')

In [60]:
# Ensure number of rows are still the same upon mergin
print("Number of rows initially before the merge: ", merge_courses_df.shape[0])
print("Number of rows after adding IO Code: ", io_code_df.shape[0])

io_code_df.head()

Number of rows initially before the merge:  780
Number of rows after adding IO Code:  780


Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code
0,A Case Approach to Modelling Corporate Acquisi...,2025-03-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029
1,A Case Approach to Modelling Corporate Acquisi...,2025-03-07,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029
2,"AI and Machine Learning: Tools, Applications a...",2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,FFHPAIML,SCH24-Not Specified-FFHPAIML-00004,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6425
3,Adaptability In the Face of Disruptive Change:...,2025-03-12,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,AFDCPV2,SCH24-Not Specified-AFDCPV2-00001,Morning #01,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6326
4,Advanced Cash Flow Analysis - Counterparty Cre...,2025-02-19,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACFACCAV2,SCH24-Not Specified-ACFACCAV2-00001,Morning #02,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6066


A function will be defined to label the "Use Type" column as "Event" for courses with IO Code, and the rest of the courses will be labelled as "Ad-Hoc"

In [61]:
# Find the number of sessions without IO Code
io_code_df["IO Code"].isnull().sum()

78

In [62]:
# Sessions without IO Code will be labelled as "AdHoc"
io_code_df["IO Code"].fillna("AdHoc", inplace = True)

In [63]:
# For columns with IO Code, Use Type will be labelled as "Event"
def populate_use_type(io_code):
    
          if io_code != 'AdHoc':
            return "Event"
          else:
            return "AdHoc"

io_code_df["Use Type"] = io_code_df["IO Code"].apply(populate_use_type)

In [64]:
#Check the if the number of rows in the dataframe remains the same
print("Number of rows after adding IO Code: ", io_code_df.shape[0])

use_type_counts = io_code_df["Use Type"].value_counts()

#Conditional statements to check how many rows have "Event" or "AdHoc" in the "Use Type" column
if 'Event' in use_type_counts:
    print("Total Number of rows with 'Event' as Use-type: ", use_type_counts['Event'])
else:
    print("Total number of rows with 'Event' as Use-type: 0")

if 'AdHoc' in use_type_counts:
    print("Total number of rows with 'AdHoc' as Use-type: ", use_type_counts['AdHoc'])
else:
    print("Total Number of rows with 'AdHoc' as Use-type: 0")

Number of rows after adding IO Code:  780
Total Number of rows with 'Event' as Use-type:  702
Total number of rows with 'AdHoc' as Use-type:  78


In [65]:
io_code_df[io_code_df["Use Type"] == "Event"]

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code,Use Type
0,A Case Approach to Modelling Corporate Acquisi...,2025-03-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event
1,A Case Approach to Modelling Corporate Acquisi...,2025-03-07,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event
2,"AI and Machine Learning: Tools, Applications a...",2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,FFHPAIML,SCH24-Not Specified-FFHPAIML-00004,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6425,Event
3,Adaptability In the Face of Disruptive Change:...,2025-03-12,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,AFDCPV2,SCH24-Not Specified-AFDCPV2-00001,Morning #01,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6326,Event
4,Advanced Cash Flow Analysis - Counterparty Cre...,2025-02-19,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACFACCAV2,SCH24-Not Specified-ACFACCAV2-00001,Morning #02,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6066,Event
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,Workplace Automation 101: Automation at Work w...,2025-01-17,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event
776,Workplace Automation 101: Automation at Work w...,2025-04-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Monday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event
777,Workplace Automation 101: Automation at Work w...,2025-04-15,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event
778,Writing Effectively for the Digital Age,2025-02-27,1900-01-01 08:00:00,1900-01-01 17:00:00,Finance & Technology,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Morning #01,Thursday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 17:00:00,08:00 AM,05:00 PM,ZAC1D6295,Event


In [66]:
io_code_df[io_code_df["Use Type"] == "AdHoc"]

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code,Use Type
57,Advanced Certificate in Generative AI for Digi...,2025-03-15,1900-01-01 08:00:00,1900-01-01 13:00:00,Finance & Technology,SCTP-ACDMM3,SCH24-Not Specified-SCTP-ACDMM3-00001,Morning #02,Saturday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 13:00:00,08:00 AM,01:00 PM,AdHoc,AdHoc
58,Advanced Certificate in Generative AI for Digi...,2025-04-05,1900-01-01 08:00:00,1900-01-01 13:00:00,Finance & Technology,SCTP-ACDMM4,SCH24-Not Specified-SCTP-ACDMM4-00001,Morning #02,Saturday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 13:00:00,08:00 AM,01:00 PM,AdHoc,AdHoc
59,Advanced Certificate in Generative AI for Digi...,2025-04-26,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,SCTP-ACDMM4,SCH24-Not Specified-SCTP-ACDMM4-00001,,Saturday,09:00 AM,06:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,AdHoc,AdHoc
91,Advanced Certificate in Innovative Educational...,2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACIETM1,SCH24-Not Specified-ACIETM1-00007,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,AdHoc,AdHoc
92,Advanced Certificate in Innovative Educational...,2025-03-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACIETM2,SCH24-Not Specified-ACIETM2-00005,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,AdHoc,AdHoc
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
750,Train Your Brain: How to Improve Brain Health ...,2025-04-14,1900-01-01 08:00:00,1900-01-01 17:00:00,"Human Capital, Management & Leadership",TYB45,SCH24-Not Specified-TYB45-00003,Morning #01,Monday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 17:00:00,08:00 AM,05:00 PM,AdHoc,AdHoc
751,Train Your Brain: How to Improve Brain Health ...,2025-04-15,1900-01-01 08:00:00,1900-01-01 18:00:00,"Human Capital, Management & Leadership",TYB45,SCH24-Not Specified-TYB45-00003,Morning #01,Tuesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,AdHoc,AdHoc
769,Using A/B Tests and Econometric Methods for Da...,2025-03-05,1900-01-01 18:00:00,1900-01-01 22:30:00,"Services, Operations and Business Improvement",UATEMDDDM,SCH24-Not Specified-UATEMDDDM-00003,Evening #01,Wednesday,07:00 PM,10:30 PM,SMUA Room 1,2025-03-03 18:00:00,2025-03-03 22:30:00,06:00 PM,10:30 PM,AdHoc,AdHoc
770,Using A/B Tests and Econometric Methods for Da...,2025-03-06,1900-01-01 18:00:00,1900-01-01 22:30:00,"Services, Operations and Business Improvement",UATEMDDDM,SCH24-Not Specified-UATEMDDDM-00003,Evening #01,Thursday,07:00 PM,10:30 PM,SMUA Room 1,2025-03-03 18:00:00,2025-03-03 22:30:00,06:00 PM,10:30 PM,AdHoc,AdHoc


## 5.0 Obtain Purpose Column

Standardized Format for the "Purpose" Column will be: | Course Code ~ Course Title |

In [67]:
purpose_df = io_code_df

In [68]:
def populate_purpose(row):
    
    return row['Course Code'] + " ~ " + row['Course Title']


purpose_df['Purpose'] = purpose_df.apply(populate_purpose,axis=1)

In [69]:
print("Number of rows after adding Purpose: ", purpose_df.shape[0])

Number of rows after adding Purpose:  780


In [70]:
purpose_df.head()

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code,Use Type,Purpose
0,A Case Approach to Modelling Corporate Acquisi...,2025-03-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event,ACAMCAB ~ A Case Approach to Modelling Corpora...
1,A Case Approach to Modelling Corporate Acquisi...,2025-03-07,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event,ACAMCAB ~ A Case Approach to Modelling Corpora...
2,"AI and Machine Learning: Tools, Applications a...",2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,FFHPAIML,SCH24-Not Specified-FFHPAIML-00004,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6425,Event,"FFHPAIML ~ AI and Machine Learning: Tools, App..."
3,Adaptability In the Face of Disruptive Change:...,2025-03-12,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,AFDCPV2,SCH24-Not Specified-AFDCPV2-00001,Morning #01,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6326,Event,AFDCPV2 ~ Adaptability In the Face of Disrupti...
4,Advanced Cash Flow Analysis - Counterparty Cre...,2025-02-19,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACFACCAV2,SCH24-Not Specified-ACFACCAV2-00001,Morning #02,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6066,Event,ACFACCAV2 ~ Advanced Cash Flow Analysis - Coun...


## 6.0 Obtain Event Code Column

The Event Code will be the IO Code for each course. However, for courses with no IO Code, the column will be populated with null values instead.

In [71]:
event_code_df = purpose_df

In [72]:
#Replace values in the IO Code column with null values if the value is "AdHoc
event_code_df["IO Code"].replace('AdHoc', np.nan, inplace=True)
print("Number of rows after adding Event Code: ", event_code_df.shape[0])


Number of rows after adding Event Code:  780


In [73]:
event_code_df[event_code_df["Use Type"] == "AdHoc"]

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code,Use Type,Purpose
57,Advanced Certificate in Generative AI for Digi...,2025-03-15,1900-01-01 08:00:00,1900-01-01 13:00:00,Finance & Technology,SCTP-ACDMM3,SCH24-Not Specified-SCTP-ACDMM3-00001,Morning #02,Saturday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 13:00:00,08:00 AM,01:00 PM,,AdHoc,SCTP-ACDMM3 ~ Advanced Certificate in Generati...
58,Advanced Certificate in Generative AI for Digi...,2025-04-05,1900-01-01 08:00:00,1900-01-01 13:00:00,Finance & Technology,SCTP-ACDMM4,SCH24-Not Specified-SCTP-ACDMM4-00001,Morning #02,Saturday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 13:00:00,08:00 AM,01:00 PM,,AdHoc,SCTP-ACDMM4 ~ Advanced Certificate in Generati...
59,Advanced Certificate in Generative AI for Digi...,2025-04-26,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,SCTP-ACDMM4,SCH24-Not Specified-SCTP-ACDMM4-00001,,Saturday,09:00 AM,06:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,,AdHoc,SCTP-ACDMM4 ~ Advanced Certificate in Generati...
91,Advanced Certificate in Innovative Educational...,2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACIETM1,SCH24-Not Specified-ACIETM1-00007,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,,AdHoc,ACIETM1 ~ Advanced Certificate in Innovative E...
92,Advanced Certificate in Innovative Educational...,2025-03-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACIETM2,SCH24-Not Specified-ACIETM2-00005,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,,AdHoc,ACIETM2 ~ Advanced Certificate in Innovative E...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
750,Train Your Brain: How to Improve Brain Health ...,2025-04-14,1900-01-01 08:00:00,1900-01-01 17:00:00,"Human Capital, Management & Leadership",TYB45,SCH24-Not Specified-TYB45-00003,Morning #01,Monday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 17:00:00,08:00 AM,05:00 PM,,AdHoc,TYB45 ~ Train Your Brain: How to Improve Brain...
751,Train Your Brain: How to Improve Brain Health ...,2025-04-15,1900-01-01 08:00:00,1900-01-01 18:00:00,"Human Capital, Management & Leadership",TYB45,SCH24-Not Specified-TYB45-00003,Morning #01,Tuesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,,AdHoc,TYB45 ~ Train Your Brain: How to Improve Brain...
769,Using A/B Tests and Econometric Methods for Da...,2025-03-05,1900-01-01 18:00:00,1900-01-01 22:30:00,"Services, Operations and Business Improvement",UATEMDDDM,SCH24-Not Specified-UATEMDDDM-00003,Evening #01,Wednesday,07:00 PM,10:30 PM,SMUA Room 1,2025-03-03 18:00:00,2025-03-03 22:30:00,06:00 PM,10:30 PM,,AdHoc,UATEMDDDM ~ Using A/B Tests and Econometric Me...
770,Using A/B Tests and Econometric Methods for Da...,2025-03-06,1900-01-01 18:00:00,1900-01-01 22:30:00,"Services, Operations and Business Improvement",UATEMDDDM,SCH24-Not Specified-UATEMDDDM-00003,Evening #01,Thursday,07:00 PM,10:30 PM,SMUA Room 1,2025-03-03 18:00:00,2025-03-03 22:30:00,06:00 PM,10:30 PM,,AdHoc,UATEMDDDM ~ Using A/B Tests and Econometric Me...


In [74]:
event_code_df

Unnamed: 0,Course Title,Session Date,DT S-Time,DT E-Time,Dept,Course Code,Sch #,Session Type,Session Day,S-Time,E-Time,Venue,Start Timing,End Timing,Time Booked From,Time Booked To,IO Code,Use Type,Purpose
0,A Case Approach to Modelling Corporate Acquisi...,2025-03-06,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Thursday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event,ACAMCAB ~ A Case Approach to Modelling Corpora...
1,A Case Approach to Modelling Corporate Acquisi...,2025-03-07,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6029,Event,ACAMCAB ~ A Case Approach to Modelling Corpora...
2,"AI and Machine Learning: Tools, Applications a...",2025-02-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,FFHPAIML,SCH24-Not Specified-FFHPAIML-00004,Morning #01,Friday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6425,Event,"FFHPAIML ~ AI and Machine Learning: Tools, App..."
3,Adaptability In the Face of Disruptive Change:...,2025-03-12,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,AFDCPV2,SCH24-Not Specified-AFDCPV2-00001,Morning #01,Wednesday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6326,Event,AFDCPV2 ~ Adaptability In the Face of Disrupti...
4,Advanced Cash Flow Analysis - Counterparty Cre...,2025-02-19,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,ACFACCAV2,SCH24-Not Specified-ACFACCAV2-00001,Morning #02,Wednesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6066,Event,ACFACCAV2 ~ Advanced Cash Flow Analysis - Coun...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
775,Workplace Automation 101: Automation at Work w...,2025-01-17,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00003,Morning #02,Friday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event,WA1AWMPP ~ Workplace Automation 101: Automatio...
776,Workplace Automation 101: Automation at Work w...,2025-04-14,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Monday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event,WA1AWMPP ~ Workplace Automation 101: Automatio...
777,Workplace Automation 101: Automation at Work w...,2025-04-15,1900-01-01 08:00:00,1900-01-01 18:00:00,Finance & Technology,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00004,Morning #02,Tuesday,09:00 AM,01:00 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 18:00:00,08:00 AM,06:00 PM,ZAC1D6187,Event,WA1AWMPP ~ Workplace Automation 101: Automatio...
778,Writing Effectively for the Digital Age,2025-02-27,1900-01-01 08:00:00,1900-01-01 17:00:00,Finance & Technology,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Morning #01,Thursday,09:00 AM,12:30 PM,SMUA Room 1,2025-03-03 08:00:00,2025-03-03 17:00:00,08:00 AM,05:00 PM,ZAC1D6295,Event,WEFTDA ~ Writing Effectively for the Digital Age


## 7.0 Obtain Facility Name, Building and Floor

Taking reference from FBS Report dataframe, if courses in the upcoming cycle occurred in the previous cycle as well, we will allocate these courses the same venue from the previous cycle 

Things to take note of:
- The Course Code may appear more than once in FBS Report, which means that a certain course may have run more than once in the same period.
- Only the latest venue will be extracted for each course

In [75]:
print(fbs_report_df.shape)
fbs_report_df.head()
print(fbs_report_df["Facility"].unique())

(2978, 25)
['YPHSL Classroom B1-09' 'LKCSB Classroom 3-4'
 'SOE/SCIS2 Seminar Room 5-2' 'SOA Meeting Room 4-1'
 'SOE/SCIS2 Catering Area 4B (Near to SR 4-2)' 'SOE/SCIS2 Classroom 3-2'
 'YPHSL Classroom B2-03' 'SOE/SCIS2 Seminar Room 4-4'
 'SOE/SCIS2 Seminar Room 2-2' 'YPHSL Seminar Room B2-01'
 'SOE/SCIS2 Seminar Room 4-3' 'SOA Seminar Room 1-1'
 'SOE/SCIS2 Seminar Room 4-2' 'SMUC Active Learning Classroom 3-2'
 'SOA Classroom 2-1' 'SOE/SCIS2 Classroom 3-3'
 'SOE/SCIS2 Seminar Room 2-8' 'LKSLIB Meeting Pod B1-2'
 'SOE/SCIS2 Seminar Room 2-5' 'SOA Seminar Room 3-3'
 'LKCSB Seminar Room 1-1' 'SOE/SCIS2 Seminar Room 2-6'
 'SOE/SCIS2 Classroom 3-1' 'YPHSL Seminar Room 2-01' 'LKCSB Classroom 3-5'
 'LKCSB Seminar Room 3-8' 'SOA Seminar Room 2-3'
 'SOE/SCIS2 Seminar Room 2-9'
 'SOE/SCIS2 Catering Area 4A (Near to SR 4-2)'
 'SOE/SCIS2 Seminar Room 2-3' 'SOE/SCIS2 Seminar Room 5-1'
 'YPHSL Seminar Room 2-02' 'SOSS/CIS Seminar Room 1-3'
 'SOA Seminar Room 2-4' 'LKCSB Seminar Room 2-3'
 'SOE/SCIS

In this section, 

1. We filter the FBS Report based on both cancelled and confirmed venues only, ensuring we take only Seminar Rooms and Classrooms*.
2. Rename the column called "Facility" to "Facility Name" in the filtered FBS Report to ensure that naming convention between both the event_code_df and this dataframe is consistent
2. Next, we extract the columns necessary for the merge, where we perform a left merge based on the Purpose column in both dataframes
3. Before merging, we sort and group the FBS Report based on Course Code and Booking Date, to ensure we only keep the latest venues in this dataframe before merging. 
4. This will ensure that if a course code appears more than once in FBS Report, only the latest venue is extracted
5. Then, we merge the two dataframes based on "Course Code"

**Note: The filtered FBS Report will exclude catering venues as well*

In [76]:
# Create a copy of event_code_df
event_code_copy = event_code_df.copy()

# Filter for both confirmed and cancelled bookings in Seminar Room or Classroom facilities
confirmed_fbs_report = fbs_report_df[
    ((fbs_report_df["BookingStatus"] == "Confirmed") & 
     (fbs_report_df["Facility"].str.contains(r'\b(Seminar Room|Classroom)\b', case=False, na=False))) |
    (fbs_report_df["Reason"].str.lower() == "course cancelled")

]

# Select relevant columns and rename 'Facility' to 'Facility Name'
confirmed_fbs_report = confirmed_fbs_report[['Building', 'Floor', 'Facility', 'Purpose', 'Booking Date', 'BookingStatus', 'Reason']]
confirmed_fbs_report = confirmed_fbs_report.rename(columns={'Facility': 'Facility Name'})

# Convert 'Booking Date' to datetime format
confirmed_fbs_report['Booking Date'] = pd.to_datetime(confirmed_fbs_report['Booking Date'])

# Exclude catering venues
catering_venue_to_exclude = [
    'SOE/SCIS2 Catering Area 4A (Near to SR 4-2)', 
    'SOE/SCIS2 Catering Area 4B (Near to SR 4-2)', 
    'SOE/SCIS2 Catering Area 4C (Near to SR 4-4)',
    'SOE/SCIS2 Catering Area B1A',
    'SOE/SCIS2 Catering Area B1B',
    'SOE/SCIS2 Catering Area B1C',
]
confirmed_fbs_report = confirmed_fbs_report[~confirmed_fbs_report['Facility Name'].isin(catering_venue_to_exclude)]

# Extract the course code from the "Purpose" column
confirmed_fbs_report['Course Code'] = confirmed_fbs_report['Purpose'].str.split('~').str[0].str.strip()

# Sort by 'Course Code', 'Booking Status' (prioritizing Confirmed first), and 'Booking Date' (latest first)
confirmed_fbs_report = confirmed_fbs_report.sort_values(by=['Course Code', 'BookingStatus', 'Booking Date'], 
                                                       ascending=[True, False, False])

# Keep the most recent entry per Course Code, prioritizing Confirmed over Cancelled
latest_confirmed_fbs_report = confirmed_fbs_report.groupby('Course Code', as_index=False).first()

# Drop the 'Purpose' column as it's no longer needed
latest_confirmed_fbs_report = latest_confirmed_fbs_report.drop(columns=['Purpose'])

# Perform a left merge on 'Course Code'
merged_df = pd.merge(event_code_df, latest_confirmed_fbs_report, on='Course Code', how='left')

In [77]:
#Print total number of rows after merge
print("Number of rows after merge: ", merged_df.shape[0])    
#Print number of rows with missing values in the "Facility Name" column
print("Number of rows with missing values in the Facility Name column: ", merged_df["Facility Name"].isnull().sum())

Number of rows after merge:  780
Number of rows with missing values in the Facility Name column:  272


## 8.0  Clean Up Data Frame

1. Remove all unnecessary columns, and keep only the columns as per what is contained in the Facility Booking List Template File
2. Change Session Date Format from YYYY-MM-DD to dd-MMM-yyyy to match the format required

In [78]:
# Remove Unnecessary Columns 
cleaned_df = merged_df[["Facility Name","Building","Floor","Session Date","Time Booked From","Time Booked To",
                     "Use Type","Purpose","IO Code","Course Code","Sch #"]]

cleaned_df.rename(columns = {'Session Date':'Date of Booking',
                           'Time Booked From':'Time of Booking From',
                           'Time Booked To':'Time of Booking To',
                           'IO Code':'Event Code'}, inplace = True)

In [79]:
# Change the Format for the "Date of Booking" Column from YYYY-MM-DD to dd-MMM-yyyy
cleaned_df['Date of Booking'] = cleaned_df['Date of Booking'].dt.strftime('%d-%b-%Y')

In [80]:
#Check the number of rows in the cleaned dataframe
print("Number of current rows: ", cleaned_df.shape[0])

Number of current rows:  780


In [81]:
#Check the number of courses for which venues are required, but have not been assigned
# print(cleaned_df['Facility Name'].isnull().sum())
# cleaned_df

print(cleaned_df['Facility Name'])

0      SMUC Active Learning Classroom 3-3
1      SMUC Active Learning Classroom 3-3
2      SMUC Active Learning Classroom 4-1
3                                     NaN
4                                     NaN
                      ...                
775    SMUC Active Learning Classroom 4-2
776    SMUC Active Learning Classroom 4-2
777    SMUC Active Learning Classroom 4-2
778                                   NaN
779                                   NaN
Name: Facility Name, Length: 780, dtype: object


## 9.0 Additional Checks to Perform

1. For courses that were not assigned any venues from the merge above, assign randomly based on Venue Preference
2. For courses that run for more than 1 day, assign the same venue
3. Check for venue clashes and see if 2 or more courses have the same venue, and correct accordingly
4. Book catering for the respective venues
5. Generate the "Venue Preference" Column
6. If the venue assigned to a course matches what was indicated in SMUA Venue Matters: Indicate "Match", if not indicate as "Not Match"
7. If unable to find the course code in SMUA venue matters: Indicate "Preference not indicated"
8. Generate the "No.of Course Days column
9. Ensure that all the facility names follow the naming convention suitable for FBS, using Facility Names File to look up the correct naming  

In [82]:
#Check for any missing values in the Facility Name Column 
print(f"Number of missing 'Facility Name' values in the dataframe: {(cleaned_df['Facility Name'].isnull().sum())}")

#Print total number of rows
print(f"Total number of rows in the dataframe: {cleaned_df.shape[0]}") 

Number of missing 'Facility Name' values in the dataframe: 272
Total number of rows in the dataframe: 780


In [83]:
cleaned_df[cleaned_df['Course Code'] == "ACGAIM6"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #
70,,,,27-Feb-2025,08:00 AM,05:00 PM,Event,ACGAIM6 ~ Advanced Certificate in Generative A...,ZAC1D6312,ACGAIM6,SCH24-Not Specified-ACGAIM6-00007
71,,,,28-Feb-2025,08:00 AM,06:00 PM,Event,ACGAIM6 ~ Advanced Certificate in Generative A...,ZAC1D6312,ACGAIM6,SCH24-Not Specified-ACGAIM6-00007
72,,,,10-Apr-2025,08:00 AM,05:00 PM,Event,ACGAIM6 ~ Advanced Certificate in Generative A...,ZAC1D6312,ACGAIM6,SCH24-Not Specified-ACGAIM6-00008


### 9.1 Generate the Preference Type Column, Fill in Empty Rows

Now, we will generate the "Preference Type" column, to merge each course code with its preferred venue, as indicated in the "SMUA Venue Matters" excel file.

Next, it will then assign random venues to courses that do not have a venue retrieved from the FBS Report, based on the Preference Type Indicated


#### Preference Type Column

To generate the "Preference Type" column:
1. First, we read all the sheets of the different pillars into separate dataframes, and then we combine into a singular dataframe
2. Then, generate the "Preference Type" column, which comes from the "Room Type" column in the excel sheet by defining a secondary function

In [84]:
pref_type_df = cleaned_df.copy()

#Read all sheets in SMUA Venue Matters into separate dataframes
bm_pref = pillar_room_pref_df['Rm Pref - BM']
fit_pref = pillar_room_pref_df['Rm Pref - FIT']
hcml_pref = pillar_room_pref_df['Rm Pref - HCML']
sobi_pref = pillar_room_pref_df['Rm Pref - SOBI']

#strip the first row from each dataframe
bm_pref = bm_pref[1:]
fit_pref = fit_pref[1:]
hcml_pref = hcml_pref[1:]
sobi_pref = sobi_pref[1:]

In [85]:
#Get the number of rows in each sheet
bm_count = bm_pref.shape[0]
fit_count = fit_pref.shape[0]
hcml_count = hcml_pref.shape[0]
sobi_count = sobi_pref.shape[0]

print("Number of rows in BM Pref: ", bm_pref.shape[0])
print("Number of rows in FIT Pref: ", fit_pref.shape[0])
print("Number of rows in HCML Pref: ", hcml_pref.shape[0])
print("Number of rows in SOBI Pref: ", sobi_pref.shape[0])

Number of rows in BM Pref:  96
Number of rows in FIT Pref:  219
Number of rows in HCML Pref:  87
Number of rows in SOBI Pref:  115


In [86]:
#concatenate all the sheets together
merged_pref = pd.concat([bm_pref, fit_pref, hcml_pref, sobi_pref], ignore_index=True)
print("The total number of rows: ", (bm_count + fit_count + hcml_count + sobi_count))

#rename the Room Type column
merged_pref.rename(columns = {'Room Type\n(Seminar Room or Class Room)':'Room Type'}, inplace = True)

merged_pref

The total number of rows:  517


Unnamed: 0,Course Title,Course Code,Room Type,Any Comments,Unnamed: 4,Any Comments\n
0,Transforming Enterprises Module 3: Manage (Eng...,TEM3-2024,Seminar Room/Classroom,Preference: YPHSL\nRoom Booking until 6pm for ...,,
1,Advanced Data Analytics: Making Better Custome...,ADAMBCDUA,Seminar Room,Seminar Room is a must as the course require p...,,
2,Applying Smart Tech in a Smart Way: The Right ...,ASTSWTRADT,Classroom,Cluster set up,,
3,Art & Science of Sales Management,ASSM,Seminar Room/Classroom,Room size for at least 30 pax and to be in clu...,,
4,"Attracting, Hiring, Motivating & Rewarding Sal...",AHMRSTE,Seminar Room/Classroom,Room size for at least 30 pax and to be in clu...,,
...,...,...,...,...,...,...
512,Artificial Intelligence (AI) in Marketing: The...,AIIM,Seminar Room,,,"Prefer SOE SR5-1, SR4-1"
513,(UOB) Communicating Data with Impact: Data Sto...,CDIDSV,Classroom,,,"Require classroom at Business school, furnitur..."
514,Community Learning for Personal Development - ...,CLPDMALSETC,Classroom,,,Preference: SOA Classrooms. Classroom to be se...
515,Community Building and Leadership,CBAL,Classroom,,,Preference: SOA Classrooms. Classroom to be se...


In [87]:
#Strip the relevant columns in the merged_pref dataframe of leading and trailling spaces to ensure consistent formatting
merged_pref['Room Type'] = merged_pref['Room Type'].str.strip()
merged_pref['Course Code'] = merged_pref['Course Code'].str.strip()
merged_pref['Course Title'] = merged_pref['Course Title'].str.strip()

print(merged_pref.columns)
print(merged_pref['Room Type'].unique())

Index(['Course Title', 'Course Code', 'Room Type', 'Any Comments',
       'Unnamed: 4', 'Any Comments\n'],
      dtype='object')
['Seminar Room/Classroom' 'Seminar Room' 'Classroom' 'No preference' nan]


1. Define a function to extract the course code in merged_pref, then match it with the course code in preference_check_df.
2. Apply the function to the Purpose column to generate the Preference Type column

In [88]:
def populate_venue_match(course_title):
    # Extract the course code by removing everything after "~"
    course_code = course_title.split('~')[0]
    course_code = course_code.strip()
    
    # Get rows where Course Code matches
    matching_rows = merged_pref[merged_pref['Course Code'] == course_code]
    
    # Check if the DataFrame is not empty
    if not matching_rows.empty:
        first_row = matching_rows.iloc[0]
        pref_type = first_row['Room Type']
        return pref_type
    return ""

pref_type_df['Preference Type'] =  pref_type_df['Purpose'].apply(populate_venue_match)

print(pref_type_df['Facility Name'].isnull().sum())
pref_type_df.sample(10)

272


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
275,,,,07-Feb-2025,08:00 AM,06:00 PM,Event,BAWCC ~ Building A Winning Corporate Culture,ZAC1A1034,BAWCC,SCH24-Graduate Diploma-BAWCC-00005,Seminar Room
0,SMUC Active Learning Classroom 3-3,SMU Connexion,Level 3,06-Mar-2025,08:00 AM,06:00 PM,Event,ACAMCAB ~ A Case Approach to Modelling Corpora...,ZAC1D6029,ACAMCAB,SCH24-Not Specified-ACAMCAB-00003,
224,SCIS1 Classroom 3-1,School of Computing & Information Systems 1,Level 3,21-Feb-2025,08:00 AM,06:00 PM,Event,ACW3M3 ~ Advanced Certificate in Web 3.0 Modul...,ZAC1D6252,ACW3M3,SCH24-Not Specified-ACW3M3-00005,Seminar Room/Classroom
109,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,20-Jan-2025,08:00 AM,06:00 PM,Event,ACW3M9 ~ Advanced Certificate in Metaverse and...,ZAC1D6277,ACW3M9,SCH24-Not Specified-ACW3M9-00005,
83,SOE/SCIS2 Classroom 4-1,School of Economics/School of Computing & Info...,Level 4,28-Feb-2025,08:00 AM,06:00 PM,Event,ACGAICCM7 ~ Advanced Certificate in Generative...,ZAC1D6435,ACGAICCM7,SCH24-Not Specified-ACGAICCM7-00004,
530,LKCSB Seminar Room 2-1,Lee Kong Chian School of Business,Level 2,22-Apr-2025,08:00 AM,06:00 PM,Event,GCMCSM5 ~ Graduate Certificate in Media Commun...,ZAC1D6133,GCMCSM5,SCH24-Not Specified-GCMCSM5-00005,Seminar Room/Classroom
108,SOE/SCIS2 Seminar Room 4-2,School of Economics/School of Computing & Info...,Level 4,05-Mar-2025,08:00 AM,06:00 PM,Event,ACLSCMM6 ~ Advanced Certificate in Logistics a...,ZAC1D2199,ACLSCMM6,SCH24-Advanced Diploma-ACLSCMM6-00003,
344,SOE/SCIS2 Seminar Room 3-10,School of Economics/School of Computing & Info...,Level 3,27-Mar-2025,08:00 AM,05:00 PM,Event,CTEPG ~ Critical Thinking Essentials for Profe...,ZAC1D6302,CTEPG,SCH24-Not Specified-CTEPG-00002,Seminar Room/Classroom
377,,,,21-Mar-2025,06:00 PM,10:30 PM,AdHoc,CNWDNMW ~ Diet and Nutrition for Mental Wellness,,CNWDNMW,SCH24-Not Specified-CNWDNMW-00004,Seminar Room
652,,,,18-Mar-2025,08:00 AM,05:00 PM,Event,ACSSBM1 ~ Module 1: Introduction to Sustainabi...,ZAC1E0301,ACSSBM1,SCH24-Not Specified-ACSSBM1-00003,Classroom


#### Empty Venues

For rows where no venue was assigned, we will randomly assign venues based on the following conditions:
>
1. If the Preference Type is either "Seminar Room" or "Classroom", we will first separate these rows from the original dataframe, and randomly assign them venues accordingly
2. If the Preference Type is "Seminar Room/Classroom" or others, then we will randomly assign venues
>
We will also ensure that there are no clashing venues with the already existing venues, and if there are, they will be resolved

In [89]:
#Check the rows for which the Facility Name is column is empty
print(f"Number of rows with empty 'Facility Name' values: {pref_type_df['Facility Name'].isnull().sum()}")

#Isolate these rows from the original dataframe, into a separate dataframe
empty_df = pref_type_df[pref_type_df['Facility Name'].isnull()]

#Check the number of rows in the empty_df
print(f"Number of rows in missing_venues_df: {empty_df.shape[0]}")

Number of rows with empty 'Facility Name' values: 272
Number of rows in missing_venues_df: 272


In [90]:
empty_df['Preference Type'].value_counts()

Preference Type
                          148
Seminar Room               70
Seminar Room/Classroom     46
Classroom                   8
Name: count, dtype: int64

In [91]:
# Filter the facility_list_df to only include the "Seminar Room" and "Classroom" facilities from the "Facility Type" column
filtered_facility_list_df = facility_list_df[
    facility_list_df['Facility Type'].str.contains(r'\b(Seminar Room|Classroom)\b', case=False, na=False)
]

print(filtered_facility_list_df['Facility Type'].value_counts())

filtered_facility_list_df = filtered_facility_list_df.drop(columns=['Unnamed: 5', 'Unnamed: 6', 'Unnamed: 7'])
filtered_facility_list_df

Facility Type
Seminar Room              93
Classroom                 39
Executive Seminar Room     4
Name: count, dtype: int64


Unnamed: 0,Site,Building,Floor,Facility Name,Facility Type
108,Bras Basah,Lee Kong Chian School of Business,Level 1,LKCSB Seminar Room 1-1,Seminar Room
109,Bras Basah,Lee Kong Chian School of Business,Level 1,LKCSB Seminar Room 1-2,Seminar Room
114,Bras Basah,Lee Kong Chian School of Business,Level 2,LKCSB Classroom 2-1,Classroom
140,Bras Basah,Lee Kong Chian School of Business,Level 2,LKCSB Seminar Room 2-1,Seminar Room
141,Bras Basah,Lee Kong Chian School of Business,Level 2,LKCSB Seminar Room 2-2,Seminar Room
...,...,...,...,...,...
613,Stamford,Yong Pung How School of Law/Kwa Geok Choo Law ...,Level 3,YPHSL Seminar Room 3-12,Seminar Room
656,Victoria Street,Administration Building,Level 4,Admin Executive Seminar Room 4-1,Executive Seminar Room
657,Victoria Street,Administration Building,Level 4,Admin Executive Seminar Room 4-2,Executive Seminar Room
672,Victoria Street,Administration Building,Level 5,Admin Executive Media Theatre,Executive Seminar Room


In [92]:
pref_type_df[pref_type_df["Course Code"] == "ACGAICCM1"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
73,,,,11-Mar-2025,08:00 AM,06:00 PM,Event,ACGAICCM1 ~ Advanced Certificate in Generative...,ZAC1D6429,ACGAICCM1,SCH24-Not Specified-ACGAICCM1-00004,
74,,,,30-Apr-2025,08:00 AM,06:00 PM,Event,ACGAICCM1 ~ Advanced Certificate in Generative...,ZAC1D6429,ACGAICCM1,SCH24-Not Specified-ACGAICCM1-00005,


1. The first function will identify the first day of each course run, and random venue based on the conditions defined above, allowing for at most 1 day break in between courses than run for more than 3 days
2. Then, it will duplicate this venue for the entire duration that the particular course runs for 

In [93]:
import random

missing_df = pref_type_df[pref_type_df['Facility Name'].isnull()].copy()

def assign_venue_to_first_run(df_missing, venue_df):
    """
    Assign a random venue to the earliest run day of each course.
    """
    assigned_venues = {}
    
    # Separate classrooms and seminar rooms
    classroom_df = venue_df[venue_df['Facility Type'] == 'Classroom']
    seminar_room_df = venue_df[venue_df['Facility Type'] == 'Seminar Room']
    
    # Get the earliest run date per course
    first_runs = df_missing.loc[df_missing.groupby('Purpose')['Date of Booking'].idxmin()]

    for idx, row in first_runs.iterrows():
        pref_type = row['Preference Type']

        if pref_type == 'Classroom' and not classroom_df.empty:
            chosen_venue = random.choice(classroom_df['Facility Name'].values)
        elif pref_type == 'Seminar Room' and not seminar_room_df.empty:
            chosen_venue = random.choice(seminar_room_df['Facility Name'].values)
        else:
            chosen_venue = random.choice(venue_df['Facility Name'].values)

        # Assign the chosen venue to the earliest run date
        df_missing.at[idx, 'Facility Name'] = chosen_venue
        assigned_venues[row['Purpose']] = chosen_venue

        print(f"✅ Assigned '{chosen_venue}' to '{row['Purpose']}' on earliest date '{row['Date of Booking']}'.")

    return df_missing, assigned_venues

In [94]:
def propagate_venues(df_missing, assigned_venues):
    """
    Propagate the assigned venue from the earliest run day to all other run days of the same course.
    """
    for purpose, venue in assigned_venues.items():
        df_missing.loc[df_missing['Purpose'] == purpose, 'Facility Name'] = venue
        print(f"✅ Propagated venue '{venue}' to all run days of course '{purpose}'.")
    
    return df_missing

In [95]:
# Step 1: Assign venues to earliest run day
missing_df, assigned_venues = assign_venue_to_first_run(missing_df, filtered_facility_list_df)

# Step 2: Propagate assigned venues to other run days
missing_df = propagate_venues(missing_df, assigned_venues)

✅ Assigned 'YPHSL Seminar Room 3-01' to 'ACABPMM4 ~ Advanced Certificate in Agile Business Practices and Management Module 4: Agile Coaching (ICP-ACC)' on earliest date '22-Jan-2025'.
✅ Assigned 'SCIS1 Seminar Room 2-1' to 'ACABPMM5 ~ Advanced Certificate in Agile Business Practices and Management Module 5: Coaching Agile Transformations (ICP-CAT)' on earliest date '24-Feb-2025'.
✅ Assigned 'SCIS1 Seminar Room 3-4' to 'ACAMAPM4 ~ SMU-NAFA Advanced Certificate in Arts Management for Arts Professionals Module 4: Arts Marketing and Audience Development' on earliest date '20-Jan-2025'.
✅ Assigned 'YPHSL Seminar Room 3-09' to 'ACAMAPM5 ~ SMU-NAFA Advanced Certificate in Arts Management for Arts Professionals Module 5: Arts Venues Management' on earliest date '03-Feb-2025'.
✅ Assigned 'SOE/SCIS2 Seminar Room 2-1' to 'ACAMDLMM5 ~ SMU-NAFA Advanced Certificate in Mastering Design Literacy for Marketing Module 5: Responsive Design' on earliest date '06-Feb-2025'.
✅ Assigned 'LKCSB Seminar Room 

✅ Propagated venue 'SCIS1 Seminar Room 2-1' to all run days of course 'ACDGSM5 ~ Advanced Certificate in Data Governance Systems Module 5: Implementing a Compliance Management System ISO37301'.
✅ Propagated venue 'SOE/SCIS2 Seminar Room 2-2' to all run days of course 'ACDPOEM4 ~ Advanced Certificate in Data Protection Operational Excellence Module 4: Data Protection Management Programme (DPMP)'.
✅ Propagated venue 'SOE/SCIS2 Seminar Room B1-1' to all run days of course 'ACDPPM6 ~ Advanced Certificate in Data Protection Principles Module 6: Data Protection Framework and Standards. ISO 29100, Nymity Accountability and APEC Privacy Framework'.
✅ Propagated venue 'YPHSL Seminar Room 2-01' to all run days of course 'ACFACCAV2 ~ Advanced Cash Flow Analysis - Counterparty Credit Analysis'.
✅ Propagated venue 'SOE/SCIS2 Seminar Room 3-9' to all run days of course 'ACGAICCM1 ~ Advanced Certificate in Generative Artificial Intelligence-Enhanced Social Media Content Creation Module 1: Discovering

In [96]:
# Check if there are still any missing venues
print(f"Number of missing venues after assignment: {missing_df['Facility Name'].isnull().sum()}")

# Sample check
missing_df[missing_df['Course Code'] == "ACABPMM4"]

Number of missing venues after assignment: 0


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
11,YPHSL Seminar Room 3-01,,,22-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
12,YPHSL Seminar Room 3-01,,,23-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
13,YPHSL Seminar Room 3-01,,,24-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,


In [97]:
#Update the dataframe 
pref_type_df.update(missing_df)

print(f"No. of missing 'Facility Name' values: {pref_type_df['Facility Name'].isnull().sum()}")

No. of missing 'Facility Name' values: 0


### 9.2 Assign the same venue for courses that run for more than 1 day

1. This section will start by sorting the cleaned_df by "Purpose" and "Date of Booking", to ensure that courses are grouped together and sorted by the date.
2. To ensure that multiple-day courses have the same venue assigned throughout its entire run duration, the code will extract the venue details for the first day of the course and assign that venue to all the days that the course runs for
3. It will also account for the fact that some of these courses may not run consecutively, and may have a 1-day break in between as well.

In [98]:
# Define a function to parse the date-string into a datetime object if not already in that format
def parse_date(date_str):
    return datetime.strptime(date_str, '%d-%b-%Y')

# Apply parse_date function to 'cleaned_df' DataFrame
pref_type_df['Date of Booking'] = pref_type_df['Date of Booking'].apply(parse_date)

# Function to check for consecutive dates with a possible one-day break
def has_consecutive_dates_with_one_day_break(dates):
    # No need to sort here since input is pre-sorted
    for i in range(len(dates) - 1):
        gap = (dates[i + 1] - dates[i]).days
        if gap != 1 and gap != 2:  # Allow for consecutive days or a one-day break
            return False
    return True

# Function to assign the same venue for consecutive courses
def assign_same_venue(df):
    # Create a copy and pre-sort by Purpose and Date
    df = df.copy()
    df = df.sort_values(by=['Purpose', 'Date of Booking'])
    
    # Now process each group - the dates will already be in order
    for purpose, group in df.groupby('Purpose'):
        dates = group['Date of Booking'].tolist()  # dates are already sorted
        if len(dates) > 1 and has_consecutive_dates_with_one_day_break(dates):
            first_facility_name = group.iloc[0]['Facility Name']
            first_building = group.iloc[0]['Building']
            first_floor = group.iloc[0]['Floor']
            
            # Update all rows for this purpose with the first venue details
            df.loc[df['Purpose'] == purpose, 'Facility Name'] = first_facility_name
            df.loc[df['Purpose'] == purpose, 'Building'] = first_building
            df.loc[df['Purpose'] == purpose, 'Floor'] = first_floor
    
    return df

# Apply the function to the DataFrame
same_venues_df = assign_same_venue(pref_type_df)

# Convert the 'Date of Booking' column back to string format for consistency
same_venues_df['Date of Booking'] = same_venues_df['Date of Booking'].dt.strftime('%d-%b-%Y')
same_venues_df.reset_index(drop=True, inplace=True)

We will now double check if all courses that run for multiple days have been assigned the same venue throughout its entire run duration

In [99]:
# Ensure sorting before checking
same_venues_df = same_venues_df.sort_values(by=['Purpose', 'Date of Booking'])

def check_consecutive_venues(df):
    """
    Identify courses where consecutive dates (with at most a one-day break)
    have different assigned venues.
    
    Returns:
    - DataFrame of courses with inconsistencies.
    """
    df = df.sort_values(by=['Purpose', 'Date of Booking'])  # Ensure correct order
    df['Next Date'] = df.groupby('Purpose')['Date of Booking'].shift(-1)
    df['Next Venue'] = df.groupby('Purpose')['Facility Name'].shift(-1)

    # Convert dates to datetime for accurate date comparison
    df['Date of Booking'] = pd.to_datetime(df['Date of Booking'], format='%d-%b-%Y')
    df['Next Date'] = pd.to_datetime(df['Next Date'], format='%d-%b-%Y')

    # Calculate day gaps
    df['Gap'] = (df['Next Date'] - df['Date of Booking']).dt.days

    # Identify errors: Same course but different venues on consecutive days (1 or 2 day gap)
    conflicting_courses = df[(df['Facility Name'] != df['Next Venue']) & (df['Gap'].isin([1, 2]))]

    if conflicting_courses.empty:
        print("✅ All courses have the same venue on consecutive days.")
    else:
        print("⚠️ The following courses have different venues on consecutive days:")
        display(conflicting_courses[['Purpose', 'Date of Booking', 'Facility Name', 'Next Date', 'Next Venue', 'Gap']])

    return conflicting_courses

# Run the check on the updated dataframe
conflicting_venues_df = check_consecutive_venues(same_venues_df)


✅ All courses have the same venue on consecutive days.


In [100]:
pref_type_df[pref_type_df["Course Code"] == "ACABPMM4"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
11,YPHSL Seminar Room 3-01,,,2025-01-22,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
12,YPHSL Seminar Room 3-01,,,2025-01-23,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
13,YPHSL Seminar Room 3-01,,,2025-01-24,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,


Run this code if the above code cell indicates that there are still multiple-day courses which have different venues during its run duration 

1. (SHORTCUT) Highlight the whole code chunk, and click CTRL + FORWARD SLASH to uncomment out the code
2. Run the code 

In [101]:
# # Function to rectify venue conflicts
# def rectify_venue_conflicts(df):
#     for purpose, group in df.groupby('Purpose'):
#         unique_venues = group['Facility Name'].dropna().unique()
#         if len(unique_venues) > 1:
#             first_venue = group['Facility Name'].dropna().iloc[0]
#             df.loc[df['Purpose'] == purpose, 'Facility Name'] = first_venue
#     return df

# # Convert 'Date of Booking' to datetime
# same_venues_df['Date of Booking'] = same_venues_df['Date of Booking'].apply(parse_date)

# # Check and rectify conflicts
# conflict_purposes = same_venues_df.groupby('Purpose').filter(check_consecutive_venues)

# if not conflict_purposes.empty:
#     print("The following courses still have venue conflicts before rectification:")
#     print(conflict_purposes[['Purpose', 'Date of Booking', 'Facility Name']])

#     # Rectify conflicts
#     same_venues_df = rectify_venue_conflicts(same_venues_df)

#     # Check for conflicts again after rectification
#     conflict_purposes_after = same_venues_df.groupby('Purpose').filter(check_consecutive_venues)

#     if not conflict_purposes_after.empty:
#         print("The following courses still have venue conflicts after rectification:")
#         print(conflict_purposes_after[['Purpose', 'Date of Booking', 'Facility Name']])
#     else:
#         print("No courses have venue conflicts after rectification.")
# else:
#     print("No courses have venue conflicts.")

# # Convert the 'Date of Booking' column back to string format for consistency
# same_venues_df['Date of Booking'] = same_venues_df['Date of Booking'].dt.strftime('%d-%b-%Y')

# same_venues_df.reset_index(drop=True, inplace=True)

### 9.3 Identifying Any Venue Clashes Between 2 or more Courses
Ensure that no two courses or more that are running on the same date have the same venues booked. 

**Code logic:**

For two courses with clashing venues, eg. course A, B

1. When identified, check if either course has the next latest venue. 
2. if found for both course A and B/only one course, leave the venue as it is for course A, and assign the next latest venue for course B
3. If not found for either course, leave the venue as it is for one course

For more than 2 courses, eg. 3 courses with clashing venues, eg course A, B, C

1. When identified, check if the courses have the next latest venue available. 
2. If only one course, course A for example, has the next latest venue, then leave the original venue as it is for course B
3. If 2 courses, eg. A and B have the next latest venue, leave the original venue as it is for course C, and change the venues accordingly for A and B
4. If all three courses have next latest venue, leave the original venue as it is for course A, and change the venues as accordingly for course B, C

For the other courses for which no next latest venue was found, cross_reference the "Facility_List.xlsx" file to find alternate venues of the same building and facility type, and ensure that the new rows being assigned do not clash again

Overall, ensure that if venue clashes are identified, the venue is changed for all days of the course

In [102]:
same_venues_df

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,08:00 AM,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,08:00 AM,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,08:00 AM,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,
...,...,...,...,...,...,...,...,...,...,...,...,...
773,SMUC Active Learning Classroom 4-2,SMU Connexion,Level 4,17-Jan-2025,08:00 AM,06:00 PM,Event,WA1AWMPP ~ Workplace Automation 101: Automatio...,ZAC1D6187,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00003,Seminar Room/Classroom
776,SOE/SCIS2 Seminar Room 2-9,School of Economics/School of Computing & Info...,Level 2,21-Mar-2025,08:00 AM,06:00 PM,Event,WBO ~ Web3.0 as a Business Opportunity,ZAC1D6327,WBO,SCH24-Not Specified-WBO-00004,Seminar Room/Classroom
777,SOE/SCIS2 Classroom 3-2,,,27-Feb-2025,08:00 AM,05:00 PM,Event,WEFTDA ~ Writing Effectively for the Digital Age,ZAC1D6295,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Seminar Room/Classroom
778,SOE/SCIS2 Classroom 3-2,,,28-Feb-2025,08:00 AM,06:00 PM,Event,WEFTDA ~ Writing Effectively for the Digital Age,ZAC1D6295,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Seminar Room/Classroom


In [103]:
# Double check for venues being booked on the same day for two separate courses
dbl_booking_check = same_venues_df[same_venues_df.duplicated(
    subset=['Facility Name', 'Date of Booking', 'Time of Booking From'], 
    keep=False
)]

# Drop rows where the 'Building' column has missing values
dbl_booking_check.dropna(subset=['Building'], inplace=True)

dbl_booking_check.shape[0]

50

In [104]:
same_venues_df[same_venues_df['Facility Name'] == "SCIS1 Classroom 3-2"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
122,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,13-Feb-2025,08:00 AM,06:00 PM,Event,ACORMBM4 ~ Advanced Certificate in Online Repu...,ZAC1D6194,ACORMBM4,SCH24-Not Specified-ACORMBM4-00003,Seminar Room/Classroom
123,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,14-Feb-2025,08:00 AM,06:00 PM,Event,ACORMBM4 ~ Advanced Certificate in Online Repu...,ZAC1D6194,ACORMBM4,SCH24-Not Specified-ACORMBM4-00003,Seminar Room/Classroom
130,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,03-Feb-2025,08:00 AM,06:00 PM,Event,ACPBLM2V2 ~ Advanced Certificate in Practical ...,ZAC1D2207,ACPBLM2V2,SCH24-Not Specified-ACPBLM2V2-00002,
131,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,04-Feb-2025,08:00 AM,06:00 PM,Event,ACPBLM2V2 ~ Advanced Certificate in Practical ...,ZAC1D2207,ACPBLM2V2,SCH24-Not Specified-ACPBLM2V2-00002,
259,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,10-Feb-2025,08:00 AM,05:00 PM,Event,ACW3M8 ~ Advanced Certificate in Metaverse and...,ZAC1D6276,ACW3M8,SCH24-Not Specified-ACW3M8-00004,Seminar Room/Classroom
260,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,11-Feb-2025,08:00 AM,06:00 PM,Event,ACW3M8 ~ Advanced Certificate in Metaverse and...,ZAC1D6276,ACW3M8,SCH24-Not Specified-ACW3M8-00004,Seminar Room/Classroom
261,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,20-Jan-2025,08:00 AM,06:00 PM,Event,ACW3M9 ~ Advanced Certificate in Metaverse and...,ZAC1D6277,ACW3M9,SCH24-Not Specified-ACW3M9-00005,
262,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,21-Jan-2025,08:00 AM,06:00 PM,Event,ACW3M9 ~ Advanced Certificate in Metaverse and...,ZAC1D6277,ACW3M9,SCH24-Not Specified-ACW3M9-00005,
346,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,27-Feb-2025,08:00 AM,06:00 PM,Event,CLGICPBS ~ Customer Loyalty: Gaining Insights ...,ZAC1E1027,CLGICPBS,SCH24-Advanced Diploma-CLGICPBS-00005,Classroom
347,SCIS1 Classroom 3-2,School of Computing & Information Systems 1,Level 3,28-Feb-2025,08:00 AM,06:00 PM,Event,CLGICPBS ~ Customer Loyalty: Gaining Insights ...,ZAC1E1027,CLGICPBS,SCH24-Advanced Diploma-CLGICPBS-00005,Classroom


In [105]:
print("Number of rows affected by double booking: ", dbl_booking_check.shape[0])
dbl_booking_check.sample(5)

Number of rows affected by double booking:  50


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
721,SCIS1 Classroom 3-1,School of Computing & Information Systems 1,Level 3,26-Feb-2025,08:00 AM,06:00 PM,Event,SLTIGCALPSM4 ~ Graduate Certificate in Allied ...,ZAC1D6239,SLTIGCALPSM4,SCH24-Not Specified-SLTIGCALPSM4-00002,Seminar Room/Classroom
722,SCIS1 Classroom 3-1,School of Computing & Information Systems 1,Level 3,27-Feb-2025,08:00 AM,06:00 PM,Event,SLTIGCALPSM4 ~ Graduate Certificate in Allied ...,ZAC1D6239,SLTIGCALPSM4,SCH24-Not Specified-SLTIGCALPSM4-00002,Seminar Room/Classroom
209,SMUC Active Learning Classroom 4-1,SMU Connexion,Level 4,15-Apr-2025,08:00 AM,06:00 PM,Event,ACSSBM6 ~ Module 6: Sustainability Reporting A...,ZAC1E0306,ACSSBM6,SCH24-Not Specified-ACSSBM6-00002,Classroom
155,SCIS1 Classroom 3-1,School of Computing & Information Systems 1,Level 3,26-Feb-2025,08:00 AM,05:00 PM,Event,ACSCIM4 ~ Advanced Certificate in Supply Chain...,ZAC1D6339,ACSCIM4,SCH24-Not Specified-ACSCIM4-00003,
617,YPHSL Seminar Room 2-02,Yong Pung How School of Law/Kwa Geok Choo Law ...,Level 2,10-Feb-2025,08:00 AM,06:00 PM,Event,LSEST ~ Loan Syndication: Executing a Successf...,ZAC1D6141,LSEST,SCH24-Not Specified-LSEST-00002,Seminar Room/Classroom


In [106]:
logging.basicConfig(level=logging.INFO)

def filter_facilities(df):
    """Filter for Seminar Rooms and Classrooms"""
    return df[df['Facility Type'].str.contains(r'\b(Seminar Room|Classroom)\b', case=False, na=False)]

def find_next_latest_venue(facility_name, booking_date, confirmed_fbs_report):
    """Find the next available booking for the same venue after the given date."""
    filtered_df = confirmed_fbs_report[
        (confirmed_fbs_report['Facility Name'] == facility_name) &
        (confirmed_fbs_report['Booking Date'] > booking_date)
    ].sort_values(by='Booking Date', ascending=True)

    return filtered_df.iloc[0]['Facility Name'] if not filtered_df.empty else None

def find_alternate_venue(facility_name, booking_date, facility_list_df, same_venues_df):
    """Find an alternative venue in the same building and of the same type."""
    facility_row = facility_list_df[facility_list_df['Facility Name'] == facility_name]
    if facility_row.empty:
        return None

    facility_type = facility_row.iloc[0]['Facility Type']
    building = facility_row.iloc[0]['Building']

    # Find other venues in the same building & type
    available_venues = facility_list_df[
        (facility_list_df['Building'] == building) &
        (facility_list_df['Facility Type'] == facility_type) &
        (facility_list_df['Facility Name'] != facility_name)
    ]

    # Remove venues that are already booked on this date
    booked_venues = set(same_venues_df[same_venues_df['Date of Booking'] == booking_date]['Facility Name'])
    available_venues = available_venues[~available_venues['Facility Name'].isin(booked_venues)]

    return available_venues.iloc[0]['Facility Name'] if not available_venues.empty else None

def resolve_venue_clashes(same_venues_df, confirmed_fbs_report, facility_list_df):
    """Resolve venue clashes by finding next latest or alternate venues."""
    
    # Identify clashes (all duplicate venue bookings at the same time)
    clashing_courses = same_venues_df[
        same_venues_df.duplicated(subset=['Facility Name', 'Date of Booking', 'Time of Booking From'], keep=False)
    ]

    # Process each clashing booking
    for idx, row in clashing_courses.iterrows():
        new_venue = find_next_latest_venue(row['Facility Name'], row['Date of Booking'], confirmed_fbs_report)
        
        if not new_venue:
            new_venue = find_alternate_venue(row['Facility Name'], row['Date of Booking'], facility_list_df, same_venues_df)

        if new_venue:
            logging.info(f"Updating {row['Facility Name']} to {new_venue} for {row['Date of Booking']}")
            same_venues_df.loc[idx, 'Facility Name'] = new_venue

    return same_venues_df

# Usage Example
clash_resolved_df = resolve_venue_clashes(same_venues_df, confirmed_fbs_report, facility_list_df)


INFO:root:Updating YPHSL Seminar Room 2-02 to YPHSL Seminar Room B1-01 for 10-Feb-2025
INFO:root:Updating SMUC Active Learning Classroom 4-1 to SMUC Active Learning Classroom 3-1 for 15-Apr-2025
INFO:root:Updating SOE/SCIS2 Seminar Room 5-2 to SOE/SCIS2 Seminar Room B1-1 for 16-Jan-2025
INFO:root:Updating SOE/SCIS2 Seminar Room 4-1 to SOE/SCIS2 Seminar Room B1-1 for 28-Apr-2025
INFO:root:Updating YPHSL Classroom B1-13 to YPHSL Classroom B1-09 for 04-Feb-2025
INFO:root:Updating LKCSB Classroom 3-4 to LKCSB Classroom 2-1 for 06-Mar-2025
INFO:root:Updating LKCSB Classroom 3-4 to LKCSB Classroom 3-1 for 07-Mar-2025
INFO:root:Updating SOE/SCIS2 Seminar Room 4-1 to SOE/SCIS2 Seminar Room B1-1 for 20-Mar-2025
INFO:root:Updating SOE/SCIS2 Seminar Room 4-1 to SOE/SCIS2 Seminar Room B1-1 for 21-Mar-2025
INFO:root:Updating SOE/SCIS2 Classroom 4-1 to SOE/SOSS Classroom B1-1 for 19-Feb-2025
INFO:root:Updating SOE/SCIS2 Classroom 4-1 to SOE/SOSS Classroom B1-1 for 20-Feb-2025
INFO:root:Updating YPHS

In [107]:
# Ensure 'Time of Booking From' is in datetime format
clash_resolved_df['Time of Booking From'] = pd.to_datetime(clash_resolved_df['Time of Booking From'], errors='coerce')

# Standardize 'Facility Name' to remove extra spaces
clash_resolved_df['Facility Name'] = clash_resolved_df['Facility Name'].str.strip()

# Identify double bookings based on Facility Name, Date, and Time
dbl_booking_check = clash_resolved_df[
    clash_resolved_df.duplicated(subset=['Facility Name', 'Date of Booking', 'Time of Booking From'], keep=False)
]

# Drop rows where 'Building' is NaN (to ensure valid venue assignments)
dbl_booking_check = dbl_booking_check.dropna(subset=['Building'])

# Identify unresolved venue assignments
no_venues_df = dbl_booking_check[dbl_booking_check['Facility Name'].fillna('').isin(['Assignment of venue required', ''])]

# Count affected rows (excluding unresolved assignments)
affected_rows = dbl_booking_check.shape[0] - no_venues_df.shape[0]

# Display results
print(f"Number of rows affected by double booking: {affected_rows}")
print(f"Total Number of rows in clash_resolved_df: {clash_resolved_df.shape[0]}")

Number of rows affected by double booking: 6
Total Number of rows in clash_resolved_df: 780


In [108]:
#Check for any missing values in the Facility Name Column
print(f"Number of missing 'Facility Name' values in the dataframe: {(clash_resolved_df['Facility Name'].isnull().sum())}")

Number of missing 'Facility Name' values in the dataframe: 0


In [109]:
clash_resolved_df[clash_resolved_df['Course Code'] == "ACABPMM4"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,


### 9.4 Booking of Catering Areas

1. Venues that require an additional catering venue are: 
    - SOE/SCIS2 Seminar Room 5-1
    - SOE/SCIS2 Seminar Room 5-2
    - SOE/SCIS2 Seminar Room B1-1
    - SOE/SCIS2 Seminar Room B1-2
    - SCIS1 Classroom B1-1
    - SCIS1 Seminar Room B1-1

2. For SOE/SCIS2 Seminar Room 5-1 and 5-2, either of these venues will be booked:
    - SOE/SCIS2 Catering Area 4A 
    - SOE/SCIS2 Catering Area 4B 
    - SOE/SCIS2 Catering Area 4C

3. For the rest, either of these catering areas will be booked:
    - SOE/SCIS2 Catering Area B1A
    - SOE/SCIS2 Catering Area B1B
    - SOE/SCIS2 Catering Area B1C


3. The row indicating the booking for catering venue will appear directly below the row of the course for which the above mentioned venues are booked 
4. Night courses do not require catering

In [110]:
clash_resolved_df.shape

clash_resolved_df

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,
...,...,...,...,...,...,...,...,...,...,...,...,...
773,SMUC Active Learning Classroom 4-2,SMU Connexion,Level 4,17-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,WA1AWMPP ~ Workplace Automation 101: Automatio...,ZAC1D6187,WA1AWMPP,SCH24-Not Specified-WA1AWMPP-00003,Seminar Room/Classroom
776,SOE/SCIS2 Seminar Room 2-9,School of Economics/School of Computing & Info...,Level 2,21-Mar-2025,2025-03-03 08:00:00,06:00 PM,Event,WBO ~ Web3.0 as a Business Opportunity,ZAC1D6327,WBO,SCH24-Not Specified-WBO-00004,Seminar Room/Classroom
777,SOE/SCIS2 Classroom 3-2,,,27-Feb-2025,2025-03-03 08:00:00,05:00 PM,Event,WEFTDA ~ Writing Effectively for the Digital Age,ZAC1D6295,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Seminar Room/Classroom
778,SOE/SCIS2 Classroom 3-2,,,28-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,WEFTDA ~ Writing Effectively for the Digital Age,ZAC1D6295,WEFTDA,SCH24-Not Specified-WEFTDA-00003,Seminar Room/Classroom


In [111]:
catering_df = clash_resolved_df.copy()
print(catering_df['Time of Booking From'].value_counts())

catering_df.head()

Time of Booking From
2025-03-03 08:00:00    720
2025-03-03 18:00:00     55
2025-03-03 17:00:00      4
2025-03-03 11:30:00      1
Name: count, dtype: int64


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,


In [None]:
# Lists of venues that require catering and catering areas
catering_venues = {
    'SOE/SCIS2 Seminar Room 5-1': ["SOE/SCIS2 Catering Area 4A (Near to SR 4-2)", "SOE/SCIS2 Catering Area 4B (Near to SR 4-2)", "SOE/SCIS2 Catering Area 4C (Near to SR 4-2)"],
    'SOE/SCIS2 Seminar Room 5-2': ["SOE/SCIS2 Catering Area 4A (Near to SR 4-2)", "SOE/SCIS2 Catering Area 4B (Near to SR 4-2)", "SOE/SCIS2 Catering Area 4C (Near to SR 4-2)"],
    'SOE/SCIS2 Seminar Room B1-1': ["SOE/SCIS2 Catering Area B1A", "SOE/SCIS2 Catering Area B1B", "SOE/SCIS2 Catering Area B1C"],
    'SOE/SCIS2 Seminar Room B1-2': ["SOE/SCIS2 Catering Area B1A", "SOE/SCIS2 Catering Area B1B", "SOE/SCIS2 Catering Area B1C"],
    'SCIS1 Classroom B1-1': ["SOE/SCIS2 Catering Area B1A", "SOE/SCIS2 Catering Area B1B", "SOE/SCIS2 Catering Area B1C"],
    'SCIS1 Seminar Room B1-1': ["SOE/SCIS2 Catering Area B1A", "SOE/SCIS2 Catering Area B1B", "SOE/SCIS2 Catering Area B1C"]
}

def populate_catering_area(df, requires_catering):
    """
    Populate catering areas for venues, avoiding night sessions and preventing area conflicts.
    
    Args:
        df (pd.DataFrame): Input DataFrame with booking details
        requires_catering (list): List of venues requiring catering
    
    Returns:
        pd.DataFrame: DataFrame with added catering rows
    """
    booked_catering = {}
    add_catering_rows = []

    def assign_catering(row):
        if row['Facility Name'] in requires_catering:
            time_of_booking_from = pd.to_datetime(row['Time of Booking From'], format='%I:%M %p').time()
            if time_of_booking_from >= pd.to_datetime('6:00 PM', format='%I:%M %p').time():
                return row

            date_of_booking = row['Date of Booking']
            available_venue = None
            if date_of_booking not in booked_catering:
                booked_catering[date_of_booking] = set()

            for venue in catering_venues[row['Facility Name']]:
                if venue not in booked_catering[date_of_booking]:
                    available_venue = venue
                    booked_catering[date_of_booking].add(venue)
                    break
            
            if available_venue:
                add_catering_rows.append({
                    'Facility Name': available_venue,
                    'Building': '',
                    'Floor': '',
                    'Date of Booking': date_of_booking,
                    'Time of Booking From': row['Time of Booking From'],
                    'Time of Booking To': row['Time of Booking To'],
                    'Use Type': row['Use Type'],
                    'Purpose': row['Purpose'],
                    'Event Code': row['Event Code'],
                    'Course Code': row['Course Code'],
                    'Sch #': row['Sch #']
                })
        return row

    df = df.apply(assign_catering, axis=1)

    if add_catering_rows:
        new_rows_df = pd.DataFrame(add_catering_rows)
        final_catering_df = pd.concat([df, new_rows_df], ignore_index=True)
    else:
        final_catering_df = df

    return final_catering_df

requires_catering = ['SOE/SCIS2 Seminar Room 5-1',
                     'SOE/SCIS2 Seminar Room 5-2',
                     'SOE/SCIS2 Seminar Room B1-1',
                     'SOE/SCIS2 Seminar Room B1-2',
                     'SCIS1 Seminar Room B1-1',
                     'SCIS1 Classroom B1-1']

final_catering_df = populate_catering_area(clash_resolved_df, requires_catering)

print(f"Number of current rows: {final_catering_df.shape[0]}")

Number of current rows: 844


Ensure that for catering venues, the Time Booked From and Time Booked To is default booked from 8.30am to 5.30pm

In [113]:
#Set the Time of Booking From and Time of Booking To columns to 0830 to 1730 for all catering areas
final_catering_df['Facility Name'] = final_catering_df['Facility Name'].fillna('')

final_catering_df.loc[final_catering_df['Facility Name'].str.contains('Catering', na=False), 'Time of Booking From'] = '08:30 AM'
final_catering_df.loc[final_catering_df['Facility Name'].str.contains('Catering', na=False), 'Time of Booking To'] = '05:30 PM'

# Check rows where 'Facility Name' contains 'Catering' (case-insensitive)
catering_rows = final_catering_df[final_catering_df['Facility Name'].str.contains('Catering', case=False)]

# Display the rows
print(catering_rows['Time of Booking From'].value_counts())
print(catering_rows['Time of Booking To'].value_counts())

catering_rows.sample(5)

Time of Booking From
2025-03-03 08:30:00    64
Name: count, dtype: int64
Time of Booking To
05:30 PM    64
Name: count, dtype: int64


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type
781,SOE/SCIS2 Catering Area B1A,,,16-Jan-2025,2025-03-03 08:30:00,05:30 PM,Event,ACGAIM1 ~ Advanced Certificate in Generative A...,ZAC1D6307,ACGAIM1,SCH24-Not Specified-ACGAIM1-00004,
826,SOE/SCIS2 Catering Area B1A,,,27-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,PCPDP2 ~ Practitioner Certificate in Personal ...,ZAC1D2067,PCPDP2,SCH24-Not Specified-PCPDP2-00009,
798,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,07-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,
830,SOE/SCIS2 Catering Area 4B (Near to SR 4-2),,,13-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,SLTIGCALPSM5 ~ Graduate Certificate in Allied ...,ZAC1D6240,SLTIGCALPSM5,SCH24-Not Specified-SLTIGCALPSM5-00002,
827,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,03-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,PFM ~ Project Finance Modelling,ZAC1D6026,PFM,SCH24-Not Specified-PFM-00003,


### 9.5 Populate "Venue Preference" Column

1. Using the "Preference Type" column previously generated, we define another function to check if the string "Seminar Room" or "Classroom" exists both in the Preference Type column and the Facility Name column
2. Whether there is a match or not will be indicated in a newly-generated "Venue Preference" column
3. If courses don't have any preference indicated, then it will be indicated as "Preference not indicated" 

Now we will check if the Course Code or Course Title exists in merged_pref to populate the Venue Preference column

In [114]:
preference_check_df = final_catering_df.copy()

#Define a function to check if the venue matches the preference
def check_venue_match(row):
    # Extract the course code from the 'Purpose' column
    course_code = row['Purpose'].split('~')[0].strip()

    # Check if the course code exists in merged_pref
    if course_code not in merged_pref['Course Code'].values:
        return "Preference not indicated"

    preference = str(row['Preference Type']).strip()
    current_venue = str(row['Facility Name']).strip()
    
    sr_condition = ('Seminar Room' in preference) and ('Seminar Room' in current_venue)
    cr_condition = ('Classroom' in preference) and ('Classroom' in current_venue)
    
    if sr_condition or cr_condition:
        return "Match"
    else:
        return "Not Match"
    
preference_check_df['Venue Preference'] = preference_check_df.apply(check_venue_match, axis=1)

In [115]:
#Check the results
preference_check_df.sample(20)

#Use a sample Sch# to check if the function is working as expected
sample_df = preference_check_df[preference_check_df["Sch #"] == "SCH24-Advanced Diploma-ADAMBCDUA-00002"]
sample_df

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type,Venue Preference
263,SOE/SCIS2 Seminar Room 5-2,School of Economics/School of Computing & Info...,Level 5,05-Mar-2025,2025-03-03 08:00:00,06:00 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,Seminar Room,Match
264,SOE/SCIS2 Seminar Room 5-2,School of Economics/School of Computing & Info...,Level 5,06-Mar-2025,2025-03-03 08:00:00,06:00 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,Seminar Room,Match
265,SOE/SCIS2 Seminar Room 5-2,School of Economics/School of Computing & Info...,Level 5,07-Mar-2025,2025-03-03 08:00:00,06:00 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,Seminar Room,Match
796,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,05-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,,Not Match
797,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,06-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,,Not Match
798,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,07-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,ADAMBCDUA ~ Advanced Data Analytics: Making Be...,ZAC1E1012,ADAMBCDUA,SCH24-Advanced Diploma-ADAMBCDUA-00002,,Not Match


### 9.6 Populate "No.of Course Days" Column 

1. Courses that run for 1, 2 or 3 days will be indicated as such in the No. of Course Days column
2. If they run for more than 3 days, it will be indicated as "Above 3 Days"
3. The dataframe will be sorted in Microsoft Excel once uploaded, to prioritise venue bookings for longer running-courses

In [None]:
# Copy the DataFrame
no_course_days_df = preference_check_df

# Initialize an empty dictionary
course_days_dict = {}

# Define a function to build the dictionary
def makedict_course_days(row):
    # Get the schedule identifier
    schedule = row['Sch #']
    
    # If the schedule is not already in the dictionary, add it with the date of booking
    if row['Sch #'] not in course_days_dict:
        course_days_dict[schedule] = row["Date of Booking"] + '@'
    # If the schedule is already in the dictionary, append the date of booking to the existing value
    else:
        course_days_dict[schedule] += row["Date of Booking"] + '@'

# Apply the function to each row of the DataFrame
preference_check_df.apply(makedict_course_days, axis=1)

#Define a function populate the No.of Course Days column based on the length of the start and end date of courses
def populate_days_column(row):
    
    schedule = row['Sch #']
    booking_dates_str = course_days_dict[schedule]
    
    list_booking_dates = booking_dates_str.split('@')
    filtered_booking_dates = [date.strip() for date in list_booking_dates if date.strip()]
    
    # Convert to a set to get unique dates
    unique_dates = list(set(filtered_booking_dates))

    # Count the number of unique dates
    number_of_unique_days = len(unique_dates)
    
    if number_of_unique_days == 1:
        return '1 day'
    elif number_of_unique_days == 2:
        return '2 days'
    elif number_of_unique_days == 3:
        return '3 days'
    elif number_of_unique_days > 3:
        return 'Above 3 days'
    
    
preference_check_df['No. Of Course Days'] = preference_check_df.apply(populate_days_column, axis=1)

In [None]:
preference_check_df.reset_index(drop=True)

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type,Venue Preference,No. Of Course Days
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,,Preference not indicated,2 days
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,,Preference not indicated,2 days
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
839,SOE/SCIS2 Catering Area B1A,,,28-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,TMBO ~ TikTok Marketing and Ads for Business,ZAC1A1233,TMBO,SCH24-Not Specified-TMBO-00010,,Not Match,2 days
840,SOE/SCIS2 Catering Area B1B,,,17-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days
841,SOE/SCIS2 Catering Area B1B,,,18-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days
842,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,19-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days


### 9.7 Course Code cannot be found in FBS Report

The course in gvSession may not be found in the FBS Report for the following reasons:

1. It is a new course, hence no previous bookings were made
2. The course did not run in the previous period.

Hence, the "Facility Name" Column values for these courses will be indicated as "Assignment of Venues Required"

In [118]:
preference_check_df['Facility Name'].fillna("Assignment of venue required", inplace=True)

preference_check_df.sample(10)

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type,Venue Preference,No. Of Course Days
833,SOE/SCIS2 Catering Area B1A,,,14-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,TEM5-2024 ~ Transforming Enterprises Module 5:...,ZAC1E1809,TEM5-2024,SCH24-Advanced Diploma-TEM5-2024-00002,,Not Match,2 days
545,YPHSL Classroom B1-13,Yong Pung How School of Law/Kwa Geok Choo Law ...,Basement 1,24-Mar-2025,2025-03-03 08:00:00,06:00 PM,Event,GEM2-2024-1 ~ Growing Enterprises Module 2: Pe...,ZAC1E1802,GEM2-2024-1,SCH24-Advanced Diploma-GEM2-2024-1-00002,Seminar Room/Classroom,Match,3 days
35,SCIS1 Seminar Room 2-1,,,12-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACDGSM5 ~ Advanced Certificate in Data Governa...,ZAC1D6184,ACDGSM5,SCH24-Advanced Diploma-ACDGSM5-00002,Seminar Room/Classroom,Match,1 day
217,YPHSL Seminar Room 2-15,,,18-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACSUSPTGWYW ~ Advanced Communication Strategie...,ZAC1C50045,ACSUSPTGWYW,SCH24-Not Specified-ACSUSPTGWYW-00011,Seminar Room,Match,2 days
544,SOE/SCIS2 Seminar Room 5-2,,,21-Feb-2025,2025-03-03 08:00:00,05:00 PM,Event,GEM1-2024 ~ Growing Enterprises Module 1: Stra...,ZAC1E1801,GEM1-2024,SCH24-Advanced Diploma-GEM1-2024-00002,Seminar Room/Classroom,Match,3 days
522,SOE/SCIS2 Seminar Room 3-4,,,12-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,GCMCSM1V2 ~ Graduate Certificate in Media Comm...,ZAC1D2178,GCMCSM1V2,SCH24-Not Specified-GCMCSM1V2-00001,,Preference not indicated,2 days
474,YPHSL Classroom B2-03,Yong Pung How School of Law/Kwa Geok Choo Law ...,Basement 2,15-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ETAAGMCTPSDA ~ Essential Thinking : Adopting A...,ZAC1E1011,ETAAGMCTPSDA,SCH24-Advanced Diploma-ETAAGMCTPSDA-00003,Classroom,Match,2 days
413,YPHSL Seminar Room 2-04,,,07-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,DMSM ~ Decision Making & Stakeholder Management,ZAC1C50022,DMSM,SCH24-Not Specified-DMSM-00005,Seminar Room,Match,2 days
630,SOSS/CIS Seminar Room 2-2,,,05-Mar-2025,2025-03-03 18:00:00,09:30 PM,AdHoc,MBSAT ~ MBSAT Decision Making Skills Training,,MBSAT,SCH24-Not Specified-MBSAT-00001,,Preference not indicated,Above 3 days
332,SOE/SCIS2 Seminar Room 5-1,School of Economics/School of Computing & Info...,Level 5,20-Feb-2025,2025-03-03 18:00:00,10:30 PM,Event,CFTRMVO(CR) ~ Certificate in Financial Trading...,ZAC1A1048,CFTRMVO(CR),SCH24-Advanced Diploma-CFTRMVO(CR)-00001,Seminar Room,Match,3 days


### 9.8 Final Check on Facility Names

1. Check if there are any facility names that are not in suitable FBS format, and use Facility Names to change those names accordingly

In [119]:
final_df = preference_check_df.copy()

def update_facility_name(row, mapping):
  """
  This function updates the facility name based on the TMS name mapping.

  Args:
      row (pd.Series): A row from the dataframe.
      mapping (pd.DataFrame): Dataframe containing TMS name to FBS name mapping.

  Returns:
      str: The updated facility name.
  """
  tms_name = row['Facility Name']
  if tms_name in mapping['TMS_Facility Name'].tolist():
    return mapping.loc[mapping['TMS_Facility Name'] == tms_name, 'Converted FBS naming'].tolist()[0]
  else:
    return tms_name  # Keep the original facility name if not found in mapping

# Update facility names based on TMS name mapping
final_df['Facility Name'] = final_df.apply(update_facility_name, axis=1, args=(facility_names_df,))

#Total number of rows in the dataframe
print("Total number of rows in the dataframe: ", final_df.shape[0])

#Check if there are any more missing Facility Names in the dataframe
print("Number of missing 'Facility Name' values in the dataframe: ", final_df['Facility Name'].isnull().sum())

Total number of rows in the dataframe:  844
Number of missing 'Facility Name' values in the dataframe:  0


In [120]:
final_df

Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,Preference Type,Venue Preference,No. Of Course Days
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,ACABPMM4,SCH24-Not Specified-ACABPMM4-00002,,Preference not indicated,3 days
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,,Preference not indicated,2 days
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,ACABPMM5,SCH24-Not Specified-ACABPMM5-00002,,Preference not indicated,2 days
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
839,SOE/SCIS2 Catering Area B1A,,,28-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,TMBO ~ TikTok Marketing and Ads for Business,ZAC1A1233,TMBO,SCH24-Not Specified-TMBO-00010,,Not Match,2 days
840,SOE/SCIS2 Catering Area B1B,,,17-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days
841,SOE/SCIS2 Catering Area B1B,,,18-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days
842,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,19-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,UXDFMA,SCH24-Not Specified-UXDFMA-00002,,Not Match,3 days


## 10.0 Final Touches

Upon having the dataframe version of the populated Facility Booking List. We will be using it to populate the values in the actual .xlsx file. 

In this section, we will be, 

1. Dropping any unnecessary columns and retaining only those columns required in the FBS Template, plus the additional columns
2. Change the "Time of Booking From" and "Time of Booking To" columns to 24HR format, and rename to required format by FBS
3. Generate the "Any Comments" column
4. Revert back the "Course Code" column to be empty, as FBS does not recognise it 
5. Renaming of some column headers to be equivalent to what is inside the template
6. Sorting the dataframe to group courses together by the Purpose and Date of Booking
7. Changing all column values to text

### 10.1 Dropping Unnecessary Columns and Retaining Only Those Necessary in FBS

In [121]:
#Checking if there are any missing values in the Facility Name column
print(f"Number of missing values in the Facility Name Column: {preference_check_df['Facility Name'].isnull().sum()}")

#Drop any unnecessary columns
final_df = final_df[['Facility Name', 'Building', 'Floor', 'Date of Booking', 'Time of Booking From', 'Time of Booking To', 'Use Type', 'Purpose', 'Event Code', 'Course Code', 'Sch #', 'No. Of Course Days', 'Venue Preference']]

#Generating the "Any Comments"
final_df['Any Comments'] = np.nan

#Revert back the "Course Code" column to be empty, as it is not required for FBS 
final_df['Course Code'] = np.nan

final_df


Number of missing values in the Facility Name Column: 0


Unnamed: 0,Facility Name,Building,Floor,Date of Booking,Time of Booking From,Time of Booking To,Use Type,Purpose,Event Code,Course Code,Sch #,No. Of Course Days,Venue Preference,Any Comments
0,YPHSL Seminar Room 3-01,,,22-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,,SCH24-Not Specified-ACABPMM4-00002,3 days,Preference not indicated,
1,YPHSL Seminar Room 3-01,,,23-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,,SCH24-Not Specified-ACABPMM4-00002,3 days,Preference not indicated,
2,YPHSL Seminar Room 3-01,,,24-Jan-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM4 ~ Advanced Certificate in Agile Busin...,ZAC1C50075,,SCH24-Not Specified-ACABPMM4-00002,3 days,Preference not indicated,
3,SCIS1 Seminar Room 2-1,,,24-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,,SCH24-Not Specified-ACABPMM5-00002,2 days,Preference not indicated,
4,SCIS1 Seminar Room 2-1,,,25-Feb-2025,2025-03-03 08:00:00,06:00 PM,Event,ACABPMM5 ~ Advanced Certificate in Agile Busin...,ZAC1C50076,,SCH24-Not Specified-ACABPMM5-00002,2 days,Preference not indicated,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
839,SOE/SCIS2 Catering Area B1A,,,28-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,TMBO ~ TikTok Marketing and Ads for Business,ZAC1A1233,,SCH24-Not Specified-TMBO-00010,2 days,Not Match,
840,SOE/SCIS2 Catering Area B1B,,,17-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,
841,SOE/SCIS2 Catering Area B1B,,,18-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,
842,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,19-Mar-2025,2025-03-03 08:30:00,05:30 PM,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,


### 10.2 **(OPTIONAL)** Populating the "Building" and "Floor" columns

Comment out this code if required to populate "Building" and "Floor" columns with values, if not, by default, these columns will be empty

In [122]:
final_df['Building'] = np.nan
final_df['Floor'] = np.nan

### 10.3 Renaming of Date and Time Columns to Fit Naming Convention of FBS Booking Template 

In [123]:
# Convert 'Time of Booking From' and 'Time of Booking To' to 24-hour format
final_df['Time of Booking From'] = pd.to_datetime(final_df['Time of Booking From'], format='%I:%M %p').dt.strftime('%H:%M')
final_df['Time of Booking To'] = pd.to_datetime(final_df['Time of Booking To'], format='%I:%M %p').dt.strftime('%H:%M')

# Standardize Date and Time Column names
final_df.rename(columns = {'Date of Booking':'Date of Booking (dd-MMM-yyyy)',
                           'Time of Booking From':'Time of Booking From (HH:mm)',
                           'Time of Booking To':'Time of Booking To (HH:mm)'}, inplace = True)

In [124]:
#Replace "" values with null for easier accounting purposes
final_df.replace("", np.nan, inplace=True)
final_df.replace("nan", np.nan, inplace=True)

In [125]:
final_df.isna().sum()

Facility Name                      0
Building                         844
Floor                            844
Date of Booking (dd-MMM-yyyy)      0
Time of Booking From (HH:mm)       0
Time of Booking To (HH:mm)         0
Use Type                           0
Purpose                            0
Event Code                        83
Course Code                      844
Sch #                              0
No. Of Course Days                 0
Venue Preference                   0
Any Comments                     844
dtype: int64

### 10.4 Sort the Dataframe by "Purpose" and "Date of Booking" to group courses together

In [126]:
final_df = final_df.sort_values(by=['Purpose', 'Date of Booking (dd-MMM-yyyy)']).reset_index(drop=True)

### 10.5 Standardize all the column values to text format

In [127]:
# Convert all columns to string format
final_df = final_df.astype(str)

# Replace 'nan' with empty string ('') for all columns with empty values
final_df = final_df.replace('nan', '')

# Function to check if every column in the DataFrame is in text format
def check_text_columns(df):
    non_text_columns = []
    for column in df.columns:
        if not df[column].apply(lambda x: isinstance(x, str)).all():
            non_text_columns.append(column)
    
    if non_text_columns:
        print("The following columns are not entirely in text format:")
        for col in non_text_columns:
            print(col)
    else:
        print("All columns are in text format.")

# Call the function
check_text_columns(final_df)

All columns are in text format.


In [128]:
#Check the final dataframe
final_df[final_df["Purpose"] == "UXDFMA ~ UX Design for Modern Applications"]

Unnamed: 0,Facility Name,Building,Floor,Date of Booking (dd-MMM-yyyy),Time of Booking From (HH:mm),Time of Booking To (HH:mm),Use Type,Purpose,Event Code,Course Code,Sch #,No. Of Course Days,Venue Preference,Any Comments
828,SOE/SCIS2 Seminar Room B1-2,,,17-Mar-2025,08:00,17:00,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Match,
829,SOE/SCIS2 Catering Area B1B,,,17-Mar-2025,08:30,17:30,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,
830,SOE/SCIS2 Seminar Room B1-2,,,18-Mar-2025,08:00,17:00,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Match,
831,SOE/SCIS2 Catering Area B1B,,,18-Mar-2025,08:30,17:30,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,
832,SOE/SCIS2 Seminar Room 5-2,,,19-Mar-2025,08:00,18:00,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Match,
833,SOE/SCIS2 Catering Area 4A (Near to SR 4-2),,,19-Mar-2025,08:30,17:30,Event,UXDFMA ~ UX Design for Modern Applications,ZAC1D6300,,SCH24-Not Specified-UXDFMA-00002,3 days,Not Match,


## 11.0 Uploading the Dataframe into "Facility Booking List" Excel Workbook

1. Specify the output file path, the name of the sheet you want to upload to, and the sheet name which contains the template for the headers
2. This code will copy the template sheet's headers into the new sheet, such that only the values from the final dataframe will be appended to their respective columns. 
3. Additional columns will have to formatted within the Excel File itself

In [129]:
# Specify the file paths
output_file_path = 'Facility+Booking+List.xlsx'
sheet_name = '01. Facility Booking List'
new_sheet_name = 'Final Facility Booking List'

# Load the template workbook and select the template sheet
wb = load_workbook(output_file_path)

# Check if the new sheet already exists and delete it if it does
if new_sheet_name in wb.sheetnames:
    del wb[new_sheet_name]

template_sheet = wb[sheet_name]

# Copy the template sheet to retain the formatting
wb.copy_worksheet(template_sheet)
copied_sheet = wb[sheet_name + ' Copy']
copied_sheet.title = new_sheet_name

# Append DataFrame values to the new sheet
start_row = 2  # Assuming the first row is the header row

# Write headers for the additional columns
for col_idx, col_name in enumerate(final_df.columns, start=1):
    copied_sheet.cell(row=1, column=col_idx, value=col_name)

# Write data
for r_idx, row in enumerate(dataframe_to_rows(final_df, index=False, header=False), start=start_row):
    for c_idx, value in enumerate(row, start=1):
        copied_sheet.cell(row=r_idx, column=c_idx, value=value)

# Auto-fit column width
for col in copied_sheet.columns:
    max_length = 0
    column = get_column_letter(col[0].column)
    for cell in col:
        try:
            if len(str(cell.value)) > max_length:
                max_length = len(cell.value)
        except:
            pass
    adjusted_width = (max_length + 2)
    copied_sheet.column_dimensions[column].width = adjusted_width

# Save the modified workbook
wb.save(output_file_path)
print("Facility Booking List has been populated with the necessary values and uploaded into Excel File")

Facility Booking List has been populated with the necessary values and uploaded into Excel File
