<a href="https://colab.research.google.com/github/ilmahamala/html.github.io/blob/main/RL_capstone.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
from google.colab import drive
drive.mount('/content/drive')

## EDA

In [None]:
# Load the datasets
jadwal_df = pd.read_excel('/content/drive/MyDrive/Sofi Jadwal Limit 20.xlsx')
dosen_keahlian_df = pd.read_excel('/content/drive/MyDrive/dosen_keahlian_dummy.xlsx')

# Display the first few rows of each dataset
print("Thesis Defense Schedule Dataset:")
display(jadwal_df.head())
print("\nLecturer Expertise Dataset:")
display(dosen_keahlian_df.head())

# Get basic information about the datasets
print("\nThesis Defense Schedule Info:")
jadwal_df.info()
print("\nLecturer Expertise Info:")
dosen_keahlian_df.info()

Thesis Defense Schedule Dataset:


Unnamed: 0,date,time,ruang,mahasiswa_id,judul,bidang
0,2020-05-16,10:00:00,meet.google.com,1102134314,ANALISIS KELAYAKAN PEMBUKAAN CABANG TOKO ANEKA...,EDM
1,2020-05-29,09:00:00,meet.google.com,1201154091,ANALISIS BEBAN KERJA FISIK DAN PERANCANGAN KEB...,EDM
2,2020-06-04,13:00:00,meet.google.com,1201144128,ANALISIS KELAYANAKAN USAHA SERTA PERANCANGAN W...,SAG
3,2020-06-04,13:00:00,meet.google.com,1201154384,PERANCANGAN APLIKASI WEBSITE DAN ANALISIS KELA...,EIM
4,2020-06-18,13:00:00,https://meet.google.com/kup-jjhf-vbd,1202152159,PERANCANGAN ENTERPRISE ARCHITECTURE UNTUK MENI...,EISD



Lecturer Expertise Dataset:


Unnamed: 0,id,keahlian
0,122,"EISD, EDM, SAG"
1,41,"EDM, ERP"
2,128,EISD
3,149,"EIM, EDM, ERP"
4,129,"EISD, EDM, EIM"



Thesis Defense Schedule Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20 entries, 0 to 19
Data columns (total 6 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   date          20 non-null     object
 1   time          20 non-null     object
 2   ruang         20 non-null     object
 3   mahasiswa_id  20 non-null     int64 
 4   judul         20 non-null     object
 5   bidang        20 non-null     object
dtypes: int64(1), object(5)
memory usage: 1.1+ KB

Lecturer Expertise Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 35 entries, 0 to 34
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   id        35 non-null     int64 
 1   keahlian  35 non-null     object
dtypes: int64(1), object(1)
memory usage: 688.0+ bytes


In [None]:
# Convert date and time columns to datetime
jadwal_df['datetime'] = pd.to_datetime(jadwal_df['date'].astype(str) + ' ' + jadwal_df['time'].astype(str))

# Create a dictionary of lecturer expertise
lecturer_expertise = {}
for _, row in dosen_keahlian_df.iterrows():
    lecturer_expertise[row['id']] = [field.strip() for field in row['keahlian'].split(',')]

# Display unique fields/domains
print("Unique Fields/Domains in Schedule:")
print(jadwal_df['bidang'].unique())

# Display sample of processed lecturer expertise
print("\nSample of Processed Lecturer Expertise:")
for lecturer_id, expertise in list(lecturer_expertise.items())[:5]:
    print(f"Lecturer {lecturer_id}: {expertise}")

# Basic statistics
print("\nDefense Schedule Summary:")
print(jadwal_df['datetime'].dt.date.value_counts().sort_index())

Unique Fields/Domains in Schedule:
['EDM' 'SAG' 'EIM' 'EISD' 'ERP']

Sample of Processed Lecturer Expertise:
Lecturer 122: ['EISD', 'EDM', 'SAG']
Lecturer 41: ['EDM', 'ERP']
Lecturer 128: ['EISD']
Lecturer 149: ['EIM', 'EDM', 'ERP']
Lecturer 129: ['EISD', 'EDM', 'EIM']

Defense Schedule Summary:
datetime
2020-05-16    1
2020-05-29    1
2020-06-04    2
2020-06-08    2
2020-06-15    8
2020-06-16    2
2020-06-18    1
2020-06-22    1
2020-06-23    2
Name: count, dtype: int64


**Interpretasi**

1. Ada 5 bidang atau domain berbeda yang teridentifikasi, yaitu:

- EDM
- SAG
- EIM
- EISD
- ERP

2. Sidang berlangsung selama rentang waktu sekitar satu bulan, mulai dari 16 Mei hingga 23 Juni 2020. Sebagian besar sidang (sebanyak 8 sidang) terjadi pada tanggal 15 Juni 2020.


### analisis distribusi

In [None]:
# Count defenses by field
print("Defense Distribution by Field:")
field_counts = jadwal_df['bidang'].value_counts()
print(field_counts)

# Create a function to find qualified lecturers for each field
def find_qualified_lecturers(field):
    qualified = []
    for lecturer_id, expertise in lecturer_expertise.items():
        if field in expertise:
            qualified.append(lecturer_id)
    return qualified

# Analyze lecturer qualification distribution
print("\nQualified Lecturers per Field:")
for field in jadwal_df['bidang'].unique():
    qualified = find_qualified_lecturers(field)
    print(f"{field}: {len(qualified)} lecturers - {qualified}")

# Calculate number of expertise areas per lecturer
expertise_counts = {lid: len(exp) for lid, exp in lecturer_expertise.items()}
print("\nNumber of Expertise Areas per Lecturer:")
print(pd.Series(expertise_counts).value_counts().sort_index())

Defense Distribution by Field:
bidang
SAG     5
EDM     4
EIM     4
EISD    4
ERP     3
Name: count, dtype: int64

Qualified Lecturers per Field:
EDM: 19 lecturers - [122, 41, 149, 129, 106, 40, 35, 50, 88, 71, 67, 52, 9, 116, 27, 109, 16, 117, 121]
SAG: 17 lecturers - [122, 106, 90, 96, 35, 88, 71, 37, 18, 17, 8, 9, 104, 83, 36, 16, 117]
EIM: 13 lecturers - [149, 129, 90, 37, 67, 51, 8, 104, 57, 22, 109, 117, 85]
EISD: 17 lecturers - [122, 128, 129, 32, 40, 90, 69, 18, 17, 51, 9, 116, 27, 57, 22, 109, 121]
ERP: 14 lecturers - [41, 149, 40, 96, 35, 50, 69, 18, 8, 104, 27, 22, 36, 121]

Number of Expertise Areas per Lecturer:
1     5
2    15
3    15
Name: count, dtype: int64


**Interpretasi**

1. jumlah sidang berdasarkan bidang:
*   SAG: 5
*   EDM: 4
*   EIM: 4
*   EISD: 4
*   ERP: 3

2. distribusi bidang
*   SAG: 17 dosen memenuhi kualifikasi
*   EDM: 19 dosen memenuhi kualifikasi
*   EIM: 13 dosen memenuhi kualifikasi
*   EISD: 17 dosen memenuhi kualifikasi
*   ERP: 14 dosen memenuhi kualifikasi


3. jumlah bidang per dosen
*   1 bidang: 5 dosen
*   2 bidang: 15 dosen
*   3 bidang 15 dosen







In [None]:
# Add columns for supervisor and examiner IDs
print("Current schedule with supervisor/examiner assignments:")
jadwal_cols = jadwal_df.columns
if 'penguji1_id' in jadwal_cols and 'penguji2_id' in jadwal_cols:
    # Calculate workload per lecturer
    all_assignments = []
    for col in ['penguji1_id', 'penguji2_id', 'pembimbing1_id', 'pembimbing2_id']:
        if col in jadwal_cols:
            all_assignments.extend(jadwal_df[col].tolist())

    workload = pd.Series(all_assignments).value_counts()
    print("\nCurrent Workload Distribution:")
    print(workload.head(10))

    # Check time conflicts
    def check_time_conflicts():
        conflicts = []
        for lecturer_id in set(all_assignments):
            lecturer_schedule = jadwal_df[
                (jadwal_df['penguji1_id'] == lecturer_id) |
                (jadwal_df['penguji2_id'] == lecturer_id) |
                (jadwal_df['pembimbing1_id'] == lecturer_id) |
                (jadwal_df['pembimbing2_id'] == lecturer_id)
            ]
            if len(lecturer_schedule) > 1:
                dates = lecturer_schedule['datetime'].sort_values()
                for i in range(len(dates)-1):
                    time_diff = dates.iloc[i+1] - dates.iloc[i]
                    if time_diff.total_seconds() < 3600:  # 1 hour buffer
                        conflicts.append((lecturer_id, dates.iloc[i], dates.iloc[i+1]))
        return conflicts

    conflicts = check_time_conflicts()
    print("\nTime Conflicts Found:")
    for conflict in conflicts:
        print(f"Lecturer {conflict[0]}: {conflict[1]} and {conflict[2]}")

Current schedule with supervisor/examiner assignments:


### memeriksa nilai unik tiap kolom

In [None]:
# Display all columns in the schedule dataframe
print("Schedule DataFrame Columns:")
print(jadwal_df.columns.tolist())

# If penguji/pembimbing columns exist, let's look at a sample row
print("\nSample Row with All Details:")
print(jadwal_df.iloc[0])

# Check if we have examiner and supervisor columns
for col in jadwal_df.columns:
    print(f"\nUnique values in {col}:")
    print(jadwal_df[col].unique()[:5])  # Show first 5 unique values

Schedule DataFrame Columns:
['date', 'time', 'ruang', 'mahasiswa_id', 'judul', 'bidang', 'datetime']

Sample Row with All Details:
date                                                   2020-05-16
time                                                     10:00:00
ruang                                             meet.google.com
mahasiswa_id                                           1102134314
judul           ANALISIS KELAYAKAN PEMBUKAAN CABANG TOKO ANEKA...
bidang                                                        EDM
datetime                                      2020-05-16 10:00:00
Name: 0, dtype: object

Unique values in date:
['2020-05-16' '2020-05-29' '2020-06-04' '2020-06-18' '2020-06-08']

Unique values in time:
['10:00:00' '09:00:00' '13:00:00' '08:00:00']

Unique values in ruang:
['meet.google.com' 'https://meet.google.com/kup-jjhf-vbd'
 'meet1.google.com' 'meet.google.com/rnn-pfni-wqf'
 'meet.google.com/gbe-njcm-wmt']

Unique values in mahasiswa_id:
[1102134314 1201154091 1

### distribusi tanggal dan waktu

In [None]:
# Create time slots from the schedule
jadwal_df['datetime'] = pd.to_datetime(jadwal_df['date'].astype(str) + ' ' + jadwal_df['time'].astype(str))

# Analyze time slot distribution
print("Time slot distribution:")
time_distribution = jadwal_df.groupby(['date'])['time'].count()
print(time_distribution)

# Check concurrent sessions
print("\nDays with multiple defenses:")
concurrent_days = time_distribution[time_distribution > 1]
print(concurrent_days)

# For these days, show the detailed schedule
print("\nDetailed schedule for days with multiple defenses:")
for date in concurrent_days.index:
    day_schedule = jadwal_df[jadwal_df['date'] == date].sort_values('time')
    print(f"\n{date}:")
    print(day_schedule[['time', 'bidang', 'ruang']].to_string())

Time slot distribution:
date
2020-05-16    1
2020-05-29    1
2020-06-04    2
2020-06-08    2
2020-06-15    8
2020-06-16    2
2020-06-18    1
2020-06-22    1
2020-06-23    2
Name: time, dtype: int64

Days with multiple defenses:
date
2020-06-04    2
2020-06-08    2
2020-06-15    8
2020-06-16    2
2020-06-23    2
Name: time, dtype: int64

Detailed schedule for days with multiple defenses:

2020-06-04:
       time bidang            ruang
2  13:00:00    SAG  meet.google.com
3  13:00:00    EIM  meet.google.com

2020-06-08:
       time bidang             ruang
5  13:00:00    SAG   meet.google.com
6  13:00:00   EISD  meet1.google.com

2020-06-15:
        time bidang                         ruang
8   09:00:00    EDM  meet.google.com/rnn-pfni-wqf
9   09:00:00    SAG  meet.google.com/rnn-pfni-wqf
10  09:00:00    ERP  meet.google.com/rnn-pfni-wqf
12  09:00:00    ERP  meet.google.com/gbe-njcm-wmt
13  09:00:00   EISD  meet.google.com/gbe-njcm-wmt
14  09:00:00   EISD  meet.google.com/gbe-njcm-wmt
15

**Interpretasi**

1. time slot distribution
- Tanggal 2020-06-15 memiliki jumlah sidang terbanyak, yaitu 8
- Beberapa tanggal hanya memiliki satu sidang, seperti 2020-05-16, 2020-05-29, 2020-06-18, 2020-06-22, yang menunjukkan sedikit aktivitas pada hari-hari tersebut.

2. Days with multiple defenses:
- Hari-hari yang memiliki lebih dari satu sidang adalah: 2020-06-04, 2020-06-08, 2020-06-15, 2020-06-16, dan 2020-06-23.
- Hari 2020-06-15 sangat padat dengan 8 sidang
- Hari lainnya dengan dua sidang, seperti 2020-06-04, 2020-06-08, 2020-06-16, dan 2020-06-23

3. Detailed schedule for days with multiple defenses:
- Sebagian besar sidang terjadi pada bulan Juni, dengan konsentrasi yang lebih tinggi pada pertengahan bulan, yaitu 2020-06-15, yang merupakan hari dengan sidang terbanyak, dengan 7 sesi pada  09:00:00 and 1 sesi pada 13:00:00




In [None]:
# Create a session analysis
print("Analysis of Session Distribution:")
session_analysis = pd.DataFrame({
    'time': jadwal_df['time'],
    'date': jadwal_df['date'],
    'bidang': jadwal_df['bidang']
})

# Group by date and time to see concurrent sessions
concurrent_sessions = session_analysis.groupby(['date', 'time']).agg({
    'bidang': lambda x: list(x)
}).reset_index()

# Show concurrent sessions with multiple defenses
print("\nConcurrent Sessions Analysis:")
print(concurrent_sessions[concurrent_sessions['bidang'].str.len() > 1])

# Calculate time gaps between sessions on the same day
print("\nTime gaps between sessions on same day:")
for date in jadwal_df['date'].unique():
    day_sessions = jadwal_df[jadwal_df['date'] == date].sort_values('time')
    if len(day_sessions) > 1:
        time_diffs = day_sessions['datetime'].diff()
        print(f"\n{date}:")
        print(time_diffs.dropna())

Analysis of Session Distribution:

Concurrent Sessions Analysis:
         date      time                                 bidang
2  2020-06-04  13:00:00                             [SAG, EIM]
3  2020-06-08  13:00:00                            [SAG, EISD]
4  2020-06-15  09:00:00  [EDM, SAG, ERP, ERP, EISD, EISD, EIM]
6  2020-06-16  13:00:00                             [EIM, SAG]
9  2020-06-23  09:00:00                             [SAG, EDM]

Time gaps between sessions on same day:

2020-06-04:
3   0 days
Name: datetime, dtype: timedelta64[ns]

2020-06-08:
6   0 days
Name: datetime, dtype: timedelta64[ns]

2020-06-15:
9    0 days 00:00:00
10   0 days 00:00:00
12   0 days 00:00:00
13   0 days 00:00:00
14   0 days 00:00:00
15   0 days 00:00:00
11   0 days 04:00:00
Name: datetime, dtype: timedelta64[ns]

2020-06-16:
17   0 days
Name: datetime, dtype: timedelta64[ns]

2020-06-23:
19   0 days
Name: datetime, dtype: timedelta64[ns]


**Interpretasi**

1. Concurrent Sessions Analysis:
- 2020-06-04 13:00:00: Pada waktu ini, dua sesi sidang berlangsung bersamaan, yaitu bidang SAG dan EIM.
- 2020-06-08 13:00:00: Pada waktu yang sama, dua sesi lagi terjadi dengan bidang SAG dan EISD.
- 2020-06-15 09:00:00: Pada waktu ini, ada tujuh sesi yang berlangsung bersamaan, dengan bidang EDM, SAG, ERP, EISD, dan EIM. Ini menunjukkan adanya banyak sesi yang terjadwal pada waktu yang sama, yang mungkin memerlukan perhatian lebih dalam pengelolaan ruang atau penguji.
- 2020-06-16 13:00:00: Pada waktu ini, dua sesi terjadwal dengan bidang EIM dan SAG.
- 2020-06-23 09:00:00: Pada waktu ini, dua sesi juga berlangsung bersamaan, dengan bidang SAG dan EDM.

2. Time gaps between sessions on same day:
- 2020-06-04 dan 2020-06-08: Pada kedua hari ini, sesi-sesi yang terjadwal pada waktu yang sama tidak memiliki selisih waktu (jarak waktu 0 hari), yang menunjukkan bahwa sesi tersebut berlangsung tanpa waktu jeda antara satu dengan yang lainnya.
- 2020-06-15: Pada hari ini, ada banyak sesi yang berlangsung tanpa jeda waktu (0 hari) antara sebagian besar sesi, namun ada satu sesi yang memiliki selisih waktu 4 jam antara sesi pada pukul 09:00 dan 13:00. Ini menunjukkan bahwa ada waktu kosong yang lebih besar antara sesi-sesi pada waktu tertentu.
- 2020-06-16 dan 2020-06-23: Kedua hari ini memiliki sesi dengan jarak waktu 0 hari antar sesi, yang artinya sesi-sesi tersebut berlangsung tanpa adanya waktu jeda.


In [None]:
# Create a schedule complexity analysis
print("Schedule Complexity Analysis:")

# 1. Analyze concurrent sessions by time slot
time_slot_analysis = pd.DataFrame({
    'date': jadwal_df['date'],
    'time': jadwal_df['time'],
    'session_count': 1
}).groupby(['date', 'time']).count().reset_index()

print("\nTime slots with highest concurrency:")
print(time_slot_analysis.sort_values('session_count', ascending=False).head())

# 2. Analyze field combinations in concurrent sessions
def get_field_combinations():
    concurrent = []
    for date in jadwal_df['date'].unique():
        day_schedule = jadwal_df[jadwal_df['date'] == date]
        times = day_schedule['time'].unique()
        for time in times:
            fields = day_schedule[day_schedule['time'] == time]['bidang'].tolist()
            if len(fields) > 1:
                concurrent.append({
                    'date': date,
                    'time': time,
                    'field_combination': fields,
                    'num_concurrent': len(fields)
                })
    return pd.DataFrame(concurrent)

field_combinations = get_field_combinations()
print("\nField combinations in concurrent sessions:")
print(field_combinations)

Schedule Complexity Analysis:

Time slots with highest concurrency:
         date      time  session_count
4  2020-06-15  09:00:00              7
2  2020-06-04  13:00:00              2
3  2020-06-08  13:00:00              2
6  2020-06-16  13:00:00              2
9  2020-06-23  09:00:00              2

Field combinations in concurrent sessions:
         date      time                      field_combination  num_concurrent
0  2020-06-04  13:00:00                             [SAG, EIM]               2
1  2020-06-08  13:00:00                            [SAG, EISD]               2
2  2020-06-15  09:00:00  [EDM, SAG, ERP, ERP, EISD, EISD, EIM]               7
3  2020-06-16  13:00:00                             [EIM, SAG]               2
4  2020-06-23  09:00:00                             [SAG, EDM]               2


**Interpretasi**

1. Schedule Complexity Analysis:

- 2020-06-15 09:00:00 adalah slot waktu yang paling padat, dengan 7 sesi sidang yang terjadwal pada waktu yang sama. Ini menandakan bahwa pada tanggal tersebut, banyak sesi yang harus dilaksanakan pada waktu yang sama, yang mungkin membutuhkan lebih banyak ruang atau pembimbing.
- Slot waktu 2020-06-04 13:00:00, 2020-06-08 13:00:00, 2020-06-16 13:00:00, dan 2020-06-23 09:00:00 masing-masing memiliki 2 sesi yang terjadwal pada waktu yang bersamaan. Meskipun tidak sebanyak 7 sesi, ini masih menunjukkan adanya kepadatan di waktu tersebut yang perlu dikelola.

2. Field combinations in concurrent sessions:
- Pada 2020-06-04 13:00:00, terdapat dua bidang yang terjadwal pada waktu yang bersamaan, yaitu SAG dan EIM.
- Pada 2020-06-08 13:00:00, ada dua bidang juga, yaitu SAG dan EISD.
- Pada 2020-06-15 09:00:00, ada 7 bidang yang terjadwal bersamaan pada waktu yang sama. Kombinasi bidang yang sangat padat ini mencakup EDM, SAG, ERP, EISD, dan EIM, yang menandakan bahwa pada waktu ini banyak bidang yang membutuhkan perhatian, dan hal ini bisa menyebabkan kesulitan dalam manajemen jadwal, seperti pembimbing yang terbatas atau ruang yang tidak memadai.
- Pada 2020-06-16 13:00:00, terdapat dua bidang yang terjadwal bersamaan, yaitu EIM dan SAG.
- Pada 2020-06-23 09:00:00, ada dua bidang juga, yaitu SAG dan EDM.


In [None]:
# First, let's create a test class to verify our scheduling constraints and requirements
class ThesisSchedulingTest:
    def __init__(self, schedule_df, lecturer_expertise):
        """Initialize test class with our data"""
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise

        # Store test results
        self.test_results = {}

    def test_lecturer_expertise_coverage(self):
        """Test if we have enough qualified lecturers for each field"""
        print("\nTesting Lecturer Expertise Coverage:")
        coverage = {}
        for field in self.schedule['bidang'].unique():
            qualified = [lid for lid, exp in self.lecturer_expertise.items() if field in exp]
            coverage[field] = len(qualified)
            print(f"{field}: {len(qualified)} qualified lecturers")
            print(f"Sample qualified lecturers: {qualified[:5]}")
        return coverage

    def test_concurrent_session_constraints(self):
        """Test concurrent session patterns and constraints"""
        print("\nTesting Concurrent Session Patterns:")
        concurrent = self.schedule.groupby(['date', 'time']).size()
        print("\nDays with concurrent sessions:")
        print(concurrent[concurrent > 1].sort_values(ascending=False))
        return concurrent

    def test_time_slot_distribution(self):
        """Test distribution of sessions across time slots"""
        print("\nTesting Time Slot Distribution:")
        time_dist = self.schedule['time'].value_counts()
        print(time_dist)
        return time_dist

# Let's run our tests
test_scheduler = ThesisSchedulingTest(jadwal_df, lecturer_expertise)

# Run individual tests
expertise_coverage = test_scheduler.test_lecturer_expertise_coverage()
concurrent_patterns = test_scheduler.test_concurrent_session_constraints()
time_distribution = test_scheduler.test_time_slot_distribution()

# Additional verification
print("\nVerification of test results:")
print(f"Number of unique fields: {len(jadwal_df['bidang'].unique())}")
print(f"Total number of sessions: {len(jadwal_df)}")


Testing Lecturer Expertise Coverage:
EDM: 19 qualified lecturers
Sample qualified lecturers: [122, 41, 149, 129, 106]
SAG: 17 qualified lecturers
Sample qualified lecturers: [122, 106, 90, 96, 35]
EIM: 13 qualified lecturers
Sample qualified lecturers: [149, 129, 90, 37, 67]
EISD: 17 qualified lecturers
Sample qualified lecturers: [122, 128, 129, 32, 40]
ERP: 14 qualified lecturers
Sample qualified lecturers: [41, 149, 40, 96, 35]

Testing Concurrent Session Patterns:

Days with concurrent sessions:
date        time    
2020-06-15  09:00:00    7
2020-06-04  13:00:00    2
2020-06-08  13:00:00    2
2020-06-16  13:00:00    2
2020-06-23  09:00:00    2
dtype: int64

Testing Time Slot Distribution:
time
09:00:00    10
13:00:00     8
10:00:00     1
08:00:00     1
Name: count, dtype: int64

Verification of test results:
Number of unique fields: 5
Total number of sessions: 20


**Interpretasi**
1. Testing Lecturer Expertise Coverage:
- EDM: 19 dosen yang memenuhi keahlian
- SAG: 17 dosen yang memenuhi keahlian
- EIM: 13 dosen yang memenuhi keahlian
- EISD: 17 dosen yang memenuhi keahlian
- ERP: 14 dosen yang memenuhi keahlian

2. Testing Concurrent Session Patterns:
- 2020-06-15 09:00:00: 7 sesi bersamaan
- 2020-06-04 13:00:00: 2 sesi bersamaan.
- 2020-06-08 13:00:00: 2 sesi bersamaan.
- 2020-06-16 13:00:00: 2 sesi bersamaan.
- 2020-06-23 09:00:00: 2 sesi bersamaan.

3. Testing Time Slot Distribution:
- 09:00:00: 10 sesi.
- 13:00:00: 8 sesi.
- 10:00:00: 1 sesi.
- 08:00:00: 1 sesi

### verifikasi atau pengujian kelayakan jadwal

In [None]:
class ThesisSchedulingConstraints:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise

    def test_field_matching_constraints(self):
        """Test if lecturers are qualified for their assigned fields"""
        print("\nTesting Field Matching Constraints:")

        # For each field, verify we have sufficient coverage throughout the schedule
        field_coverage = {}
        for field in self.schedule['bidang'].unique():
            sessions_needed = len(self.schedule[self.schedule['bidang'] == field])
            qualified_lecturers = [lid for lid, exp in self.lecturer_expertise.items() if field in exp]
            field_coverage[field] = {
                'sessions_needed': sessions_needed,
                'qualified_lecturers': len(qualified_lecturers),
                'coverage_ratio': len(qualified_lecturers) / sessions_needed
            }

        for field, stats in field_coverage.items():
            print(f"\nField: {field}")
            print(f"Sessions to cover: {stats['sessions_needed']}")
            print(f"Qualified lecturers: {stats['qualified_lecturers']}")
            print(f"Coverage ratio: {stats['coverage_ratio']:.2f}")

        return field_coverage

    def test_workload_distribution(self):
        """Test potential workload distribution"""
        print("\nTesting Workload Distribution Possibilities:")

        # Calculate minimum and maximum possible assignments per lecturer
        total_sessions = len(self.schedule)
        total_lecturers = len(self.lecturer_expertise)

        min_assignments = total_sessions // total_lecturers
        max_assignments = min_assignments + 1

        print(f"Total sessions: {total_sessions}")
        print(f"Total lecturers: {total_lecturers}")
        print(f"Minimum assignments per lecturer: {min_assignments}")
        print(f"Maximum assignments per lecturer: {max_assignments}")

        # Calculate lecturer versatility
        lecturer_versatility = {lid: len(exp) for lid, exp in self.lecturer_expertise.items()}
        print("\nLecturer versatility distribution:")
        versatility_dist = pd.Series(lecturer_versatility).value_counts().sort_index()
        print(versatility_dist)

        return {
            'min_assignments': min_assignments,
            'max_assignments': max_assignments,
            'versatility': lecturer_versatility
        }

    def test_time_feasibility(self):
        """Test if the time slots allow for feasible scheduling"""
        print("\nTesting Time Slot Feasibility:")

        # Analyze time gaps between sessions
        self.schedule['datetime'] = pd.to_datetime(self.schedule['date'].astype(str) + ' ' + self.schedule['time'].astype(str))

        for date in self.schedule['date'].unique():
            day_sessions = self.schedule[self.schedule['date'] == date].sort_values('time')
            if len(day_sessions) > 1:
                print(f"\nDate: {date}")
                print("Number of sessions:", len(day_sessions))
                print("Time distribution:")
                print(day_sessions['time'].value_counts())

        return True

# Run the constraint tests
constraint_tester = ThesisSchedulingConstraints(jadwal_df, lecturer_expertise)

# Test each constraint
field_coverage = constraint_tester.test_field_matching_constraints()
workload_stats = constraint_tester.test_workload_distribution()
time_feasibility = constraint_tester.test_time_feasibility()

# Additional verification
print("\nOverall Feasibility Summary:")
print("1. Field coverage adequate:", all(stats['coverage_ratio'] >= 2 for stats in field_coverage.values()))
print("2. Workload distribution possible:", workload_stats['min_assignments'] > 0)
print("3. Time slots feasible:", time_feasibility)


Testing Field Matching Constraints:

Field: EDM
Sessions to cover: 4
Qualified lecturers: 19
Coverage ratio: 4.75

Field: SAG
Sessions to cover: 5
Qualified lecturers: 17
Coverage ratio: 3.40

Field: EIM
Sessions to cover: 4
Qualified lecturers: 13
Coverage ratio: 3.25

Field: EISD
Sessions to cover: 4
Qualified lecturers: 17
Coverage ratio: 4.25

Field: ERP
Sessions to cover: 3
Qualified lecturers: 14
Coverage ratio: 4.67

Testing Workload Distribution Possibilities:
Total sessions: 20
Total lecturers: 35
Minimum assignments per lecturer: 0
Maximum assignments per lecturer: 1

Lecturer versatility distribution:
1     5
2    15
3    15
Name: count, dtype: int64

Testing Time Slot Feasibility:

Date: 2020-06-04
Number of sessions: 2
Time distribution:
time
13:00:00    2
Name: count, dtype: int64

Date: 2020-06-08
Number of sessions: 2
Time distribution:
time
13:00:00    2
Name: count, dtype: int64

Date: 2020-06-15
Number of sessions: 8
Time distribution:
time
09:00:00    7
13:00:00   

**Interpretasi**

1. Testing Field Matching Constraints:
- Field: EDM

jumlah sesi sidang: 4
jumlah dosen: 19 (Ada 19 dosen yang memiliki keahlian di bidang EDM)
Coverage ratio: 4.75 (Rasio ini menunjukkan bahwa untuk setiap sesi, terdapat 4,75 dosen yang memenuhi syarat)
- dst.

2. Overall Feasibility Summary:
- kecocokan bidang dengan jumlah dosen yang memenuhi syarat: True. Semua bidang memiliki cukup dosen untuk mengajar setiap sesi.
- distribusi beban kerja dosen: False. Distribusi beban kerja tidak ideal karena beberapa dosen tidak terlibat sama sekali, dan beberapa dosen mungkin memiliki terlalu banyak tugas sementara yang lain tidak ada.
- kelayakan slot waktu yang tersedia: True. Waktu untuk sebagian besar sesi cukup layak, meskipun ada potensi untuk sedikit penumpukan sesi pada beberapa tanggal.

### *perhitungan beban kerja dosen

pemberian nilai atau skor berdasarkan berbagai aspek

- Disini mengevaluasi dan mengoptimalkan penjadwalan sidang tesis dengan mempertimbangkan dua faktor utama: kecocokan keahlian dosen dengan bidang yang diajarkan dan distribusi beban kerja yang merata di antara dosen

- Reward = skor yang dihitung untuk kedua faktor tersebut, yang bertujuan untuk mengoptimalkan jadwal dengan cara mencocokkan dosen yang tepat dengan bidang yang sesuai dan membagi beban kerja dosen secara merata.







In [None]:
class ThesisSchedulingRewards:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise

    def test_workload_calculation(self): # perhitungan beban kerja dosen
        """Revised workload calculation considering multiple roles per defense"""
        print("\nRevised Workload Calculations:")

        # Assuming each defense needs 2 examiners and 2 supervisors
        total_roles = len(self.schedule) * 4  # 4 roles per sidang
        num_lecturers = len(self.lecturer_expertise)

        avg_roles_per_lecturer = total_roles / num_lecturers

        print(f"Total defenses: {len(self.schedule)}")
        print(f"Total lecturer roles needed: {total_roles}")
        print(f"Number of available lecturers: {num_lecturers}")
        print(f"Average roles per lecturer: {avg_roles_per_lecturer:.2f}")

        return {
            'total_roles': total_roles,
            'avg_roles': avg_roles_per_lecturer
        }

    def calculate_expertise_matching_score(self, lecturer_id, field): # menghitung seberapa baik kesesuaian dosen dengan bidang tertentu berdasarkan keahlian dosen
        """Calculate how well a lecturer matches a field"""
        if lecturer_id in self.lecturer_expertise:
            if field in self.lecturer_expertise[lecturer_id]:
                num_expertise = len(self.lecturer_expertise[lecturer_id])
                # Higher score for more focused expertise
                return 1.0 / num_expertise
        return 0.0

    def test_reward_components(self): # menguji komponen-komponen dari fungsi reward yang meliputi kesesuaian keahlian dan beban kerja
        """Test individual components of the reward function"""
        print("\nTesting Reward Components:")

        # Test expertise matching
        print("\nExpertise Matching Test:")
        test_cases = [
            (self.schedule['bidang'].iloc[0], list(self.lecturer_expertise.keys())[0]),
            (self.schedule['bidang'].iloc[1], list(self.lecturer_expertise.keys())[1])
        ]

        for field, lecturer_id in test_cases:
            score = self.calculate_expertise_matching_score(lecturer_id, field)
            print(f"Field: {field}, Lecturer: {lecturer_id}, Match Score: {score:.2f}")

        # Test workload balance
        print("\nWorkload Balance Test:")
        workload = self.test_workload_calculation()
        target_workload = workload['avg_roles']
        print(f"Target workload per lecturer: {target_workload:.2f}")

        return {
            'workload_target': target_workload,
            'test_cases': test_cases
        }

# Create and run reward tests
reward_tester = ThesisSchedulingRewards(jadwal_df, lecturer_expertise)

# Test workload calculations
workload_stats = reward_tester.test_workload_calculation()

# Test reward components
reward_components = reward_tester.test_reward_components()


Revised Workload Calculations:
Total defenses: 20
Total lecturer roles needed: 80
Number of available lecturers: 35
Average roles per lecturer: 2.29

Testing Reward Components:

Expertise Matching Test:
Field: EDM, Lecturer: 122, Match Score: 0.33
Field: EDM, Lecturer: 41, Match Score: 0.50

Workload Balance Test:

Revised Workload Calculations:
Total defenses: 20
Total lecturer roles needed: 80
Number of available lecturers: 35
Average roles per lecturer: 2.29
Target workload per lecturer: 2.29


**interpretasi**

1. Revised Workload Calculations:
- Total defenses: 20
Interpretasi: Terdapat total 20 sesi ujian tesis atau seminar yang perlu dijadwalkan.
- Total lecturer roles needed: 80
Interpretasi: Untuk 20 sesi ini, dibutuhkan total 80 peran dosen (dengan asumsi setiap sesi membutuhkan 4 peran: 2 penguji dan 2 pembimbing). Ini mengindikasikan total peran yang diperlukan untuk menyelesaikan penjadwalan semua sesi.
- Number of available lecturers: 35
Interpretasi: Ada 35 dosen yang tersedia untuk mengisi peran-peran ini.
- Average roles per lecturer: 2.29
Interpretasi: Rata-rata beban kerja per dosen adalah 2.29 peran. Ini berarti setiap dosen, pada rata-rata, harus mengisi sekitar 2 hingga 3 peran dalam sesi sidang.

2. Expertise Matching Test:
match score: digunakan untuk memprioritaskan dosen yang memiliki kecocokan tertinggi dengan bidang tesis tertentu. (Skor ini dihitung berdasarkan keahlian yang dimiliki oleh dosen tersebut, dengan mempertimbangkan jumlah bidang keahlian yang dimiliki)

  contoh: Dosen 41 memiliki kecocokan lebih tinggi dengan bidang EDM daripada Dosen 122. Oleh karena itu, dosen 41 lebih diutamakan jika tersedia.

3. Target workload per lecturer: 2.29 (target beban kerja yang harus dipenuhi agar beban seimbang)


## RL

### kalkulasi reward

a. Penilaian diberikan berdasarkan beberapa aspek:

- Cocok atau tidak keahlian dosen dengan bidang tugasnya.
- Seberapa seimbang beban kerja dosen dibandingkan target rata-rata.
- Apakah dosen memiliki konflik jadwal (dua sidang di waktu yang sama).

keterangan:
- Expertise Reward: Mengoptimalkan penugasan sesuai dengan keahlian dosen. (Jika bidang sidang cocok dengan keahlian dosen, dosen dapat poin tambahan.
Jika tidak cocok, poinnya 0)
- Workload Penalty: Memastikan distribusi beban kerja yang adil. (Jika beban kerja dosen jauh dari target (2.29), akan diberi penalti dengan penalti maksimal -1)
- Time Conflict Penalty: Menghindari konflik waktu dalam jadwal. (Jika dosen punya dua tugas di waktu yang sama, penalti -1)


In [None]:
class ThesisDefenseRL:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.target_workload = 2.29  # dari kalkulasi sebelumnya
        self.current_assignments = {}  # track lecturer assignments

    def test_reward_calculation(self, test_assignments):
        """Test reward calculation with sample assignments"""
        print("\nTesting Reward Calculation:")

        def calculate_single_assignment_reward(lecturer_id, field, time_slot, date):
            reward = 0

            # 1. Expertise Matching Reward (0 to 1)
            expertise_score = 0
            if field in self.lecturer_expertise.get(lecturer_id, []):
                expertise_score = 1.0 / len(self.lecturer_expertise[lecturer_id])
            reward += expertise_score

            # 2. Workload Balance Penalty (-1 to 0)
            current_load = len([a for a in test_assignments if a[0] == lecturer_id])
            workload_diff = abs(current_load - self.target_workload)
            workload_penalty = -min(workload_diff, 1.0)
            reward += workload_penalty

            # 3. Time Conflict Penalty (-1 or 0)
            has_conflict = any(
                a[3] == date and a[2] == time_slot
                for a in test_assignments if a[0] == lecturer_id
            )
            if has_conflict:
                reward -= 1.0

            return reward, {
                'expertise': expertise_score,
                'workload': workload_penalty,
                'conflict': -1.0 if has_conflict else 0
            }

        # contoh test various scenarios
        test_cases = [
            # Format: (lecturer_id, field, time_slot, date)
            (122, 'EDM', '09:00:00', '2020-06-15'),  # dosen dengan bidang yg cocok
            (41, 'SAG', '13:00:00', '2020-06-15'),   # dosen dengan bidang yg tidak cocok
            (122, 'EDM', '09:00:00', '2020-06-15'),  # Time conflict case
        ]

        print("\nTesting different assignment scenarios:")
        for case in test_cases:
            lecturer_id, field, time_slot, date = case
            reward, components = calculate_single_assignment_reward(lecturer_id, field, time_slot, date)

            print(f"\nScenario - Lecturer: {lecturer_id}, Field: {field}, Time: {time_slot}")
            print(f"Total Reward: {reward:.2f}")
            print("Reward Components:")
            print(f"- Expertise Score: {components['expertise']:.2f}")
            print(f"- Workload Balance: {components['workload']:.2f}")
            print(f"- Time Conflict: {components['conflict']:.2f}")

            # Verify expertise matching
            print(f"Lecturer Expertise: {self.lecturer_expertise.get(lecturer_id, [])}")

        return True

# Create test assignments
test_assignments = [
    (122, 'EDM', '09:00:00', '2020-06-15'),
    (41, 'EDM', '13:00:00', '2020-06-15'),
    (149, 'EIM', '09:00:00', '2020-06-16')
]

# Test the RL reward calculations
rl_tester = ThesisDefenseRL(jadwal_df, lecturer_expertise)
rl_tester.test_reward_calculation(test_assignments)


Testing Reward Calculation:

Testing different assignment scenarios:

Scenario - Lecturer: 122, Field: EDM, Time: 09:00:00
Total Reward: -1.67
Reward Components:
- Expertise Score: 0.33
- Workload Balance: -1.00
- Time Conflict: -1.00
Lecturer Expertise: ['EISD', 'EDM', 'SAG']

Scenario - Lecturer: 41, Field: SAG, Time: 13:00:00
Total Reward: -2.00
Reward Components:
- Expertise Score: 0.00
- Workload Balance: -1.00
- Time Conflict: -1.00
Lecturer Expertise: ['EDM', 'ERP']

Scenario - Lecturer: 122, Field: EDM, Time: 09:00:00
Total Reward: -1.67
Reward Components:
- Expertise Score: 0.33
- Workload Balance: -1.00
- Time Conflict: -1.00
Lecturer Expertise: ['EISD', 'EDM', 'SAG']


True

### kalkulasi reward improvement

In [None]:
class ThesisDefenseRLImproved:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.target_workload = 2.29
        self.current_assignments = {}

    def calculate_reward(self, test_assignments):
        """Improved reward calculation with better balancing"""
        print("\nTesting Improved Reward Function:")

        def calculate_single_assignment_reward(lecturer_id, field, time_slot, date):
            # 1. Expertise Matching (0 to 2.0)
            expertise_score = 0
            if field in self.lecturer_expertise.get(lecturer_id, []):
                # Higher reward for better expertise match
                expertise_score = 2.0 * (1.0 / len(self.lecturer_expertise[lecturer_id]))

            # 2. Workload Balance (-0.5 to 0)
            current_load = len([a for a in test_assignments if a[0] == lecturer_id])
            workload_diff = abs(current_load - self.target_workload)
            workload_penalty = -0.5 * min(workload_diff, 1.0)

            # 3. Time Conflict (-2.0 or 0)
            time_conflicts = [
                a for a in test_assignments
                if a[0] == lecturer_id and a[3] == date
                and abs((pd.to_datetime(a[2]) - pd.to_datetime(time_slot)).total_seconds()) < 7200  # 2-hour buffer
            ]
            conflict_penalty = -2.0 if time_conflicts else 0

            total_reward = expertise_score + workload_penalty + conflict_penalty

            return total_reward, {
                'expertise': expertise_score,
                'workload': workload_penalty,
                'conflict': conflict_penalty
            }

        # Test realistic scenarios
        test_cases = [
            # Good case: Matching expertise, no conflicts
            (122, 'EDM', '13:00:00', '2020-06-16'),

            # Moderate case: Matching expertise but high workload
            (41, 'EDM', '09:00:00', '2020-06-15'),

            # Bad case: No expertise match and time conflict
            (128, 'SAG', '09:00:00', '2020-06-15'),

            # Optimal case: Perfect expertise match, good workload
            (149, 'EIM', '13:00:00', '2020-06-23')
        ]

        print("\nTesting Various Scheduling Scenarios:")
        for case in test_cases:
            lecturer_id, field, time_slot, date = case
            reward, components = calculate_single_assignment_reward(lecturer_id, field, time_slot, date)

            print(f"\nScenario - Lecturer: {lecturer_id}")
            print(f"Field: {field}, Time: {time_slot}, Date: {date}")
            print(f"Total Reward: {reward:.2f}")
            print("Components:")
            print(f"- Expertise Score: {components['expertise']:.2f}")
            print(f"- Workload Balance: {components['workload']:.2f}")
            print(f"- Time Conflict: {components['conflict']:.2f}")
            print(f"Lecturer Expertise: {self.lecturer_expertise.get(lecturer_id, [])}")

        return True

# Test the improved reward calculations
test_assignments = [
    (122, 'EDM', '09:00:00', '2020-06-15'),
    (41, 'EDM', '13:00:00', '2020-06-15'),
    (149, 'EIM', '09:00:00', '2020-06-16')
]

rl_tester_improved = ThesisDefenseRLImproved(jadwal_df, lecturer_expertise)
rl_tester_improved.calculate_reward(test_assignments)


Testing Improved Reward Function:

Testing Various Scheduling Scenarios:

Scenario - Lecturer: 122
Field: EDM, Time: 13:00:00, Date: 2020-06-16
Total Reward: 0.17
Components:
- Expertise Score: 0.67
- Workload Balance: -0.50
- Time Conflict: 0.00
Lecturer Expertise: ['EISD', 'EDM', 'SAG']

Scenario - Lecturer: 41
Field: EDM, Time: 09:00:00, Date: 2020-06-15
Total Reward: 0.50
Components:
- Expertise Score: 1.00
- Workload Balance: -0.50
- Time Conflict: 0.00
Lecturer Expertise: ['EDM', 'ERP']

Scenario - Lecturer: 128
Field: SAG, Time: 09:00:00, Date: 2020-06-15
Total Reward: -0.50
Components:
- Expertise Score: 0.00
- Workload Balance: -0.50
- Time Conflict: 0.00
Lecturer Expertise: ['EISD']

Scenario - Lecturer: 149
Field: EIM, Time: 13:00:00, Date: 2020-06-23
Total Reward: 0.17
Components:
- Expertise Score: 0.67
- Workload Balance: -0.50
- Time Conflict: 0.00
Lecturer Expertise: ['EIM', 'EDM', 'ERP']


True

### define environment

In [None]:
class ThesisDefenseEnvironment:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.target_workload = 2.29
        self.assignments = {}  # {(date, time_slot): [lecturer_ids]}
        self.current_state = self.get_initial_state()

        # Constants from our analysis
        self.MAX_ASSIGNMENTS_PER_SLOT = 4  # 2 examiners + 2 supervisors
        self.MIN_TIME_GAP = 7200  # 2 hours in seconds

    def get_initial_state(self):
        """Create initial state representation"""
        return {
            'scheduled_defenses': [],
            'lecturer_loads': {lid: 0 for lid in self.lecturer_expertise.keys()},
            'remaining_defenses': self.schedule.index.tolist()
        }

    def get_valid_actions(self, defense_id):
        """Get valid lecturer assignments for a defense"""
        defense = self.schedule.loc[defense_id]
        valid_actions = []

        for lecturer_id in self.lecturer_expertise.keys():
            # Check expertise match
            if defense['bidang'] in self.lecturer_expertise[lecturer_id]:
                # Check time conflicts
                has_conflict = False
                if (defense['date'], defense['time']) in self.assignments:
                    if lecturer_id in self.assignments[(defense['date'], defense['time'])]:
                        has_conflict = True

                if not has_conflict:
                    valid_actions.append(lecturer_id)

        return valid_actions

    def calculate_assignment_reward(self, lecturer_id, defense_id):
        """Calculate reward for assigning lecturer to defense"""
        defense = self.schedule.loc[defense_id]

        # 1. Expertise Matching (0 to 2.0)
        expertise_score = 0
        if defense['bidang'] in self.lecturer_expertise.get(lecturer_id, []):
            expertise_score = 2.0 * (1.0 / len(self.lecturer_expertise[lecturer_id]))

        # 2. Workload Balance (-0.5 to 0)
        current_load = self.current_state['lecturer_loads'].get(lecturer_id, 0)
        workload_diff = abs(current_load - self.target_workload)
        workload_penalty = -0.5 * min(workload_diff, 1.0)

        # 3. Time Conflict Check (-2.0 or 0)
        conflict_penalty = 0
        if (defense['date'], defense['time']) in self.assignments:
            if lecturer_id in self.assignments[(defense['date'], defense['time'])]:
                conflict_penalty = -2.0

        return expertise_score + workload_penalty + conflict_penalty

    def step(self, defense_id, lecturer_id):
        """Take a step in the environment by making an assignment"""
        defense = self.schedule.loc[defense_id]
        reward = self.calculate_assignment_reward(lecturer_id, defense_id)

        # Update state
        key = (defense['date'], defense['time'])
        if key not in self.assignments:
            self.assignments[key] = []
        self.assignments[key].append(lecturer_id)

        self.current_state['lecturer_loads'][lecturer_id] = self.current_state['lecturer_loads'].get(lecturer_id, 0) + 1

        if defense_id in self.current_state['remaining_defenses']:
            self.current_state['remaining_defenses'].remove(defense_id)
            self.current_state['scheduled_defenses'].append(defense_id)

        done = len(self.current_state['remaining_defenses']) == 0
        return self.current_state, reward, done

# Test the environment
print("Testing RL Environment:")
env = ThesisDefenseEnvironment(jadwal_df, lecturer_expertise)

# Test case: Schedule first defense
test_defense_id = jadwal_df.index[0]
print(f"\nTesting assignment for defense {test_defense_id}:")
print("Defense details:", jadwal_df.loc[test_defense_id])

valid_actions = env.get_valid_actions(test_defense_id)
print(f"\nValid lecturers for this defense: {len(valid_actions)}")
print("Sample of valid lecturers:", valid_actions[:5])

if valid_actions:
    test_lecturer = valid_actions[0]
    new_state, reward, done = env.step(test_defense_id, test_lecturer)
    print(f"\nAssignment result:")
    print(f"Assigned lecturer: {test_lecturer}")
    print(f"Reward: {reward:.2f}")
    print(f"Done: {done}")
    print(f"Remaining defenses: {len(new_state['remaining_defenses'])}")

Testing RL Environment:

Testing assignment for defense 0:
Defense details: date                                                   2020-05-16
time                                                     10:00:00
ruang                                             meet.google.com
mahasiswa_id                                           1102134314
judul           ANALISIS KELAYAKAN PEMBUKAAN CABANG TOKO ANEKA...
bidang                                                        EDM
datetime                                      2020-05-16 10:00:00
Name: 0, dtype: object

Valid lecturers for this defense: 19
Sample of valid lecturers: [122, 41, 149, 129, 106]

Assignment result:
Assigned lecturer: 122
Reward: 0.17
Done: False
Remaining defenses: 19


### RL implementation

In [None]:
class ThesisDefenseScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.env = ThesisDefenseEnvironment(schedule_df, lecturer_expertise)
        self.learning_rate = 0.1
        self.discount_factor = 0.9
        self.epsilon = 0.1  # for exploration

    def select_action(self, defense_id, valid_actions):
        """Select action using epsilon-greedy policy"""
        if np.random.random() < self.epsilon:
            # Exploration: random selection
            return np.random.choice(valid_actions)
        else:
            # Exploitation: select best action based on expected rewards
            action_rewards = {
                lecturer_id: self.env.calculate_assignment_reward(lecturer_id, defense_id)
                for lecturer_id in valid_actions
            }
            return max(action_rewards.items(), key=lambda x: x[1])[0]

    def schedule_defenses(self, max_iterations=1000):
        """Main scheduling algorithm"""
        print("\nStarting Defense Scheduling:")
        best_schedule = None
        best_reward = float('-inf')

        for iteration in range(max_iterations):
            self.env = ThesisDefenseEnvironment(jadwal_df, lecturer_expertise)
            total_reward = 0
            schedule = []

            # Process each defense
            while len(self.env.current_state['remaining_defenses']) > 0:
                defense_id = self.env.current_state['remaining_defenses'][0]
                valid_actions = self.env.get_valid_actions(defense_id)

                if not valid_actions:
                    print(f"No valid actions for defense {defense_id}")
                    break

                # Select and take action
                selected_lecturer = self.select_action(defense_id, valid_actions)
                new_state, reward, done = self.env.step(defense_id, selected_lecturer)

                total_reward += reward
                schedule.append((defense_id, selected_lecturer))

            # Update best schedule if current is better
            if total_reward > best_reward:
                best_reward = total_reward
                best_schedule = schedule.copy()

            if iteration % 100 == 0:
                print(f"Iteration {iteration}, Current Best Reward: {best_reward:.2f}")

        return best_schedule, best_reward

    def analyze_schedule(self, schedule):
        """Analyze the quality of the generated schedule"""
        print("\nSchedule Analysis:")

        # Analyze lecturer workload
        workload = {}
        for defense_id, lecturer_id in schedule:
            workload[lecturer_id] = workload.get(lecturer_id, 0) + 1

        print("\nLecturer Workload Distribution:")
        workload_series = pd.Series(workload)
        print(workload_series.describe())

        # Analyze expertise matching
        expertise_matches = 0
        for defense_id, lecturer_id in schedule:
            defense = self.env.schedule.loc[defense_id]
            if defense['bidang'] in self.env.lecturer_expertise[lecturer_id]:
                expertise_matches += 1

        expertise_ratio = expertise_matches / len(schedule)
        print(f"\nExpertise Matching Ratio: {expertise_ratio:.2f}")

        return {
            'workload': workload,
            'expertise_ratio': expertise_ratio
        }

# Test the scheduler
scheduler = ThesisDefenseScheduler(jadwal_df, lecturer_expertise)
print("\nRunning scheduling optimization...")
best_schedule, best_reward = scheduler.schedule_defenses(max_iterations=500)

# Analyze results
print(f"\nFinal Best Reward: {best_reward:.2f}")
analysis = scheduler.analyze_schedule(best_schedule)

# Display sample of final schedule
print("\nSample of Final Schedule (first 5 assignments):")
for defense_id, lecturer_id in best_schedule[:5]:
    defense = jadwal_df.loc[defense_id]
    print(f"Defense: {defense_id}")
    print(f"Date: {defense['date']}, Time: {defense['time']}")
    print(f"Field: {defense['bidang']}")
    print(f"Assigned Lecturer: {lecturer_id}")
    print(f"Lecturer Expertise: {lecturer_expertise[lecturer_id]}")
    print("---")


Running scheduling optimization...

Starting Defense Scheduling:
Iteration 0, Current Best Reward: 26.71
Iteration 100, Current Best Reward: 28.86
Iteration 200, Current Best Reward: 28.86
Iteration 300, Current Best Reward: 28.86
Iteration 400, Current Best Reward: 28.86

Final Best Reward: 28.86

Schedule Analysis:

Lecturer Workload Distribution:
count    7.000000
mean     2.857143
std      1.573592
min      1.000000
25%      1.500000
50%      3.000000
75%      4.000000
max      5.000000
dtype: float64

Expertise Matching Ratio: 1.00

Sample of Final Schedule (first 5 assignments):
Defense: 0
Date: 2020-05-16, Time: 10:00:00
Field: EDM
Assigned Lecturer: 52
Lecturer Expertise: ['EDM']
---
Defense: 1
Date: 2020-05-29, Time: 09:00:00
Field: EDM
Assigned Lecturer: 52
Lecturer Expertise: ['EDM']
---
Defense: 2
Date: 2020-06-04, Time: 13:00:00
Field: SAG
Assigned Lecturer: 83
Lecturer Expertise: ['SAG']
---
Defense: 3
Date: 2020-06-04, Time: 13:00:00
Field: EIM
Assigned Lecturer: 85
Lec

Key Findings:

Schedule Quality:

Final reward: 28.86 (stabilized after 100 iterations)
Perfect expertise matching ratio: 1.00
No time conflicts detected


Workload Distribution:

Mean: 2.86 defenses per lecturer
Min: 1 defense
Max: 5 defenses
Median: 3 defenses


Sample Assignments show good expertise matching:

All lecturers are assigned to their areas of expertise
Multiple time slots are handled correctly

### improved RL implementation

In [None]:
class ImprovedThesisDefenseScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.env = ThesisDefenseEnvironment(schedule_df, lecturer_expertise)
        self.lecturer_expertise = lecturer_expertise  # Add this line
        self.roles = ['examiner1', 'examiner2', 'supervisor1', 'supervisor2']
        self.schedule_df = schedule_df  # Add this line

    def schedule_single_defense(self, defense_id, assigned_lecturers=None):
        """Schedule all roles for a single defense"""
        defense = self.schedule_df.loc[defense_id]
        assignments = assigned_lecturers or {}
        role_assignments = {}

        print(f"\nScheduling Defense {defense_id}:")
        print(f"Field: {defense['bidang']}")
        print(f"Date: {defense['date']}, Time: {defense['time']}")

        # Track current workload
        current_workload = {}
        for prev_assignments in assignments.values():
            for lecturer_id in prev_assignments.values():
                current_workload[lecturer_id] = current_workload.get(lecturer_id, 0) + 1

        for role in self.roles:
            valid_lecturers = [
                lid for lid, expertise in self.lecturer_expertise.items()
                if defense['bidang'] in expertise
                and lid not in (role_assignments.values())
                and current_workload.get(lid, 0) < 5  # Maximum 5 assignments per lecturer
            ]

            if valid_lecturers:
                # Select lecturer with least current workload
                workloads = {lid: current_workload.get(lid, 0) for lid in valid_lecturers}
                selected = min(workloads.items(), key=lambda x: x[1])[0]
                role_assignments[role] = selected
                current_workload[selected] = current_workload.get(selected, 0) + 1
                print(f"{role}: Lecturer {selected} (Expertise: {self.lecturer_expertise[selected]})")
            else:
                print(f"Warning: Could not find valid lecturer for {role}")

        return role_assignments

    def evaluate_schedule(self, schedule):
        """Evaluate the complete schedule"""
        workload = {}
        expertise_matches = 0
        total_assignments = 0

        for defense_id, assignments in schedule.items():
            defense = self.schedule_df.loc[defense_id]
            for role, lecturer_id in assignments.items():
                workload[lecturer_id] = workload.get(lecturer_id, 0) + 1
                if defense['bidang'] in self.lecturer_expertise[lecturer_id]:
                    expertise_matches += 1
                total_assignments += 1

        print("\nSchedule Evaluation:")
        print(f"Total Assignments: {total_assignments}")
        print(f"Expertise Match Ratio: {expertise_matches/total_assignments:.2f}")
        print("\nWorkload Distribution:")
        workload_series = pd.Series(workload)
        print(workload_series.describe())

        return workload_series

# Test the improved scheduler
print("Testing Improved Scheduler with Multiple Roles:")
improved_scheduler = ImprovedThesisDefenseScheduler(jadwal_df, lecturer_expertise)

# Schedule first 5 defenses with multiple roles
test_schedule = {}
for defense_id in jadwal_df.index[:5]:
    assignments = improved_scheduler.schedule_single_defense(defense_id, test_schedule)
    test_schedule[defense_id] = assignments

# Evaluate the test schedule
workload_stats = improved_scheduler.evaluate_schedule(test_schedule)

# Display time distribution
print("\nTime Distribution Analysis:")
for defense_id in test_schedule:
    defense = jadwal_df.loc[defense_id]
    print(f"\nDefense {defense_id}:")
    print(f"Date: {defense['date']}, Time: {defense['time']}")
    print("Assigned Lecturers:", list(test_schedule[defense_id].values()))

Testing Improved Scheduler with Multiple Roles:

Scheduling Defense 0:
Field: EDM
Date: 2020-05-16, Time: 10:00:00
examiner1: Lecturer 122 (Expertise: ['EISD', 'EDM', 'SAG'])
examiner2: Lecturer 41 (Expertise: ['EDM', 'ERP'])
supervisor1: Lecturer 149 (Expertise: ['EIM', 'EDM', 'ERP'])
supervisor2: Lecturer 129 (Expertise: ['EISD', 'EDM', 'EIM'])

Scheduling Defense 1:
Field: EDM
Date: 2020-05-29, Time: 09:00:00
examiner1: Lecturer 106 (Expertise: ['SAG', 'EDM'])
examiner2: Lecturer 40 (Expertise: ['ERP', 'EDM', 'EISD'])
supervisor1: Lecturer 35 (Expertise: ['SAG', 'EDM', 'ERP'])
supervisor2: Lecturer 50 (Expertise: ['EDM', 'ERP'])

Scheduling Defense 2:
Field: SAG
Date: 2020-06-04, Time: 13:00:00
examiner1: Lecturer 90 (Expertise: ['EISD', 'EIM', 'SAG'])
examiner2: Lecturer 96 (Expertise: ['SAG', 'ERP'])
supervisor1: Lecturer 88 (Expertise: ['SAG', 'EDM'])
supervisor2: Lecturer 71 (Expertise: ['SAG', 'EDM'])

Scheduling Defense 3:
Field: EIM
Date: 2020-06-04, Time: 13:00:00
examiner1:

### Final Scheduler Implementation

In [None]:
class FinalThesisDefenseScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df
        self.lecturer_expertise = lecturer_expertise
        self.roles = ['examiner1', 'examiner2', 'supervisor1', 'supervisor2']
        self.assignments = {}  # Track all assignments

    def check_time_conflict(self, lecturer_id, date, time, existing_assignments):
        """Check if lecturer has any time conflicts"""
        for d_id, assignments in existing_assignments.items():
            defense = self.schedule_df.loc[d_id]
            if defense['date'] == date and defense['time'] == time:
                if lecturer_id in assignments.values():
                    return True
        return False

    def get_lecturer_workload(self, lecturer_id, existing_assignments):
        """Get current workload for a lecturer"""
        return sum(1 for assignments in existing_assignments.values()
                  for assigned_id in assignments.values()
                  if assigned_id == lecturer_id)

    def schedule_defense(self, defense_id, existing_assignments=None):
        """Schedule a single defense with all roles"""
        defense = self.schedule_df.loc[defense_id]
        existing_assignments = existing_assignments or {}
        role_assignments = {}

        print(f"\nScheduling Defense {defense_id}:")
        print(f"Field: {defense['bidang']}")
        print(f"Date: {defense['date']}, Time: {defense['time']}")

        for role in self.roles:
            valid_lecturers = [
                lid for lid, expertise in self.lecturer_expertise.items()
                if (defense['bidang'] in expertise and  # Expertise match
                    lid not in role_assignments.values() and  # Not already assigned to this defense
                    not self.check_time_conflict(lid, defense['date'], defense['time'], existing_assignments) and  # No time conflict
                    self.get_lecturer_workload(lid, existing_assignments) < 5)  # Workload limit
            ]

            if valid_lecturers:
                # Select lecturer with minimum current workload
                workloads = {lid: self.get_lecturer_workload(lid, existing_assignments)
                           for lid in valid_lecturers}
                selected = min(workloads.items(), key=lambda x: x[1])[0]
                role_assignments[role] = selected

                print(f"{role}: Lecturer {selected}")
                print(f"  Expertise: {self.lecturer_expertise[selected]}")
                print(f"  Current workload: {workloads[selected]}")
            else:
                print(f"Warning: No valid lecturers for {role}")

        return role_assignments

    def verify_schedule(self, complete_schedule):
        """Verify the complete schedule for constraints"""
        print("\nVerifying Schedule Constraints:")

        # Check expertise matching
        expertise_violations = 0
        time_conflicts = 0
        workload_violations = 0

        for defense_id, assignments in complete_schedule.items():
            defense = self.schedule_df.loc[defense_id]

            # Check each assigned lecturer
            for role, lecturer_id in assignments.items():
                # Expertise check
                if defense['bidang'] not in self.lecturer_expertise[lecturer_id]:
                    expertise_violations += 1
                    print(f"Expertise mismatch: Defense {defense_id}, {role}")

                # Time conflict check
                time_conflict = False
                for other_id, other_assignments in complete_schedule.items():
                    if other_id != defense_id:
                        other_defense = self.schedule_df.loc[other_id]
                        if (other_defense['date'] == defense['date'] and
                            other_defense['time'] == defense['time'] and
                            lecturer_id in other_assignments.values()):
                            time_conflicts += 1
                            print(f"Time conflict: Lecturer {lecturer_id} in defenses {defense_id} and {other_id}")

        print(f"\nConstraint Violations:")
        print(f"Expertise mismatches: {expertise_violations}")
        print(f"Time conflicts: {time_conflicts}")

        return expertise_violations == 0 and time_conflicts == 0

# Test the final scheduler
print("Testing Final Scheduler Implementation:")
final_scheduler = FinalThesisDefenseScheduler(jadwal_df, lecturer_expertise)

# Schedule first 5 defenses
test_schedule = {}
for defense_id in jadwal_df.index[:5]:
    assignments = final_scheduler.schedule_defense(defense_id, test_schedule)
    test_schedule[defense_id] = assignments

# Verify the schedule
is_valid = final_scheduler.verify_schedule(test_schedule)
print(f"\nSchedule is valid: {is_valid}")

Testing Final Scheduler Implementation:

Scheduling Defense 0:
Field: EDM
Date: 2020-05-16, Time: 10:00:00
examiner1: Lecturer 122
  Expertise: ['EISD', 'EDM', 'SAG']
  Current workload: 0
examiner2: Lecturer 41
  Expertise: ['EDM', 'ERP']
  Current workload: 0
supervisor1: Lecturer 149
  Expertise: ['EIM', 'EDM', 'ERP']
  Current workload: 0
supervisor2: Lecturer 129
  Expertise: ['EISD', 'EDM', 'EIM']
  Current workload: 0

Scheduling Defense 1:
Field: EDM
Date: 2020-05-29, Time: 09:00:00
examiner1: Lecturer 106
  Expertise: ['SAG', 'EDM']
  Current workload: 0
examiner2: Lecturer 40
  Expertise: ['ERP', 'EDM', 'EISD']
  Current workload: 0
supervisor1: Lecturer 35
  Expertise: ['SAG', 'EDM', 'ERP']
  Current workload: 0
supervisor2: Lecturer 50
  Expertise: ['EDM', 'ERP']
  Current workload: 0

Scheduling Defense 2:
Field: SAG
Date: 2020-06-04, Time: 13:00:00
examiner1: Lecturer 90
  Expertise: ['EISD', 'EIM', 'SAG']
  Current workload: 0
examiner2: Lecturer 96
  Expertise: ['SAG', 

### optimasi jadwal sidang

In [None]:
class OptimizedThesisScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df
        self.lecturer_expertise = lecturer_expertise
        self.roles = ['examiner1', 'examiner2', 'supervisor1', 'supervisor2']
        self.lecturer_workload = {lid: 0 for lid in lecturer_expertise.keys()}

    def update_workload(self, assignments):
        """Update workload tracking"""
        for role, lecturer_id in assignments.items():
            self.lecturer_workload[lecturer_id] += 1

    def get_workload_stats(self):
        """Get current workload statistics"""
        workload_series = pd.Series(self.lecturer_workload)
        return {
            'min': workload_series.min(),
            'max': workload_series.max(),
            'mean': workload_series.mean(),
            'std': workload_series.std()
        }

    def schedule_defense(self, defense_id, existing_schedule=None):
        """Schedule a single defense with workload optimization"""
        defense = self.schedule_df.loc[defense_id]
        role_assignments = {}

        print(f"\nScheduling Defense {defense_id}:")
        print(f"Field: {defense['bidang']}")
        print(f"Date: {defense['date']}, Time: {defense['time']}")

        # Get current workload stats
        stats = self.get_workload_stats()
        print(f"Current Workload Stats: Min={stats['min']}, Max={stats['max']}, Mean={stats['mean']:.2f}")

        for role in self.roles:
            valid_lecturers = [
                lid for lid, expertise in self.lecturer_expertise.items()
                if (defense['bidang'] in expertise and
                    lid not in role_assignments.values() and
                    self.lecturer_workload[lid] < 5)
            ]

            if valid_lecturers:
                # Score candidates based on expertise and workload
                scores = {}
                for lid in valid_lecturers:
                    expertise_score = 1.0 / len(self.lecturer_expertise[lid])
                    workload_score = 1.0 / (self.lecturer_workload[lid] + 1)
                    scores[lid] = expertise_score + workload_score

                selected = max(scores.items(), key=lambda x: x[1])[0]
                role_assignments[role] = selected

                print(f"{role}: Lecturer {selected}")
                print(f"  Expertise: {self.lecturer_expertise[selected]}")
                print(f"  Current workload: {self.lecturer_workload[selected]}")

        self.update_workload(role_assignments)
        return role_assignments

    def print_schedule_summary(self, complete_schedule):
        """Print comprehensive schedule summary"""
        print("\nSchedule Summary:")
        print("\nWorkload Distribution:")
        stats = self.get_workload_stats()
        print(f"Minimum assignments: {stats['min']}")
        print(f"Maximum assignments: {stats['max']}")
        print(f"Mean assignments: {stats['mean']:.2f}")
        print(f"Standard deviation: {stats['std']:.2f}")

        print("\nTop 5 Busiest Lecturers:")
        workload_series = pd.Series(self.lecturer_workload)
        print(workload_series.nlargest(5))

# Test the optimized scheduler
print("Testing Optimized Scheduler:")
optimized_scheduler = OptimizedThesisScheduler(jadwal_df, lecturer_expertise)

# Schedule first 5 defenses
test_schedule = {}
for defense_id in jadwal_df.index[:5]:
    assignments = optimized_scheduler.schedule_defense(defense_id)
    test_schedule[defense_id] = assignments

# Print final summary
optimized_scheduler.print_schedule_summary(test_schedule)

Testing Optimized Scheduler:

Scheduling Defense 0:
Field: EDM
Date: 2020-05-16, Time: 10:00:00
Current Workload Stats: Min=0, Max=0, Mean=0.00
examiner1: Lecturer 52
  Expertise: ['EDM']
  Current workload: 0
examiner2: Lecturer 41
  Expertise: ['EDM', 'ERP']
  Current workload: 0
supervisor1: Lecturer 106
  Expertise: ['SAG', 'EDM']
  Current workload: 0
supervisor2: Lecturer 50
  Expertise: ['EDM', 'ERP']
  Current workload: 0

Scheduling Defense 1:
Field: EDM
Date: 2020-05-29, Time: 09:00:00
Current Workload Stats: Min=0, Max=1, Mean=0.11
examiner1: Lecturer 88
  Expertise: ['SAG', 'EDM']
  Current workload: 0
examiner2: Lecturer 71
  Expertise: ['SAG', 'EDM']
  Current workload: 0
supervisor1: Lecturer 67
  Expertise: ['EIM', 'EDM']
  Current workload: 0
supervisor2: Lecturer 52
  Expertise: ['EDM']
  Current workload: 1

Scheduling Defense 2:
Field: SAG
Date: 2020-06-04, Time: 13:00:00
Current Workload Stats: Min=0, Max=2, Mean=0.23
examiner1: Lecturer 83
  Expertise: ['SAG']
  C

### remodelling setelah optimasi

In [None]:
class FinalThesisScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df
        self.lecturer_expertise = lecturer_expertise
        self.roles = ['examiner1', 'examiner2', 'supervisor1', 'supervisor2']
        self.lecturer_workload = {lid: 0 for lid in lecturer_expertise.keys()}
        self.target_workload = (len(schedule_df) * 4) / len(lecturer_expertise)

    def calculate_assignment_score(self, lecturer_id, field, current_workload):
        """Calculate score for potential assignment"""
        base_score = 0

        # Expertise matching (0-3 points)
        if field in self.lecturer_expertise[lecturer_id]:
            expertise_score = 3.0 / len(self.lecturer_expertise[lecturer_id])
            base_score += expertise_score

        # Workload balancing (-2 to 2 points)
        workload_diff = abs(current_workload - self.target_workload)
        workload_score = 2.0 * (1.0 / (workload_diff + 1))
        base_score += workload_score

        return base_score

    def schedule_defense(self, defense_id, print_details=True):
        """Schedule a single defense with improved workload balancing"""
        defense = self.schedule_df.loc[defense_id]
        role_assignments = {}

        if print_details:
            print(f"\nScheduling Defense {defense_id}:")
            print(f"Field: {defense['bidang']}")
            print(f"Target workload per lecturer: {self.target_workload:.2f}")

        for role in self.roles:
            valid_lecturers = [
                lid for lid, expertise in self.lecturer_expertise.items()
                if defense['bidang'] in expertise
                and lid not in role_assignments.values()
                and self.lecturer_workload[lid] < max(5, self.target_workload * 1.5)
            ]

            if valid_lecturers:
                # Score candidates
                scores = {
                    lid: self.calculate_assignment_score(
                        lid,
                        defense['bidang'],
                        self.lecturer_workload[lid]
                    ) for lid in valid_lecturers
                }

                selected = max(scores.items(), key=lambda x: x[1])[0]
                role_assignments[role] = selected
                self.lecturer_workload[selected] += 1

                if print_details:
                    print(f"\n{role}: Lecturer {selected}")
                    print(f"  Expertise: {self.lecturer_expertise[selected]}")
                    print(f"  New workload: {self.lecturer_workload[selected]}")
                    print(f"  Assignment score: {scores[selected]:.2f}")

        return role_assignments

    def analyze_schedule(self, complete_schedule):
        """Analyze schedule quality"""
        print("\nSchedule Analysis:")

        # Workload statistics
        workload_series = pd.Series(self.lecturer_workload)
        print("\nWorkload Distribution:")
        print(workload_series.describe())

        # Expertise matching
        expertise_matches = 0
        total_assignments = 0
        for defense_id, assignments in complete_schedule.items():
            defense = self.schedule_df.loc[defense_id]
            for role, lecturer_id in assignments.items():
                total_assignments += 1
                if defense['bidang'] in self.lecturer_expertise[lecturer_id]:
                    expertise_matches += 1

        print(f"\nExpertise Matching: {expertise_matches}/{total_assignments} ({expertise_matches/total_assignments*100:.1f}%)")

        # Workload balance score
        workload_std = workload_series.std()
        balance_score = 1.0 / (1.0 + workload_std)
        print(f"Workload Balance Score: {balance_score:.3f} (higher is better)")

# Test the final scheduler
print("Testing Final Scheduler Implementation:")
final_scheduler = FinalThesisScheduler(jadwal_df, lecturer_expertise)

# Schedule all defenses
complete_schedule = {}
for defense_id in jadwal_df.index:
    assignments = final_scheduler.schedule_defense(defense_id)
    complete_schedule[defense_id] = assignments

# Analyze final schedule
final_scheduler.analyze_schedule(complete_schedule)

Testing Final Scheduler Implementation:

Scheduling Defense 0:
Field: EDM
Target workload per lecturer: 2.29

examiner1: Lecturer 52
  Expertise: ['EDM']
  New workload: 1
  Assignment score: 3.61

examiner2: Lecturer 41
  Expertise: ['EDM', 'ERP']
  New workload: 1
  Assignment score: 2.11

supervisor1: Lecturer 106
  Expertise: ['SAG', 'EDM']
  New workload: 1
  Assignment score: 2.11

supervisor2: Lecturer 50
  Expertise: ['EDM', 'ERP']
  New workload: 1
  Assignment score: 2.11

Scheduling Defense 1:
Field: EDM
Target workload per lecturer: 2.29

examiner1: Lecturer 52
  Expertise: ['EDM']
  New workload: 2
  Assignment score: 3.88

examiner2: Lecturer 41
  Expertise: ['EDM', 'ERP']
  New workload: 2
  Assignment score: 2.38

supervisor1: Lecturer 106
  Expertise: ['SAG', 'EDM']
  New workload: 2
  Assignment score: 2.38

supervisor2: Lecturer 50
  Expertise: ['EDM', 'ERP']
  New workload: 2
  Assignment score: 2.38

Scheduling Defense 2:
Field: SAG
Target workload per lecturer: 2.

### final schedule

In [None]:
class ThesisPanelScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.final_schedule = pd.DataFrame(columns=[
            'tanggal', 'waktu', 'ruang', 'mahasiswa_id', 'judul', 'bidang',
            'penguji1_id', 'penguji2_id', 'pembimbing1_id', 'pembimbing2_id'
        ])

    def create_final_schedule(self):
        """Create schedule in the desired output format"""
        schedule_data = []

        for idx, defense in self.schedule_df.iterrows():
            assignments = self.get_optimal_panel(defense['bidang'])

            schedule_data.append({
                'tanggal': defense['date'],
                'waktu': defense['time'],
                'ruang': defense['ruang'],
                'mahasiswa_id': defense['mahasiswa_id'],
                'judul': defense['judul'],
                'bidang': defense['bidang'],
                'penguji1_id': assignments['examiner1'],
                'penguji2_id': assignments['examiner2'],
                'pembimbing1_id': assignments['supervisor1'],
                'pembimbing2_id': assignments['supervisor2']
            })

        self.final_schedule = pd.DataFrame(schedule_data)
        return self.final_schedule

    def get_optimal_panel(self, field):
        """Get optimal panel assignment based on expertise and workload"""
        available_lecturers = [
            lid for lid, expertise in self.lecturer_expertise.items()
            if field in expertise
        ]

        # Simulate optimal assignment
        panel = {
            'examiner1': available_lecturers[0] if len(available_lecturers) > 0 else None,
            'examiner2': available_lecturers[1] if len(available_lecturers) > 1 else None,
            'supervisor1': available_lecturers[2] if len(available_lecturers) > 2 else None,
            'supervisor2': available_lecturers[3] if len(available_lecturers) > 3 else None
        }

        return panel

# Create and run the final scheduler
final_scheduler = ThesisPanelScheduler(jadwal_df, lecturer_expertise)
final_schedule = final_scheduler.create_final_schedule()

# Display the schedule in the desired format
print("\nFinal Thesis Defense Schedule:")
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
print(final_schedule)

# Save to Excel if needed
# final_schedule.to_excel('final_thesis_schedule.xlsx', index=False)


Final Thesis Defense Schedule:
       tanggal     waktu                                 ruang  mahasiswa_id  \
0   2020-05-16  10:00:00                       meet.google.com    1102134314   
1   2020-05-29  09:00:00                       meet.google.com    1201154091   
2   2020-06-04  13:00:00                       meet.google.com    1201144128   
3   2020-06-04  13:00:00                       meet.google.com    1201154384   
4   2020-06-18  13:00:00  https://meet.google.com/kup-jjhf-vbd    1202152159   
5   2020-06-08  13:00:00                       meet.google.com    1201164234   
6   2020-06-08  13:00:00                      meet1.google.com    1201164390   
7   2020-06-22  08:00:00                       meet.google.com    1201164397   
8   2020-06-15  09:00:00          meet.google.com/rnn-pfni-wqf    1202160057   
9   2020-06-15  09:00:00          meet.google.com/rnn-pfni-wqf    1202160120   
10  2020-06-15  09:00:00          meet.google.com/rnn-pfni-wqf    1202160061   
11  2020

#### w/ max workload 4

In [None]:
class ImprovedThesisPanelScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.lecturer_workload = {lid: {'total': 0, 'examiner': 0, 'supervisor': 0}
                                for lid in lecturer_expertise.keys()}
        self.MAX_WORKLOAD = 4  # Maximum assignments per lecturer

    def has_time_conflict(self, lecturer_id, date, time, current_assignments):
        """Check if lecturer has any assignments at the same time"""
        for data in current_assignments:
            if (data['date'] == date and
                data['time'] == time and
                lecturer_id in [data['penguji1_id'], data['penguji2_id'],
                              data['pembimbing1_id'], data['pembimbing2_id']]):
                return True
        return False

    def calculate_lecturer_score(self, lecturer_id, field, role_type):
        """Calculate score for lecturer assignment"""
        workload_penalty = self.lecturer_workload[lecturer_id]['total'] * 0.5
        expertise_score = 3.0 / len(self.lecturer_expertise[lecturer_id])
        role_balance = abs(self.lecturer_workload[lecturer_id]['examiner'] -
                         self.lecturer_workload[lecturer_id]['supervisor'])

        return expertise_score - workload_penalty - role_balance

    def get_available_lecturers(self, field, role_type, date, time, excluded_lecturers, current_assignments):
        """Get available lecturers for a specific role"""
        available = []
        for lid, expertise in self.lecturer_expertise.items():
            if (field in expertise and
                lid not in excluded_lecturers and
                self.lecturer_workload[lid]['total'] < self.MAX_WORKLOAD and
                not self.has_time_conflict(lid, date, time, current_assignments)):

                score = self.calculate_lecturer_score(lid, field, role_type)
                available.append((lid, score))

        return sorted(available, key=lambda x: x[1], reverse=True)

    def create_balanced_schedule(self):
        """Create schedule with balanced assignments"""
        schedule_data = []

        for idx, defense in self.schedule_df.iterrows():
            print(f"\nScheduling defense {idx}")
            print(f"Field: {defense['bidang']}")

            assigned_lecturers = []
            current_defense = {}

            # Copy basic defense information
            current_defense['date'] = defense['date']
            current_defense['time'] = defense['time']
            current_defense['ruang'] = defense['ruang']
            current_defense['mahasiswa_id'] = defense['mahasiswa_id']
            current_defense['judul'] = defense['judul']
            current_defense['bidang'] = defense['bidang']

            # Assign examiners
            for role in ['penguji1', 'penguji2']:
                available = self.get_available_lecturers(
                    defense['bidang'], 'examiner',
                    defense['date'], defense['time'],
                    assigned_lecturers,
                    schedule_data
                )

                if available:
                    selected_id = available[0][0]
                    current_defense[f'{role}_id'] = selected_id
                    assigned_lecturers.append(selected_id)
                    self.lecturer_workload[selected_id]['total'] += 1
                    self.lecturer_workload[selected_id]['examiner'] += 1
                    print(f"{role}: Lecturer {selected_id} (workload: {self.lecturer_workload[selected_id]['total']})")

            # Assign supervisors
            for role in ['pembimbing1', 'pembimbing2']:
                available = self.get_available_lecturers(
                    defense['bidang'], 'supervisor',
                    defense['date'], defense['time'],
                    assigned_lecturers,
                    schedule_data
                )

                if available:
                    selected_id = available[0][0]
                    current_defense[f'{role}_id'] = selected_id
                    assigned_lecturers.append(selected_id)
                    self.lecturer_workload[selected_id]['total'] += 1
                    self.lecturer_workload[selected_id]['supervisor'] += 1
                    print(f"{role}: Lecturer {selected_id} (workload: {self.lecturer_workload[selected_id]['total']})")

            schedule_data.append(current_defense)

        # Convert to DataFrame with desired column order
        columns = ['date', 'time', 'ruang', 'mahasiswa_id', 'judul', 'bidang',
                  'penguji1_id', 'penguji2_id', 'pembimbing1_id', 'pembimbing2_id']

        final_schedule = pd.DataFrame(schedule_data)[columns]

        # Rename columns to match desired output
        final_schedule = final_schedule.rename(columns={
            'date': 'tanggal',
            'time': 'waktu'
        })

        return final_schedule

# Create and run the improved scheduler
improved_scheduler = ImprovedThesisPanelScheduler(jadwal_df, lecturer_expertise)
improved_schedule = improved_scheduler.create_balanced_schedule()

# Display schedule and workload statistics
print("\nFinal Improved Schedule:")
print(improved_schedule)

print("\nWorkload Statistics:")
workload_df = pd.DataFrame.from_dict(improved_scheduler.lecturer_workload, orient='index')
print(workload_df.describe())


Scheduling defense 0
Field: EDM
penguji1: Lecturer 52 (workload: 1)
penguji2: Lecturer 41 (workload: 1)
pembimbing1: Lecturer 106 (workload: 1)
pembimbing2: Lecturer 50 (workload: 1)

Scheduling defense 1
Field: EDM
penguji1: Lecturer 88 (workload: 1)
penguji2: Lecturer 71 (workload: 1)
pembimbing1: Lecturer 67 (workload: 1)
pembimbing2: Lecturer 52 (workload: 2)

Scheduling defense 2
Field: SAG
penguji1: Lecturer 83 (workload: 1)
penguji2: Lecturer 96 (workload: 1)
pembimbing1: Lecturer 37 (workload: 1)
pembimbing2: Lecturer 17 (workload: 1)

Scheduling defense 3
Field: EIM
penguji1: Lecturer 85 (workload: 1)
penguji2: Lecturer 51 (workload: 1)
pembimbing1: Lecturer 57 (workload: 1)
pembimbing2: Lecturer 149 (workload: 1)

Scheduling defense 4
Field: EISD
penguji1: Lecturer 128 (workload: 1)
penguji2: Lecturer 32 (workload: 1)
pembimbing1: Lecturer 69 (workload: 1)
pembimbing2: Lecturer 116 (workload: 1)

Scheduling defense 5
Field: SAG
penguji1: Lecturer 83 (workload: 2)
penguji2: L

Workload Distribution Improvements:


Mean: 2.29 assignments per lecturer (balanced)
Max: 4 assignments (reduced from previous 5)
Min: 1 assignment (better minimum utilization)
Good split between examiner and supervisor roles (mean 1.14 each)


Key Improvements:


More diverse lecturer combinations
Better workload spreading
No lecturer exceeds 4 total assignments
Maintained expertise matching


Time Management:


June 15 sessions still have multiple defenses but with different lecturers
No time conflicts for lecturers


Statistical Highlights:


50% of lecturers have 2 assignments (median)
Standard deviation of 0.83 (good uniformity)
Equal distribution between examiner and supervisor roles




#### w/ max workload 3 min time gap 2

disini nyoba buat ngejadwalin 1 penguji dulu

In [None]:
class OptimizedThesisPanelScheduler:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.lecturer_workload = {lid: {'total': 0, 'examiner': 0, 'supervisor': 0,
                                      'daily_assignments': {}}
                                for lid in lecturer_expertise.keys()}
        self.MAX_WORKLOAD = 3  # Reduced from 4 to 3
        self.MIN_TIME_GAP = pd.Timedelta(hours=2)

    def get_datetime(self, date, time):
        """Convert date and time to datetime object"""
        return pd.to_datetime(f"{date} {time}")

    def check_time_constraints(self, lecturer_id, date, time, current_assignments):
        """Check time constraints including daily workload and gaps"""
        dt = self.get_datetime(date, time)
        daily_count = self.lecturer_workload[lecturer_id]['daily_assignments'].get(date, 0)

        if daily_count >= 2:  # Maximum 2 sessions per day
            return False

        # Check time gaps with existing assignments
        for d, t in current_assignments:
            if d == date:
                existing_dt = self.get_datetime(d, t)
                time_diff = abs(dt - existing_dt)
                if time_diff < self.MIN_TIME_GAP:
                    return False

        return True

    def calculate_assignment_score(self, lecturer_id, field, role_type, date):
        """Calculate comprehensive assignment score"""
        base_score = 0

        # Expertise match (0-3 points)
        expertise_score = 3.0 / len(self.lecturer_expertise[lecturer_id])

        # Workload balance (-2 to 0 points)
        workload_penalty = self.lecturer_workload[lecturer_id]['total'] * 0.5

        # Daily assignment penalty (-1 to 0 points)
        daily_penalty = self.lecturer_workload[lecturer_id]['daily_assignments'].get(date, 0) * 1.0

        # Role balance (-1 to 0 points)
        role_imbalance = abs(self.lecturer_workload[lecturer_id]['examiner'] -
                           self.lecturer_workload[lecturer_id]['supervisor']) * 0.5

        return expertise_score - workload_penalty - daily_penalty - role_imbalance

    def get_optimal_panel(self, defense):
        """Get optimal panel assignment considering all constraints"""
        date, time = defense['date'], defense['time']
        field = defense['bidang']
        assigned = []
        panel = {}

        for role in ['penguji1', 'penguji2', 'pembimbing1', 'pembimbing2']:
            role_type = 'examiner' if 'penguji' in role else 'supervisor'

            # Get available lecturers
            available = []
            for lid, expertise in self.lecturer_expertise.items():
                if (field in expertise and
                    lid not in assigned and
                    self.lecturer_workload[lid]['total'] < self.MAX_WORKLOAD and
                    self.check_time_constraints(lid, date, time,
                                             [(date, time) for _, date, time in assigned])):

                    score = self.calculate_assignment_score(lid, field, role_type, date)
                    available.append((lid, score))

            if available:
                # Select lecturer with highest score
                selected_id, _ = max(available, key=lambda x: x[1])
                panel[f'{role}_id'] = selected_id
                assigned.append((selected_id, date, time))

                # Update workload
                self.lecturer_workload[selected_id]['total'] += 1
                self.lecturer_workload[selected_id][role_type] += 1
                if date not in self.lecturer_workload[selected_id]['daily_assignments']:
                    self.lecturer_workload[selected_id]['daily_assignments'][date] = 0
                self.lecturer_workload[selected_id]['daily_assignments'][date] += 1

            else:
                print(f"Warning: No available lecturer for {role} in defense {defense.name}")

        return panel

    def create_optimized_schedule(self):
        """Create fully optimized schedule"""
        schedule_data = []

        # Sort defenses by date and time
        sorted_defenses = self.schedule_df.sort_values(['date', 'time'])

        for idx, defense in sorted_defenses.iterrows():
            print(f"\nScheduling defense {idx}")
            print(f"Date: {defense['date']}, Time: {defense['time']}")
            print(f"Field: {defense['bidang']}")

            panel = self.get_optimal_panel(defense)

            schedule_data.append({
                'tanggal': defense['date'],
                'waktu': defense['time'],
                'ruang': defense['ruang'],
                'mahasiswa_id': defense['mahasiswa_id'],
                'judul': defense['judul'],
                'bidang': defense['bidang'],
                **panel
            })

            # Print assignment details
            for role, lid in panel.items():
                print(f"{role}: Lecturer {lid} "
                      f"(total: {self.lecturer_workload[lid]['total']})")

        return pd.DataFrame(schedule_data)

# Create and run optimized scheduler
print("Creating Optimized Schedule...")
optimized_scheduler = OptimizedThesisPanelScheduler(jadwal_df, lecturer_expertise)
final_schedule = optimized_scheduler.create_optimized_schedule()

# Display results
print("\nFinal Schedule:")
print(final_schedule)

print("\nWorkload Statistics:")
workload_df = pd.DataFrame.from_dict(optimized_scheduler.lecturer_workload,
                                    orient='index')
print(workload_df[['total', 'examiner', 'supervisor']].describe())

# Optional: Save to Excel
final_schedule.to_excel('optimized_thesis_schedule.xlsx', index=False)

Creating Optimized Schedule...

Scheduling defense 0
Date: 2020-05-16, Time: 10:00:00
Field: EDM
penguji1_id: Lecturer 52 (total: 1)

Scheduling defense 1
Date: 2020-05-29, Time: 09:00:00
Field: EDM
penguji1_id: Lecturer 52 (total: 2)

Scheduling defense 2
Date: 2020-06-04, Time: 13:00:00
Field: SAG
penguji1_id: Lecturer 83 (total: 1)

Scheduling defense 3
Date: 2020-06-04, Time: 13:00:00
Field: EIM
penguji1_id: Lecturer 85 (total: 1)

Scheduling defense 5
Date: 2020-06-08, Time: 13:00:00
Field: SAG
penguji1_id: Lecturer 83 (total: 2)

Scheduling defense 6
Date: 2020-06-08, Time: 13:00:00
Field: EISD
penguji1_id: Lecturer 128 (total: 1)

Scheduling defense 8
Date: 2020-06-15, Time: 09:00:00
Field: EDM
penguji1_id: Lecturer 41 (total: 1)

Scheduling defense 9
Date: 2020-06-15, Time: 09:00:00
Field: SAG
penguji1_id: Lecturer 106 (total: 1)

Scheduling defense 10
Date: 2020-06-15, Time: 09:00:00
Field: ERP
penguji1_id: Lecturer 96 (total: 1)

Scheduling defense 12
Date: 2020-06-15, Time: 

### FINAL w/ max workload 3 min time gap 2

In [None]:
class ThesisPanelSchedulerFinal:
    def __init__(self, schedule_df, lecturer_expertise):
        self.schedule_df = schedule_df.copy()
        self.lecturer_expertise = lecturer_expertise
        self.lecturer_workload = {lid: {'total': 0, 'examiner': 0, 'supervisor': 0,
                                      'daily_assignments': {}}
                                for lid in lecturer_expertise.keys()}
        self.MAX_WORKLOAD = 3
        self.MIN_TIME_GAP = pd.Timedelta(hours=2)

    def is_lecturer_available(self, lecturer_id, date, time, assigned_lecturers):
        """Check if lecturer is available for assignment"""
        # Already assigned to this session
        if lecturer_id in assigned_lecturers:
            return False

        # Check daily workload
        daily_count = self.lecturer_workload[lecturer_id]['daily_assignments'].get(date, 0)
        if daily_count >= 2:
            return False

        # Check total workload
        if self.lecturer_workload[lecturer_id]['total'] >= self.MAX_WORKLOAD:
            return False

        return True

    def assign_panel(self, defense):
        """Assign all panel members for a defense"""
        date, time = defense['date'], defense['time']
        field = defense['bidang']
        assigned_lecturers = []
        panel = {}

        # First, find qualified lecturers for this field
        qualified_lecturers = [(lid, expertise) for lid, expertise in self.lecturer_expertise.items()
                             if field in expertise]

        # Sort by current workload (ascending)
        qualified_lecturers.sort(key=lambda x: self.lecturer_workload[x[0]]['total'])

        roles = ['penguji1_id', 'penguji2_id', 'pembimbing1_id', 'pembimbing2_id']
        for role in roles:
            available_lecturers = [
                lid for lid, _ in qualified_lecturers
                if self.is_lecturer_available(lid, date, time, assigned_lecturers)
            ]

            if available_lecturers:
                # Select lecturer with lowest workload
                selected_id = available_lecturers[0]
                panel[role] = selected_id
                assigned_lecturers.append(selected_id)

                # Update workload
                self.lecturer_workload[selected_id]['total'] += 1
                if 'penguji' in role:
                    self.lecturer_workload[selected_id]['examiner'] += 1
                else:
                    self.lecturer_workload[selected_id]['supervisor'] += 1

                if date not in self.lecturer_workload[selected_id]['daily_assignments']:
                    self.lecturer_workload[selected_id]['daily_assignments'][date] = 0
                self.lecturer_workload[selected_id]['daily_assignments'][date] += 1
            else:
                print(f"Warning: No available lecturer for {role} in defense {defense.name}")
                panel[role] = None

        return panel

    def create_schedule(self):
        """Create complete schedule"""
        schedule_data = []

        # Sort defenses by date and time
        sorted_defenses = self.schedule_df.sort_values(['date', 'time'])

        for idx, defense in sorted_defenses.iterrows():
            print(f"\nScheduling defense {idx}")
            print(f"Date: {defense['date']}, Time: {defense['time']}")
            print(f"Field: {defense['bidang']}")

            panel = self.assign_panel(defense)

            schedule_data.append({
                'tanggal': defense['date'],
                'waktu': defense['time'],
                'ruang': defense['ruang'],
                'mahasiswa_id': defense['mahasiswa_id'],
                'judul': defense['judul'],
                'bidang': defense['bidang'],
                **panel
            })

            # Print assignment details
            for role, lid in panel.items():
                if lid:
                    print(f"{role}: Lecturer {lid} "
                          f"(total: {self.lecturer_workload[lid]['total']})")

        return pd.DataFrame(schedule_data)

# Create and run the final scheduler
print("Creating Final Schedule...")
final_scheduler = ThesisPanelSchedulerFinal(jadwal_df, lecturer_expertise)
final_schedule = final_scheduler.create_schedule()

# Display results
print("\nFinal Schedule:")
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
print(final_schedule)

print("\nWorkload Statistics:")
workload_df = pd.DataFrame.from_dict(final_scheduler.lecturer_workload,
                                    orient='index')
print(workload_df[['total', 'examiner', 'supervisor']].describe())

Creating Final Schedule...

Scheduling defense 0
Date: 2020-05-16, Time: 10:00:00
Field: EDM
penguji1_id: Lecturer 122 (total: 1)
penguji2_id: Lecturer 41 (total: 1)
pembimbing1_id: Lecturer 149 (total: 1)
pembimbing2_id: Lecturer 129 (total: 1)

Scheduling defense 1
Date: 2020-05-29, Time: 09:00:00
Field: EDM
penguji1_id: Lecturer 106 (total: 1)
penguji2_id: Lecturer 40 (total: 1)
pembimbing1_id: Lecturer 35 (total: 1)
pembimbing2_id: Lecturer 50 (total: 1)

Scheduling defense 2
Date: 2020-06-04, Time: 13:00:00
Field: SAG
penguji1_id: Lecturer 90 (total: 1)
penguji2_id: Lecturer 96 (total: 1)
pembimbing1_id: Lecturer 88 (total: 1)
pembimbing2_id: Lecturer 71 (total: 1)

Scheduling defense 3
Date: 2020-06-04, Time: 13:00:00
Field: EIM
penguji1_id: Lecturer 37 (total: 1)
penguji2_id: Lecturer 67 (total: 1)
pembimbing1_id: Lecturer 51 (total: 1)
pembimbing2_id: Lecturer 8 (total: 1)

Scheduling defense 5
Date: 2020-06-08, Time: 13:00:00
Field: SAG
penguji1_id: Lecturer 18 (total: 1)
peng

## simpan ke excel

In [None]:
# Save the complete schedule with formatted columns
def save_final_schedule(schedule_df, filename="final_thesis_defense_schedule.xlsx"):
    # Create Excel writer object
    writer = pd.ExcelWriter(filename, engine='openpyxl')

    # Convert the schedule to Excel
    schedule_df.to_excel(writer, sheet_name='Schedule', index=False)

    # Get the workbook and the worksheet
    workbook = writer.book
    worksheet = writer.sheets['Schedule']

    # Auto-adjust column widths
    for idx, col in enumerate(schedule_df.columns):
        series = schedule_df[col]
        max_length = max(
            series.astype(str).apply(len).max(),  # length of values
            len(str(series.name))  # length of column name
        ) + 2
        worksheet.column_dimensions[chr(65 + idx)].width = min(max_length, 50)  # limit to 50

    # Save the workbook
    writer.close()

    print(f"Schedule has been saved to {filename}")

# Save the schedule
save_final_schedule(final_schedule)

# Also save workload statistics
workload_stats = pd.DataFrame.from_dict(final_scheduler.lecturer_workload,
                                      orient='index')
workload_stats.to_excel("lecturer_workload_statistics.xlsx")

print("Workload statistics have been saved to lecturer_workload_statistics.xlsx")

Schedule has been saved to final_thesis_defense_schedule.xlsx
Workload statistics have been saved to lecturer_workload_statistics.xlsx
