# Job Scheduling with WATC Heuristic

This notebook demonstrates the process of generating job instances and solving the scheduling problem using the WATC (Weighted Average Tardiness Cost) heuristic. It generates job data with specific distributions and calculates the corresponding weighted tardiness for each job schedule.

---

### Importing Required Libraries
We import the necessary libraries: `csv` for data storage, `random` for random number generation, `math` for mathematical operations, and `numpy` for handling numerical arrays.


In [1]:
import csv
import random
import math
import numpy as np

random.seed(444)

### Generating Job Instances

In this section, we define a function to generate synthetic instances of job data. Each job has the following attributes:
- **Processing Time**: Randomly generated from a uniform distribution between 1 and 8.
- **Due Date**: Calculated as a deviation from the cumulative processing time, with the delta randomly chosen from a uniform distribution between -10 and 14.
- **Weights**: Randomly assigned from a uniform distribution between 1 and 6.

This function generates `n` jobs with these random attributes.


In [2]:
def generate_instance(n=50):
    # Generate processing times (uniform between 1 and 8)
    processing_times = [random.randint(1, 8) for _ in range(n)]
    
    # Compute completion times (but not actually the completion time but interpret as cumulative sums of processing times)
    completion_times = [sum(processing_times[:i+1]) for i in range(n)]
    
    # Generate due dates with a delta deviation from completion times (uniform between -10 and 14)
    delta_range = (-10, 14)
    due_dates = [
        max(processing_times[i], completion_times[i] + random.randint(*delta_range))  # Ensure the due dates can be at least processing times
        for i in range(n)
    ]

    # Generate weights (uniform between 1 and 6)
    weights = [random.randint(1, 6) for _ in range(n)]

    #if the dataset is wanted to be shuffled then uncomment this part:
    #combined = list(zip(processing_times, weights, due_dates))
    #random.shuffle(combined)
    #processing_times_shuffled, weights_shuffled, due_dates_shuffled = zip(*combined)
    #processing_times = list(processing_times_shuffled)
    #weights = list(weights_shuffled)
    #due_dates = list(due_dates_shuffled)
    return processing_times, weights, due_dates


### WATC Heuristic

In this section, we define the WATC heuristic algorithm. The key idea is to calculate an "urgency" score for each job based on:
- **Weight of the job**
- **Processing time of the job**
- **Deviation of the job's due date from the current time**

Jobs are scheduled based on the highest urgency score, and after each job is scheduled, the average processing time is recalculated. This continues until all jobs are scheduled.

The function returns the schedule of jobs sorted by this heuristic.


In [3]:
def watc(processing_times, weights, due_dates):
    n = len(processing_times)
    jobs = list(range(n))
    p_avg = np.mean(processing_times)  # Initialize p_avg with the mean of all processing times
    k = 1
    C = 0

    scheduled_jobs = []
    remaining_jobs = jobs.copy()

    for _ in range(n):
        urgency = []
        for j in remaining_jobs:
            urgency_j = (weights[j] / processing_times[j]) * math.exp(-max(due_dates[j] - C, 0) / (k * p_avg))
            urgency.append((urgency_j, j))

        urgency.sort(reverse=True)  # Sort in descending order
        next_job = urgency[0][1]
        scheduled_jobs.append(next_job)
        remaining_jobs.remove(next_job)
        C += processing_times[next_job]
        
        if remaining_jobs:  # Only update p_avg if remaining jobs exist
            p_avg = np.mean([processing_times[j] for j in remaining_jobs])

    return scheduled_jobs


### Calculating Weighted Tardiness

Once a schedule is obtained, we calculate the **weighted tardiness** for the schedule. The tardiness for each job is defined as the difference between its completion time and due date, but only if the job finishes after its due date (otherwise, tardiness is zero).

The weighted tardiness is the sum of the tardiness values weighted by the job's weight.

In [4]:
def calculate_weighted_tardiness(schedule, processing_times, weights, due_dates):
    n = len(schedule)
    C = [0] * n
    T = [0] * n
    C[0] = processing_times[schedule[0]]
    T[0] = max(0, C[0] - due_dates[schedule[0]])
    for i in range(1, n):
        C[i] = C[i-1] + processing_times[schedule[i]]
        T[i] = max(0, C[i] - due_dates[schedule[i]])
    weighted_tardiness = sum(weights[j] * T[j] for j in schedule)
    return weighted_tardiness


### Generating Multiple Instances and Storing Results

This section generates multiple job instances, schedules them using the WATC heuristic, calculates their weighted tardiness, and writes the results to a CSV file. The CSV contains the following columns:
- Instance ID
- Processing Times
- Weights
- Due Dates
- WATC Schedule
- WATC Weighted Tardiness

In [6]:
num_instances = 1000
filename = "heuristics3.csv"

with open(filename, 'w', newline='') as csvfile:
    fieldnames = ['Instance ID', 'Processing Times', 'Weights', 'Due Dates', 'WATC Schedule', 'WATC Weighted Tardiness']
    writer = csv.writer(csvfile)

    for i in range(num_instances):
        p, w, d = generate_instance()
        watc_schedule = watc(p, w, d)

        # Job data in the new format
        job_data = [[p[j], w[j], d[j]] for j in range(len(p))]
        # Binary indicator for WATC
        watc_indicator = [1 if idx == watc_schedule[0] else 0 for idx in range(len(p))]
        # Job sequence
        watc_sequence = [j + 1 for j in watc_schedule]  # Convert to 1-based indexing
        # Write job data on the first row
        writer.writerow([job_data])
        # Write WATC indicator on the second row
        writer.writerow([watc_indicator])
        # Write WATC job sequence on the third row
        writer.writerow([watc_sequence])

print(f"Instances generated and saved to {filename}")

Instances generated and saved to heuristics3.csv
