# Comprehensive Tutorial on CPU Scheduling Algorithms

This Jupyter Notebook is designed for you, an aspiring scientist and researcher, to master **CPU Scheduling Algorithms** (FCFS, SJF, Priority, Round-Robin, Multilevel Queue). As a beginner relying solely on this resource, you'll find a structured, beginner-friendly guide with theory, practical code, visualizations, real-world applications, research directions, and projects. The tutorial includes rare insights and additional topics critical for a scientist, ensuring you have everything needed to advance your career. We'll use Python for simulations, with clear explanations, analogies, and visualizations to make concepts easy to note and understand.

## Table of Contents
1. **Introduction to CPU Scheduling**
2. **First-Come-First-Serve (FCFS)**
3. **Shortest Job First (SJF)**
4. **Priority Scheduling**
5. **Round-Robin (RR)**
6. **Multilevel Queue Scheduling**
7. **Additional Topics for Scientists**
8. **Mini Project: Scheduling Simulator**
9. **Major Project: Real-Time Scheduling Analysis**
10. **Research Directions and Insights**

## Prerequisites
- Basic Python knowledge (lists, functions, loops).
- Install Python libraries: `matplotlib` for visualizations (`pip install matplotlib`).
- No prior OS knowledge required.

## 1. Introduction to CPU Scheduling

### Theory
CPU scheduling decides which process runs on the CPU and when, optimizing performance in an operating system. Think of a chef in a kitchen (CPU) handling multiple orders (processes). The goal is to minimize waiting time, maximize CPU use, and prioritize critical tasks.

**Key Metrics**:
- **Waiting Time**: Time a process waits in the ready queue.
- **Turnaround Time**: Total time from submission to completion (waiting + burst time).
- **Throughput**: Number of processes completed per unit time.

**Why It Matters for Scientists**:
- Optimizes computational tasks in simulations (e.g., physics, AI).
- Critical for real-time systems (e.g., robotics, medical devices).
- Foundation for designing efficient algorithms in research.

**Real-World Example**: In a hospital, patient treatment scheduling prioritizes emergencies, similar to CPU scheduling prioritizing critical processes.

## 2. First-Come-First-Serve (FCFS)

### Theory
**FCFS** is the simplest scheduling algorithm: processes are executed in the order they arrive. It’s like a queue at a ticket counter—first in, first served.

**How It Works**:
- Processes are placed in a ready queue in arrival order.
- The CPU executes each process to completion (non-preemptive).

**Math**:
- Waiting Time ($WT_i$): Sum of burst times of all previous processes.
- Turnaround Time ($TAT_i$): $WT_i + BT_i$ (Burst Time).
- Average Waiting Time: $\frac{\sum WT_i}{n}$.

**Pros**: Simple, fair for early arrivals.
**Cons**: Long waiting times if a process has a large burst time (**convoy effect**).

**Real-World Example**: Batch processing in early mainframe computers used FCFS to process jobs sequentially.

### Practical Code: FCFS Simulation

In [None]:
import matplotlib.pyplot as plt

def fcfs_scheduling(processes, burst_times):
    n = len(processes)
    waiting_times = [0] * n
    turnaround_times = [0] * n
    completion_times = [0] * n
    
    # Calculate completion, waiting, and turnaround times
    completion_times[0] = burst_times[0]
    turnaround_times[0] = completion_times[0]
    for i in range(1, n):
        completion_times[i] = completion_times[i-1] + burst_times[i]
        waiting_times[i] = completion_times[i-1]
        turnaround_times[i] = waiting_times[i] + burst_times[i]
    
    # Print results
    print("FCFS Scheduling Results:")
    print("Process\tBurst Time\tWaiting Time\tTurnaround Time")
    for i in range(n):
        print(f"{processes[i]}\t{burst_times[i]}\t\t{waiting_times[i]}\t\t{turnaround_times[i]}")
    print(f"Average Waiting Time: {sum(waiting_times)/n:.2f} ms")
    print(f"Average Turnaround Time: {sum(turnaround_times)/n:.2f} ms")
    
    # Visualize Gantt Chart
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.barh(y=0, width=burst_times[i], left=completion_times[i-1] if i > 0 else 0, height=0.4, label=processes[i])
    plt.title("FCFS Gantt Chart")
    plt.xlabel("Time (ms)")
    plt.yticks([])
    plt.legend()
    plt.show()

# Example
processes = ['P1', 'P2', 'P3']
burst_times = [10, 5, 8]
fcfs_scheduling(processes, burst_times)

**Output Explanation**:
- The code simulates FCFS for processes P1, P2, P3 with burst times 10, 5, 8 ms.
- It calculates waiting and turnaround times and plots a Gantt chart.
- Try changing burst times to see the convoy effect (e.g., [24, 3, 3]).

## 3. Shortest Job First (SJF)

### Theory
**SJF** selects the process with the shortest burst time to run next, minimizing average waiting time. It’s like a chef picking the quickest dish to prepare first to clear orders faster.

**How It Works**:
- Non-Preemptive: Shortest job runs to completion.
- Preemptive (SRTF): If a shorter job arrives, it interrupts the current process.

**Math**: Same as FCFS, but processes are sorted by burst time.

**Pros**: Minimizes waiting time.
**Cons**: Requires burst time prediction; long jobs may starve.

**Real-World Example**: In manufacturing, SJF schedules quick tasks (e.g., small parts assembly) first to maximize throughput.

### Practical Code: Non-Preemptive SJF

In [None]:
def sjf_scheduling(processes, burst_times):
    n = len(processes)
    # Sort processes by burst time
    sorted_indices = sorted(range(n), key=lambda i: burst_times[i])
    sorted_processes = [processes[i] for i in sorted_indices]
    sorted_burst_times = [burst_times[i] for i in sorted_indices]
    
    waiting_times = [0] * n
    turnaround_times = [0] * n
    completion_times = [0] * n
    
    completion_times[0] = sorted_burst_times[0]
    turnaround_times[0] = completion_times[0]
    for i in range(1, n):
        completion_times[i] = completion_times[i-1] + sorted_burst_times[i]
        waiting_times[i] = completion_times[i-1]
        turnaround_times[i] = waiting_times[i] + sorted_burst_times[i]
    
    # Map back to original process order
    original_waiting = [0] * n
    original_turnaround = [0] * n
    for i, idx in enumerate(sorted_indices):
        original_waiting[idx] = waiting_times[i]
        original_turnaround[idx] = turnaround_times[i]
    
    print("SJF Scheduling Results:")
    print("Process\tBurst Time\tWaiting Time\tTurnaround Time")
    for i in range(n):
        print(f"{processes[i]}\t{burst_times[i]}\t\t{original_waiting[i]}\t\t{original_turnaround[i]}")
    print(f"Average Waiting Time: {sum(original_waiting)/n:.2f} ms")
    print(f"Average Turnaround Time: {sum(original_turnaround)/n:.2f} ms")
    
    # Gantt Chart
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.barh(y=0, width=sorted_burst_times[i], left=completion_times[i-1] if i > 0 else 0, height=0.4, label=sorted_processes[i])
    plt.title("SJF Gantt Chart")
    plt.xlabel("Time (ms)")
    plt.yticks([])
    plt.legend()
    plt.show()

processes = ['P1', 'P2', 'P3']
burst_times = [10, 5, 8]
sjf_scheduling(processes, burst_times)

## 4. Priority Scheduling

### Theory
**Priority Scheduling** assigns a priority to each process; the highest-priority process runs first. It’s like an airport prioritizing premium passengers.

**How It Works**:
- Non-Preemptive: Current process finishes.
- Preemptive: Higher-priority process interrupts.

**Math**: Same as FCFS, but sorted by priority.

**Pros**: Prioritizes critical tasks.
**Cons**: Starvation for low-priority processes.

**Real-World Example**: Air traffic control prioritizes emergency landings.

### Practical Code: Non-Preemptive Priority

In [None]:
def priority_scheduling(processes, burst_times, priorities):
    n = len(processes)
    sorted_indices = sorted(range(n), key=lambda i: priorities[i])
    sorted_processes = [processes[i] for i in sorted_indices]
    sorted_burst_times = [burst_times[i] for i in sorted_indices]
    
    waiting_times = [0] * n
    turnaround_times = [0] * n
    completion_times = [0] * n
    
    completion_times[0] = sorted_burst_times[0]
    turnaround_times[0] = completion_times[0]
    for i in range(1, n):
        completion_times[i] = completion_times[i-1] + sorted_burst_times[i]
        waiting_times[i] = completion_times[i-1]
        turnaround_times[i] = waiting_times[i] + sorted_burst_times[i]
    
    original_waiting = [0] * n
    original_turnaround = [0] * n
    for i, idx in enumerate(sorted_indices):
        original_waiting[idx] = waiting_times[i]
        original_turnaround[idx] = turnaround_times[i]
    
    print("Priority Scheduling Results:")
    print("Process\tBurst Time\tPriority\tWaiting Time\tTurnaround Time")
    for i in range(n):
        print(f"{processes[i]}\t{burst_times[i]}\t\t{priorities[i]}\t\t{original_waiting[i]}\t\t{original_turnaround[i]}")
    print(f"Average Waiting Time: {sum(original_waiting)/n:.2f} ms")
    print(f"Average Turnaround Time: {sum(original_turnaround)/n:.2f} ms")
    
    plt.figure(figsize=(10, 2))
    for i in range(n):
        plt.barh(y=0, width=sorted_burst_times[i], left=completion_times[i-1] if i > 0 else 0, height=0.4, label=sorted_processes[i])
    plt.title("Priority Gantt Chart")
    plt.xlabel("Time (ms)")
    plt.yticks([])
    plt.legend()
    plt.show()

processes = ['P1', 'P2', 'P3']
burst_times = [10, 5, 8]
priorities = [3, 1, 2]
priority_scheduling(processes, burst_times, priorities)

## 5. Round-Robin (RR)

### Theory
**Round-Robin** gives each process a fixed time slice (quantum) in a cyclic order. It’s like a teacher giving each student a few minutes to speak before moving to the next.

**How It Works**:
- Processes get CPU for a quantum (e.g., 4 ms).
- If unfinished, they return to the queue.

**Math**: Context switching overhead increases with smaller quantum.

**Pros**: Fair, responsive for interactive systems.
**Cons**: High overhead with small quantum.

**Real-World Example**: Time-sharing systems like Linux use RR for multitasking.

### Practical Code: Round-Robin

In [None]:
def round_robin_scheduling(processes, burst_times, quantum):
    n = len(processes)
    remaining_times = burst_times.copy()
    waiting_times = [0] * n
    turnaround_times = [0] * n
    time = 0
    queue = list(range(n))
    gantt = []
    
    # To avoid duplicate labels in legend
    label_set = set()
    
    while queue:
        i = queue.pop(0)
        if remaining_times[i] > quantum:
            gantt.append((processes[i], time, time + quantum))
            time += quantum
            remaining_times[i] -= quantum
            queue.append(i)
        else:
            gantt.append((processes[i], time, time + remaining_times[i]))
            time += remaining_times[i]
            turnaround_times[i] = time
            waiting_times[i] = time - burst_times[i]
            remaining_times[i] = 0
    
    print("Round-Robin Scheduling Results:")
    print("Process\tBurst Time\tWaiting Time\tTurnaround Time")
    for i in range(n):
        print(f"{processes[i]}\t{burst_times[i]}\t\t{waiting_times[i]}\t\t{turnaround_times[i]}")
    print(f"Average Waiting Time: {sum(waiting_times)/n:.2f} ms")
    print(f"Average Turnaround Time: {sum(turnaround_times)/n:.2f} ms")
    
    plt.figure(figsize=(10, 2))
    label_set = set()
    for proc, start, end in gantt:
        label = proc if proc not in label_set else ""
        plt.barh(y=0, width=end-start, left=start, height=0.4, label=label)
        label_set.add(proc)
    plt.title("Round-Robin Gantt Chart")
    plt.xlabel("Time (ms)")
    plt.yticks([])
    plt.legend()
    plt.show()

processes = ['P1', 'P2', 'P3']
burst_times = [10, 5, 8]
quantum = 4
round_robin_scheduling(processes, burst_times, quantum)

## 6. Multilevel Queue Scheduling

### Theory
**Multilevel Queue** divides processes into queues based on type/priority, each with its own algorithm. It’s like a supermarket with express, priority, and regular lanes.

**How It Works**:
- Queues have priorities (e.g., system > interactive > batch).
- Higher-priority queues are served first.

**Pros**: Tailored to process types.
**Cons**: Starvation for low-priority queues.

**Real-World Example**: Cloud platforms prioritize real-time analytics over backups.

### Practical Code: Multilevel Queue

In [None]:
def multilevel_queue_scheduling(processes, burst_times, queues, quantum1, quantum2):
    n = len(processes)
    remaining_times = burst_times.copy()
    waiting_times = [0] * n
    turnaround_times = [0] * n
    time = 0
    gantt = []
    
    # To avoid duplicate labels in legend
    label_set = set()
    
    # Process Queue 1 (RR, quantum1)
    q1 = [i for i in range(n) if queues[i] == 1]
    while q1:
        i = q1.pop(0)
        if remaining_times[i] > quantum1:
            gantt.append((processes[i], time, time + quantum1))
            time += quantum1
            remaining_times[i] -= quantum1
            q1.append(i)
        else:
            gantt.append((processes[i], time, time + remaining_times[i]))
            time += remaining_times[i]
            turnaround_times[i] = time
            waiting_times[i] = time - burst_times[i]
            remaining_times[i] = 0
    
    # Process Queue 2 (RR, quantum2)
    q2 = [i for i in range(n) if queues[i] == 2]
    while q2:
        i = q2.pop(0)
        if remaining_times[i] > quantum2:
            gantt.append((processes[i], time, time + quantum2))
            time += quantum2
            remaining_times[i] -= quantum2
            q2.append(i)
        else:
            gantt.append((processes[i], time, time + remaining_times[i]))
            time += remaining_times[i]
            turnaround_times[i] = time
            waiting_times[i] = time - burst_times[i]
            remaining_times[i] = 0
    
    print("Multilevel Queue Scheduling Results:")
    print("Process\tBurst Time\tQueue\tWaiting Time\tTurnaround Time")
    for i in range(n):
        print(f"{processes[i]}\t{burst_times[i]}\t\t{queues[i]}\t\t{waiting_times[i]}\t\t{turnaround_times[i]}")
    print(f"Average Waiting Time: {sum(waiting_times)/n:.2f} ms")
    print(f"Average Turnaround Time: {sum(turnaround_times)/n:.2f} ms")
    
    plt.figure(figsize=(10, 2))
    label_set = set()
    for proc, start, end in gantt:
        label = proc if proc not in label_set else ""
        plt.barh(y=0, width=end-start, left=start, height=0.4, label=label)
        label_set.add(proc)
    plt.title("Multilevel Queue Gantt Chart")
    plt.xlabel("Time (ms)")
    plt.yticks([])
    plt.legend()
    plt.show()

processes = ['P1', 'P2', 'P3', 'P4']
burst_times = [10, 6, 8, 12]
queues = [1, 1, 2, 2]
multilevel_queue_scheduling(processes, burst_times, queues, 4, 8)

## 7. Additional Topics for Scientists

### Context Switching Overhead
- **Theory**: Switching between processes incurs overhead (saving/restoring state). High in RR with small quantum.
- **Math**: Overhead $O$ per switch, total overhead = $O \times \text{number of switches}$.
- **Application**: Minimize context switches in real-time systems.

### Starvation and Aging
- **Theory**: Low-priority processes may never run. Aging increases priority over time.
- **Example**: In Priority Scheduling, a process waiting 10 seconds might gain +1 priority.
- **Application**: Critical for fair scheduling in cloud systems.

### Real-Time Scheduling
- **Theory**: Ensures tasks meet deadlines (e.g., Rate Monotonic, EDF).
- **Relevance**: Vital for robotics, autonomous vehicles.

### Queuing Theory
- **Theory**: Models process waiting times using queues (e.g., M/M/1 model).
- **Math**: Average waiting time in queue: $W_q = \frac{\lambda}{\mu(\mu - \lambda)}$, where $\lambda$ is arrival rate, $\mu$ is service rate.
- **Application**: Analyze scheduling performance in research.

**Visualization**: Plot waiting time vs. arrival rate to study system load.

In [None]:
# Queuing Theory Visualization
import numpy as np
import matplotlib.pyplot as plt

lambda_rates = np.linspace(0.1, 0.9, 100)
mu = 1.0
waiting_times = lambda_rates / (mu * (mu - lambda_rates))

plt.plot(lambda_rates, waiting_times)
plt.title("Queuing Theory: Waiting Time vs. Arrival Rate")
plt.xlabel("Arrival Rate ($\lambda$)")
plt.ylabel("Waiting Time (ms)")
plt.grid(True)
plt.show()

## 8. Mini Project: Scheduling Simulator

**Objective**: Build a Python program to compare FCFS, SJF, Priority, and RR.

**Steps**:
1. Create a function to input processes, burst times, and priorities.
2. Implement all algorithms from above.
3. Compare average waiting and turnaround times.
4. Visualize results with bar charts.

**Code**:
- Modify the above codes to accept user input.
- Add a comparison function to plot metrics.

In [None]:
def compare_scheduling_algorithms(processes, burst_times, priorities, quantum):
    # Example values for demonstration; replace with actual computed values for real comparison
    avg_waiting_times = [9.33, 5.67, 7.33, 8.67]  # FCFS, SJF, Priority, RR
    plt.figure(figsize=(8, 4))
    plt.bar(['FCFS', 'SJF', 'Priority', 'RR'], avg_waiting_times)
    plt.title("Comparison of Average Waiting Times")
    plt.ylabel("Time (ms)")
    plt.show()

# Example usage (ensure variables are defined in the notebook)
compare_scheduling_algorithms(processes, burst_times, priorities, quantum)

## 9. Major Project: Real-Time Scheduling Analysis

**Objective**: Simulate a real-time system with Priority and Multilevel Queue scheduling.

**Steps**:
1. Define tasks with periods and execution times (e.g., T1: period 20 ms, execution 5 ms).
2. Implement Priority Scheduling with deadlines.
3. Add Multilevel Queue with real-time and background tasks.
4. Analyze schedulability (e.g., utilization ≤ 1).
5. Visualize task execution and missed deadlines.

**Research Application**: Test in a robotics simulation to ensure timely sensor processing.

## 10. Research Directions and Rare Insights

**Research Directions**:
- **Hybrid Scheduling**: Combine RR and Priority for fairness and efficiency.
- **Quantum Computing**: Develop scheduling for quantum processors.
- **AI-Driven Scheduling**: Use ML to predict burst times for SJF.

**Rare Insights**:
- **Energy-Aware Scheduling**: Modern systems prioritize energy efficiency (e.g., mobile devices adjust CPU frequency).
- **Fairness Metrics**: Research fairness using Jain’s index: $\frac{(\sum x_i)^2}{n \sum x_i^2}$, where $x_i$ is CPU time for process $i$.

**Applications**:
- **Robotics**: Priority for sensor tasks.
- **Cloud Computing**: Multilevel Queue for diverse workloads.
- **HPC**: SJF for scientific simulations.

**Further Reading**:
- *Operating System Concepts* by Silberschatz, Chapter 5.
- IEEE papers on real-time scheduling.
- Simulate with tools like Cheddar or RTEMS.