# Chapter 6: CPU Scheduling
- multiprogramming
    - goal: maximize CPU utilization
    - most processes will alternate between CPU bursts and IO bursts
        - CPU burst: process is running on the CPU
        - IO burst: process is waiting for IO to complete
    - schedule other processes while one is waiting for something
- CPU bound process
    - bounded by the CPU speed
    - spends most of it's time in the CPU
    - at least a few long CPU bursts
- I/O bound process
    - bounded by I/O speed
    - spends most of it's time doing I/O
    - many short CPU bursts
### CPU Scheduler
- selects from among the processes in memory that are ready to execute
    - part of the OS dispatcher
    - based on a particular scheduling algorithm
- occurs when
  1. process switches from running to waiting state
  2. process switches from running to ready state
  3. process switches from waiting/new to ready
  4. process terminates
- preemptive vs non-preemptive
    - preemptive: scheduler can interrupt a process
    - non-preemptive
        - process gives up the CPU voluntarily
        - easy, requires no special hardware
        - poor response time for interactive and real-time systems
    - preemptive
        - OS can force a process to give up the CPU
            - when process exceeds time slot
            - when a higher priority process becomes ready
        - requires special hardware (timer)
        - may require synchronization 
        - favored by most OSes
    - scheduling is non-preemptive under 1 and 4
        - the process voluntarily gives up the CPU 
    - scheduling is preemptive under all other conditions
        - the process is forced to give up the CPU

### Dispatcher
- functions
    - get the new process from scheduler
    - switch out the context of the current process
    - give control of the CPU to the new process
    - jump to the proper location in the new process
- dispatch latency
    - time taken by the dispatcher to stop one process and start another
### Scheduling Queues
- job queue
    - set of all processes in the system
    - scheduled by the long-term scheduler
        - some OSes may not have a long-term scheduler
            - e.g. phones, embedded systems, etc
- ready queue
    - set of all processes residing in main memory, ready and waiting to execute
    - scheduled by the short-term or CPU scheduler
- device queue
    - set of processes waiting for an I/O device
    - scheduled by the I/O scheduler
    - I/O completion moves the process to the ready queue
    - multiple processes can be waiting for the same device

# Scheduling
### Performance Metrics
- CPU utilization
    - keep the CPU as busy as possible
        - i.e. keep the CPU utilization as close to 100% as possible
    - 0% to 100%
- Throughput
    - number of processes that complete their execution per time unit
- Turnaround time
    - amount of time to execute a particular process
    - time from submission to completion
- Waiting time
    - amount of time a process has been waiting in the ready queue
- Response time
    - amount of time it takes from when a request was submitted until the first response is produced
    - for time-sharing systems
    - may not be the same as turnaround time
- Scheduling goals
    - maximize CPU utilization
    - minimize turnaround, wait, and response times
    - fairness to processes and users
        - starvation: a process may never be scheduled
        - aging: increase priority of processes that have been waiting for a long time
        - priority: some processes are more important than others
        - interactive response time: users expect a quick response

### Evaluation Methods
- Criteria
    - specify relative importance of metrics
    - consider system specific goals and measures
- Deterministic modeling
    - takes a particular predetermined workload and defines the performance of each algorithm for that workload
        - i.e. a test workload simulation
            - each algorithm is run on the same workload and the results are compared
            - the workload is usually a set of processes with known CPU and IO bursts
            - the results are usually the average of several runs
    - simple and fast
        - gives exact numbers
    - difficult to generalize
    - can recognize patterns over several inputs
    - used for explaining and predicting behavior of algorithms
### Workload Models and Grantt Charts
- workload model
    - set of processes that are submitted to the system
    - may be real or synthetic
    - <img src="images/workloadmodel.png">
        
        - the burst times are the CPU bursts of the processes
        - it is not the actual CPU burst times of the processes
            - it is the burst time which the process would like to have
- Grantt chart
    - bar chart ilustrating a schedule
    - <img src="images/granttchart.png">
        
        - this figure shows a batch schedule 
            - i.e. a FCFS schedule
            - ordered from P1 to P4 because they are submitted in that order
    - chart will be different for different scheduling algorithms and different workloads
- interpreting a Grantt chart
    - <img src="images/granttinterp.png">
        
        - A, B, and C are processes submitted at $t_0$
        - turnaround time is the time from submission to completion
            - $t_f - t_0$
        - waiting time is the time spent in the ready queue
            - e.g. for A it is the sum of the lengths of other processes before it's final burst
                - i.e. B & C + B & C +nC
        - response time is the time from submission to first response
            - e.g. for A it is 0 because it starts executing immediately

### Scheduling Algorithms
- First Come First Serve
    - schedules processes in the order they arrive in the ready queue
    - non-preemptive
    - implemented by FIFO queue
    - advantages
        - simple and easy to understand
        - cannot cause starvation
        - good for batch systems
    - disadvantages
        - average waiting time may be too long
            - large variation based on arrival time
        - cannot balance CPU and IO bound processes
            - convoy effect - short process behind a long process
        - not good for interactive systems or time-sharing systems
    - e.g. P1, P2, P3
        - <img src="images/FCFSeg.png">
        - <img src="images/FCFSeg1.png">
        - wait time (Pn) = completion - burst - arrival 
        - turnaround time (Pn) = completion - arrival 
        - wait time (P1) = 24 - 24 - 0 = 0
        - turnaround time (P1) = 24 - 0 = 24
        - wait time (P2) = 27 - 3 - 0 = 24
        - turnaround time (P2) = 30 - 3 = 27
        - average wait time = $\frac{0 + 24 + 27}{3} = 17$
    - e.g. P2, P3, P1
        - <img src="images/FCFSeg2.png">
        - when the arrival same processes but different arrival order
            - changes average wait time and turnaround time
- Shortest Job First 
    - orders processes by shortest CPU burst
    - allocates CPU to the process at the front of the list
        - i.e. the process with the shortest CPU burst
    - Advantage
        - minimizes average waiting time
        - guarantees that the average waiting time is optimal
    - disadvantage
        - requires knowledge of the length of the next CPU burst
            - not possible in most cases
            - how do you know the length of the next CPU burst for a process?
        - can cause starvation
            - long processes may never be scheduled
    - e.g.
      <img src="images/SJFeg.png">
            - wait time (P1) = 9 - 6 = 3
            - wait time (P2) = 24 - 8 = 16

### Estimating Length of Next CPU Burst
- can only be estimated
    - many models for prediction
- we will use an exponential average of past burst length
- Formula
    1. $t_n$ = actual length of the nth CPU burst
    2. $\tau_{n+1}$ = predicted value for the next CPU burst
    3. $\alpha, 0 \leq \alpha \leq 1$ 
    4. $\tau_{n+1} = \alpha t_n + (1 - \alpha) \tau_n$
    - if $\alpha = 0$ then $\tau_{n+1} = \tau_n$
        - recent history does not matter
    - if $\alpha = 1$ then $\tau_{n+1} = t_n$
        - only the most recent burst matters
    - formula can be expanded to include each former burst explicitly
        - $\tau_{n+1} = \alpha t_n + \alpha(1 - \alpha) t_{n-1} + ... + \alpha(1 - \alpha)^j t_{n-j} + ...$
            - $\tau_{n+1}$ is the next predicted burst time
            - $t_n$ is the actual burst time
            - $\alpha$ is the weight given to the most recent burst
            - $j$ is the number of bursts ago
    - <img src="images/BurstEstEg.png">
        
        - blue is estimate
        - if actual time is constant, estimate approaches actual time
                - .5(20)+.5(20) = 

### Preemptive SJF
- new shorter processes can preempt longer currently running processes
- <img src="images/preSJFeg.png">

    1. P1 scheduled at $t_0$
    2. P2 arrives and scheduler is activated
        - P1 has an expected remaining burst time of 7 units
        - P2's expected burst time < 7
        - P2 preempts P1
    3. P3 arrives and scheduler is activated
        - P2 has an expected remaining burst time of 3 units
        - P3's expected burst time > 3
        - P3 is added to the ready queue
    4. P4 arrives and scheduler is activated
        - P2 has an expected remaining burst time of 2 units
        - P4's expected burst time > 2
        - P4 is added to the ready queue
    5. P2 completes, 3 processes remain in the ready queue
    - flow:
        - 0 to 1: P1
        - 1 to 5: P2
        - 5 to 10: P4
        - 10 to 17: P1
        - 17 to 26: P3
        <img src="images/preSJFeg1.png">
        
    - W(P1) = 17 - 8 - 0 = 9
    - W(P2) = 5 - 4 - 1 = 0
    - W(P3) = 26 - 9 - 2 = 15
    - W(P4) = 10 - 5 - 3 = 2
    - Ave(W) = $\frac{9 + 0 + 15 + 2}{4} = \frac{26}{4} = 6.5$

### Priority Scheduling
- priority number is associated with each process
- CPU allocated to the process with highest priority
    - ties broken in FCFS order
- internally determined priorities
    - time limit, memory requirements, etc
    - SFJ uses next CPU burst time for priority
- externally determined priorities
    - process importance, user level, etc
- can be preemptive or not
- low number generally means higher priority
- advantages
    - priorities can be as general as needed
- disadvantages
    - low priority processes may never execute
- aging
    - technique to prevent starvation
    - as time progresses, increase the priority of the process
        - no one true method for implementing aging 

### Round Robin
- algorithm
    - arrange jobs in FCFS order
    - allocate CPU to first job for one time slice
    - preempt job if it does not complete in one time slice and put it at the end of the ready queue
        - if it does complete, it cedes the CPU voluntarily
    - allocate CPU to next job in FCFS order
- a time slice is called a time quantum
- by definition preemptive
    - can be considered FCFS with preemption
- advantages
    - simple
    - avoids starvation
- disadvantages
    - may involve a lot of context switching
    - higher average wait time than SJF
        - SJF is optimal so any other algorithm will have a higher average wait time
    - I/O bound processes may suffer on heavily loaded systems
        - i.e. it will lose portions of it's time slice to other processes if waiting for I/O
- e.g. 
    <img src="images/RoundRobin.png"> 
- performance depends on the size of the time quantum
    - with long time quantum, it is essentially FCFS
    - with short time quantum, it has high context switching overhead
- general characteristics
    - time quanta are usually 10 to 100 milliseconds
    - context switching overhead is usually < 10 microseconds
    - results in longer wait times
    - better response time in interactive systems
        - turnaround time depends on the size of the time quantum

### Lotery Scheduling
- built to address the fairness issue of priority scheduling
- algorithm
    - each process has some tickets
    - scheduler draws a random ticket each time slice
    - on average, allocated CPU time for a process is proportional to number of tickets held
- to approximate SJF, short jobs get more tickets
    - SJF is optimal
- all jobs get at least one ticket
    - to prevent starvation

### Multilevel Queue Scheduling
- most OSes have multiple scheduling algorithms
- ready queue is partitioned into separate queues
    - e.g. foreground (interactive) and background (batch)
- each queue has its own scheduling algorithm
    - foreground - RR
    - background - FCFS
- scheduling must be done between the queues
    - i.e. the schedulers must be scheduled
    - fixed priority scheduling
        - i.e. serve all from foreground then from background
        - e.g. foreground gets 80% of CPU, background gets 20%
    - time slice
        - i.e queue gets a certain amount of CPU time which it can schedule amongst it's processes
        - e.g. 80% foreground RR, 20% background FCFS
- useful when
    - processes can be easily classified into groups
    - each group has different scheduling needs
- algorithm
    - partition ready queue into separate queues
    - determine some scheduling algorithm for each queue
        - e.g. RR, SJF, FCFS, etc
    - determine the inter-queue scheduling method
        - e.g. fixed priority, time slice, etc
    - permanently assign processes to one queue
        - e.g. foreground and background
- **processes do not move between queues automatically**
    - e.g. a process cannot move from foreground to background without user intervention
    - e.g. a process cannot move from background to foreground without user intervention
- e.g.
    <img src="images/MultilevelQueue.png">

### Multilevel Feedback Queue Scheduling
- allows processes to move between queues dynamically
- algorithm
    - multiple queues with different scheduling algorithms
    - round robin scheduling between the queues
    - run highest priority jobs first, then lower priority jobs
    - jobs start in the highest priority queue
    - if time slice expires, move down one queue
    - if time slice does not expire, move up one queue
- different queues may have a different number of time slices
    - e.g. priority 0 may only allow 1 slice, priority 1 may allow 2 slices, etc
- approximates SRTF (shortest remaining time first)
    - CPU bound processes will move down the queue quickly
    - I/O bound processes will move up the queue quickly
- unfair for long running processes
    - they will always be in the lowest priority queue
    - countermeasure: aging
        - increase priority of processes that have been waiting for a long time
        - difficult to tune aging parameters

### Case Study: Solaris
- priority based scheduling
    - six classes
        - real time, system, interactive, fair share, time share, and idle
            - listed in order of priority
        - different classes have different scheduling algorithms
            - e.g. real time is FCFS, interactive is round robin, etc
    - default class is time share
        - uses multilevel feedback queue scheduling
        - inverse relation between time slice and priority
            - higher priority = shorter time slice
            - lower priority = longer time slice
        - good response time for interactive processes
        - good throughput for CPU bound processes
- 60 priority levels
    - 0 to 59
    - 0 is highest priority
    - 59 is lowest priority
    - 60 is reserved for the idle process
- <img src="images/solarisdispatch.png">

    - the Solaris dispatch table
    - `time quantum` is the time slice
    - `time quantum expired` is the queue a process is sent to if the time slice expires
    - `return from sleep` is queue a process is sent to if the time slice does not expire
- Note for Exams:
    - scheduler does not preempt even if round robin unless specified in question

### Thread Scheduling
- Linux only supports one to one mapping
    - one user thread per kernel thread
    - so the below is not applicable to Linux
- on systems supporting threads,
    - kernel threads are the real scheduling entities
    - user threads must be mapped to kernel threads
    - scheduling attributes may be set at thread creation
- Contention-scope
    - PTHREAD_SCOPE_PROCESS
        - group user threads to contend for kernel threads
        - threads are scheduled together at the process level
            - i.e. threads are scheduled together with other threads
    - PTHREAD_SCOPE_SYSTEM
        - directly assigned to kernel threads, contend with other kernel threads
        - threads are scheduled independently at the system level
            - i.e. threads are scheduled independently of other threads
- inheritsched
    - PTHREAD_INHERIT_SCHED
        - new threads inherit scheduling attributes of creating thread
    - PTHREAD_EXPLICIT_SCHED
        - new threads are created with explicitly specified attributes
- schedpolicy
    - SCHED_OTHER
        - default scheduling policy
        - regular non-real-time scheduling
        - time sharing
    - SCHED_FIFO
        - real time FCFS
        - first in first out
    - SCHED_RR
        - round robin
- schedparam
    - set/get priority of the thread
- all parameters are only relevant when
    - thread library supports many to one user level threads
    - real-time scheduling 

- pthread scheduling example
```C
int main(int argc, char *argv[]){
    int i;
    pthread_t tid[5];
    pthread_attr_t attr;
    
    pthread_attr_init(&attr); /* get the default attributes */
    /* set the scheduling algorithm to PROCESS or SYSTEM */
    pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
    /* set the scheduling policy - FIFO, RT, or OTHER */
    pthread_attr_setschedpolicy(&attr, SCHED_OTHER);
    
    for (i = 0; i < 5; i++)
        pthread_create(&tid[i],&attr,runner,NULL);
    for (i = 0; i < NUM_THREADS; i++)
        pthread_join(tid[i], NULL);
}
void *runner(void *param){ 
    printf("I am a thread\n");
    pthread_exit(0);
}
```

### Multiprocessor Scheduling Issues
- multiprocessor scheduling
    - asymmetric multiprocessing
        - only one processor runs the OS and accesses the system data structures
        - other processors are slaves
        - master processor distributes work to slaves
        - master may become a bottleneck
    - symmetric multiprocessing (more common)
        - each processor runs the OS and accesses the system data structures
        - each processor is self scheduling
        - each processor may have it's own private ready queue
        - processors may share a common ready queue
            - this is more common
        - advantage
            - no race conditions
            - load balancing
                - distribute processes evenly across processors
        - disadvantage
            - more complex
            - more overhead
- process affinity
    - process has affinity for the processor it is currently running on
    - reduces memory and cache overhead
        - caches on a given processor will already have the data for the process running on that processor
        - moving the process to another processor will cause the caches to be flushed and refilled
    - memory affinity is important for NUMA systems
        - non-uniform memory access
        - memory access time depends on the location of the memory relative to the processor
    - soft and hard processor affinity
        - how strictly the OS follows the affinity
        - soft affinity
            - process prefers to run on a particular processor
            - but can be moved to another processor
        - hard affinity
            - process can only run on a particular processor
            - cannot be moved to another processor
- load balancing
    - evenly distribute processes across processors
    - important if each processor has it's own ready queue
    - uses *push migration* and *pull migration*
        - push/pull processes from busy processors to idle processors
        - push migration
            - periodically check the load on each processor
            - if a processor is overloaded, move a process to another processor
        - pull migration
            - idle processors pull processes from busy processors
            - idle processors periodically check the load on busy processors
            - if a processor is overloaded, pull a process from it
- multicore processors
    - multiple processors on a single chip
        - uniform memory access
        - faster intercore communication
    - may be simultaneously multithreaded (SMT)
        - each core can execute multiple threads simultaneously
        - instructions from multiple threads are simultaneously live in different pipeline stages
        - OS is given a view of one processor per hardware thread
        - may reduce memory stalls
        - may increase resource contention
        - e.g. Intel Hyperthreading
- 