#### Here
- 27Nov23 started on Ch9 slide 9 so most notes probably actually belong in the Ch8 notebook

### Handling a Page Fault
<img src="images/pagefault.png">

- all the pages required for an instruction must be in memory at the same time for the instruction to execute


### Performance of Demand Paging
- Page Fault Rate $p$: 0 $\leq$ p $\leq$ 1
    - if p = 0, no page faults
    - if p = 1, every reference is a fault
- Effective Access Time (EAT)
    - EAT = (1 - p) x memory access + p x page fault time + swap page out time + swap page in time + restart overhead 
- e.g. Memory access time = 200ns, average page-fault service time = 8ms
    - EAT $= (1-p) \times 200 + p \times 8,000,000$
        - $= 200 + p \times 7,999,800\mu$s
    - if one access out of 1,000 causes a page fault, then p = 0.001
        - EAT $= 200\mu s + 0.001 \times 7,999,800\mu s$
        - $= 200\mu s + 7,999.8\mu s$
        - $= 8,199.8\mu s$
        - $= 8.2 \mu s $
        - 40 times slower than memory access alone

### Process Creation: Copy-on-Write
- fork() duplicates the parent address space for the child
- demand paging with copy-on-write allows sharing address spaces
- Copy-on-Write
    - shared pages marked as copy-on-write
    - if either process modifies a shared page, only then is the page copied
    - only modified page is copied
- CoW allows more efficient process creation as only modified pages are copied
- free pages are allocated from a pool of zeroed-out pages
    - CoW avoids having to zero out the pages
<img src="images/CoW.png">


### Page Replacement
- review of demand paging
    - separates logical memory from physical memory
    - allows logical address space to be larger than physical address space
    - enables greater multiprogramming
        - higher CPU utilization and throughput
    - allows faster process startup
- drawbacks
    - may increase later individual process access time
        - EAT
    - potential for over-allocation of physical memory
        - if allocated memory is not used, it is wasted
        - currently active process may reference more pages than there is physical memory space
        - pages from active processes may need to be replaced
            - refer to the image under Page Fault above
- page replacement is needed when a page fault occurs and there are no free frames available
- how do you choose which page to replace?
    - terminate user process?
    - swap out an entire process?
    - find a page in memory to swap and hope it isn't needed? 
        - most common approach
- the same page may be brought into memory several times
    - page-in is expensive
    - page-out is expensive
    - page replacement is expensive

### Basic Page Replacement
1. Find the location of the desired page on disk
2. Find a free frame
    - if there is a free frame, use it
    - if there is no free frame, use a page replacement algorithm to select a *victim* frame
3. Bring the desired page into the (newly) free frame; update the page and frame tables
4. Restart the process
5. If there is no free frame, then block the process
    - use a *modify* or *dirty* bit with each frame
        - used to reduce overhead of page transfers
        - only modified pages are written to disk
        - if 0, page has not been modified from the disk copy
            - does not need to be written to disk
        - if 1, page has been modified and differs from disk copy
            - must be written to disk

### Page Replacement Algorithms
- Criteria: get the lowest page fault rate
- page replacement schemes
    - FIFO
    - Optimal
    - LRU
    - counting
- evaluation metrics
    - simulate on a string of page references
    - compute the number of page faults on each page reference string
- we will use the string
    - 7, 0, 1, 2, 0, 3, 0, 4, 2, 3, 0, 3, 2, 1, 2, 0, 1, 7, 0, 1
- expected result:
    <img src="images/replacementgraph.png">

- more frames (more memory) should reduce page faults
    - but not always

### FIFO Page Replacement
- simplest page replacement algorithm
    - each page is assigned a *arrival* time when it is loaded into memory
    - replace the oldest page during page replacement
    - implemented with a FIFO queue of all pages
    - replacement page found at the head of the queue
    - new page is added to the tail of the queue
    <img src="images/fifo_page_alg.png">
 
    - 7 is allocated 3 frames
        - page fault but no replacement
        - 7 is the oldest page
    - 0 is brought into physical memory
        - page fault but no replacement
        - 7 is the oldest page
    - 1 is brought into physical memory
        - page fault but no replacement
        - 7 is the oldest page
        - all 3 frames are now allocated
    - 2 is brought into physical memory
        - page fault, replacement needed
        - 7 is the oldest page
        - 7 is replaced with 2
        - 0 is now the oldest page
    - **0 is already in physical memory**
        - no page fault
        - 0 is the oldest page
    - 3 is brought into physical memory
        - page fault, replacement needed
        - 0 is the oldest page
        - 0 is replaced with 3
        - 1 is now the oldest page
    - 0 is brought into physical memory
        - page fault, replacement needed
        - 1 is the oldest page
        - 1 is replaced with 0
        - 2 is now the oldest page
    - 4 is brought into physical memory
        - page fault, replacement needed
        - 2 is the oldest page
        - 2 is replaced with 4
        - 3 is now the oldest page
    - and so on...

### Belady's Anomaly
- for some page replacement algorithms, the page fault rate may increase as the number of allocated frames increases
    - can occur with FIFO
- e.g. string 1, 2, 3, 4, 1, 2, 5, 1, 2, 3, 4, 5
    <img src="images/beladyfifo.png">

    - overall, more frames should reduce page faults in most cases

### Optimal Page Replacement Algorithm
- an algorithm that produces the lowest page fault rate
    - replaces the page that will not be used for the longest period of time
    - provably optimal
    - does not suffer from Belady's Anomaly
    - not implementable in practice
    - can only be determined after the fact
        - requires future knowledge of the reference string
        - used as a benchmark for other algorithms
- e.g. for previously shown string from FIFO
    <img src="images/optimal_page.png">

    - 7 is allocated 3 frames
        - page fault but no replacement
        - $\begin{array} {|r|} \hline 7 \ \hline - \ \hline - \ \hline \end{array}$
    - 0 is brought into physical memory
        - page fault but no replacement
        - $\begin{array} {|r|} \hline 7 \ \hline 0 \ \hline - \ \hline \end{array}$
    - 1 is brought into physical memory
        - page fault but no replacement
        - 7 is the oldest page
        - $\begin{array} {|r|} \hline 7 \ \hline 0 \ \hline 1 \ \hline \end{array}$
    - 2 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 7 is used furthest in the future
        - 7 is replaced with 2
        - $\begin{array} {|r|} \hline 2 \ \hline 0 \ \hline 1 \ \hline \end{array}$
    - 0 is already in physical memory
        - no page fault
    - 3 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 1 is used furthest in the future
        - 1 is replaced with 3
    - 0 is already in physical memory
        - no page fault
    - 4 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 0 is used furthest in the future
        - 0 is replaced with 4
    - 2 is already in physical memory
        - no page fault
    - 3 is already in physical memory
        - no page fault
    - 0 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 4 is used furthest in the future
        - 4 is replaced with 0
    - 3 is already in physical memory
        - no page fault
    - 2 is already in physical memory
        - no page fault
    - 1 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 3 is used furthest in the future
        - 3 is replaced with 1
    - 2 is already in physical memory
        - no page fault
    - 0 is already in physical memory
        - no page fault
    - 1 is already in physical memory
        - no page fault
    - 7 is brought into physical memory
        - page fault, replacement needed
        - look to future
        - 2 is not used in the future
        - 2 is replaced with 7
    - 0 is already in physical memory
        - no page fault
    - 1 is already in physical memory
        - no page fault


### Least Recently Used (LRU) Page Replacement
- approximates optimal page replacement
    - detect and store when each page is used
    - replace the page that has not been used for the longest period of time
    - a very good approximation of optimal
    - requires hardware support
        - otherwise it has loads of memory overhead as it adds a timestamp to each page
    - does not suffer from Belady's Anomaly
- used most frequently in practice
- must use a counter for each page
    - every time a page is referenced through this entry, copy the clock into the counter
    - when a page needs to be replaced, look at the counters to determine which are to replace
    - requires hardware support
- e.g. 
    <img src="images/LRU_page.png">

    - 7 is allocated 3 frames
        - page fault but no replacement
        - $\begin{array} {|r|} \hline 7 \ \hline - \ \hline - \ \hline \end{array}$
        - counter for 7 = 1
    - 0 is brought into physical memory
        - page fault but no replacement
        - $\begin{array} {|r|} \hline 7 \ \hline 0 \ \hline - \ \hline \end{array}$
        - counter for 7 = 1
        - counter for 0 = 2
    - 1 is brought into physical memory
        - page fault but no replacement
        - $\begin{array} {|r|} \hline 7 \ \hline 0 \ \hline 1 \ \hline \end{array}$
        - counter for 7 = 1
        - counter for 0 = 2
        - counter for 1 = 3
    - 2 is brought into physical memory
        - page fault, replacement needed
        - 7 is has lowest counter value of 1
        - 7 is replaced with 2
        - $\begin{array} {|r|} \hline 2 \ \hline 0 \ \hline 1 \ \hline \end{array}$
        - counter for 2 = 4
        - counter for 0 = 2
        - counter for 1 = 3
    - 0 is already in physical memory
        - no page fault
        - counter for 2 = 4
        - counter for 0 = 5
        - counter for 1 = 3
    - 3 is brought into physical memory
        - page fault, replacement needed
        - counter for 3 = 6
        - 1 is replaced with 3
        - $\begin{array} {|r|} \hline 2 \ \hline 0 \ \hline 3 \ \hline \end{array}$
        - counter for 2 = 4
        - counter for 0 = 5
        - counter for 3 = 6
    - 0 is already in physical memory
        - no page fault
        - counter for 2 = 4
        - counter for 0 = 7
        - counter for 3 = 6
    - 4 is brought into physical memory
        - page fault, replacement needed
        - 2 is has lowest counter value of 2
        - 2 is replaced with 4
        - $\begin{array} {|r|} \hline 4 \ \hline 0 \ \hline 3 \ \hline \end{array}$
        - counter for 4 = 8
        - counter for 0 = 7
        - counter for 3 = 6
    - 2 is brought into physical memory
        - page fault, replacement needed
        - 3 is has lowest counter value of 6
        - 3 is replaced with 2
        - $\begin{array} {|r|} \hline 4 \ \hline 0 \ \hline 2 \ \hline \end{array}$
        - counter for 4 = 8
        - counter for 0 = 7
        - counter for 2 = 9
    - 3 is brought into physical memory
        - page fault, replacement needed
        - 0 is has lowest counter value of 7
        - 0 is replaced with 3
        - $\begin{array} {|r|} \hline 4 \ \hline 3 \ \hline 2 \ \hline \end{array}$
        - counter for 4 = 8
        - counter for 3 = 10
        - counter for 2 = 9
    - 0 is brought into physical memory
        - page fault, replacement needed
        - 4 is has lowest counter value of 8
        - 4 is replaced with 0
        - $\begin{array} {|r|} \hline 0 \ \hline 3 \ \hline 2 \ \hline \end{array}$
        - counter for 0 = 11
        - counter for 3 = 10
        - counter for 2 = 9
- **fix counters from here down**
    - 3 is already in physical memory
        - no page fault
        - counter for 0 = 14
        - counter for 3 = 15
        - counter for 2 = 12
    - 2 is already in physical memory
        - no page fault
        - counter for 0 = 14
        - counter for 3 = 15
        - counter for 2 = 16
    - 1 is brought into physical memory
        - page fault, replacement needed
        - 0 is has lowest counter value of 11
        - 0 is replaced with 1
        - $\begin{array} {|r|} \hline 1 \ \hline 3 \ \hline 2 \ \hline \end{array}$
        - counter for 1 = 17
        - counter for 3 = 15
        - counter for 2 = 16
    - 2 is already in physical memory
        - no page fault
        - counter for 1 = 17
        - counter for 3 = 15
        - counter for 2 = 18
    - 0 is brought into physical memory
        - page fault, replacement needed
        - 3 is has lowest counter value of 12
        - 3 is replaced with 0
        - $\begin{array} {|r|} \hline 1 \ \hline 0 \ \hline 2 \ \hline \end{array}$
        - counter for 1 = 17
        - counter for 0 = 19
        - counter for 2 = 18
    - 1 is already in physical memory
        - no page fault
        - counter for 1 = 17
        - counter for 0 = 19
        - counter for 2 = 18
    - 7 is brought into physical memory
        - page fault, replacement needed
        - 2 is has lowest counter value of 15
        - 2 is replaced with 7
        - $\begin{array} {|r|} \hline 1 \ \hline 0 \ \hline 7 \ \hline \end{array}$
        - counter for 1 = 17
        - counter for 0 = 19
        - counter for 7 = 20
- **see screenshot from 29Nov23 for LRU page replacement algorithm scratchwork**

### Counter Implementation of LRU
- each page entry contains a counter
- copy the clock into the counter on every page access
- for replacement, find the page with the smallest counter
- requires search of full page table
- each memory access requires additional memory access to update the counter
- counter overflow?

### Stack Implementation of LRU
- keep a stack of page numbers in a linked list
- move the page to the top of the stack on every page access
- for replacement, choose the bottom page in the stack
- each update is more expensive due to stack operations
- on the other hand, replacement is a constant time operation
    - i.e. no search required
### LRU Approximation Algorithms
- LRU is expensive to implement so we approximate it instead
- Reference Bit Algorithm
    - keep a reference bit for each page
    - set to 1 on every page access
    - periodically reset to 0
    - when a page needs to be replaced, choose one with reference bit = 0
    - downside: cannot store the order of page access in one bit
- Record Reference Bit Algorithm
    - each page associated with a 8-bit field for recording reference bit
    - at regular intervals
        - shift the reference bits right by 1 bit
        - move reference bit into the highest order bit
    - when a page needs to be replaced, choose one with lowest 8-bit value
    - downside: requires 8 times more space than the reference bit algorithm
- Second Chance Algorithm
    - keep a single reference bit for each page
    - follow FIFO page replacement
    - find replacement page
        - if reference bit = 0, replace page
        - if reference bit = 1, set reference bit to 0 and move page to the end of the queue

# Missed half an hour, up to slide 32

### Frame Allocation Algorithms
- equal
    - not especially useful
    - divide m frames among n processes
    - each process gets about m/n frames
    - may give processes more frames than they need
- proportional
    - allocate according to the size of the process
        - but how do you define size?
            - definable in theory but not very easilly in practice
    - allocate m frames
    - e.g.
        - total frame m = 64
        - size of $p_i$ is $s_i = 10$
        - S = $\sum s_i$
        - $s_2$ = 127
        - $a_1$ = $\frac{10}{137} \times 64 = 4.67 \approx 5$
        - $a_2$ = $\frac{127}{137} \times 64 = 59.33 \approx 59$
    - downside: small processes may not get the minimum number of frames they need
### Other Frame Allocation Issues
- global vs local replacement
    - global
        - process selects a replacement frame from the set of all frames
        - one process can take a frame from another
        - page cannot control its own page-fault rate
    - local
        - each process selects from only its own set of allocated frames
        - each process can control its own page-fault rate
        - may hinder progress by not allowing a high priority process to take a frame from a low priority process
        - benefit is scalability
            - reduces pressure on the global frame pool
            - less effect on other processes
        - less commonly used

### Thrashing
- the process spends more time paging than executing
    - process keeps swapping pages in and out
        - the pages are in active use so those swapped out are immediately needed again
    - leads to a very high page fault rate
- causes
    - too many processes
    - process does not have enough frames allocated to it  
- vicious cycle
    - a problem in early operating systems
    - flow:
        - not enough frames -> more page faults
        - lowers CPU utilization
        - OS thinks it needs to increase the degree of multiprogramming
            - i.e it doesn't have enough processes running
        - adds more processes to the system
        - even less frames per process are available per process
        - cycle repeats
    - not as much of a problem today
        - more memory
        - better algorithms
        - better hardware support
- becomes more likely as the degree of multiprogramming increases
- <img src="images/thrashing.png">

    - adding more processes increases CPU utilization... until it doesn't
- solution
    - user: increase physical RAM
        - generally not practical in the short term
    - OS: reduce degree of multiprogramming
        - suspend some processes
        - swap some processes out to disk
        - terminate some processes

### Preventing Thrashing
- add RAM to increase the number of frames
    - a long term hardware solution 
- give the process more frames
    - more on demand solution which the OS can (attempt to) do
- local replacement
    - other processes are not affected by thrashing
        - i.e. the thrashing process is not stealing frames from functional processes
    - but the thrashing process is still thrashing
        - i.e. it is still not getting enough frames
    - may be a good solution if the thrashing process is not important
    - mitigates thrashing but does not prevent it particularly well
- Working-Set model
    - based on the locality model of process execution
        - each process phase uses a small set of memory frames
            - 3 phases
                - decompression of relevant files
                - execution of the program
                - compression of relevant files
        - execution moves from one process phase to another
    - the number of frames required for each phase is called the working set
        - the set of pages that are referenced in the near future by the process
    - <img src="images/pagefault_working_set.png">
 
        - page fault rate rises on program load and falls until the next program phase begins
        - each large tooth represents the loading of a new phase
    - implementation
        - assume a particular working-set window $\Delta$
            - $\Delta$ is some interval of time or number of page references
        - WSS$_i$ = working set of process $P_i$
        - D = $\sum$ WSS$_i$ = total demand for frames
        - m = total number of frames, i.e. memory size
        - if D > m, then thrashing is likely and some process should be suspended
        - with a well defined $\Delta$, this solution functions well - per Kulkarni
- Page-Fault frequency scheme
    - working set model is based on several assumptions and is complicated
    - instead, measure page fault rate routinely
    - establich acceptable page-fault rate
        - if actual rate too low, process loses frame or increase processes
        - if actual rate too high, process gains frame or suspends processes
    - <img src="images/page_fault_frequency.png">

### Memory-Mapped Files
- memory-mapped file I/O allows file I/O to be treated as routine memory access by mapping a disk block to a page in memory
    - removes the need for read() and write() system calls
    - converts disk access to memory access
        - i.e. the file is in RAM now 
    - simplifies disk access for the user
- mechanism
    - file is read using demand paging
    - a page-sized portion of the file is read from the file system into a physical page
    - subsequent reads/writes to/from the file are treated as ordinary memory (RAM) accesses
- memory mapping is a powerful technique
    - allows file I/O to be treated as routine memory access
    - allows file I/O to be cached
    - allows memory sharing between processes
    - allows for faster file I/O in most cases
    - allows for faster file copying
        - e.g. the extra credit lab
- <img src="images/MMIO.png">

    - allows several processes to map the same file and share data
- special I/O instructions may be available to transfer and control messages to the I/O controller
- MMIO
    - I/O device registers are mapped to logical address space
    - convenient and fast
    - this is what we did in EECS 388
        - device monitors the address bus addresses that are in its range
            - e.g. maybe the last 4 bits on a bus belong to the device
        - if the address is in its range, the device responds to the request
    - there may be a control bit to indicate whether the access is to memory or to an I/O device
        - programmed I/O
            - CPU sets the bit to indicate the type of access
        - interrupt driven I/O
            - sends an interrupt to the CPU to indicate availability


### Allocating Kernel Memory
- kernel memory often allocated from a free memory pool
    - kernel memory does not use demand paging
    - kernel memory is not swapped
    - kernel memory is not allocated to user processes
- reasons for treating kernel memory differently
    - some kernel memory must be contiguous
        - e.g. page tables
    - attempts to minimize waste due to internal fragmentation
        - kernel requsts memory for structures of varying sizes
        - does not use paging
- strategies
    - buddy system
        - power of 2 allocation
            - satisfies requests in units sized as powers of 2
            - request rounded up to the next power of 2
            - when smaller allocation needed that is available, split the current chunk into two buddies
                - continue splitting until the desired size is reached 
        - <img src="images/buddy_system.png">

            - some memory is lost to internal fragmentation
    - slab allocator
        - not on the Final

### Other Virtual Memory Issues
- ~~prepaging~~ **not on the Final**
    - ~~reduces the number of page faults that occur on process startup~~
    - ~~prepage all or some of the pages a process will need before they are referenced~~
        - ~~can save startup time for a process~~
        - ~~I/O and memory wasted if the pages are not used~~
- issues deciding page size
    - internal fragmentation
        - wasted space within a page
        - smaller pages have less internal fragmentation~~
    - page table size
        - small page size reduces the size of the page table
            - more bits left over for the page number in the logical address
            - more pages can be allocated
            - not true for inverted page tables
    - I/O overhead
        - larger pages reduce disk latency and seek time
            - i.e. n requests suffer n seek times
                - e.g. 1 4kb request only suffers 1 seek time but 2 2kb requests suffer 2 seek times

    - page faults
        - larger page size reduces the number of page faults

- drop 2 quizzes, add 1 100% quiz, and all quizzes are out of 10