# Persistence

While with CPU/Mem virtualization and Concurrency, we've focussed alot fo our discussion on safely sharing resources, our major concern for file systems and disks will be robustness.

Our goal is to "permanently" store information so that it persists across reboots.

When we write a file, we want to ensure that what we wanted to write gets written.

We need information to persist across reboots, but also unexpected power offs.

An unexpected poweroff could corrupt the file system. We need to guarantee this can't happen.

## Intro to I/O Devices

Computers are organized as a hierarchy of devices

At the top is CPU/Mem bus.

Below this are other busses that connect to other devices.

Eg. Graphics cards, hard disks, keyboards/mice

Accessing any I/O device is atleast an order of magnitude slower than performing CPU computations.

Many I/O devices have their own microprocessors or even mini CPUs and they execute independently from the main CPU.

This allows the OS to delegate the work of performing the write. It makes requests to and from the hard disk.

These requests are issued by writing to a set of outward facing registers on the hard disk:
- COMMAND
- DATA
- STATUS

The OS likes to write to files in 4KB chunks. When it wants to write to a file, it will write the "write" command to the COMMAND register, then copy the 4KB of data to be written from memory onto the DATA register.

Then the hard disk will write that data to the actual disk.

STATUS is used by the hard disk to post its current status (busy, done).

**A Generic Protocol**

OS Writing to Hard Disk:

```
while(STATUS == busy):
    spin;
write data -> DATA
write command -> COMMAND
while(STATUS == busy):
    spin;
// check status, retrieve information, etc.
```

This would be horribly innefficient.

For efficiency, we use interrupts.



When a process makes an I/O request, the OS will issue the command to the hard disk and it will put the process to sleep in the BLOCKED state.

When the request is complete, the disk issues an interrupt which will trigger the OS to change that process from BLOCKED to READY.



## Some more Detail

Hard disks read and write in 512 byte increments because they are organized into 512 byte sectors.

The OS likes to perform reads and writes in increments of 4KB.

If the OS wants to write 4KB of data to disk, it needs to copy 4KB of data from memory to the DATA register.

This can take a while, 4096 memory requests!

For efficiencies sake, we don't want to wate the OS's time performing many tedious memory copies.

There is dedicated piece of hardware whose purpose it is to perform the copying of data from memory to DATA on behalf of the OS.

The **DMA** (Direct Memory Access) does this copying.

The OS tells the DMA what range of memory to write and teh DMA copies that memory, freeing the OS to continue scheduling processes productively.



# An intro to Hard Disks

A hard disk has physical spinning disks which hold the data.

The disk aluminum platters that have magnetized layers on the top and bottom.

Magnetic patterns that store the 1's and 0's.

Disks are organized into sectors that are 512 bytes large.

They have concentric tracts on them containing the sectors.

Each magnetized layer has an arm & head which read/write data to that layer.

To read some byte, we must wait for the head to **seek* the correct tract and then for the disk to rotate the byte under the head.

The majority of the time reading/writing is spent seeking the correct tract and waiting for the disk to rotate.

Hard disks advertize their average seek and rotation times.

`T_io = seek_time + rotation_time + transfer_time`

<br>
<img src="images/01-disk.png" width="500">
<br>