## Data Block Allocation
__Contiguous Allocation__ allocate a file as a continuous sequence of data blocks

 - Advantages: sequential access are fast; fast allocation and deallocation; small amount of meta
 - Disadvantages: external fragmentation (free space are separated in small blocks and cannot be used); need compaction and move whole files around; inflexible
 
__Linked__ treat a file as a linked list of data blocks
 
 - Advantages: easy sequential access; dish blocks can be anywhere; no external fragmentation
 - Disadvantages: expensive direct access; if a data block is corrupted could lose rest of life. 
 
__Indexed__ inode structure 

- Advantages: handles random access well; small files: quick sequential and random access; no external fragmentation
- Disadvantages: limit file size; cost of access bytes near the end of large files grows

#### Unix Inodes
Ext2 Linux file system inodes are 128 bytes, includes 15 block pointers
- block[0:11] direct block pointers
- block[12] a single indirect block pointer
- block[13] a double indirect block pointer
- block[14] a triple indirect block pointers

<img src="assets/fs.png">

Suppose each block is 4KB and the pointer is 4Byte, then each data block supports $1024$ pointers, the largest supported file size is 
$$4kB(\underset{\text{direct}}{12} + \underset{\text{single indirect}}{1024} + \underset{\text{double indirect}}{1024^2} + \underset{\text{triple indirect}}{1024^3}) \approx 4TB$$

#### Extents
a disk pointer plus a length (number of blocks), instead of requiring a pointer to every block of a file, just need a pointer to every several blocks. More trade-off are then made. 

## Disk Scheduling
- FCFS: first come first serve
  - long waiting time for long request queues, but fine with low load
- SSTF (shortest seek time first) 
  - minimize arm movement, maximize request rate; but favors middle blocks
- SCAN (elevator) service requests in one direction until done, then reverse
- C-SCAN, like SCAN, but only go in one direction
- LOOK/C-LOOK go as far as last request in each direction, instead the full width of the disk

## FS Reliability
Only the __data block__ is written to disk > No inconsistency  
Only the __inode__ is written to disk > something points to garbage  
Only the __data block bitmap__ is written to disk > data leak  
Only the __inode__ and __data block bitmap__ are written to disk > something point to garbage  
Only the __data block bitmap__ and the __data block__ are written > data leak  
Only the __inode__ and the __data block__ are written > multiple inodes may point to same data block

#### Journaling 
Each journal starts with `TxBegin (TID=x)` block containing a transaction ID.   
Followed by blocks with the content to be written (e.g. `Updated inode`)
Ends with a "transaction end" `TxEnd(TID=x)` block. 

## RAID (redundant array of independent disks)
by duplicating mirror images, data are spread out across multiple disks 

#### RAID 0 
Files are divided across disks, improves throughput. However, if one drive fails, the whole volume is lost. 

#### RAID 1
Copying everything and stores in the second drive. capacity is half, improved read throughput, but takes longer to write. No data loss if one drive fails

#### RAID 5
block level stripping, distributed parity, a failed disk can be reconstructed from the rest. 

<img src="https://upload.wikimedia.org/wikipedia/commons/6/64/RAID_5.svg"/>