# Colocating file lists and small files
* We can shove small files and directory lists inside the inode of the directory. This makes it harder to do hardlinks, but otherwise faster 

# Synchronous metadata updates
**Problem**: Some updates to metadata require synchronous writes
* data needs to "hit the disk" before anything else can be done

Examples:
* creating a file: must write new file inode to disk before the corresponding directory entry
* deleting a file: must clear the directory entry before marking inode as "free"

We have to be able to survive a crash at any time between atomic operations
* Issue here is that we have two components that can fail independently: disk and processor

## Cyclic Dependency
Given:
* Both inodes in the same disk block (Block 2)
* Both file create and file delete have occurred in the cache, but neither have hit the disk

What order do we write the disk blocks out?
* Block 1 is the directory inode
* The `file create` depends on Block 2 being written before Block 1
* The `file delete` depends on Block 1 being written before Block 2

### Solution: Soft Updates
* Write at a finer granularity
* Roll back one of the operations, write the other
* Then write the rolled back operation
* in other words, do 4 writes: eg. Block 2 for create, Block 1 for create, Block 1 for delete, Block 2 for delete
* Performance is good enough, comes close to idealized

# Log-structured Filesystem (LFS)

FFS (Fast file system) had some lingering problems LFS wanted to fix.

## Overview
Treat entire disk as **one big append-only log** for writes
* Don't lay out blocks
* Whenever a file write occurs, append to end of log
* When file metadata changes, append to end of log

Collect pending writes in memory and stream out in one big write
* Maximize disk bandwidth
* No extra disk seeks required

Actual writes occur when
* user calls `sync()` or `fsync()` and explicitly syncs the filesystem
* OS needs to reclaim dirty buffer cache pages

This was supposed to reduce seeking
* Main memories were getting larger, we can cache more things

## Writing the log

Each log write updates the inode

## Inode map
How do you find the inodes?

Use an **inode map**, which maps file number to its location of its inode in the log
* inode map also in log
* cache inode map in memory for perfromance (in fixed checkpoint region)

## Reading from LFS
File is scattered all over the disks.

Basic assumption: Buffer cache will handle most writes

## Log Cleaner
Eventually the disk fills up, we need to reclaim dead space
* eg. deleted files, overwritten file blocks

We do periodic log cleaning
* scan log, look for deleted or overwritten blocks (clear out stale log entries)
* copy live data to end of the log

When does the cleaner run?
* when the disk is idle
* this is shitty if the system is never idle lol

Cleaning a segment requires reading the whole segment!
* can reduce cost if data is already in cache

How does segment size affect performance?
* Large segments amortize access/seek time to read/write entire segment during cleaning
* Small segments introduce more variance in segment utilizations


# Journalling

## Aside: Filesystem Corruption
What happens when you are changing the FS and the system crashes?
* eg. adding a ton of new file entries in a block
* system crashes while block is being written
* files are lost!

## Journalling Filesystems
* Ensure changes are made atomically
    * eg. creating a new file

Idea: Maintain a log
* eg. "Directory 893 had inodes 123, 124, 125 added to it"

To make a change
1. Write **intent-to-commit** (start of atomic section)
2. Write the change to the log, DO NOT MODIFY FILESYSTEM DATA DIRECTLY)
3. Write a commit record to the log (end of atomic section)

This is similar to notion of database transactions

If we come back after a crash and realize our commit record is not there, then we know we can disregard all of the changes since the intent-to-commit.

## Recovery

When the system crashes:
* FS data has not been modified, just the log
* FS itself reflects what happened before the crash

We periodically synchronize the log with the FS data
* this is a **checkpoint**
* Ensures the FS data reflects all changes in the log

So in the case of a crash, we only have to look at the log entries since the last checkpoint
* Check these log entries for a commit record. If there's no commit record, we don't perform them
* `fsck` does this action