A pure Python simulation of a distributed file system built around a leader-follower architecture with 3 nodes. No external libraries required — just Python 3.10+.
| Concept | Description |
|---|---|
| Write-Ahead Log (WAL) | Every write is logged before being applied, enabling crash recovery and follower catch-up |
| Leader-Follower Replication | Only the leader accepts writes; followers replicate via WAL replay |
| Network Partitions | Simulates split-brain scenarios and healing (CAP theorem in practice) |
| Conflict Resolution | Last-write-wins using logical version clocks, not wall clocks |
| Delta Sync | Block-level diffing (rsync-style) to minimize bandwidth on large file updates |
dfs/
├── main.py # Entry point — runs all five scenarios
├── simulator.py # Five failure/recovery scenario definitions
├── node.py # Node class handling both leader and follower roles
├── storage.py # In-memory file storage layer
├── wal.py # Write-Ahead Log implementation
├── network.py # Network simulation (latency, partitions, healing)
└── sync.py # Replication engine and consistency reporting
python main.pyThis runs all five scenarios end to end and prints a summary of results.
- Basic replication — leader writes propagate to all followers
- Crash recovery — node crashes mid-write, replays WAL on restart
- Network partition — nodes diverge during a split, reconcile after healing
- Conflict resolution — concurrent writes resolved via last-write-wins
- Delta sync — large file updated with only changed blocks transmitted
- WAL-first writes make crash recovery always possible
- Network partitions are inevitable — design for them explicitly
- Logical clocks beat wall clocks for ordering concurrent events
- Delta sync dramatically reduces bandwidth for large file updates
- Crash ≠ partition: crashed nodes must replay WAL on restart, not just reconnect