Skip to content

josh22018/distributed-file-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Distributed File System (DFS) — Simulation

A pure Python simulation of a distributed file system built around a leader-follower architecture with 3 nodes. No external libraries required — just Python 3.10+.

Concepts Demonstrated

Concept Description
Write-Ahead Log (WAL) Every write is logged before being applied, enabling crash recovery and follower catch-up
Leader-Follower Replication Only the leader accepts writes; followers replicate via WAL replay
Network Partitions Simulates split-brain scenarios and healing (CAP theorem in practice)
Conflict Resolution Last-write-wins using logical version clocks, not wall clocks
Delta Sync Block-level diffing (rsync-style) to minimize bandwidth on large file updates

Project Structure

dfs/
├── main.py        # Entry point — runs all five scenarios
├── simulator.py   # Five failure/recovery scenario definitions
├── node.py        # Node class handling both leader and follower roles
├── storage.py     # In-memory file storage layer
├── wal.py         # Write-Ahead Log implementation
├── network.py     # Network simulation (latency, partitions, healing)
└── sync.py        # Replication engine and consistency reporting

Running

python main.py

This runs all five scenarios end to end and prints a summary of results.

Scenarios

  1. Basic replication — leader writes propagate to all followers
  2. Crash recovery — node crashes mid-write, replays WAL on restart
  3. Network partition — nodes diverge during a split, reconcile after healing
  4. Conflict resolution — concurrent writes resolved via last-write-wins
  5. Delta sync — large file updated with only changed blocks transmitted

Key Takeaways

  • WAL-first writes make crash recovery always possible
  • Network partitions are inevitable — design for them explicitly
  • Logical clocks beat wall clocks for ordering concurrent events
  • Delta sync dramatically reduces bandwidth for large file updates
  • Crash ≠ partition: crashed nodes must replay WAL on restart, not just reconnect

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages