A simplified yet robust Distributed File System written in Go that supports uploading, downloading, and replicating .mp4 files across multiple nodes. This system simulates a fault-tolerant architecture with master coordination, heartbeat checks, and gRPC-based communication.
This DFS follows a centralized coordination model with three key components:
-
Master Tracker Node:
- Maintains a lookup table with:
filename,data node,file path,isAlive. - Handles client requests for upload/download.
- Coordinates data replication and monitors node health via heartbeats.
- Maintains a lookup table with:
-
Data Keeper Nodes:
- Store actual file data.
- Send heartbeat messages to the master.
- Accept file uploads from clients and replicate files on command.
-
Client:
- Uploads
.mp4files to the system. - Downloads files from available data nodes.
- Interacts only with the Master Tracker initially.
- Uploads
- Communication is handled via gRPC.
- Every 1 second, each Data Keeper sends a heartbeat to the Master Tracker.
- Master updates its lookup table based on live status.
- If a Data Keeper becomes unresponsive, it's marked as dead in the table.
- Client requests an upload slot from the Master Tracker.
- Master returns a port of an active Data Keeper.
- Client uploads the
.mp4file directly to the Data Keeper over TCP. - Data Keeper notifies the Master after upload completion.
- Master updates the lookup table and sends a success response to the client.
- Master initiates replication to 2 additional nodes.
- Every 10 seconds, the Master scans its file records.
- Ensures each file is stored on at least 3 live nodes.
- If replication is needed:
- Master chooses a source and target node.
- Notifies both to transfer the file.
- Client requests a file from the Master Tracker.
- Master replies with a list of IPs/ports of nodes holding the file.
- Client downloads the file in parallel as chunks from each of the nodes that have the file.
- 📡 gRPC-based communication
- 🧠 Centralized master with smart replication
- ♻️ Automatic recovery via periodic replication checks
- 🔄 Heartbeat-based node monitoring
- 🚀 Multi-threaded handling for concurrent requests
- 🔐 Fault tolerance with replicated storage
- Go (Golang)
- gRPC
- Concurrency via Goroutines and Mutex locks
Sara Bisheer |
Rawan Mostafa |
Menna Mohammed |
Fatma Ebrahim |