This project implements a distributed file storage system with replication capabilities. The system consists of a Controller and multiple Data Stores (Dstores), supporting file operations such as store, load, list, and remove with built-in fault tolerance.
The system follows a Controller-Dstore architecture:
- Controller: Orchestrates all operations and maintains an index of file locations
- Dstores: Store the actual file data with replication across multiple Dstores
- Client: Provided as a compiled JAR to interact with the system
Files are replicated across R different Dstores to ensure fault tolerance. If Dstores fail or new ones join, the system performs rebalance operations to maintain the replication factor.
- Java 21 JDK
- client.jar (provided)
mkdir downloads to_store
echo "This is test file 1" > to_store/testfile1.txt
echo "This is test file 2" > to_store/testfile2.txt-
Start the Controller:
java Controller <cport> <R> <timeout> <rebalance_period>
cport: Controller port numberR: Replication factor (number of copies for each file)timeout: Timeout in millisecondsrebalance_period: Time between rebalance operations in seconds
-
Start at least R Dstores:
java Dstore <port> <cport> <timeout> <file_folder>
port: Port for this Dstorecport: Controller porttimeout: Timeout in millisecondsfile_folder: Folder to store files
-
Run the Client:
java -cp client.jar:. ClientMain <cport> <timeout>
cport: Controller porttimeout: Timeout in milliseconds
Example:
# Start Controller with replication factor 3
java Controller 12345 3 1000 60
# Start 3 Dstores
java Dstore 10001 12345 1000 dstore1_folder
java Dstore 10002 12345 1000 dstore2_folder
java Dstore 10003 12345 1000 dstore3_folder
# Run the client
java -cp client.jar:. ClientMain 12345 1000The client interacts with the system through the ClientMain.java file, which uses:
to_store/: Files in this folder are uploaded to the distributed systemdownloads/: Downloaded files are stored here
The client performs the following operations:
- Lists all files in the system
- Uploads files from the
to_store/folder - Lists files again to verify storage
- Downloads the files to the
downloads/folder - Removes files from the system
- Store, load, and remove files using the client
- Check that files appear in the correct Dstore folders
- Verify content integrity between uploaded and downloaded files
-
Dstore Failure Test:
- Start Controller and at least R+1 Dstores
- Store some files
- Kill one Dstore
- Verify the system still functions if at least R Dstores remain
-
Rebalance Test:
- Start Controller and R Dstores
- Store some files
- Add a new Dstore
- Wait for the rebalance period
- Verify files are redistributed evenly
The system uses TCP connections with a text-based protocol:
- Client → Controller:
STORE filename filesize - Controller → Client:
STORE_TO port1 port2 ... portR - Client connects to each Dstore and sends the file
- Client → Controller:
LOAD filename - Controller → Client:
LOAD_FROM port filesize - Client connects to the Dstore to download the file
- Client → Controller:
LIST - Controller → Client:
LIST file1 file2 ...
- Client → Controller:
REMOVE filename - Controller removes the file from all Dstores
- Controller → Client:
REMOVE_COMPLETE
The system handles several error conditions:
ERROR_NOT_ENOUGH_DSTORES: When fewer than R Dstores are availableERROR_FILE_ALREADY_EXISTS: When trying to store a file that already existsERROR_FILE_DOES_NOT_EXIST: When trying to load or remove a non-existent fileERROR_LOAD: When a file cannot be loaded from any available Dstore
The rebalance operation ensures:
- Files are replicated R times
- Files are distributed evenly across Dstores
The Controller initiates rebalance:
- Periodically based on
rebalance_period - When a new Dstore joins the system
During rebalance, client operations are queued and processed after rebalance completes.