MyHDFS

This project is a minimal implementation of distributed file system similar to HDFS

Features

* Uses Java RMI + Google protobuf for communication
* Replication factor = 2 
* Block size = 64K 
* NameNode & DataNodes maintains the information of block report to  
  recover even after restart of machine
* Supports 4 DataNodes running on different machines interconnected

Running NameNode & DataNode

* Provide the datanode IPs in config.properties and run namenode as "java -jar NameNode.jar"
* Run datanode as "java -jar -Djava.rmi.server.hostname=<NameNodeIP> dataNode.jar"

Protobuf Generation Command

* protoc -I<include-path> --java_out=<src-dir> <.proto file path>

Inter Docker Communication

If you are using dockers for nodes, check this for inter docker Communication

Future Scope

Implement MapReduce on top of this HDFS.
Make blocksize, replication factor configurable.
Instead of using random approach for reading the block from the replicas, get the block from the nearest datanode.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Client/src		Client/src
DataNode/src		DataNode/src
NameNode		NameNode
README.md		README.md
networkSetup.sh		networkSetup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MyHDFS

Features

Running NameNode & DataNode

Protobuf Generation Command

Inter Docker Communication

Future Scope

About

Releases

Packages

Languages

manoj535/MyHDFS

Folders and files

Latest commit

History

Repository files navigation

MyHDFS

Features

Running NameNode & DataNode

Protobuf Generation Command

Inter Docker Communication

Future Scope

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages