[Group 54] Distributed System

This project is an implementation of a basic distributed log querier and a distributed Failure detector

Description

Distributed Log Querier

We have implemented a distributed system where a client process can query for logs from a set of machines. The client is agnostic to the number of machines and the specifics of the distributed environment.

The client process first chooses a coordinator process to which the query request is to be forwarded to. If the connection to a chosen coordinator fails, the client randomly chooses another coordinator process until a connection succeeds. This ensures that even if there is one coordinator process running the client's request will be processed.

The coordinator process then forwards the query request to the worker/service processes who process the query on the chunk of log file that resides on them and returns the result to the coordinator. The coordinator node assimilates the results from all the service processes and returns a response back to the client.

Distributed Group Membership

We have implemented a SWIM style distributed group membership protocol, where in each protocol period (0.5s), each machine in the topology (ring shaped) pings one of the neighbors (predecessor, successor, super-successor) in a cyclic way, making sure it monitors all the 3 neighbors over three protocol periods. This design ensures that 3 simultaneous failures are detected. This design scales well for a large number of nodes because each process just monitors 3 of its neighbors and sends a copy of the membership list to them instead of flooding the whole topology with the list which could lead to increased network congestion as in all-to-all heartbeat failure detection. We also piggy back the marshaled membership list with every “pong” that the node sends out ensuring gossip style membership list dissemination.

The state of each of the entries in the membership list goes from “Active” -> “Suspicious” -> “Failed” -> “Delete”, after which the entry for that process is deleted. If the process doesn’t respond to the UDP ping the process is marked “Suspicious” and if it remains suspicious for T_FAIL = 1s, the state is updated to “Failed”. Once the process is marked as failed, we wait for another T_DELETE = 1s to mark the process state as “Delete” after which the process will be removed from the list. This ensures that a failed process will be removed from the list in well under 3s.

Running the code

Setting up the code

$ git clone https://gitlab.engr.illinois.edu/shahidi3/cs425_mp1_distributedlogquerier.git
$ mkdir logs (should contain log pertaining to the node)
$ cd cs425_mp1_distributedlogquerier
$ go mod download

Running the Distributed Log Querier

[optional] To delete the previous logs of coordinator and the service

$ ./removelogs.sh 

To start the coordinator and the service on the node

$ ./run.sh

To run the client and query for an arbitrary query

$ cd client
$ go run client.go -query "searchQuery"

To run the test client

$ cd testclient
$ go run testclient

To stop the coordinator and service processes in the end

$ ./stop.sh

Running the Distributed Group Membership

[optional] To delete the previous logs

$ ./removelogs.sh 

Starting the introducer

$ cd commands/introducer
$ go run introducer.go

Starting a process
$ go run commands/process
$ go run process.go

Authors

Lavanya Ramkumar
Shahid Ikram

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
client		client
commands		commands
coordinator		coordinator
files		files
glob		glob
introducer		introducer
localfiles		localfiles
membershiplist		membershiplist
process		process
proto		proto
test_log_scripts		test_log_scripts
testclient		testclient
topology		topology
util		util
.gitignore		.gitignore
README.md		README.md
Report.pdf		Report.pdf
delete_file.go		delete_file.go
failnode.go		failnode.go
go.mod		go.mod
go.sum		go.sum
join_node.go		join_node.go
member_default.go		member_default.go
output.png		output.png
readme.md		readme.md
removelogs.sh		removelogs.sh
run.sh		run.sh
stop.sh		stop.sh
write.go		write.go
write_ver2.go		write_ver2.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[Group 54] Distributed System

Description

Distributed Log Querier

Distributed Group Membership

Running the code

Setting up the code

Running the Distributed Log Querier

Running the Distributed Group Membership

Authors

About

Releases

Packages

Contributors 2

Languages

kukr/Distributed-File-System

Folders and files

Latest commit

History

Repository files navigation

[Group 54] Distributed System

Description

Distributed Log Querier

Distributed Group Membership

Running the code

Setting up the code

Running the Distributed Log Querier

Running the Distributed Group Membership

Authors

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages