Skip to content

Implementing Map reduce paper in Golang. Dealing with race conditions, worker failures.

Notifications You must be signed in to change notification settings

nancyp321/MapReduceImplementation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MapReduceImplementation

Implementing Map reduce paper in Golang. Dealing with race conditions, worker failures.

In this repo, I am trying to implement the word count problem using Map reduce approach.

  • Initially master reads the text files provided at run time.
  • After the master starts, if any worker thread requests for map or reduce task, it checks its available(not yet started) map or reduce tasks.
  • Master wouldn't be handing out the reduce tasks unless all the map tasks are complete.
  • Hence if worker accidentally requests reduce task before map task, then it still receives a map task, which we keep track via custom Task struct.

For map tasks:

  • Each worker would be given a unique ID at init. Each map task has to be divided into 'nReduce' number of reduce tasks.
  • Worker applies map function to the content read from the file it receives from master and writes the intermediate output of format (, 1) to intermediate file which are named as mr-workerID-reduceID.
  • Once the task is initiated, the master sleeps for 10 seconds and checks if the custom struct "Task"'s status changes to complete.
  • If not, the task is again marked as incomplete and would be later given to next worker.
  • If the master receives ACK within 10 seconds, it marks the task as complete and creates Task structs for all the intermediate files.
  • All these newly created Tasks are added to reduce queue of master.
  • Once all map tasks are done, reduce stage starts.

For reduce tasks:

Since single file is processed by single worker, we group the reduce tasks by workerID.

Final done() stage

Once the reduce queue in the master is empty, the master exits.

To do

Check if worker's ACK is received 10 secs later.

To run:

Open atleast 2 terminal tabs, one for coordinator.go & rest for different workers(worker.go)
go build -race -buildmode=plugin ../mrapps/wc.go
go run -race mrcoordinator.go pg-*.txt (To start master)
go run -race mrworker.go wc.so (To start single worker)

To test:

bash test-mr.sh

About

Implementing Map reduce paper in Golang. Dealing with race conditions, worker failures.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published