Skip to content

Iterative reduce

jpatanooga edited this page Nov 1, 2012 · 4 revisions

What is Iterative Reduce?

Iterative Reduce was designed specifically for parallel iterative algorithms on Hadoop. It is implemented directly on top of YARN and provides intrinsic parallelism while abstracting part of the "plumbing" complexities away of YARN programming.

Lifecycle of an IterativeReduce Application

Client

  • Launches the YARN ApplicationMaster

Master

  • Computes required resources
  • Obtains resources from YARN
  • Launches Workers

Workers

  • Computation on partial data (input split)
  • Synchronizes with Master

Iterative Reduce Programming Guide

Resources

https://github.com/emsixteeen/IterativeReduce