Skip to content
This repository has been archived by the owner on Jun 16, 2023. It is now read-only.

Grouping

longdafeng edited this page Oct 21, 2014 · 3 revisions
  1. fieldsGrouping -- this is like "group by" in SQL, tuples with the same value of target field will be sent to the same task
  2. globalGrouping -- all tuples will be sent to the first task of the component.
  3. shuffleGrouping -- all tuples will be shuffle sent to tasks of the component,
  4. localOrShuffleGrouping -- if there are tasks of target component in current worker, then send tuples to these tasks with shuffle method, otherwise it is same as shuffleGrouping
  5. localFirst -- there are 3 kinds of tasks of target component, the first tasks are in the same worker, the second tasks are in same node but not same worker, the last tasks are run on other nodes. if there are the first tasks, then do shuffle in the first tasks, otherwise do shuffle in the second tasks, if neither the first tasks nor the second tasks exist, do shuffle in the last tasks.
  6. noneGrouping -- all tuple will be random sent to tasks of the component, it is similar as shuffleGrouping, but it can't guarantee tuples be sent equally between the tasks.
  7. allGrouping -- tuples will be sent to tasks
  8. directGrouping -- tuples will be sent the specified task
  9. customGrouping -- tuples will be sent to the user defined task
Clone this wiki locally