Bagel: Large-scale graph processing on Spark #48

ankurdave · 2011-05-03T22:49:56Z

Bagel is an implementation of the Pregel graph processing framework on Spark.

Bagel currently supports basic graph computation, combiners, and aggregators. Future work includes support for mutating the graph topology. Tests exist but currently don't run due to a Spark bug.

Note: This test suite currently fails for the same reason that the Spark Core test suite fails: Spark currently seems to have a bug where any test after the first one fails.

Refactored out the agg() and comp() methods from Pregel.run. Defined an implicit conversion to allow applications that don't use aggregators to avoid including a null argument for the result of the aggregator in the compute function.

tjhunter · 2011-05-04T00:46:34Z

I would recommend you refactor your code before merging, it is always harder / less tempting to do after.

mateiz · 2011-05-09T21:21:30Z

This looks great, Ankur, except for two naming things: can you change the package name from bagel to spark.bagel, and can you rename the Pregel class to Bagel?

ankurdave · 2011-05-09T22:28:25Z

Sure, I've done so.

Bagel: Large-scale graph processing on Spark

mateiz · 2011-05-13T04:35:17Z

Looks great, thanks. The one thing I should add is that maybe you should write a README documenting the examples, or a wiki page (and put a comment in the code pointing to this location).

ankurdave added 8 commits May 3, 2011 15:37

Add Bagel, an implementation of Pregel on Spark

c0736f6

Clean up Pregel.run, add logging

62ef620

Add Bagel classpath to run script

45ec9db

Update ShortestPath to work with controllable partitioning

19122af

Clean up Bagel source and interface

c5b3ea7

Add Bagel test suite

1c8ca0e

Note: This test suite currently fails for the same reason that the Spark Core test suite fails: Spark currently seems to have a bug where any test after the first one fails.

Package combiner functions into a trait

c18fa3e

Refactor and add aggregator support

563c5e7

Refactored out the agg() and comp() methods from Pregel.run. Defined an implicit conversion to allow applications that don't use aggregators to avoid including a null argument for the result of the aggregator in the compute function.

Move shortest path and PageRank to bagel.examples

c110405

Rename bagel to spark.bagel and Pregel to Bagel

f40a089

mateiz added a commit that referenced this pull request May 13, 2011

Merge pull request #48 from ankurdave/bagel-new

4b1f0f1

Bagel: Large-scale graph processing on Spark

mateiz merged commit 4b1f0f1 into mesos:new-rdds May 13, 2011

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bagel: Large-scale graph processing on Spark #48

Bagel: Large-scale graph processing on Spark #48

ankurdave commented May 3, 2011

tjhunter commented May 4, 2011

mateiz commented May 9, 2011

ankurdave commented May 9, 2011

mateiz commented May 13, 2011

Bagel: Large-scale graph processing on Spark #48

Bagel: Large-scale graph processing on Spark #48

Conversation

ankurdave commented May 3, 2011

tjhunter commented May 4, 2011

mateiz commented May 9, 2011

ankurdave commented May 9, 2011

mateiz commented May 13, 2011