Skip to content
ariehf edited this page Jan 2, 2017 · 6 revisions

RDDComparisons enables you to compare two RDDs. You can compare the RDDs and get a sample of the mismatch between the two RDDs:

  • With order using method compareRDDWithOrder (e.g. [1,2,3] != [3,2,1]).
  • Without order using method compareRDD.

You can also just assert the RDDs equality:

  • With order using method assertRDDEqualsWithOrder.
  • Without order using method assertRDDEquals.

For Java users, The same functionality is supported through JavaRDDComparisons.

Examples:

class RDDComparisonTest extends FunSuite with SharedSparkContext with RDDComparisons {

  test("test RDDComparisons") {
    val expectedRDD = sc.parallelize(Seq(1, 2, 3))
    val resultRDD = sc.parallelize(Seq(3, 2, 1))

    assert(None === compareRDD(expectedRDD, resultRDD)) // succeed
    assert(None === compareRDDWithOrder(expectedRDD, resultRDD)) // Fail

    assertRDDEquals(expectedRDD, resultRDD) // succeed
    assertRDDEqualsWithOrder(expectedRDD, resultRDD) // Fail
  }
}