Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Safely register class for GraphX

How was this patch tested?

added suites

@SparkQA
Copy link

SparkQA commented May 8, 2019

Test build #105250 has finished for PR 24555 at commit d3e5f65.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the upside -- that GraphX classes won't work if registration is enforced?

classOf[HighlyCompressedMapStatus],
classOf[BitSet],
classOf[CompactBuffer[_]],
classOf[OpenHashSet[Int]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these actually different types, if the generic type is all that varies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// We load them safely, ignore it if the class not found.
Seq(
"org.apache.spark.graphx.Edge",
"org.apache.spark.graphx.Edge$mcB$sp",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are synthetic classes and their name may change. Are they needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, they are needed if we want to register Edge, since type specialization is used in https://github.com/apache/spark/blob/master/graphx/src/main/scala/org/apache/spark/graphx/Edge.scala#L32

I had test this, if we do not reigster org.apache.spark.graphx.Edge$mcB$sp, Edge[Boolean] will not be handled by kryo.

import org.apache.spark.serializer.KryoSerializer

class GraphXPrimitiveKeyOpenHashMapSuite extends SparkFunSuite {
test("Kryo class register") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @zhengruifeng . This seems to fail on Jenkins. Is this tested in your local environment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not test this class in local env, it seems that too many anonymous classes are envolved here. I will remove this.

@SparkQA
Copy link

SparkQA commented May 9, 2019

Test build #105273 has finished for PR 24555 at commit 928f36a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng zhengruifeng changed the title [SPARK-27656][GraphX][WIP] Safely register class for GraphX [SPARK-27656][GraphX] Safely register class for GraphX May 9, 2019
@srowen
Copy link
Member

srowen commented May 9, 2019

Just for completeness, what does this achieve? GraphX doesn't otherwise work at all with registration enforced? and it's faster when it's not enforced?

@zhengruifeng
Copy link
Contributor Author

@srowen I locally tested on some small datasets, and find the difference is tiny.
However, it seems that kryo register is enable in graphx's 'benchmark' SynthBenchmark, so I think this will help improve perf.

@zhengruifeng zhengruifeng deleted the graphxutils_registerKryoClasses branch August 21, 2019 06:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants