Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21928][CORE] Set classloader on SerializerManager's private kryo #19280

Closed
wants to merge 2 commits into from

Conversation

squito
Copy link
Contributor

@squito squito commented Sep 19, 2017

What changes were proposed in this pull request?

We have to make sure that SerializerManager's private instance of
kryo also uses the right classloader, regardless of the current thread
classloader. In particular, this fixes serde during remote cache
fetches, as those occur in netty threads.

How was this patch tested?

Manual tests & existing suite via jenkins. I haven't been able to reproduce this is in a unit test, because when a remote RDD partition can be fetched, there is a warning message and then the partition is just recomputed locally. I manually verified the warning message is no longer present.

We have to make sure thatthat SerializerManager's private instance of
kryo also uses the right classloader, regardless of the current thread
classloader.  In particular, this fixes serde during remote cache
fetches, as those occur in netty threads.
@SparkQA
Copy link

SparkQA commented Sep 19, 2017

Test build #81944 has finished for PR 19280 at commit acbaf8b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 20, 2017

Test build #81949 has finished for PR 19280 at commit 20e3585.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@squito
Copy link
Contributor Author

squito commented Sep 20, 2017

reaching out to some potential reviewers: @vanzin @srowen @JoshRosen @mridulm @tgravescs

@vanzin
Copy link
Contributor

vanzin commented Sep 20, 2017

Looks ok to me, assuming the "default serializer" in SerializerManager is configured correctly through other means.

Title would sound better with a possessive: "SerializerManager's private kryo"

@squito squito changed the title [SPARK-21928][CORE] Set classloader on SerializerManager private kryo [SPARK-21928][CORE] Set classloader on SerializerManager's private kryo Sep 21, 2017
@squito
Copy link
Contributor Author

squito commented Sep 21, 2017

Looks ok to me, assuming the "default serializer" in SerializerManager is configured correctly through other means.

I think that part is fine. The serializer is created here:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L279

The same instance is assigned to SparkEnv.serializer: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L374

Which has its default classloader set in Executor.scala, right by the part I'm changing: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L131

@vanzin
Copy link
Contributor

vanzin commented Sep 21, 2017

LGTM, merging to master / 2.2 / 2.1.

asfgit pushed a commit that referenced this pull request Sep 21, 2017
## What changes were proposed in this pull request?

We have to make sure that SerializerManager's private instance of
kryo also uses the right classloader, regardless of the current thread
classloader.  In particular, this fixes serde during remote cache
fetches, as those occur in netty threads.

## How was this patch tested?

Manual tests & existing suite via jenkins.  I haven't been able to reproduce this is in a unit test, because when a remote RDD partition can be fetched, there is a warning message and then the partition is just recomputed locally.  I manually verified the warning message is no longer present.

Author: Imran Rashid <irashid@cloudera.com>

Closes #19280 from squito/SPARK-21928_ser_classloader.

(cherry picked from commit b75bd17)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
@vanzin
Copy link
Contributor

vanzin commented Sep 21, 2017

Didn't merge to 2.1, please open a PR against that branch if you want it there.

@asfgit asfgit closed this in b75bd17 Sep 21, 2017
squito added a commit to squito/spark that referenced this pull request Sep 21, 2017
We have to make sure that SerializerManager's private instance of
kryo also uses the right classloader, regardless of the current thread
classloader.  In particular, this fixes serde during remote cache
fetches, as those occur in netty threads.

Manual tests & existing suite via jenkins.  I haven't been able to reproduce this is in a unit test, because when a remote RDD partition can be fetched, there is a warning message and then the partition is just recomputed locally.  I manually verified the warning message is no longer present.

Author: Imran Rashid <irashid@cloudera.com>

Closes apache#19280 from squito/SPARK-21928_ser_classloader.

(cherry picked from commit b75bd17)
@squito squito deleted the SPARK-21928_ser_classloader branch September 25, 2017 20:57
MatthewRBruce pushed a commit to Shopify/spark that referenced this pull request Jul 31, 2018
## What changes were proposed in this pull request?

We have to make sure that SerializerManager's private instance of
kryo also uses the right classloader, regardless of the current thread
classloader.  In particular, this fixes serde during remote cache
fetches, as those occur in netty threads.

## How was this patch tested?

Manual tests & existing suite via jenkins.  I haven't been able to reproduce this is in a unit test, because when a remote RDD partition can be fetched, there is a warning message and then the partition is just recomputed locally.  I manually verified the warning message is no longer present.

Author: Imran Rashid <irashid@cloudera.com>

Closes apache#19280 from squito/SPARK-21928_ser_classloader.

(cherry picked from commit b75bd17)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants