Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-5470][Core]use defaultClassLoader to load classes in KryoSerializer #4258

Closed
wants to merge 2 commits into from

Conversation

lianhuiwang
Copy link
Contributor

Now KryoSerializer load classes of classesToRegister at the time of its initialization. when we set spark.kryo.classesToRegister=class1, it will throw SparkException("Failed to load class to register with Kryo".
because in KryoSerializer's initialization, classLoader cannot include class of user's jars.
we need to use defaultClassLoader of Serializer in newKryo(), because executor will reset defaultClassLoader of Serializer after Serializer's initialization.
thank @zzcclp for reporting it to me.

@lianhuiwang lianhuiwang changed the title [SPARK-5470][Core]use defaultClassLoader to load classes of classesToRegister in KryoSeria... [SPARK-5470][Core]use defaultClassLoader to load classes in KryoSerializer Jan 29, 2015
@SparkQA
Copy link

SparkQA commented Jan 29, 2015

Test build #26281 has started for PR 4258 at commit 64cf306.

  • This patch merges cleanly.

@zzcclp
Copy link
Contributor

zzcclp commented Jan 29, 2015

LGTM, I test this with spark-branch 1.2, it works correctly. :)

@SparkQA
Copy link

SparkQA commented Jan 29, 2015

Test build #26281 has finished for PR 4258 at commit 64cf306.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class LogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasMaxIter,
    • class LogisticRegressionModel(JavaModel):
    • class Tokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class HashingTF(JavaTransformer, HasInputCol, HasOutputCol, HasNumFeatures):
    • class Param(object):
    • class Params(Identifiable):
    • template = '''class Has$Name(Params):
    • class HasMaxIter(Params):
    • class HasRegParam(Params):
    • class HasFeaturesCol(Params):
    • class HasLabelCol(Params):
    • class HasPredictionCol(Params):
    • class HasInputCol(Params):
    • class HasOutputCol(Params):
    • class HasNumFeatures(Params):
    • class Estimator(Params):
    • class Transformer(Params):
    • class Pipeline(Estimator):
    • class PipelineModel(Transformer):
    • class Identifiable(object):
    • class JavaWrapper(Params):
    • class JavaEstimator(Estimator, JavaWrapper):
    • class JavaTransformer(Transformer, JavaWrapper):
    • class JavaModel(JavaTransformer):

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26281/
Test PASSed.

@@ -97,7 +87,9 @@ class KryoSerializer(conf: SparkConf)
// Use the default classloader when calling the user registrator.
Thread.currentThread.setContextClassLoader(classLoader)
// Register classes given through spark.kryo.classesToRegister.
classesToRegister.foreach { clazz => kryo.register(clazz) }
classesToRegister.split(',')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do the splitting and filtering during initialization so that we don't incur them for each task?

@SparkQA
Copy link

SparkQA commented Jan 31, 2015

Test build #26447 has started for PR 4258 at commit 73b719f.

  • This patch merges cleanly.

@lianhuiwang
Copy link
Contributor Author

@sryza thank you. I have updated for your comment. Can you review again?

@SparkQA
Copy link

SparkQA commented Jan 31, 2015

Test build #26447 has finished for PR 4258 at commit 73b719f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26447/
Test PASSed.

@sryza
Copy link
Contributor

sryza commented Feb 2, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Feb 2, 2015

Test build #26528 has started for PR 4258 at commit 73b719f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 2, 2015

Test build #26528 has finished for PR 4258 at commit 73b719f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26528/
Test FAILed.

@lianhuiwang
Copy link
Contributor Author

Jenkins, retest this please.

@SparkQA
Copy link

SparkQA commented Feb 4, 2015

Test build #26707 has started for PR 4258 at commit 73b719f.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 4, 2015

Test build #26707 has finished for PR 4258 at commit 73b719f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26707/
Test PASSed.

@srowen
Copy link
Member

srowen commented Feb 6, 2015

This looks like a logical change, since it makes sense to use the provided default classloader to register user classes, just as it's already used to create the serializer. Given reviews from 3 other people and confirmation this does what's intended, it seems like a good fix to merge.

@asfgit asfgit closed this in ed3aac7 Feb 6, 2015
@JoshRosen
Copy link
Contributor

I've cherry-picked this into branch-1.3 (1.3.0) as well.

asfgit pushed a commit that referenced this pull request Feb 6, 2015
…lizer

Now KryoSerializer load classes of classesToRegister at the time of its initialization. when we set spark.kryo.classesToRegister=class1, it will throw SparkException("Failed to load class to register with Kryo".
because in KryoSerializer's initialization, classLoader cannot include class of user's jars.
we need to use defaultClassLoader of Serializer in newKryo(), because executor will reset defaultClassLoader of Serializer after Serializer's initialization.
thank zzcclp for reporting it to me.

Author: lianhuiwang <lianhuiwang09@gmail.com>

Closes #4258 from lianhuiwang/SPARK-5470 and squashes the following commits:

73b719f [lianhuiwang] do the splitting and filtering during initialization
64cf306 [lianhuiwang] use defaultClassLoader to load classes of classesToRegister in KryoSerializer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants