[SPARK-5494][SQL] SparkSqlSerializer Ignores KryoRegistrators #4693

hkothari · 2015-02-19T15:51:19Z

SparkSqlSerializer completely ignores custom KryoRegistrators (unlike the regular Spark serializer). I've applied a change to call super.newKryo before registering their custom classes, and removed duplicate registrations/settings.

ash211 · 2015-02-19T16:12:48Z

Jenkins this is ok to test

SparkQA · 2015-02-19T16:52:40Z

Test build #27723 has finished for PR 4693 at commit 8e756cc.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2015-02-19T20:32:50Z

There is no reason to allow custom kryo registration. The SQL serializer is only ever used to serialize SQL types. Have you seen some bug with us missing a type?

pwoody · 2015-03-02T13:11:42Z

I believe the bug is due to inconsistency between KryoSerializer and SqlSerializer in the order of registration. I don't know if there is a way to manage multiple Serializers, but mixed applications (sql/raw spark) can run into issues.

marmbrus · 2015-03-02T18:05:22Z

@pwoody I don't think that is possible. As long as the two sides are using copies of kryo with the same order or registration then things should work. As far as I know we never mix them up. Please let me know if you have a test case that shows otherwise.

If there are no examples of failures, then I suggest we close this issue.

mccheah · 2015-03-02T22:27:05Z

Talked with @pwoody offline and he's working on a test case, but I'll summarize what I think will break. If you use an SQL Context wrapping a Spark Context where you set spark.serializer, then when you do some things in Spark SQL and try to collect the RDD, the operations in computing the SQL will be fine, but when you collect the RDD it uses the SQL serializer to serialize the results but the driver will kryo-deserialize using spark.serializer.

This was the issue that originally prompted this change.

marmbrus · 2015-03-02T22:38:30Z

@mccheah we set the serializer in SQL on a per-shuffle basis, so that would surprise me. However, if you can show it happening we should certainly fix it.

yhuai · 2015-03-29T00:28:32Z

@pwoody @mccheah any update on it? If you saw an error, what was the exception and error message?

[SPARK-5494] SparkSqlSerializer Ignores KryoRegistrators

8e756cc

asfgit closed this in 0cc8fcb Apr 12, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-5494][SQL] SparkSqlSerializer Ignores KryoRegistrators #4693

[SPARK-5494][SQL] SparkSqlSerializer Ignores KryoRegistrators #4693

hkothari commented Feb 19, 2015

ash211 commented Feb 19, 2015

SparkQA commented Feb 19, 2015

marmbrus commented Feb 19, 2015

pwoody commented Mar 2, 2015

marmbrus commented Mar 2, 2015

mccheah commented Mar 2, 2015

marmbrus commented Mar 2, 2015

yhuai commented Mar 29, 2015

[SPARK-5494][SQL] SparkSqlSerializer Ignores KryoRegistrators #4693

[SPARK-5494][SQL] SparkSqlSerializer Ignores KryoRegistrators #4693

Conversation

hkothari commented Feb 19, 2015

ash211 commented Feb 19, 2015

SparkQA commented Feb 19, 2015

marmbrus commented Feb 19, 2015

pwoody commented Mar 2, 2015

marmbrus commented Mar 2, 2015

mccheah commented Mar 2, 2015

marmbrus commented Mar 2, 2015

yhuai commented Mar 29, 2015