Skip to content
This repository was archived by the owner on Mar 7, 2018. It is now read-only.

Improve serialization performance#20

Merged
c-w merged 1 commit intomasterfrom
improve-serialization
Jun 21, 2017
Merged

Improve serialization performance#20
c-w merged 1 commit intomasterfrom
improve-serialization

Conversation

@c-w
Copy link
Contributor

@c-w c-w commented Jun 20, 2017

This PR switches our Spark setup to using the Kryo serializer for common DTO classes.

Unfortunately we can't enforce the use of the Kryo serializer for all classes (via the setting spark.kryo.registrationRequired) because some classes that we depend upon are non-public so we can't easily get a handle to them to register them in Kryo (e.g. TwitterReceiver).

Resolves #15

Unfortunately we can't use Kryo serializer for all classes (via setting
"spark.kryo.registrationRequired") because some classes that we depend
upon are non-public (e.g. TwitterReceiver) so we can't easily get a
handle to them to register them in Kryo.
@c-w c-w force-pushed the improve-serialization branch from 43391b2 to a3ed4fc Compare June 21, 2017 03:29
Copy link
Contributor

@kevinhartman kevinhartman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Perhaps we can register some internal classes using reflection in a later PR?

@c-w
Copy link
Contributor Author

c-w commented Jun 21, 2017

I did some research and didn't find anyone doing this. From my understanding, it's usually good enough to use Kryo for the commonly serialized data types (i.e., the DTOs) and keep using the Java serializer for less common types (like Receivers).

@c-w c-w merged commit 4bb54b2 into master Jun 21, 2017
@c-w c-w deleted the improve-serialization branch June 21, 2017 20:55
@c-w c-w removed the in progress label Jun 21, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants