Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization is broken when using Scrooge-generated classes with the TypedAPI #449

Open
plaflamme opened this issue Jun 4, 2013 · 3 comments

Comments

@plaflamme
Copy link

When a Tuple contains a Scrooge-generated class instance, serialization is broken. It results in varying exceptions (StackOverFlow, OOM, etc.) when deserializing an instance.

This is only reproduced with the TypedAPI.

The workaround is to convert the instances to a byte array whenever serialization kicks in (between mapper and reducer, between steps, etc.) Or simply use the Fields API.

@azymnis
Copy link
Contributor

azymnis commented Jun 5, 2013

I have ran into this issue before but not for all scrooge types: instead this fails if a scrooge class contains an iterable field that is too long, e.g. a list with more than a few thousand elements. It would be great to post your stacktrace, but in my case this was an error that was happening somewhere inside Kryo (I don't have a stacktrace right now). If that is the case then we probably need to write a ScroogeHadoopSerialization similar to this:

https://github.com/Cascading/cascading-thrift/blob/master/src/jvm/backtype/hadoop/ThriftSerialization.java

This could register for all ThriftStruct types and delegate either to the Bijection or directly to the ThriftStructSerializer class.

What do you think? Can you try this? If this solves your problem, we should make this.

@johnynek
Copy link
Collaborator

johnynek commented Jun 5, 2013

Issue is that we need to make a chill subproject that has a Kryo serializer for ThriftStruct:

https://github.com/twitter/chill

because once we get in to scala Tuples, cascading's serialization is out of the picture.

@sritchie
Copy link
Collaborator

sritchie commented Jun 5, 2013

For now, folks can use the InjectiveSerializer with the Scrooge injections in bijection-scrooge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants