Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialization fixes #125

Merged
merged 9 commits into from Jul 18, 2012

Conversation

@johnynek
Copy link
Collaborator

commented Jul 18, 2012

This does three things:

  1. Moves serialization to a new package: scalding.serialization
  2. Doesn't use Kryo for writing classes in collections (most questionable, but we are seeing some issues with unregistered classes).
  3. Make sure to flush after each object write in collections (important to avoid bloating the intermediate buffers).
@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request fails (merged b1b32b1 into 73d0450).

@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request passes (merged 59e91eb into 73d0450).

@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request fails (merged 999c377 into 73d0450).

@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request fails (merged 77cf4da into 73d0450).

@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request fails (merged b12c9aa into 73d0450).

@travisbot

This comment has been minimized.

Copy link

commented Jul 18, 2012

This pull request fails (merged 00b9332 into 73d0450).

azymnis added a commit that referenced this pull request Jul 18, 2012

@azymnis azymnis merged commit eeaab9b into twitter:develop Jul 18, 2012

@travisbot

This comment has been minimized.

Copy link

commented Jul 19, 2012

This pull request fails (merged dd4285a into 73d0450).

@vidma

This comment has been minimized.

Do you know if this still apply both to Tuple & Typed API, so one is unable to reliably group by Kryo serialized object?

This comment has been minimized.

Copy link
Collaborator Author

replied Feb 23, 2015

Well, it is a bit complex. 1) we have lost track of all the details of this issue since we did not carefully make an issue for it in the tracker. 2) I believe it is referring to a bug we tracked down as being related to Java enums, which use system hashCode, rather than hashing the int version of the enum. This means grouping on an Enum on the JVM is not safe and produces bad data (which you can detect because the same key appears on different reducers).

So, there is nothing Kryo related here that I know of. The problem is: you need to have a hashCode that is the same across all JVM instances. It cannot be the system/identity hashCode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.