Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9517][SQL] BytesToBytesMap should encode data the same way as UnsafeExternalSorter #7845

Closed
wants to merge 3 commits into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Aug 1, 2015

BytesToBytesMap current encodes key/value data in the following format:

8B key length, key data, 8B value length, value data

UnsafeExternalSorter, on the other hand, encodes data this way:

4B record length, data

As a result, we cannot pass records encoded by BytesToBytesMap directly into UnsafeExternalSorter for sorting. However, if we rearrange data slightly, we can then pass the key/value records directly into UnsafeExternalSorter:

4B key+value length, 4B key length, key data, value data

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39315 has finished for PR 7845 at commit 2d4ad05.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 1, 2015

Test build #39325 has finished for PR 7845 at commit 5716b59.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor Author

rxin commented Aug 1, 2015

Going to merge this since @JoshRosen and I looked at this together.

@asfgit asfgit closed this in d90f2cf Aug 1, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants