Skip to content

[SPARK-27560][Java API] Provide an hash partitioning method which supports seeding#25034

Closed
adarmiento wants to merge 1 commit intoapache:masterfrom
adarmiento:SPARK-27560
Closed

[SPARK-27560][Java API] Provide an hash partitioning method which supports seeding#25034
adarmiento wants to merge 1 commit intoapache:masterfrom
adarmiento:SPARK-27560

Conversation

@adarmiento
Copy link

@adarmiento adarmiento commented Jul 2, 2019

What changes were proposed in this pull request?

A SeedHashPartitioner class is provided.
SeedHashPartitioner extends tue existing HashPartitioner class and receives an integer seed.

Instead of using java.Object.HashCode(), SeedHashPartitioner implements a simple seed-based hashing algorithm

How was this patch tested?

Manual testing with all the primitives and their array form.
Manual testing with boxed case classes

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@srowen
Copy link
Member

srowen commented Jul 2, 2019

I don't think I understand the purpose of this, from the JIRA. The same values will still hash to the same partition here. If you partition by a column and that column has all the same values, you will always get one partition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants