Skip to content

[hail][bugfix] always truncate partitioner keys in TableMultiWayZipJoin#8035

Merged
danking merged 1 commit intohail-is:masterfrom
chrisvittal:issue-8027
Feb 5, 2020
Merged

[hail][bugfix] always truncate partitioner keys in TableMultiWayZipJoin#8035
danking merged 1 commit intohail-is:masterfrom
chrisvittal:issue-8027

Conversation

@chrisvittal
Copy link
Copy Markdown
Collaborator

fixes #8027

cc @konradjk

@konradjk
Copy link
Copy Markdown
Collaborator

konradjk commented Feb 5, 2020

Can confirm this fixes my issue

val childRanges = childRVDs.flatMap(_.partitioner.rangeBounds)
val newPartitioner = RVDPartitioner.generate(childRVDs.head.typ.kType.virtualType, childRanges)
childRVDs.map(_.repartition(newPartitioner, ctx))
childRVDs.map(_.repartition(newPartitioner, ctx).truncateKey(typ.key.length))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see the problem. But you're computing a new common partitioner with the wrong key, repartitioning each child RVD, then fixing up the partitioner on each child RVD. Better to just compute the right partitioner before doing the repartitioning. I think this should work:

val childRanges = childRVDs.flatMap(_.partitioner.coarsenedRangeBounds(typ.key.length))
val newPartitioner = RVDPartitioner.generate(typ.kType, childRanges)
childRVDs.map(_.repartition(newPartitioner, ctx))

@danking danking merged commit ccbcaa5 into hail-is:master Feb 5, 2020
@chrisvittal chrisvittal deleted the issue-8027 branch March 11, 2020 21:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix multi_way_zip_join key issue

4 participants