Skip to content

Conversation

@danking
Copy link
Contributor

@danking danking commented Feb 2, 2024

CHANGELOG: Fix a bug, introduced in 0.2.114, in which Table.multi_way_zip_join and Table.aggregate_by_key could throw "NoSuchElementException: Ref with name __iruid_..." when one or more of the tables had a number of partitions substantially different from the desired number of output partitions.

Fixes #14245.

In both MultiWayZipJoin and TableAggregateByKey, we repartition the child but neglect to use the new globals Ref from the repartitioned child. As long as repartitionNoShuffle does not create a new TableStage with new globals, this is fine, but that is not, in general, true. It seems that recently, in lowered backends, when the repartition cost is deemed "high" we generate a fresh TableStage with a fresh globals ref.

CHANGELOG: Fix a bug, introduced in 0.2.114, in which `Table.multi_way_zip_join` and `Table.aggregate_by_key` could throw "NoSuchElementException: Ref with name __iruid_..." when one or more of the tables had a number of partitions substantially different from the desired number of output partitions.

Fixes hail-is#14245.

In both MultiWayZipJoin and TableAggregateByKey, we repartition the child but neglect to use the
new globals `Ref` from the repartitioned child. As long as `repartitionNoShuffle` does not create a
new TableStage with new globals, this is fine, but that is not, in general, true. It seems that
recently, in lowered backends, when the repartition cost is deemed "high" we generate a fresh
TableStage with a fresh globals ref.

repartitioned.mapPartition(Some(child.typ.key)) { partition =>
Let(
FastSeq("global" -> repartitioned.globals),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sneaky

@danking danking merged commit d4679eb into hail-is:main Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants