Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle null partitions in P2P shuffling #8116

Merged
merged 1 commit into from Aug 17, 2023

Conversation

hendrikmakait
Copy link
Member

@hendrikmakait hendrikmakait commented Aug 17, 2023

Closes #8092
Follow-up to #7922

  • Tests added / passed
  • Passes pre-commit run --all-files

@@ -2057,6 +2058,28 @@ async def test_handle_null_partitions_p2p_shuffling(c, s, *workers):
await check_scheduler_cleanup(s)


@gen_cluster(client=True)
async def test_handle_null_partitions_p2p_shuffling_2(c, s, a, b):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails only fails 2/3 of the time on main, but that's good enough for me.

@hendrikmakait hendrikmakait added shuffle bug Something is broken labels Aug 17, 2023
@hendrikmakait hendrikmakait mentioned this pull request Aug 17, 2023
4 tasks
def make_partition(i):
"""Return null column for one partition"""
if i % 2 == 1:
return pd.DataFrame({"a": np.random.random(10), "b": [None] * 10})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the None column null explicitly, otherwise lgtm

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fails unless I replace b in the other case with a string. That's another issue I plan to handle. As the current test mimics the original reproducer, I will keep this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@github-actions
Copy link
Contributor

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       21 files  ±0         21 suites  ±0   11h 36m 28s ⏱️ - 5m 4s
  3 773 tests +1    3 662 ✔️ +1     108 💤 ±0  3 ±0 
36 483 runs  +8  34 670 ✔️ +9  1 810 💤 +1  3  - 2 

For more details on these failures, see this check.

Results for commit 6243868. ± Comparison against base commit aab76d7.

@hendrikmakait hendrikmakait merged commit c72e8ae into dask:main Aug 17, 2023
26 of 29 checks passed
@hendrikmakait hendrikmakait deleted the p2p-null-partitions branch August 17, 2023 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken shuffle
Projects
None yet
Development

Successfully merging this pull request may close these issues.

P2P with null partitions fails
2 participants