Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid meta roundtrip in P2P shuffle #7895

Merged
merged 2 commits into from Jun 8, 2023

Conversation

hendrikmakait
Copy link
Member

@hendrikmakait hendrikmakait commented Jun 8, 2023

Closes dask/dask#10335

  • Tests added / passed
  • Passes pre-commit run --all-files

@hendrikmakait
Copy link
Member Author

cc @wence-, @jrbourbeau

@github-actions
Copy link
Contributor

github-actions bot commented Jun 8, 2023

Unit Test Results

See test report for an extended history of previous test failures. This is useful for diagnosing flaky tests.

       20 files  ±    0         20 suites  ±0   10h 44m 12s ⏱️ - 1h 6m 57s
  3 679 tests +    4    3 566 ✔️ +    3     108 💤 ±  0  5 +1 
34 561 runs   - 979  32 842 ✔️  - 930  1 714 💤  - 50  5 +1 

For more details on these failures, see this check.

Results for commit d8c9252. ± Comparison against base commit e31c864.

This pull request removes 1 and adds 5 tests. Note that renamed tests count towards both.
distributed.tests.test_scheduler ‑ test_cumulative_worker_metrics
distributed.tests.test_spans ‑ test_worker_metrics
distributed.tests.test_worker_metrics ‑ test_no_spans_extension
distributed.tests.test_worker_metrics ‑ test_reschedule
distributed.tests.test_worker_metrics ‑ test_send_metrics_to_scheduler
distributed.tests.test_worker_metrics ‑ test_user_metrics_weird

♻️ This comment has been updated with latest results.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hendrikmakait -- I'm testing the dask/dask test suite against this PR over in dask/dask#10334

@@ -234,7 +234,7 @@ async def add_partition(

@abc.abstractmethod
async def get_output_partition(
self, i: T_partition_id, key: str
self, i: T_partition_id, key: str, **kwargs: Any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow a generic **kwargs instead of meta=? The code here looks functionally fine, but I have a slight preference for avoiding **kwargs if possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no meta for array rechunking. The class hierarchy doesn't play out perfectly, we may want to get rid of it eventually.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could still use meta=None instead of **kwargs though, no? (even if array re-chunking doesn't choose to use it for now)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If people like that better, we can change it. I don't have strong opinions and plan to refactor this soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have strong opinions and plan to refactor this soon.

Makes sense - I don't have strong feelings either. Mostly just trying to digest the general code. That said, I am pretty interested to know about the refactoring plans.

FYI, we are currently experimenting with a decomposed version of P2PShuffleLayer (i.e. ShuffleTransfer(Blockwise) -> ShuffleBarrier(Layer) -> ShuffleUnpack(Blockwise)). I'd like to get feedback on the approach once dask/dask#10312 and a modified version of #7743 get in.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been meaning to check those PRs out, I'll block some time for it next week.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now using meta instead of **kwargs.

Copy link
Member

@jrbourbeau jrbourbeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrbourbeau jrbourbeau merged commit c520f1c into dask:main Jun 8, 2023
21 of 27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dask/tests/test_distributed.py::test_map_partitions_df_input failing
3 participants