Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windowing Support for the Dask Runner #32941

Merged
merged 196 commits into from
Nov 18, 2024
Merged

Conversation

alxmrs
Copy link
Contributor

@alxmrs alxmrs commented Oct 24, 2024

This CL adds basic Windowing support to this runner, including a few tests for side inputs – take three (#27618, #23913).

CC: @cisaacstern
Reviewers: @jrmccluskey


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

alxmrs and others added 30 commits September 19, 2022 16:34
- CoGroupByKey is broken due to how tags are used with GroupByKey
- GroupByKey should output `[('0', None), ('1', 1)]`, however it actually outputs: [(None, ('1', 1)), (None, ('0', None))]
- Once that is fixed, we may have test pipelines work on Dask.
@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 7, 2024

@Abacn May I have some help getting the windows build to work? It looks tractable:

C:\hostedtoolcache\windows\Python\3.10.11\x64\python.exe: can't open file 
'D:\\a\\beam\\beam\\sdks\\python\\target\\.tox\\py310-win-dask\\Scripts\\pip': [Errno 2] No such file or directory

@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 7, 2024

Ok, I think I got it working on windows, too.

@alxmrs alxmrs requested a review from Abacn November 7, 2024 22:43
@alxmrs alxmrs force-pushed the dask-runner-windowing branch from 1b41784 to 7caba84 Compare November 8, 2024 00:09
@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 8, 2024

I think the failing tests are just flaky.

@shunping
Copy link
Contributor

@Abacn, could you take another look on @alxmrs 's changes?

@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 14, 2024

Hey there! I'd love to get this merged. Please let me know if there is any more I can do.

@Abacn
Copy link
Contributor

Abacn commented Nov 14, 2024

Sorry for late response. My comments were most on the infrastructure / testing related components. For that part LGTM. For dask runner implementation would it be possible to find a reviewer who knows Dask also?

@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 15, 2024

That sounds good. Hey, @jacobtomlinson: by any chance, do you have cycles to review this PR, focusing on the Dask aspect of the code? This one has been a long time coming! I would appreciate any and all help you can offer (and no pressure if time is sparse).

Copy link
Contributor

@jacobtomlinson jacobtomlinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally this looks good to me from the Dask side.

@alxmrs
Copy link
Contributor Author

alxmrs commented Nov 15, 2024

Hey @Abacn! In light of Jacob's review, do you think we're good-to-go? :)

@Abacn
Copy link
Contributor

Abacn commented Nov 18, 2024

Thank you, merging for now

@Abacn Abacn merged commit e939be3 into apache:master Nov 18, 2024
106 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants