Skip to content

[Prism] Refactor stageState to a behavior interface to reduce branch combinatorics #34132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 19, 2025

Conversation

lostluck
Copy link
Contributor

@lostluck lostluck commented Feb 28, 2025

A basic no-behavior change refactoring to split the different stage kinds up more cleanly into the 3 types: ordinary, Stateful, and aggregation.

While their differences don't come up everywhere, they in particular affect when building bundles, and adding pending input elements to their stages.

  • Ordinary stages are the simplest, and handle most simple batch stages.
  • Stateful stages handle Stateful transforms, with state, timers, and a requirement for keys. They don't respect triggers.
  • Aggregations are Stateful stages that don't handle timers, have specific conditions for triggering, as well as handling late data.

The hope of this change is to reduce the combinatorics complexity parts of prism were suffering under, depending on which kind of stage was being handled. This leads to some duplication between Stateful and aggregation stages, but having clear handling of their differences should make up for it.

This is a minimal change. Not every potential portion is moved over right now, just those that had additional branching paths due to the old Stateful or aggregate booleans. Future changes should move handling of the pending elements stacks into similar methods instead of simply checking both.

This needed to work a change that was made to ElementManager.FailBundle which tried to avoid an issue with a deadlock in one of the test stream tests (testMultiStage). The deadlock problem became much worse for reasons not evident to the split, and it appears it was due to never performing the cleanup that Fail bundle performs.

Part of prism work: #29650 and was originally mentioned in #33881


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@lostluck
Copy link
Contributor Author

lostluck commented Mar 1, 2025

Alas, when I synced before starting this, my local builds stopped working, so I'm unable to dig into what I messed up in the migration (likely in the startEventTimeBundle handling). I'll be on a trip for a week, so I'll revisit after that.

@derrickaw
Copy link
Collaborator

Hi @lostluck, friendly reminder to revisit this PR. Thanks.

@lostluck
Copy link
Contributor Author

Damon helped unblock my local testing against the Java suite. The Java build wasn't working after I synced, so I needed the reminder that beam's Gradle Clean doesn't work properly, and sometimes you just need to manually remove all the build folders: find . | grep \/build$ | xargs -I{} rm -rf {}.

@lostluck
Copy link
Contributor Author

assign set of reviewers

Copy link
Contributor

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @jrmccluskey for label go.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

Copy link
Contributor

@jrmccluskey jrmccluskey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - gave this a quick pass since it's mostly re-arranging existing code, the end result seems far more readable

@jrmccluskey jrmccluskey merged commit 77841f4 into apache:master Mar 19, 2025
23 checks passed
talatuyarer pushed a commit to talatuyarer/beam that referenced this pull request Mar 20, 2025
…combinatorics (apache#34132)

* [Prism] Refactor stageState to a behavior interface to reduce branch combinatorics

* Revert on demand pending change.

* revert attempt at lock avoidance, causes hangs.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
prodriguezdefino pushed a commit to prodriguezdefino/beam-pabs that referenced this pull request Mar 26, 2025
…combinatorics (apache#34132)

* [Prism] Refactor stageState to a behavior interface to reduce branch combinatorics

* Revert on demand pending change.

* revert attempt at lock avoidance, causes hangs.

---------

Co-authored-by: lostluck <13907733+lostluck@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants