Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prism] Refactor stageState to a behavior interface to reduce branch combinatorics #34132

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

lostluck
Copy link
Contributor

@lostluck lostluck commented Feb 28, 2025

A basic no-behavior change refactoring to split the different stage kinds up more cleanly into the 3 types: ordinary, Stateful, and aggregation.

While their differences don't come up everywhere, they in particular affect when building bundles, and adding pending input elements to their stages.

  • Ordinary stages are the simplest, and handle most simple batch stages.
  • Stateful stages handle Stateful transforms, with state, timers, and a requirement for keys. They don't respect triggers.
  • Aggregations are Stateful stages that don't handle timers, have specific conditions for triggering, as well as handling late data.

The hope of this change is to reduce the combinatorics complexity parts of prism were suffering under, depending on which kind of stage was being handled. This leads to some duplication between Stateful and aggregation stages, but having clear handling of their differences should make up for it.

This is a minimal change. Not every potential portion is moved over right now, just those that had additional branching paths due to the old Stateful or aggregate booleans. Future changes should move handling of the pending elements stacks into similar methods instead of simply checking both.

Part of prism work: #29650 and was originally mentioned in #33881


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

Copy link
Contributor

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

@lostluck
Copy link
Contributor Author

lostluck commented Mar 1, 2025

Alas, when I synced before starting this, my local builds stopped working, so I'm unable to dig into what I messed up in the migration (likely in the startEventTimeBundle handling). I'll be on a trip for a week, so I'll revisit after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant