[BEAM-3711] Enabling combiner lifting in Dataflow Runner. #5974
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This change does two things:
It modifies the way that the Dataflow Runner transmits combines to
Dataflow so that it can support combiner lifting. This is done by, when
translating CombineGroupedValues transforms, encoding the ID of the
parent Combine Per Key transform as a Serialized Fn.
This change also preemptively fixes an issue that occurs that would
cause CombineGroupedValues with side inputs to get translated that way
for Combiner lifting, despite the parent transform being an anonymous
composite transform, indicating that the CombineGroupedValues should
be translated as a ParDo. This is fixed by adjusting the PTransform
Overrides in DataflowRunner slightly.
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.It will help us expedite review of your Pull Request if you tag someone (e.g.
@username
) to look at it.Post-Commit Tests Status (on master branch)