New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-8965] Remove duplicate sideinputs in ConsumerTrackingPipelineVisitor #10901
Conversation
Run Python Postcommit |
Thanks! I'll take a look : ) |
Run Python Postcommit |
Run Python 3.5 Postcommit |
self.views.append(side_input) | ||
if side_input not in self._side_input_views: | ||
self._side_input_views[side_input] = side_input | ||
self.views = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we may want to only append rather than rebuild all of self.views
?
An idea is that we may want to keep self._views = set()
, add views to the set, and add a @property
that returns the set as a list? (no need to implement this way, but we do want to make sure that we are not dropping elements from 'views' that we appended in previous visit_transform
calls
Run Python PreCommit |
Retest this please |
retest this please |
thanks @bobingm ! Looks good to me. Can you fix the formatting issues, and I'll merge? |
@pabloem the failure is not caused by this PR. |
I see. Thanks for pointing that out. |
thanks @bobingm |
Summary
This PR is mainly preventing
BundleBasedDirectRunner
evaluates single side_input more than once.To achieve this, this PR changes the logic to get side_input views in
ConsumerTrackingPipelineVisitor
, and modify the unit tests to make sure that when one single side_input is used in two different PTransforms, it will be evaluated once.Check List
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.