Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BEAM-9073] Fixes order-dependence in PipelineVisitor #10541

Closed
wants to merge 1 commit into from

Conversation

rohdesamuel
Copy link
Contributor

@rohdesamuel rohdesamuel commented Jan 9, 2020

The Python PipelineVisitor is topologically-order dependent and can visit the same transform multiple times. The fix is to journal each transform we visit.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Choose reviewer(s) and mention them in a comment (R: @username).
  • Format the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

Post-Commit Tests Status (on master branch)

Lang SDK Apex Dataflow Flink Gearpump Samza Spark
Go Build Status --- --- Build Status --- --- Build Status
Java Build Status Build Status Build Status Build Status
Build Status
Build Status
Build Status Build Status Build Status
Build Status
Build Status
Python Build Status
Build Status
Build Status
Build Status
--- Build Status
Build Status
Build Status
Build Status
--- --- Build Status
XLang --- --- --- Build Status --- --- ---

Pre-Commit Tests Status (on master branch)

--- Java Python Go Website
Non-portable Build Status Build Status
Build Status
Build Status Build Status
Portable --- Build Status --- ---

See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.

@rohdesamuel
Copy link
Contributor Author

R: @lukecwik can you review this please?

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it make sense to ensure the topological visiting order in the graph instead of making the visit order possibly random?

(I ask this since some pipeline transform replacement algorithms may assume that you first visit the composites before you have visited the children so that replacing the higher level transform happens first).

@rohdesamuel
Copy link
Contributor Author

Wouldn't it make sense to ensure the topological visiting order in the graph instead of making the visit order possibly random?

(I ask this since some pipeline transform replacement algorithms may assume that you first visit the composites before you have visited the children so that replacing the higher level transform happens first).

That makes sense, but unfortunately I don't have time to implement that larger change. I have a smaller work-around that fixes it at the PipelineVisitor subclass that I'm using.

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely an improvement since it prevents visiting the same transform multiple times but strongly consider following up with a better fix.

@lukecwik
Copy link
Member

retest this please

Copy link
Member

@lukecwik lukecwik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Message
AttributeError: 'PipelineTest' object has no attribute 'assertCountEqual'
Stacktrace
self = <apache_beam.pipeline_test.PipelineTest testMethod=test_visitor_not_sorted>

@stale
Copy link

stale bot commented Mar 21, 2020

This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@beam.apache.org list. Thank you for your contributions.

@stale stale bot added the stale label Mar 21, 2020
@stale
Copy link

stale bot commented Mar 28, 2020

This pull request has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@stale stale bot closed this Mar 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants