[BEAM-7923] Include side effects in p.run#11141
Conversation
There was a problem hiding this comment.
This test was flaky because the dataframe columns can be built in arbitrary orders. This option makes sure it doesn't take column positioning into consideration since we only care about the equivalence of data.
|
Formatted with yapf. R: @aaltay PTAL, thx! |
There was a problem hiding this comment.
Do you want to track, mark side effects differently? Does users want to specifically track these pcollections?
There was a problem hiding this comment.
It's not necessary. The intended behavior is not ambiguous: When the user uses show, head, collect APIs, these PCollections are excluded completely as the user explicitly wishes. And when the user invokes p.run(), all transforms in the pipeline should be executed as expected.
This change is only to make sure that the prune logic doesn't affect the above intended behavior.
|
Could you resolve the conflict? |
1. PCollections never used as inputs and not watched, such as sinks without being assigned to variables will be pruned before `p.run()`. The change makes sure that these side effect PCollections are now considered as extended targets and will be executed on `p.run()`. 2. Note the change will not affect `show`, `head` and `collect` because they have an additional pipeline fragment logic that already prunes everything unrelated before the instrumenting and the prune logic inside instrumenting.
34dae9d to
b458cfd
Compare
|
Rebased to resolve merge conflicts and force pushed! |
|
Merged this, without noticing that test did not run. Fooled by githubs "all green check signs". Please watch the tests, especially the cron ones and see if anything is failing. Or better, create an empty PR to run the tests. |
sinks without being assigned to variables is currently pruned before
p.run().considered as extended targets and will be executed on
p.run().show,headandcollectbecausethey have an additional pipeline fragment logic that already prunes
everything unrelated before the instrumenting and the prune logic inside
instrumenting.
Please add a meaningful description for your change here
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username).[BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replaceBEAM-XXXwith the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.