Skip to content

Conversation

@tgroh
Copy link
Member

@tgroh tgroh commented Jun 4, 2016

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

  • Make sure the PR title is formatted like:
    [BEAM-<Jira issue #>] Description of pull request
  • Make sure tests pass via mvn clean verify. (Even better, enable
    Travis-CI on your fork and ensure the whole test matrix passes).
  • Replace <Jira issue #> in the title with the actual Jira issue
    number, if there is one.
  • If this contribution is large, please file an Apache
    Individual Contributor License Agreement.

The ParDoInProcessEvaluator is provided clones of a DoFn when
appropriate, and should not serialize them.

@tgroh
Copy link
Member Author

tgroh commented Jun 4, 2016

The ThreadLocal-cloning Fn Factories were added in 483a5a4 but cloning was not removed (as appropriate) in that CR.

R: @kennknowles @bjchambers

@tgroh
Copy link
Member Author

tgroh commented Jun 4, 2016

The test failures are because TestDoFn does not permit itself to be reused (specifically, if startBundle is called after any other DoFn method is called, including finishBundle, it will fail an assertion).

There are two potential fixes. My preference would be to modify TestDoFn to be reusable, so long as the call sequence is correct (e.g. startBundle resets the state if state is FINISHED). The other is to use the ThreadLocal only when startBundle and finishBundle are the defaults (i.e. the declaring class is org.apache.beam.sdk.transforms.DoFn

This may be a discussion suitable for the mailing list. Thoughts?

@tgroh tgroh force-pushed the remove_improper_cloning branch 2 times, most recently from 7a830b7 to 338bc10 Compare June 6, 2016 20:39
@bjchambers
Copy link
Contributor

Have the tests been updated now? Should this be passing now? Also, we should state that this isn't improper cloning, but excessive cloning. It is still valid according to the model to clone these.

The ParDoInProcessEvaluator is provided clones of a DoFn when
appropriate, and should not serialize them.
@tgroh tgroh force-pushed the remove_improper_cloning branch from 338bc10 to 0cf1730 Compare June 7, 2016 21:21
@tgroh
Copy link
Member Author

tgroh commented Jun 7, 2016

Updated the commit message; The RunnableOnService tests all pass, but a future-change that replaces the DirectRunner in its entirety does not, due to the implementation of Write

@tgroh tgroh changed the title [BEAM-22] Remove improper Fn cloning [BEAM-22] Remove unneccesary Fn cloning Jun 7, 2016
@kennknowles
Copy link
Member

LGTM but I will wait until tomorrow to merge, in case there is any last-minute feedback on BEAM-38.

@kennknowles
Copy link
Member

Based on the dev list discussion, it looks like DoFn reuse might already be prevalent, so there's no downside here. Merging.

@asfgit asfgit merged commit 0cf1730 into apache:master Jun 8, 2016
asfgit pushed a commit that referenced this pull request Jun 8, 2016
dhalperi added a commit to dhalperi/beam that referenced this pull request Sep 27, 2016
Before the backoff would grow unboundedly, so we could in principle wait
1.5x to 2x the actual job time. For long running jobs this is hours.
Now, we just back off at most 1 minute between checking the job state.
Note there should be no danger of QPS overload here because we should
have very few concurrent outstanding jobs
tvalentyn pushed a commit to tvalentyn/beam that referenced this pull request May 15, 2018
pl04351820 pushed a commit to pl04351820/beam that referenced this pull request Dec 20, 2023
Co-authored-by: release-please[bot] <55107282+release-please[bot]@users.noreply.github.com>
Co-authored-by: Tres Seaver <tseaver@palladion.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants