-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-1556] Make PipelineOptions a lazy-singleton and init IOs as par… #2169
Conversation
Trying to avoid the need to "figure-out" if a Read/Write requires IO resgistration, I simply put |
Run Spark RunnableOnService |
R: @davorbonaci CC: @francesperry |
Refer to this link for build results (access rights to CI server needed): Failed Tests: 1beam_PostCommit_Java_RunnableOnService_Spark/org.apache.beam:beam-runners-spark: 1
--none-- |
ResumeFromCheckpointStreamingTest.testWithResume flakes |
Refer to this link for build results (access rights to CI server needed): |
Confirmed this fixes the issue for me! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I'll merge, but will leave a few comments afterwards.
} | ||
} | ||
// register IO factories. | ||
IOChannelUtils.registerIOFactoriesAllowOverride(pipelineOptions); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment after the merge:
Can it happen, in some clusters, that this process on the worker runs multiple pipelines at the same time? If so, different pipelines may have different pipeline options, and the second registration may clobber the first one.
Is this relevant? If not, great. If so, perhaps we should log a JIRA issue for later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question!
From Spark docs: "Each application gets its own executor processes".
This page in Spark docs should cover it a lot better than me.
So I think a lazy-singleton should work (pretty common pattern for Spark users..).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
And, thank you so much for fixing this quickly! |
…t of it.
Be sure to do all of the following to help us incorporate your contribution
quickly and easily:
[BEAM-<Jira issue #>] Description of pull request
mvn clean verify
. (Even better, enableTravis-CI on your fork and ensure the whole test matrix passes).
<Jira issue #>
in the title with the actual Jira issuenumber, if there is one.
Individual Contributor License Agreement.