New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-5978] Changing parallelism for wordcount to 1 #7174
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see #6954 (comment)
This is a problem specific to batch mode. Please set the workaround accordingly (there is already another one below) and add a comment pointing to the JIRA.
Thanks. I will apply this parameter to batch only then. |
e349534
to
77da4f3
Compare
Updated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's wait for @mxm to take a look as well. I experimented a bit with the parallelism settings and documented it on https://issues.apache.org/jira/browse/BEAM-5167
Even though not needed for streaming currently, it is generally advisable to be explicit about the parallelism (predictability WRT network buffers etc.)
We could also fold this change into #7180 Unrelated to this change, note that currently wordcount doesn't work on macOS with the docker jobserver:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we have 12 threads in SDKHarness, we get a deadlock on my machine
Default should be 1. You would get auto-set 12 threads with it set to 0.
IMHO ok to merge this as a fix for as long as the underlying issue is still pending.
1a2bb38
to
538a792
Compare
Shall we merge this PR? |
Go ahead :) |
] | ||
if (streaming) | ||
options += ["--streaming"] | ||
else | ||
// workaround for local file output in docker container | ||
options += ["--environment_cache_millis=10000"] | ||
// [BEAM-5167] Workaround for scheduling issue between SDKHarness and Flink | ||
options += ["--parallelism=1"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the intention is to use it only for batch, you would need to add {..}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. That's why it's better to always use brackets for if blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, Totally missed it. BTW python works with indentation so it might have worked with Python :)
Should also apply this change to streaming as we don't want parallelism in streaming as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't make this more restrictive than necessary. So only batch for now.
@angoenka the error posted above applies to Docker execution (the default), as shown in the command. |
538a792
to
bd0103c
Compare
Updated the PR |
Local execution environment sets the parallelism of the pipeline same as the number of cores on the machine. In my case, its 12. As we have 12 threads in SDKHarness, we get a deadlock on my machine.
Running on a cluster does not have this issue as the default parallelism there is 1.
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.It will help us expedite review of your Pull Request if you tag someone (e.g.
@username
) to look at it.Post-Commit Tests Status (on master branch)