New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-15718: [C++] Increase thread limit to work around thread issues #12845
Conversation
Thanks for opening a pull request! If this is not a minor PR. Could you open an issue for this pull request on JIRA? https://issues.apache.org/jira/browse/ARROW Opening JIRAs ahead of time contributes to the Openness of the Apache Arrow project. Then could you also rename pull request title in the following format?
or
See also: |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for realizing we still needed to patch this up. This looks good to me but I think we can be just a touch safer (the cost of allocating these extra states should be insignificant).
It looks like that Java failure is simply flaky right now. See other recent failures: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking care of this.
Benchmark runs are scheduled for baseline = 1763622 and contender = 08ab8b0. 08ab8b0 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
In #12339 we added one, which enabled joining one table to one dataset using
use_threads=false
. However, I found that joining two datasets hit the thread limit.There are plans to find a long-term fix that can run these operations synchronously with fewer threads, but that won't be ready for the next release.
As a temporary fix for 8.0.0, I propose just bumping up the
local_states_
capacity.