Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-14702: [Doc][C++] Document threading model #12670

Conversation

westonpace
Copy link
Member

No description provided.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@westonpace
Copy link
Member Author

westonpace commented Mar 18, 2022

Todo:

  • Figure out arrow::internal::Executor
  • Review documentation for Future methods
  • Figure out why Sphinx is complaining about duplicate definitions for Future
  • Link in ARROW_IO_THREADS env var
  • Add info to python docs
  • Add info to R docs

@westonpace
Copy link
Member Author

@pitrou I can't include API docs for arrow::internal::Executor because doxygen excludes everything in the internal namespace. Would it be better to remove all references to the executor (and simply reference the management methods like setting and retrieving capacity) or should we move Executor out of the internal namespace?

@pitrou
Copy link
Member

pitrou commented Mar 21, 2022

I don't know, do we actually want to document Executor publicly or just the more general concepts around our execution model?
I think for now we should just reference the management methods.

@westonpace
Copy link
Member Author

I reduced the scope to remove references to Executor. I left a small blurb in about futures (mainly so I could say "any method that returns a future probably has a synchronous variant")

@westonpace westonpace marked this pull request as ready for review March 23, 2022 02:03
@westonpace westonpace requested a review from pitrou March 23, 2022 02:03
@westonpace westonpace force-pushed the feature/ARROW-14702--document-threading-model branch 2 times, most recently from cce6788 to 873f786 Compare March 23, 2022 02:05

Many Arrow operations distribute work across multiple threads to take
advantage of underlying hardware parallelism. For example, when reading a
parquet file we can decode each column in parallel. To achieve this we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "Parquet" capitalized, also let's cross-reference to the corresponding doc page?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exactly where the best place would be to cross-reference. I added a reference to https://arrow.apache.org/docs/cpp/api/formats.html#_CPPv4N7parquet5arrow10FileReader15set_use_threadsEb

@pitrou
Copy link
Member

pitrou commented Mar 23, 2022

I really like this change, thanks for doing this!

@westonpace westonpace force-pushed the feature/ARROW-14702--document-threading-model branch from 4071a6d to 667213c Compare March 26, 2022 03:05
@westonpace
Copy link
Member Author

@pitrou Thanks for the review. I think I've addressed the feedback. I also realized this information applies to python and R as well so I plan on adding a small blurb to those implementations as well. I also added a small reference in the filesystems page.

@pitrou
Copy link
Member

pitrou commented Mar 28, 2022

I'm going to merge this, feel free to open new PR for Python and R additions.

@pitrou pitrou closed this in 495eb16 Mar 28, 2022
@ursabot
Copy link

ursabot commented Mar 28, 2022

Benchmark runs are scheduled for baseline = 583a02b and contender = 495eb16. 495eb16 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.54% ⬆️0.04%] test-mac-arm
[Finished ⬇️0.71% ⬆️1.07%] ursa-i9-9960x
[Finished ⬇️0.17% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants