You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is required allow Windows 32-bit RTools 3.5 builds avoid crashing as they don't seem to properly implement the <threading> header.
However, it could be generally useful to anyone that wants to avoid thread creation.
Currently, the asynchronous approaches introduce necessary threading. For example, even a simple call to check if the CSVFileFormat supports a file requires peeking the file and reading the first block. These I/O operations happen on the I/O pools and then are transferred to the CPU thread pool (which is NOT the same thing as the calling thread) meanwhile the calling thread is blocked waiting for results.
This can be avoided by treating the calling thread as a single threaded thread pool and then using that as the CPU thread pool. This allows all CPU work to be done on the calling thread alone. This could also allow us to remove duplicate code paths (e.g. code paths that exist only to keep functions serial such as the serial CSV reader) in the future.
This capability could be extended to include the I/O thread pool as well at some point in the future.
Antoine Pitrou / @pitrou:
I would not bother about RTools 3.5 at this point. We're dropping it in a month or two. Just disable whatever functionality doesn't work (especially if the problems are 32-bit only).
Regardless, this is conceptually a worthwhile proposal, but I don't know if it will be easy to implement.
Weston Pace / @westonpace:
Ah, my understanding from Neal was that we cannot release release 4.0 with RTools 3.5 broken. I had already been working on this as part of ARROW-12161 but ended up creating this separate Jira as the PR is doing more than just fixing a bug. I already have an implementation.
Neal Richardson / @nealrichardson:
Correct, we cannot release 4.0 with a broken RTools 3.5 build. But if necessary we can conditionally disable features in it--we already do this for S3 support.
This is required allow Windows 32-bit RTools 3.5 builds avoid crashing as they don't seem to properly implement the
<threading>
header.However, it could be generally useful to anyone that wants to avoid thread creation.
Currently, the asynchronous approaches introduce necessary threading. For example, even a simple call to check if the CSVFileFormat supports a file requires peeking the file and reading the first block. These I/O operations happen on the I/O pools and then are transferred to the CPU thread pool (which is NOT the same thing as the calling thread) meanwhile the calling thread is blocked waiting for results.
This can be avoided by treating the calling thread as a single threaded thread pool and then using that as the CPU thread pool. This allows all CPU work to be done on the calling thread alone. This could also allow us to remove duplicate code paths (e.g. code paths that exist only to keep functions serial such as the serial CSV reader) in the future.
This capability could be extended to include the I/O thread pool as well at some point in the future.
Reporter: Weston Pace / @westonpace
Assignee: Weston Pace / @westonpace
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-12208. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: