Skip to content

Commit

Permalink
GH-36121: [R] Warn for set_io_thread_count() with num_threads < 2 (
Browse files Browse the repository at this point in the history
…#36304)

### Rationale for this change

Setting the number of threads in the IO thread pool to 1 causes a hang or crash when using some functions (notably: any Acero exec plan).

### What changes are included in this PR?

`set_io_thread_count()` now warns for `num_threads == 1`:

``` r
library(arrow, warn.conflicts = FALSE)
#> Some features are not enabled in this build of Arrow. Run `arrow_info()` for more information.

# Already errors from C++
set_io_thread_count(0)
#> Error: Invalid: ThreadPool capacity must be > 0
# New warning!
set_io_thread_count(1)
#> Warning: `arrow::set_io_thread_count()` with num_threads < 2 may
#> cause certain operations to hang or crash.
#> ℹ Use num_threads >= 2 to support all operations
# No warning!
set_io_thread_count(2)
```

<sup>Created on 2023-06-26 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>

### Are these changes tested?

Yes

### Are there any user-facing changes?

Yes: some existing code may issue a warning that previously did not. Documentation was added.
* Closes: #36121

Authored-by: Dewey Dunnington <dewey@voltrondata.com>
Signed-off-by: Dewey Dunnington <dewey@voltrondata.com>
  • Loading branch information
paleolimbot committed Jun 27, 2023
1 parent 8b4a548 commit 4a93a37
Show file tree
Hide file tree
Showing 3 changed files with 34 additions and 5 deletions.
18 changes: 17 additions & 1 deletion r/R/config.R
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,26 @@ io_thread_count <- function() {
}

#' @rdname io_thread_count
#' @param num_threads integer: New number of threads for thread pool
#' @param num_threads integer: New number of threads for thread pool. At least
#' two threads are reccomended to support all operations in the arrow
#' package.
#' @export
set_io_thread_count <- function(num_threads) {
current_io_thread_count <- io_thread_count()
SetIOThreadPoolCapacity(as.integer(num_threads))

# Warn for IO thread count < 2: Arrow C++ makes the assumption that there
# is at least one thread available and the R package uses one thread from
# the IO thread pool to support calling into R from C++.
if (num_threads < 2) {
rlang::warn(
c(
"`arrow::set_io_thread_count()` with num_threads < 2 may",
"cause certain operations to hang or crash.",
"i" = "Use num_threads >= 2 to support all operations"
)
)
}

invisible(current_io_thread_count)
}
4 changes: 3 additions & 1 deletion r/man/io_thread_count.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 14 additions & 3 deletions r/tests/testthat/test-config.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,22 @@ test_that("set_io_thread_count() sets the number of io threads", {
current_io_thread_count <- io_thread_count()
on.exit(set_io_thread_count(current_io_thread_count))

previous_io_thread_count <- set_io_thread_count(1)
previous_io_thread_count <- set_io_thread_count(2)
expect_identical(previous_io_thread_count, current_io_thread_count)
expect_identical(io_thread_count(), 1L)
expect_identical(io_thread_count(), 2L)

expect_identical(set_io_thread_count(current_io_thread_count), 1L)
expect_identical(set_io_thread_count(current_io_thread_count), 2L)
})

test_that("set_io_thread_count() warns for num_threads == 1", {
current_io_thread_count <- io_thread_count()
on.exit(set_io_thread_count(current_io_thread_count))

expect_warning(
set_io_thread_count(1),
"num_threads < 2",
fixed = TRUE
)
})

test_that("set_cpu_count() sets the number of CPU threads", {
Expand Down

0 comments on commit 4a93a37

Please sign in to comment.