Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] read_feather hanging on Windows #27449

Closed
asfimport opened this issue Feb 10, 2021 · 8 comments
Closed

[R] read_feather hanging on Windows #27449

asfimport opened this issue Feb 10, 2021 · 8 comments
Assignees
Milestone

Comments

@asfimport
Copy link

asfimport commented Feb 10, 2021

On windows 10, reading large feather objects in R seems to lead to hanging on a repeat read.

 

This issue has been reproduced on 3 different windows machines.  All running win 10, R 4.0.0 (or later).

read_feather does not hang if using version = 1, or using uncompressed with version 2.

This issue does not happen on tests on linux (Ubuntu 20.04 atleast)

 

Example:

 

library(arrow)

m <- data.frame(x = rnorm(7e6), y = rnorm(5), b = rnorm(5), n = rnorm(5), c = c("a", "n"))

write_feather(m, "test.feather4", version = 2, compression = "lz4") # does not hang with uncompressed, but does with lz4 and zstd

for (j in 1:50){  

y <- read_feather("test.feather4")  # hangs after an unpredictable number of reads, just on windows though  

print(paste0("feather read ", j, "..."))

}

 

 

 

 

 

Interestingly, a work around is to use read_feather but call just one column at a time.  This does not hang so far.

 

e.g. y returns the full data frame, and this doesn't hang on repeated reads:

 

y <- lapply(cols, function(col) {

read_feather("test.feather4", col_select = all_of(col))

})

 

Environment: windows 10, R 4.0.0, arrow 3.0.0
Reporter: Claymore Marshall
Assignee: Neal Richardson / @nealrichardson

Related issues:

Note: This issue was originally created as ARROW-11579. Please see the migration documentation for further details.

@asfimport
Copy link
Author

Claymore Marshall:
Perhaps related, it seems (on win 10) write_feather(..., compression = "lz4") etc of large objects may also be hanging

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
[~claymoremarshall] Thank you for reporting this. I made two attempts to reproduce this error, both with version 3.0.0 of the arrow package installed from CRAN running in R x64 4.0.3. First I used Windows 10 running in a VM with VirtualBox on a macOS host. The error did not occur there even after repeated attempts. Next I used a laptop running Windows 10 natively, and there I was able to reproduce the issue immediately.

I also experimented with two things:

  • running Sys.setenv(ARROW_DEFAULT_MEMORY_POOL="system") before loading the arrow package

  • running gc() in each iteration of the loop

    The hanging behavior occurred regardless.

    In my tests, there was no memory starvation when the hanging occurred.

    I'll keep investigating.

@asfimport
Copy link
Author

Ian Cook / @ianmcook:
[~claymoremarshall]  It seems this hanging behavior is being caused by a thread deadlock. In the VM where I could not reproduce the issue, arrow::cpu_count() was 2, meaning the VM had two virtual CPU cores. On the machine where the hanging did occur, arrow::cpu_count() was 8.

You can get the hanging to stop by using arrow::set_cpu_count() to reduce the number of CPU threads Arrow can use. When I set arrow::set_cpu_count(2), the hanging behavior stopped for me.

@asfimport
Copy link
Author

Claymore Marshall:
Thanks, also agree reducing set_cpu_count() stops the hanging on read_feather (and write_feather) so far.

@asfimport
Copy link
Author

Claymore Marshall:
Noting that I was able to use set_cpu_count(2) without hanging, but have now noticed on bulk small feather data set reads, getting hanging behaviour again.  Need to set to set_cpu_count(1) to avoid this.

@asfimport
Copy link
Author

Neal Richardson / @nealrichardson:
FTR, another way you can turn off multithreading is to set options(arrow.use_threads = FALSE)

There are some longstanding issues with multithreading on Windows that we unfortunately haven't been able to pin down. See also ARROW-8379.

@asfimport
Copy link
Author

Andrew C Thomas:
Same problem found here. Also applies to read_parquet in my case. Turning off multithreading seems to have done the trick.

 

@asfimport
Copy link
Author

Neal Richardson / @nealrichardson:
We believe that this has been resolved in ARROW-8379. If you still experience this with version 6.0.0 or greater (after it is released in mid-October), please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants