You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently upgraded to Arrow 6.0.1 and am using it in R.
Whenever reading a large file (~10gb) in Windows it randomly freezes sometimes. I can see the memory being allocated in the first 10-20 seconds, but then nothing happens and R just doesn't respond (the R process becomes idle too).
I'm using the option options(arrow.use_threads=FALSE).
I didn't have this issue with the previous version (0.15.1) I was using. And the file reads fine under Linux.
I would post a reproducible example but it happens randomly. I even thought I would just read large files in pieces by first getting all the distinct sections of a specific column (with compute>collect) but that hangs too.
Any ideas would be appreciated.
Edit
Not sure if it makes sense to anyone but after a few tries it seems that the issue only happens in Rstudio. In the R console it loads it fine. All I'm executing is the below.
options(arrow.use_threads=FALSE)
aa <- arrow::read_arrow('.../file.arrow5')
One thing I want to point out that the underlying Rscript process under Rstudio seems to definitely use more than one core when executing the above.
Edit2
Using arrow::set_cpu_count(1) seems to solve the issue.
Hi -
I recently upgraded to Arrow 6.0.1 and am using it in R.
Whenever reading a large file (~10gb) in Windows it randomly freezes sometimes. I can see the memory being allocated in the first 10-20 seconds, but then nothing happens and R just doesn't respond (the R process becomes idle too).
I'm using the option options(arrow.use_threads=FALSE).
I didn't have this issue with the previous version (0.15.1) I was using. And the file reads fine under Linux.
I would post a reproducible example but it happens randomly. I even thought I would just read large files in pieces by first getting all the distinct sections of a specific column (with compute>collect) but that hangs too.
Any ideas would be appreciated.
Edit
Not sure if it makes sense to anyone but after a few tries it seems that the issue only happens in Rstudio. In the R console it loads it fine. All I'm executing is the below.
options(arrow.use_threads=FALSE)
aa <- arrow::read_arrow('.../file.arrow5')
One thing I want to point out that the underlying Rscript process under Rstudio seems to definitely use more than one core when executing the above.
Edit2
Using arrow::set_cpu_count(1) seems to solve the issue.
Reporter: Christian
Related issues:
Note: This issue was originally created as ARROW-15729. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: