Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-11478: [R] Consider ways to make arrow.skip_nul option more user-friendly #9899

Closed
wants to merge 5 commits into from

Conversation

nealrichardson
Copy link
Member

Following a similar approach to ARROW-11766. This also pulls in withr as a test dependency and refactors some unrelated tests to use it.

@github-actions
Copy link

github-actions bot commented Apr 5, 2021

@nealrichardson
Copy link
Member Author

The error on centos7 is something that I see on about 1 out of every 10 runs locally on the as.data.frame(RecordBatch) test. Locally it looks like:

libc++abi.dylib: Pure virtual function called!
/bin/sh: line 1: 93200 Abort trap: 6

I haven't observed this on repeated runs of the array/chunked-array tests.

Any ideas @romainfrancois @bkietz? Even though my changes are purely in the R code, googling the error message suggests it's a C++ issue.

e$message <- paste0(msg, "; to strip nuls when converting from Arrow to R, set options(arrow.skip_nul = TRUE)")
}
stop(e)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the call info in the error message at all informative? If not, you might want stop(msg, call. = FALSE) inside the conditional (after appending to the message of course).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought (though I could be remembering wrong) that when you call stop() on a caught simpleError class object, you just re-raise it and that preserves however the original source of the stop() had said call. or whatever options. I'll review/confirm tomorrow.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed:

> f <- function(...) stop(...)
> f("boo")
Error in f("boo") : boo
> f("boo", call. = FALSE)
Error: boo
> tryCatch(f("boo"), error = function(e) stop(e))
Error in f("boo") : boo
> tryCatch(f("boo", call. = FALSE), error = function(e) stop(e))
Error: boo

@ianmcook
Copy link
Member

ianmcook commented Apr 6, 2021

Looks good to me 👍

@westonpace
Copy link
Member

The error on centos7 is something that I see on about 1 out of every 10 runs locally on the as.data.frame(RecordBatch) test. Locally it looks like:

libc++abi.dylib: Pure virtual function called!
/bin/sh: line 1: 93200 Abort trap: 6

I haven't observed this on repeated runs of the array/chunked-array tests.

Any ideas @romainfrancois @bkietz? Even though my changes are purely in the R code, googling the error message suggests it's a C++ issue.

It's a bug in to_dataframe_parallel (or possibly somewhere deeper). Something is calling stop which is causing the code to bail before tg (the arrow::internal::TaskGroup) is destroyed. The converters are then destroyed first for whatever reason (I don't really know how R cleans this stuff up). When the TaskGroup is destroyed it waits until all running tasks cease. Since this hasn't happened the tasks are still running. The tasks reference this implicitly where this is the converter. Thus, the tasks have a pointer to an object that has been destroyed and they attempt to make a call on it. This yields the error you saw.

A quick patch (but I'm not sure if the right one as I don't understand R's stop well enough) could be...

diff --git a/r/src/array_to_vector.cpp b/r/src/array_to_vector.cpp
index ddcb74946..ef57fa660 100644
--- a/r/src/array_to_vector.cpp
+++ b/r/src/array_to_vector.cpp
@@ -88,11 +88,12 @@ class Converter {
   // for each array, add a task to the task group
   //
   // The task group is Finish() in the caller
-  void IngestParallel(SEXP data, const std::shared_ptr<arrow::internal::TaskGroup>& tg) {
+  void IngestParallel(SEXP data, const std::shared_ptr<arrow::internal::TaskGroup>& tg,
+                      std::shared_ptr<Converter> self) {
     R_xlen_t k = 0, i = 0;
     for (const auto& array : arrays_) {
       auto n_chunk = array->length();
-      tg->Append([=] { return IngestOne(data, array, k, n_chunk, i); });
+      tg->Append([=] { return self->IngestOne(data, array, k, n_chunk, i); });
       k += n_chunk;
       i++;
     }
@@ -1242,7 +1243,7 @@ cpp11::writable::list to_dataframe_parallel(
 
     // add a task to ingest data of that column if that can be done in parallel
     if (converters[i]->Parallel()) {
-      converters[i]->IngestParallel(column, tg);
+      converters[i]->IngestParallel(column, tg, converters[i]);
     }
   }

This copies the shared_ptr into the task and uses that (instead of an implicit this capture).

@nealrichardson nealrichardson deleted the skip-nul-message branch April 6, 2021 19:18
GeorgeAp pushed a commit to sirensolutions/arrow that referenced this pull request Jun 7, 2021
…r-friendly

Following a similar approach to ARROW-11766. This also pulls in `withr` as a test dependency and refactors some unrelated tests to use it.

Closes apache#9899 from nealrichardson/skip-nul-message

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 10, 2021
…r-friendly

Following a similar approach to ARROW-11766. This also pulls in `withr` as a test dependency and refactors some unrelated tests to use it.

Closes apache#9899 from nealrichardson/skip-nul-message

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
michalursa pushed a commit to michalursa/arrow that referenced this pull request Jun 13, 2021
…r-friendly

Following a similar approach to ARROW-11766. This also pulls in `withr` as a test dependency and refactors some unrelated tests to use it.

Closes apache#9899 from nealrichardson/skip-nul-message

Authored-by: Neal Richardson <neal.p.richardson@gmail.com>
Signed-off-by: Neal Richardson <neal.p.richardson@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants