Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with cpp11 0.4.4 changes to the preserve list structure #330

Closed
DavisVaughan opened this issue Aug 2, 2023 · 0 comments · Fixed by #331
Closed

Issues with cpp11 0.4.4 changes to the preserve list structure #330

DavisVaughan opened this issue Aug 2, 2023 · 0 comments · Fixed by #331
Assignees

Comments

@DavisVaughan
Copy link
Member

DavisVaughan commented Aug 2, 2023

The following will error if arrow is compiled with cpp11 0.4.4

conn <- duckdb::dbConnect(duckdb::duckdb())

path <- gbifdb::gbif_example_data()
dataset <- arrow::open_dataset(path)

duckdb::dbDisconnect(conn, shutdown = TRUE)

# Now restart R

It should error on restart with:

Error: bad value

This error is related to SETCAR() calls, which we confirmed with an lldb backtrace:

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.1
  * frame #0: 0x00000001014589ec libR.dylib`Rf_error(format="bad value") at errors.c:961:26 [opt]
    frame #1: 0x00000001014bded4 libR.dylib`SETCAR(x=0x000000013000a0e0, y=0x00000001309d81c8) at memory.c:4285:2 [opt]
    frame #2: 0x0000000123188814 arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::$_0::release(this=<unavailable>, cell=<unavailable>) at protect.hpp:388:5 [opt]
    frame #3: 0x00000001231887d0 arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::sexp::~sexp(this=<unavailable>) at sexp.hpp:58:23 [opt]
    frame #4: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::sexp::~sexp(this=<unavailable>) at sexp.hpp:58:11 [opt]
    frame #5: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::environment::~environment(this=<unavailable>) at environment.hpp:21:7 [opt]
    frame #6: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared() [inlined] cpp11::environment::~environment(this=<unavailable>) at environment.hpp:21:7 [opt]
    frame #7: 0x00000001231887cc arrow.so`std::__1::__shared_ptr_emplace<cpp11::environment, std::__1::allocator<cpp11::environment> >::__on_zero_shared(this=<unavailable>) at memory:3318:23 [opt]
    frame #8: 0x0000000123183c2c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::__shared_count::__release_shared(this=0x00006000001faac0) at memory:3169:9 [opt]
    frame #9: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::__shared_weak_count::__release_shared(this=0x00006000001faac0) at memory:3211:27 [opt]
    frame #10: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::shared_ptr<cpp11::environment>::~shared_ptr(this=0x000060000388c830) at memory:3884:19 [opt]
    frame #11: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType() [inlined] std::__1::shared_ptr<cpp11::environment>::~shared_ptr(this=0x000060000388c830) at memory:3882:1 [opt]
    frame #12: 0x0000000123183c1c arrow.so`RExtensionType::~RExtensionType(this=0x000060000388c790) at extension.h:33:7 [opt]
    frame #13: 0x0000000123182e8c arrow.so`RExtensionType::~RExtensionType() [inlined] RExtensionType::~RExtensionType(this=<unavailable>) at extension.h:33:7 [opt]
    frame #14: 0x0000000123182e88 arrow.so`RExtensionType::~RExtensionType(this=<unavailable>) at extension.h:33:7 [opt]
    frame #15: 0x0000000123224ea8 arrow.so`std::__1::shared_ptr<parquet::internal::RecordReader>::~shared_ptr() + 56
    frame #16: 0x0000000123507510 arrow.so`void std::__1::allocator_traits<std::__1::allocator<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*> > >::destroy<std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::shared_ptr<arrow::ExtensionType> > >(std::__1::allocator<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*> >&, std::__1::pair<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const, std::__1::shared_ptr<arrow::ExtensionType> >*) + 24
    frame #17: 0x00000001235074dc arrow.so`std::__1::__hash_table<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::__unordered_map_hasher<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::__unordered_map_equal<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::allocator<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> > > >::__deallocate_node(std::__1::__hash_node_base<std::__1::__hash_node<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, void*>*>*) + 32
    frame #18: 0x000000012350749c arrow.so`std::__1::__hash_table<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::__unordered_map_hasher<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::hash<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::__unordered_map_equal<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> >, std::__1::equal_to<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, true>, std::__1::allocator<std::__1::__hash_value_type<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::shared_ptr<arrow::ExtensionType> > > >::~__hash_table() + 24
    frame #19: 0x00000001235071d4 arrow.so`arrow::ExtensionTypeRegistryImpl::~ExtensionTypeRegistryImpl() + 44
    frame #20: 0x0000000123224ea8 arrow.so`std::__1::shared_ptr<parquet::internal::RecordReader>::~shared_ptr() + 56
    frame #21: 0x00000001a60fdec4 libsystem_c.dylib`__cxa_finalize_ranges + 476
    frame #22: 0x00000001a60fdc4c libsystem_c.dylib`exit + 44
    frame #23: 0x00000001015980b8 libR.dylib`Rstd_CleanUp(saveact=SA_NOSAVE, status=0, runLast=<unavailable>) at sys-std.c:1246:5 [opt]
    frame #24: 0x000000010159ae60 libR.dylib`R_CleanUp(saveact=<unavailable>, status=0, runLast=<unavailable>) at system.c:87:5 [opt]
    frame #25: 0x00000001014b539c libR.dylib`do_quit(call=<unavailable>, op=<unavailable>, args=0x0000000112b84bd8, rho=<unavailable>) at main.c:1487:5 [opt]
    frame #26: 0x0000000101466a24 libR.dylib`bcEval(body=0x0000000112b84f58, rho=<unavailable>, useCache=<unavailable>) at eval.c:7446:14 [opt]
    frame #27: 0x000000010145f048 libR.dylib`Rf_eval(e=0x0000000112b84f58, rho=0x0000000112b84cb8) at eval.c:1013:8 [opt]
    frame #28: 0x000000010147bccc libR.dylib`R_execClosure(call=0x0000000112b81400, newrho=0x0000000112b84cb8, sysparent=<unavailable>, rho=<unavailable>, arglist=<unavailable>, op=0x0000000112b850a8) at eval.c:0 [opt]
    frame #29: 0x000000010147a54c libR.dylib`Rf_applyClosure(call=0x0000000112b81400, op=0x0000000112b850a8, arglist=0x000000013000a0e0, rho=0x0000000130042f88, suppliedvars=<unavailable>) at eval.c:2113:16 [opt]
    frame #30: 0x000000010145f31c libR.dylib`Rf_eval(e=0x0000000112b81400, rho=0x0000000130042f88) at eval.c:1140:12 [opt]
    frame #31: 0x00000001014b33b4 libR.dylib`Rf_ReplIteration(rho=0x0000000130042f88, savestack=<unavailable>, browselevel=<unavailable>, state=0x000000016ee85b10) at main.c:262:2 [opt]
    frame #32: 0x00000001014b4928 libR.dylib`R_ReplConsole(rho=0x0000000130042f88, savestack=0, browselevel=0) at main.c:314:11 [opt]
    frame #33: 0x00000001014b4864 libR.dylib`run_Rmainloop at main.c:1200:5 [opt]
    frame #34: 0x00000001014b49d0 libR.dylib`Rf_mainloop at main.c:1207:5 [opt]
    frame #35: 0x0000000100f7bea0 R`main + 32
    frame #36: 0x00000001a5ee7f28 dyld`start + 2236

That error is due to the changes in this PR #285, which slightly tweaked the structure of the preserve list.

duckdb vendors cpp11 0.4.2, which uses the "old" structure of the preserve list. It gets loaded first (and presumably something internally calls cpp11) so the preserve list is initialized to the "old" structure. Then we load arrow and that internally uses cpp11 0.4.4 which included that PR. Since the structures differ, the insertion/release calls seem to not be aligned quite right, so as R exits and the destructors for the arrow objects are all run, we end up with an error in SETCAR() which seems to come from the release() call in ~sexp().

The PR included an attempt to be backwards compatible with old versions of cpp11, but it seems like it wasn't quite right (and is quite hard to get right, as we see here).


We have had many issues with "shared cpp11 state" over the years, and the fact that a package built with, say, cpp11 0.4.2, has to share state with a package built with cpp11 0.4.4 makes it very hard to introduce changes to the shared state objects in a backwards compatible way.

In #327 we faced a related issue, and our solution was actually to remove that shared state altogether.

Lionel, Kevin, and I have come up with a proposal to do the same thing here. Rather than having 1 "global" preserve list that is shared across multiple package and across compilation units within the same package, we propose that we go back to the original implementation of the preserve list (before it was changed here #92), which creates one preserve list per compilation unit. The crux of the implementation would look like

  static SEXP get_preserve_list() {
    static SEXP out = R_NilValue;
    if (out == R_NilValue) {
      out = new_preserve_list();
      R_PreserveObject(out);
    }
    return out;
  }

We think this should get rid of the shared state problem altogether:

  • We no longer have to try and store a global preserve list somewhere. The static variable means that the pointer to it will persist across accesses, and R_PreserveObject() handles the protection for us. This means we no longer have to deal with storing external pointers as global options (which is the only thing we could use due to needing to avoid the option being copyable and needing to avoid a stack overflow when serializing the protection list). It also means a user can't bork it accidentally.
  • There would be minimal memory increase. We estimate it at sizeof(empty preserve list) * number of compilation units, which seems reasonable.
  • It fixes the duckdb / arrow issue because duckdb can continue using their vendored version (for now) and it won't conflict with arrow if it gets built with this new cpp11 version since each arrow compilation unit has its own preserve list.
  • I tested the reprex in this issue Understanding memory usage with nest/spread/gather: removing object still leaves used memory? tidyverse/tidyr#1059 which was supposedly fixed by switching to a global preserve list, but it does not seem to reproduce in my tests when switching back to a one-list-per-compilation-unit approach.

Two very minor downsides:

  • release_all() is a helper that was intended for R < 3.5.0 that would release all cpp11 objects from the preserve list. This would no longer work as expected since it would only release objects from the list managed by that compilation unit. But we think that is okay because we don't think this function is particularly safe anyways - you could release an object that you don't own if the preserve list is a global one! A github search makes us think no one is using it, and the tidyverse / r-lib orgs have moved on to require R >= 3.6.0 at this point, so I think we can set R (>= 3.5.0) for cpp11 and remove this function safely. It seems like release_all() was one of the motivating reasons for a global preserve list according to Preserved list refactor #92, so if we remove that then that is even more reason to consider alternative approaches.
  • If you are developing, say, vroom, and have some vroom R objects (which may be ALTREP and may rely on objects in the preserve list existing) lying around and rerun load_all(), then that will reload the C++ internals resulting in new preserve lists being created. We think this is okay, as the old preserve lists will have been protected with R_PreserveObject() so the R vroom objects should continue to work, but their memory will not be released until R restarts. This seems okay since it would be a very rare developer only issue.

Some other related issues:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant