New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-35649: [R] Always call RecordBatchReader::ReadNext()
from DuckDB from the main R thread
#36307
Conversation
|
Thank you very much @paleolimbot for looking and fixing the issue! Really love the Arrow community! |
This all looks good, so will approve shortly, but @paleolimbot, mainly for the sake of my own understanding, would you mind expanding on why routing calls to |
It's a good thing for all of our sanity!
arrow/r/src/safe-call-into-r-impl.cpp Lines 62 to 76 in 8b4a548
The main difference with actual code is that the call to If |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making these changes!
Conbench analyzed the 6 benchmark runs on commit There were 9 benchmark results indicating a performance regression:
The full Conbench report has more details. |
Rationale for this change
When passing a DuckDB result to Arrow via
to_arrow()
whose input was an Arrow dataset, calls to R code from other threads can occur in some DuckDB operations. This caused a crash or hang on Linux when attempting to combinepivot_longer()
andwrite_dataset()
.What changes are included in this PR?
RecordBatchReader
that routes calls toReadNext()
throughSafeCallIntoR()
.Are these changes tested?
I can't find a new case that isn't covered by our existing tests, although I did remove a skip that was causing a similar problem at one point (#33033). Because it's difficult to predict/test where duckdb evaluates R code, it's hard to know exactly what to test here (I would have expected R code to be evaluated/a crash to occur with many of our existing tests, but even the
pivot_longer()
example does not crash on MacOS and Windows 馃し ).I did verify on Ubuntu 22.04 that the reprex kindly provided by @PMassicotte errors before this PR and does not error after this PR:
Are there any user-facing changes?
There are no user facing changes.