New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Returning error messages to R #278
Comments
Okay it looks like we have an existing wrapper which is extendr/extendr-api/src/thread_safety.rs Lines 91 to 98 in 2103979
I note there's also an |
This sounds like something to look at. Would you like to try a PR or would you like one of us to give it a try? |
If it's as simple as implementing a simple Rust wrapper like this for each of those APIs I'm happy to do it. By the way, we have this wrapper for returning errors, but afaik none of the |
Notes for future implementation:
|
We did look at handling Result types in wrappers. It may still do so. |
The thing that worries me is this: extendr/extendr-api/src/robj/into_robj.rs Lines 37 to 45 in 7e9cac3
So if our Rust function returns a result, and that result is an error, then it will just get unwrapped in Rust. I guess the error message will get printed to the terminal maybe (?) but I suspect this isn't very user friendly. Also will this cause existing Rust objects in R to break, since I assume the memory will be reclaimed when the library crashes? |
This only works because we catch panics and convert them to R errors. A nicer way would be to have a R class "RustError" or similar The jury has been out on this, but we could put it on a switch. |
Huh, I didn't even know you could catch panics. Well yes I would find it infinitely cleaner to just go Rust Error → R Error and bypass the use of panic entirely. Is there some kind of way to vote on this issue? |
The magic is here.
https://github.com/extendr/extendr/blob/1bdd03f08c14a781ba7b3d0ae628b10cf2425b95/extendr-macros/src/wrappers.rs#L106
*handle_panic* uses *catch_unwind* to do the conversion.
In theory, the unwind should free any allocated variables,
especially the formal arguments which are reference counted
*Robj*s.
You are probably not alone with the "straight to R error"
but it is very difficult to catch R errors without the danger of
memory leaks.
*make_funtion_wrappers* has and *options* parameter
that can be driven from the *#[extendr(param=value)]* attribute.
This would allow some customisation.
…On Mon, Sep 27, 2021 at 12:33 AM Michael Milton ***@***.***> wrote:
Huh, I didn't even know you *could* catch panics. Well yes I would find
it infinitely cleaner to just go Rust Error → R Error and bypass the use of
panic entirely. Is there some kind of way to vote on this issue?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#278 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAL36XGD2VKXCJAKKFNFPE3UD6UV7ANCNFSM5DMCPUAQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Okay so in summary what I'm hearing is that the correct way to raise an error using extendr is simply to return a |
To be clear: I don't think we'll change the part where you return a |
Sorry yes that's what I meant: that the unwrapping panic part may change but that returning a |
Sorry guys, I don't follow. Let's say we made a Rust function Now I can't understand the reason that we can't return the correct message to R because (To be clear, what I want is R calls a Rust function, which may throw Rust errors. The errors are not in R's stderr stream. Thus, it looks strange. Thanks. |
In C++ you can call We did discuss having an option to return an object with a "rust.error" class or similar. Something like
This was not popular with the team and got pushed back. |
I still don't think we've addressed my suggestion above which is to just call the R API's throw error function, because I believe unlike the other methods, it will behave how R users expect and be catchable using |
We do have a function for throwing R errors. This is very dangerous, however In the wrapper, we could assume that objects have been dropped, but we would If the error supports Display (a fair assumption) it should be possible
|
Exactly, that's why I pushed back against it. Since R uses longjumps on error, I think the Rust code needs to somehow emulate this also. Would it help to write some C intermediary that wraps everything, so that the Rust code can fully shut down and return to C, and then the C code calls the longjump? |
Using a wrapper sounds good to me (If I remember correctly, it was what @shrektan suggested on Discord). But, I don't get why we need annotation. Are there anything ambiguous without that annotation switch? |
Sorry, to clarify I mean: add a panic hook (like we already have), but in that hook we call the |
I was just trying in that direction. Unlike with normal errors, One disadvantage of this is probably we cannot add custom classes to the error, because pub fn handle_panic<F, R>(err_str: &str, f: F) -> R
where
F: FnOnce() -> R,
F: std::panic::UnwindSafe,
{
// Surpress outputs from the default hook (c.f. https://stackoverflow.com/a/59211505)
let prev_hook = std::panic::take_hook();
std::panic::set_hook(Box::new(|_| {}));
let result = match std::panic::catch_unwind(f) {
Ok(res) => res,
Err(e) => {
let err_str = if let Some(e) = e.downcast_ref::<&str>() {
e.as_ptr()
} else {
err_str.as_ptr()
};
unsafe {
libR_sys::Rf_error(err_str as *const std::os::raw::c_char);
}
unreachable!("handle_panic unreachable")
}
};
// Restore the previous hook
std::panic::set_hook(prev_hook);
result
} This almost worked, but I don't know why I get this extra "src/lib.rs" string... #[extendr]
fn hello_world() -> &'static str {
panic!("Hello world!");
} hello_world()
#> Error in hello_world() : Hello world!src\lib.rs |
Just a guess, is it because the string pointer you're creating is to a string without a null terminator, so it's reading off the end of the string onto another one? Can you convert to a |
Ah, it might be. Thanks. |
It is worth noting that a CString will leak memory as nothing will deallocate it. This may not be a problem for occasional errors - I'm pretty certain that R also will leak |
Anyway, I agree I was too naive about allocation during handling panic. I also notice I cannot pass a non-constant string to |
I did some simple investigation but don't have time to look further yet. The cargo package does this by using a macro to wrap the Rust code and unwind the stack when panicked, then converting the message into an R error. @yutannihilation You may find the following code helpful. |
Would it be possible to this, but let the autogenerated R wrappers raise the error?
It could be something like:
This adds an overhead for every function call, we could provide a way for users to disable this when they are sure no error is going to be raised. |
I think it's also a possible solution. So, we have plenty of options... (not sure if they all are really feasible, though) I have no idea how we can decide.
|
This comment has been minimized.
This comment has been minimized.
I thought @yutannihilation The other dimension is how we return it to R, which is an orthogonal question to where we handle it. We've had suggestions of :
|
No, I don't think so. We do want to capture all sorts of panics, but it's to protect the user's R session from crash, not to handle it nicely. What we are discussing here is how to handle |
After some thinking, I'd vote the R wrapper. The Rust wrapper can simply return an error condition object. Here's my attempt (I just tweaked the master...yutannihilation:poc/raise-error-in-R-wrapper While this can be also possible in C code, but I think it's easier to throw this nicer with metadata on R's code. |
That's a lot of performance overhead though if every function that dispatches to Rust has this R-side error handling. |
Yeah, it seems |
I would like again to point out that handling errors on the R side can be an opt-in, determined by the return type. If it is a Benchmark resultsrextendr::rust_source(
code = "
use extendr_api::prelude::*;
#[extendr]
fn get_with_result_impl(x : Doubles, y : Doubles) -> List {
let res : Rfloat =
x.iter().zip(y.iter())
.map(|(xx, yy)| xx + yy)
.sum();
list!(res = res)
}
#[extendr]
fn get_without_result(x : Doubles, y : Doubles) -> Rfloat {
x.iter().zip(y.iter())
.map(|(xx, yy)| xx + yy)
.sum()
}
",
patch.crates_io = list(
`extendr-api` = list(git = "https://github.com/extendr/extendr"),
`libR-sys` = list(git = "https://github.com/extendr/libR-sys")
)
)
#> i build directory: 'C:/Users/[redacted]/AppData/Local/Temp/RtmpQdCHWc/file27cc48be2316'
#> v Writing 'C:/Users/[redacted]/AppData/Local/Temp/RtmpQdCHWc/file27cc48be2316/target/extendr_wrappers.R'.
get_with_result <- function(x, y) {
res <- .Call("wrap__get_with_result_impl", x, y, PACKAGE = "rextendr1")
if (!is.null(res$err)) {
stop(res$err)
}
res$res
}
get_with_result_rev <- function(x, y) {
res <- .Call("wrap__get_with_result_impl", x, y, PACKAGE = "rextendr1")
if (!is.null(res$res)) {
return(res$res)
}
stop(res$err)
}
test_list_access <- function(x) {
if (!is.null(x$err)) {
stop(x$err)
}
x$res
}
test_return_result <- function(x) x
bench::press(
n = c(1, 1e2, 1e4, 1e6, 1e7),
{
x <- rnorm(n)
y <- rnorm(n)
reference <- sum(x + y)
dummy <- list(res = reference)
bench::mark(
test_return_result(reference),
test_list_access(dummy),
test_list_access(list(result = reference)),
get_without_result(x, y),
get_with_result(x, y),
get_with_result_rev(x, y),
)
}
) |> print(n = 100)
#> Running with:
#> n
#> 1 1
#> 2 100
#> 3 10000
#> 4 1000000
#> 5 10000000
#> # A tibble: 30 x 14
#> expression n min median `itr/sec`
#> <bch:expr> <dbl> <bch:t> <bch:t> <dbl>
#> 1 test_return_result(reference) 1 100ns 200ns 4.33e6
#> 2 test_list_access(dummy) 1 200ns 300ns 2.82e6
#> 3 test_list_access(list(result = reference)) 1 400ns 500ns 1.87e6
#> 4 get_without_result(x, y) 1 5.1us 5.5us 1.78e5
#> 5 get_with_result(x, y) 1 10.1us 10.8us 8.93e4
#> 6 get_with_result_rev(x, y) 1 9.9us 10.8us 9.13e4
#> 7 test_return_result(reference) 100 100ns 200ns 5.16e6
#> 8 test_list_access(dummy) 100 200ns 300ns 2.73e6
#> 9 test_list_access(list(result = reference)) 100 400ns 500ns 1.93e6
#> 10 get_without_result(x, y) 100 13.2us 13.9us 7.13e4
#> 11 get_with_result(x, y) 100 18.2us 19.6us 5.06e4
#> 12 get_with_result_rev(x, y) 100 18.1us 19.2us 5.17e4
#> 13 test_return_result(reference) 10000 100ns 200ns 5.03e6
#> 14 test_list_access(dummy) 10000 200ns 300ns 2.83e6
#> 15 test_list_access(list(result = reference)) 10000 400ns 500ns 1.88e6
#> 16 get_without_result(x, y) 10000 737.5us 757.5us 1.31e3
#> 17 get_with_result(x, y) 10000 747.2us 769.5us 1.30e3
#> 18 get_with_result_rev(x, y) 10000 740.8us 762.5us 1.31e3
#> 19 test_return_result(reference) 1000000 100ns 200ns 5.27e6
#> 20 test_list_access(dummy) 1000000 200ns 300ns 2.75e6
#> 21 test_list_access(list(result = reference)) 1000000 400ns 500ns 1.96e6
#> 22 get_without_result(x, y) 1000000 74.6ms 75ms 1.33e1
#> 23 get_with_result(x, y) 1000000 74.8ms 75ms 1.33e1
#> 24 get_with_result_rev(x, y) 1000000 74.9ms 75ms 1.33e1
#> 25 test_return_result(reference) 10000000 100ns 200ns 5.36e6
#> 26 test_list_access(dummy) 10000000 200ns 300ns 2.81e6
#> 27 test_list_access(list(result = reference)) 10000000 400ns 500ns 1.99e6
#> 28 get_without_result(x, y) 10000000 760.4ms 760.4ms 1.32e0
#> 29 get_with_result(x, y) 10000000 753.3ms 753.3ms 1.33e0
#> 30 get_with_result_rev(x, y) 10000000 748.7ms 748.7ms 1.34e0
#> # ... with 9 more variables: mem_alloc <bch:byt>, gc/sec <dbl>, n_itr <int>,
#> # n_gc <dbl>, total_time <bch:tm>, result <list>, memory <list>, time <list>,
#> # gc <list> Created on 2022-01-31 by the reprex package (v2.0.1) |
UPD: An interesting observation. It seems the fastest way (based on the number of iterations instead of measured time/ops) is to check Benchmark resultsstop_in_if <- function(res) {
if (!is.null(res$err)) {
stop(res$err)
}
res$res
}
return_in_if <- function(res) {
if (!is.null(res$res)) {
return(res$res)
}
stop(res$err)
}
cache_res <- function(res) {
tmp <- res$res
if (!is.null(tmp)) {
return(tmp)
}
stop(res$err)
}
check_err_and_return <- function(res) {
if (is.null(res$err)) {
return(res$res)
}
stop(res$err)
}
x <- rnorm(1e6)
y <- rnorm(1e6)
reference <- sum(x + y)
dummy <- list(res = reference)
bench::mark(
stop_in_if(dummy),
return_in_if(dummy),
cache_res(dummy),
check_err_and_return(dummy),
max_iterations = 1e8
) |> print(n = 100)
#> # A tibble: 4 x 13
#> expression min median `itr/sec` mem_alloc `gc/sec` n_itr
#> <bch:expr> <bch:> <bch:t> <dbl> <bch:byt> <dbl> <int>
#> 1 stop_in_if(dummy) 200ns 300ns 3083301. 0B 28.2 1.09e6
#> 2 return_in_if(dummy) 200ns 300ns 2970877. 0B 29.3 1.11e6
#> 3 cache_res(dummy) 200ns 300ns 2780485. 0B 31.6 1.06e6
#> 4 check_err_and_return(dummy) 200ns 300ns 3313410. 0B 27.4 1.21e6
#> # ... with 6 more variables: n_gc <dbl>, total_time <bch:tm>, result <list>,
#> # memory <list>, time <list>, gc <list> Created on 2022-01-31 by the reprex package (v2.0.1) |
I feel the need to reinforce that returning a Further, all of the condition-handling suggested by Hadley that is common in R would become irrelevant as well, because we would not be throwing errors at all. Users will expect I realise that a Rust package author could solve this in their R wrapper layer, but I feel that it's better to just produce idiomatic R in the first place, so that an extra layer is not essential. |
Just to be clear, if you are talking about the interface, the wrapper never expose it to the user. It just returns |
Right, so the Rust → R interface returns a |
I'll summarise (again 😄) the discussion so far. Please correct me if I'm wrong:
|
I'll raise the same question, if it's not about avoiding the "very little memory leak". Adding a "condition class" to the error message, from my personally experience, is not useful. I think Rcpp and cpp11 call the plain R error message, too. |
Can we not just allocate a string vector in R memory and then strcpy the Rust string into the vector, unprotect it so that the GC can handle it, and then let Rust free the Rust string memory? Some Rust equivalent of: SEXP err = PROTECT(allocVector(STRSXP, 1));
SET_STRING_ELT(err, 0, mkChar(our_error_message) );
UNPROTECT(1); |
Thanks for the core question. I too thought this is a problem of memory leak, which is not considered as "unsafe" in Rust, but it seems the problem is the undefined behavior on cross-language unwinding.
Sorry, I think I'm running out of time here as my current focus is graphics-related things... I hope Andy will give some vision on this eventually! |
EDIT: Yikes. Little testing showed this to not be a good idea. Anything not dropped before was leaked when calling throw_r_error() great thread :) just to clarify? For an average extendr-user who just barely can wait for the new cool error handling :) , how much would one leak by doing the following until then?
A concrete example could be some method which would return a Result, but the current unwrap-panic! is not desirable. Instead here the final Result is unwrapped with Is it only one C-string or could it potentially be much more memory? |
I suggest the following pattern (git repo) to use opt-in pros
cons
*assuming error implements Rust code
R code
|
We had this discussion multiple times, and there is no consensus. There are arguments in favor of different implementations. Again, many ways to solve this problem, feel free to make an attempt at PR. |
Yeah many ways. I think of it as an iron triangle of Speed, Safety and 'Good' error handling. Unless the C-wizards :) come up with a really nice have-it-all solution, it will be a compromise. The current default behavior, "unwrap->panic->unwind" favors Speed+Safety and might actually be a fine default behavior. I suggest to add 2 or 3 handy generic functions and a explanation in the learning materials how the extendr_api-user can opt-in to other mixes of speed-safety-goodError. list(ok,err) favors Safety+GoodError your mentioned simpleCondition-option has Safety+GoodError and retains speed on the happy path by avoiding the list allocation. The generic function would look like this then
usage in rust could look like
and on R side the extendr_api-user would write a function like
I will try it out for some weeks and make a PR if I still like it :) |
The end goal would be to update generates code as well. If a rust FN returns a result of supported shape, then correctly process it and send to R. And on the R side, generate wrapper that not only makes a call into native lib, but also throws an error if it recognizes it is indeed an error. Then, from users perspective, it will be super smooth. |
How about extending the #[extendr] macro to take an optional argument which decides the behavior. The default behavior can be "smooth for end-users who just wanna try writing a few functions". "Need for speed"-users and "error control"-people can just opt for other behaviors.
|
I am not sure the flag is needed, you can communicate it with a return type. If it is something like |
Let me share my proof-of-concept implementation using tagged pointer, which is in the direction of #278 (comment). I have no intention (and bandwidth) to promote this, at least at the moment, but hope this helps the discussion here somehow. Rust code: https://github.com/yutannihilation/unextendr/blob/ec88a43411a54a720b4071b7c0adbfe597fda18e/src/rust/src/lib.rs#L33-L77 (updated after Ilia's comment) The main advantage is that the additional cost is only one bitwise operation, but I'm not sure how portable this is... |
What do you achieve with |
Aw, it was my mistake, sorry. The intention was to ensure the error flag is unset (so I should write it as |
My toy framework now reached a usable state. To discuss about error handling, it might be a good example to play with. I have no idea how I can fit the implementation into extendr, but feedback is welcome. How to create a package: https://github.com/yutannihilation/savvy#getting-started #[savvy]
fn raise_error() -> savvy::Result<savvy::SEXP> {
Err(savvy::Error::new("This is my custom error"))
} raise_error()
#> Error: This is my custom error |
Is there an easy way to return strings from Rust that will become errors, warnings or messages in R? If I remember right, simply returning an Error will just panic since it just gets unwrapped.
The text was updated successfully, but these errors were encountered: