-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved error handling in the C API #90
Comments
I am having a hard time imagining how the first approach would work in the higher-level languages like Julia. Right now, it's very easy to use with the built in threading in e.g. Julia, like we do in our tests: bridgestan/julia/test/model_tests.jl Lines 477 to 492 in f07232a
But it seems like it would be much more difficult to express in the The second approach is one we considered, but never implemented. As far as my memory and searching has turned up, there wasn't a strong reason we didn't, it just never got prioritized. Freeing after the fact is a minor annoyance, but something we can nicely wrap-up in our higher-level interfaces I suppose. So I think my vote is for that style, but it's hard to be super sure without seeing the consequences in code. |
Thanks, @aseyboldt. I think this is a really important issue to get right. And I can see that we've messed this up pretty badly in the initial implementation by assuming everything would just print. There are two places where "prints" might happen.
@aseybolt: Why were you suggesting From a Python interface, I would strongly prefer to have the exceptions percolate through to Python exceptions and then let the clients handle them. Presumably that would also make sense in Julia and R. For interfaces like C where that's not possible, I think we want to take in an output stream. |
I've started a branch https://github.com/roualdes/bridgestan/tree/feature/return-error-messages/ where I'm experimenting with "method 2" here. It seems reasonable based on what I've done so far. I will keep looking into this next week. Either option here would be a breaking change for the C level API, but I personally don't mind having a version bump for this |
I know pretty much nothing about julia, but the simplest implementation for option 1 would probably look something like this: model = load_test_model("multi")
nt = Threads.nthreads()
R = 1000
ld = Vector{Bool}(undef, R)
g = Vector{Bool}(undef, R)
@sync for it = 1:nt
Threads.@spawn for r = it:nt:R
ctx = BridgeStan.context(model, seed, it)
x = randn(BridgeStan.param_num(ctx))
(lp, grad) = BridgeStan.log_density_gradient(ctx, x) Ideally, I guess splitting the The second option is a bit simpler, but the first one might be easier to extend if that were required for some reason. For instance, let's assume at some later point there was a reason to expose more structured error data. That could be done by simply adding an accessor function for that additional error data. That would be backward compatible, old code just would never call that function. |
@bob-carpenter I didn't know there were additional prints in the stan code... That's certainly something to consider. By stream you mean a file descriptor on the C side? I don't work with C++ much, but I think the stream construct there doesn't really exist in C, does it? If that's the case, I'm not sure it's really is easier. If things go bad, we might even end up with a deadlock, if the process that calls |
C-style streams (I assume you mean |
@aseyboldt yeah, that snippet is about what I imagined for how the Julia code would need to work for option 1. I think that it's worth noting that this makes Bridgestan strictly less usable by third-party code than a version in which log_density is thread safe "out of the box". By contrast, constraining the parameters is 1) less likely to be done as part of a black-box API (it's easy to do as a post-processing step) and 2) not a performance bottleneck, so less likely to be done in parallel at all. Moving thread safety (particularly of the log_density functions) from something which happens on the inside of the call to something which happens outside the call seems like too big a price to pay for me. Aside: If we did care about thread safety of |
I haven't done anything, seeing as I have no clue who all of you are.
…On Friday, March 24, 2023, Brian Ward ***@***.***> wrote:
Why were you suggesting char** arguments rather than streams? Is it hard
to take in a C stream and use it in C++? Streams just make the code a lot
simpler if you are going to write to a file or handle like stderr or stdout
and they don't need to be freed like character pointers.
From a Python interface, I would strongly prefer to have the exceptions
percolate through to Python exceptions and then let the clients handle
them. Presumably that would also make sense in Julia and R. For interfaces
like C where that's not possible, I think we want to take in an output
stream.
C-style streams (I assume you mean FILE* from stdio.h) seem like an
equally ugly API boundary to have in something as a char**. The only
resources I can find about how to use them from Python all involve creating
temporary files or platform-specific named pipes, which is much worse than
just creating a char**
—
Reply to this email directly, view it on GitHub
<#90 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ATCK5W3MVE22BR2UERKKTYTW5YHCFANCNFSM6AAAAAAWG7FDAQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@WardBrian If the rng part of gone from the |
The difference is indeed smaller in that case but I don’t think it’s purely cosmetic. The ctx still requires a malloc to create and adds a pointer indirection to each call Plus, it simply doubles the number of objects in our API a user needs to care about. I think that’s a big jump. I am happy to be outvoted on this particular point, but I am in general more in favor of keeping the API as simple as we can, even if that means we may need to do major version bumps in the future, rather than adding complexity we may not ever need. I think a How to handle things like |
I would prefer not to have a
It exists, but in a stripped down C interface with strange nomenclature for historical reasons. Here's an overview: https://stackoverflow.com/questions/38652953/what-does-stream-mean-in-c With streams, the API punts on freeing memory. And there is no worry on the client side about freeing memory. Ideally, we'd just be able to pipe the C++ output stream output directly to the C stream. I think we'd need something like this (from the very reliable co-author of the standard texts on all this, including templates): http://www.josuttis.com/libbook/io/outbuf2.hpp.html If I understand that code, it creates a C++ output stream that wraps a file descriptor (C is confusing in that all streams are file descriptors): https://www.gnu.org/software/libc/manual/html_node/Streams.html So we can take it as a file descriptor argument in the C interface, then convert it to a C++ stream on the back end. |
That would be nice, but I think creating a Even if we could pass something like Python's |
That's unfortunate. With Is there some other way to encapsulate those strings so they don't leak into the client interface? I'm not an expert in either Python or C, but this looks like one possible tack: |
Went and had a discussion with @WardBrian about how the interfaces don't even use streams to back up their I/O interface, so we can't interface at that level. He also let me know that the handling of memory will be encapsulated so that the Python interface encapsulates access to the character strings so that clients won't need to worry about it. |
I made a couple other issues to keep different threads of discussion separate:
The |
There was some discussion about error handling in #88, but I thought a separate issue would be a better place to discuss this.
Currently, the error messages of the C++ exception are printed to stderr, which works fine, but has the disadvantage that many users might not know to look there, and depending on the setup it might not even be easy to find it (for example in a jupyter notebook that is started from jupyterhub or from a system service). Also, if several models are failing at the same time, a user would have to figure out which error belongs to which model manually.
The two (I think) most important kinds of failures would be
bs_construct
). Users will I think provide invalid data fairly often, and the error message should contain information about what part of the data was invalid.bs_log_density_gradient
(or for different sampling / VI algorithms the other log_density functions): Error messages often contain information about why there was a divergence during sampling, and sampling algorithms might want to collect this information to help model debugging.I can see a couple of different options of how this could be achieved, my favorites would probably be those:
bs_model_rng
struct as an extra field, and provide an accessor function for it. This unfortunately makes thebs_log_density_gradient
function family no longer thread safe. We can work around that issue however by creating newbs_model_rng
structs for each thread. With the current API this would require loading the data several times however. This again can be avoided by splittingbs_model_rng
into two separate types: One for the dataset and the potential error message from model creating, and a separate context for rng and error messages during sampling.This would then look like this:
The additional function declarations:
In this scenario
bs_model
would be Sync (functions that take a const pointer tobs_model
can be called concurrently from different threads), but not Send (ownership ofbs_model
can not be transferred to a different thread, ie it has to be deallocated from the same thread where it was created. Is that actually a requirement?).bs_ctx
doesn't need to be Sync or Send, as every thread can just create a separate context (which should be super cheap).char **error_msg
arguments to the fallible functions, and add abs_free_error_msg(char *)
function.This would then look something like this:
I hope this is at least somewhat useful to you, happy to discuss more if there is interest in a change like this.
The text was updated successfully, but these errors were encountered: