Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove RNG_BEGIN() and RNG_END() #2583

Open
szhorvat opened this issue Apr 21, 2024 · 7 comments
Open

Remove RNG_BEGIN() and RNG_END() #2583

szhorvat opened this issue Apr 21, 2024 · 7 comments
Milestone

Comments

@szhorvat
Copy link
Member

szhorvat commented Apr 21, 2024

This issue is to discuss the removal of RNG_BEGIN() and RNG_END() for igraph 1.0, which were primarily motivated by R's requirement to save/restore the random state. This is the last major change needed to fully decouple the C core from the R interface and remove the USING_R macro. It will pave the way to linking the R interface to the C library in the same way as it'd link to any other library.

Nesting these can cause issues, but avoiding any nesting is very difficult, almost impossible to manage. The current setup requires micro-managing the placement of RNG_BEGIN/END pairs, which is bug-prone, and likely has several existing bugs.

These macros serve two purposes:

  • Initial initialization of the default RNG. We need to find a solution to this, or make initialization manual.
  • Save/restore RNG state in the R interface. This should be moved to the R interface, and managed through automatic code generation.

CC @krlmlr

Ref: igraph/rigraph#1131

@szhorvat szhorvat added this to the 1.0 milestone Apr 21, 2024
@krlmlr
Copy link
Contributor

krlmlr commented Apr 29, 2024

Why is it difficult to control nesting with a flag in thread-local storage?

@szhorvat
Copy link
Member Author

szhorvat commented Apr 29, 2024

It could be, but this would still need to be implemented on the R side, where RNG_BEGIN/END have R-specific definitions. Currently nesting causes problems only in R.

But it seems to me that this would only be a temporary workaround and wouldn't address the core issue, which is that micro-managing this is difficult. Besides nesting, sometimes we forget to add the RNG_BEGIN/END, and sometimes we add too many potentially performance-killing saves/restores.

The clean solution here is to do a single save/restore at the beginning/end of functions in the generated wrappers. In the long term this is the least amount of work, and only requires a fairly small adaptation for the R interface generator.

@krlmlr
Copy link
Contributor

krlmlr commented Apr 29, 2024

In other projects, we've seen that having these wrappers everywhere also harms performance. I need a better understanding of the issue at hand, and I can't spend the time it needs right now.

@szhorvat
Copy link
Member Author

szhorvat commented Apr 29, 2024

This is why we can make it opt-out for the interface generator, and in the longer term disable it where it's not needed.

EDIT: Let's discuss it at a meeting then. If this is not done for 1.0, we'd need to wait for 2.0, which may be quite some time away ... It's not a lot of work on the R side, and will significantly simplify things. It'd be very good to do it.

@ntamas
Copy link
Member

ntamas commented Apr 30, 2024

wouldn't address the core issue, which is that micro-managing this is difficult

Note that with this we would simply be trading the micro-management of RNG_BEGIN() and RNG_END() calls with the micro-management of marking functions in functions.yaml if they happen to use the RNG. Not sure which one of this is less of a burden. For instance, we cannot mark igraph_vector_shuffle() as using the RNG because it's not in functions.yaml and we don't want to generate wrappers for it anyway. However, any igraph function that may call igraph_vector_shuffle() directly or indirectly must then be marked as potentially using the RNG.

My biggest worry right now is that we have absolutely no way to validate whether the RNG is actually used outside an RNG_BEGIN() ... RNG_END() block. Without that, we are bound to make mistakes no matter whether we are managing these calls manually or through a flag in Stimulus.

@szhorvat
Copy link
Member Author

Not sure which one of this is less of a burden.

What we have now only doesn't seem to be more of a burden because issues go unnoticed (no symptom unless R is involved, and difficult to notice symptoms even with R), and it's too easy to ignore issues.

I'm sure that igraph is full of these kinds of bugs at the moment. Here's just the last one I fixed: 3cbf7be The test suite cannot detect these bugs, so we're back to having to pay extra special attention.

There's many instances of RNG_BEGIN being called in a loop which invalidates any performance concerns about just adding these in the R interface generator. Just now as trying to make my point I found this:

while (running) {
igraph_integer_t v1, num_neis;
igraph_real_t max_count;
igraph_vector_int_t *neis;
igraph_vector_int_t *ineis;
igraph_bool_t was_zero;
IGRAPH_ALLOW_INTERRUPTION_LIMITED(iter, 1 << 8);
if (control_iteration) {
/* If we are in the control iteration, we expect in the beginning of
the iteration that all vertices meet the end condition, so 'running' is false.
If some of them does not, 'running' is set to true later in the code. */
running = false;
} else {
/* Shuffle the node ordering vector if we are in the label updating iteration */
IGRAPH_CHECK(igraph_vector_int_shuffle(&node_order));
}
RNG_BEGIN();
/* In the prescribed order, loop over the vertices and reassign labels */

How would you fix this? Move it outside the loop? Don't miss that vector_shuffle() there then! Can you recall off-hand whether vector_shuffle() calls RNG_BEGIN(), or do you have to look it up? I had to look it up.

As you can see RNG_BEGIN() cannot be moved outside of the loop (potential performance issue), and this is difficult to notice and needs careful examination to determine.

Having to pay attention to all this is a much bigger burden to me than marking functions as needing the RNG.

BUT! I never wanted to make this opt-in, but opt-out. Functions are assumed to need the RNG by default, unless it is indicated that they don't. If there's reason to believe that the unnecessary call is a performance issue for some function, only then one can put in the effort to investigate if marking it as non-RNG is possible. I expect only a few simple functions, like igraph_degree(), will end up being marked like this, which is easy and reasonably safe to manage.

@ntamas
Copy link
Member

ntamas commented May 2, 2024

Ah okay, this seems to make more sense to me. So, basically, all functions would be assumed to call the RNG at one point or another, unless explicitly specified otherwise in functions.yaml, right? This makes sense.

How shall we call the new flag in functions.yaml? NO_RNG would be a trivial choice, but maybe a more generic name like DETERMINISTIC or PURE would be better?

There remains the question of initializing the RNG. Most higher-level interfaces initialize the RNG anyway as far as I know so we can safely assume that higher-level interfaces need not provide an RNG binding that needs to take care of initializaiton. The C core would need to auto-initialize the RNG to keep the current behaviour. We could either say that in low-level C one needs to seed the RNG explicitly, or we could provide a C implementation of the RNG that auto-seeds. In order to avoid performance problems by checking whether the RNG is seeded before every call to a RNG routine, we would actually need to provide two RNG implementations in C: a private one that assumes that the RNG is already seeded, and a public one that assumes that the RNG is not seeded, seeds it when needed, and sneakily replaces the RNG function table with the private one to avoid performance problems in future calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants