-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Poll for actions in structural equality loop. #12039
Conversation
If I understand correctly, after an interrupt the PR aborts the comparison and restarts from the initial value. If we wanted to instead continue comparing the values on the current "compare stack" (a worklist of pairs of values to compare), we would be in trouble if a GC had moved some values around or freed them. Naive question: why don't we make the compare stack a GC root, so that its elements remain alive during GCs? Is the worry that there would be a performance cost there? (Is it not clear to me that there would.) Or maybe just this was felt more complex to implement? |
That's right, we restart from the initial values because of the GC. Discussing with @stedolan, the biggest issue with marking the whole stack as a GC root is with the extra complexity it introduces. Restarting the comparison like this is much easier to verify, and should occur rarely enough that the performance penalty of restarting should not be observable by any reasonably written program. |
One worry with restarting from the start is that it introduces a risk of non-termination (or at least arbitrary delays). You could be unlucky and compare two large values that take more time to compare than the time interval between two pending action notifications (maybe there is another domain that allocates a lot in the background), and you keep restarting indefinitely and never make progress. |
This being said: the current PR has an attractive feature/weight ratio: it is simpler than registering the compare stack as a root, and it already provides what is arguably a nice improvement over the current state. I think it could make sense to review an implementation along the current lines first, consider merging, and then think about keeping the stack alive as a second step. |
If I understand the comparison code correctly, comparing two constructor blocks type nat = Z | S of nat
let rec infty = S infty
let () = ignore (infty = infty) Other sources of large delays are the cases A natural fix with the current implementation (that never returns from the poll) would be to use Note: instead of doing |
I've implemented polling for strings and double arrays in the commit above, but I don't think it's a particularly good idea, the extra returns increased the code size by a bit ( |
Actually I was mistaken, as long as the compiler behaves itself these changes should be fine performance-wise. Hopefully they are correct. |
Regarding the |
My argument is that if you increment-and-check once for each block header, instead of once for each block argument, you do it less often. But I agree that it does not matter much. (Most blocks have few arguments anyway, and the increment-and-check instructions are not costly.) |
runtime/compare.c
Outdated
if (res < 0) return LESS; | ||
if (res > 0) return GREATER; | ||
if (last_block) break; | ||
COMPARE_DO_POLL(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect COMPARE_INC_POLL_TIMER
here, not COMPARE_DO_POLL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was going to be a COMPARE_INC_POLL_TIMER(COMPARE_SIGNAL_POLL_PERIOD)
since that's how many bytes were checked, but of course then the comparison would always hold. I've changed it to that, since it makes intention clearer, and the compiler should be able to optimise it away.
runtime/compare.c
Outdated
signal_poll_timer = 0; \ | ||
if (compare_poll_actions(&sp, stk, &orig1, &orig2, &v1, &v2)) \ | ||
goto loop_start; \ | ||
} while (0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be just the source of the poll:
label, with COMPARE_INC_POLL_TIMER doing a goto loop
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also: I'm not too excited by macros being defined at the toplevel that contains bound occurrences of names that are not currently in scope. I made suggestions that I think would let us get rid of COMPARE_DO_POLL
, and I would rather write the code corresponding to COMPARE_IN_POLL_TIMER
directly at each callsite, or possibly defining the macro within the body of the do_compare_val
function.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be just the source of the
poll:
label, with COMPARE_INC_POLL_TIMER doing agoto loop
, right?
That's not quite true, we need it to carry on from where it left off if the poll finds nothing while comparing strings or double arrays. I could make COMPARE_INC_POLL_TIMER
expand to an expression instead, and add goto loop_start
s and continue
s explicitly at the call site to make intent clearer:
static intnat do_compare_val(/*...*/) {
#define COMPARE_INC_POLL_TIMER(inc) \
(((signal_poll_timer += (inc)) >= COMPARE_SIGNAL_POLL_PERIOD) && \
(signal_poll_timer = 0, \
compare_poll_actions(&sp, stk, &orig1, &orig2, &v1, &v2)))
// ...
if (COMPARE_INC_POLL_TIMER(1)) goto loop_start;
// ...
}
Or of course inline it at call sites, at the cost of some code duplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah of course, I was missing the fact that we return normally in absence of pending actions. Sorry for the noise.
The new code is substantially more complex than the previous version. I think it would be nice if we could get back to a simpler version, possibly by giving up on polls in string comparisons. |
{ | ||
value o1 = *orig1, o2 = *orig2; | ||
if (caml_check_pending_actions()) { | ||
compare_free_stack(stk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
caml_check_pending_actions()
can return false positives. It is considered that spurious polls are cheap, which simplifies various aspects of the implementation. One worry of backtracking pessimistically is that spurious polls are no longer necessarily cheap. It should be possible to avoid this with an internal process_pending_actions
variant that returns whether something has actually been done; I will have to look further into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That may complicate things too much, since we'd have to free the stack after a possible garbage collection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clearer, I have in mind doing the following: 7c104c5
However, the alternate path of scanning the compare stack would let one avoid doing this altogether.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Your commit looks like it should update the implementation of caml_do_pending_actions_exn
but does not.)
The change to caml_do_pending_actions
looks like it won't be necessary for the present PR, but could be useful for other purposes in the future, so maybe it could be an independent PR of its own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, it's not finished indeed. At least you get the idea. I do not have time to finish it right now but I can do it if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change to caml_do_pending_actions looks like it won't be necessary for the present PR
Review-wise, are we waiting to see where #12053 is headed, or is the plan to review and merge this one (with restarting) independently?
I quite like the approach taken and do not find it too complex. Let me some time and I can review it. |
Sure. I would be interested in your opinion of whether we should try to ensure that the compare stack gets scanned by the GC, instead of restarting. |
I haven't read the code, but just to point out the obvious: if comparison restarts from zero at every interrupt, it means that with big enough data structures and frequent enough interrupts, comparison will never terminate, right? For this reason I'm not in favor of checking interrupts while comparing strings : string comparison will terminate eventually, and rather quickly even for very big strings. The most problematic case is comparison of cyclic data structures, which never terminates. In this case, being interruptible is important. No so in cases where termination is guaranteed. As to giving the comparison stack to the GC : that's not possible with the current format of these stacks, which contain pointers inside heap objects, so at the minimum a different format should be used. Then it might be possible to use one of the GC's hooks to add a scanning function for comparison stacks. |
Good point. Could we just store the two containing blocks and an offset? |
Yes, but you might also need the length (unless it's cheap enough to recompute it from the blocks?), so you now have 4 words per stack entry instead of 3. |
In the end we did a lot of polish to #12128 and merged it. I am not sure whether @eutro wants to push a bit more refining the design, or stop there (in which case the current PR can be closed). The avenue I see for potential refinement -- it was done in this PR but not in #12128 -- is to make sure that very large blocks increase the polling pace. @eutro, I think you should decide whether you want to go further with this. |
I think I'll close this PR since the main intention (of polling within a finite time) has been superseded. I may open a PR in the future to address the outstanding possible refinements. |
Fixes #3921.
This PR updates the implementation of structural equality in
compare.c
to periodically poll for actions (outstanding signals, garbage collection interrupts), since this is an unbounded loop.This ensures, for example, that the following comparison:
can appropriately be interrupted with Ctrl-C at the top level.
It is possible that handling actions does not throw an exception, this happens in the case of a gc interrupt, or in the case that the signal handler (such as one registered by a user) does not raise an exception. It is also possible that a major garbage collection occurs:
Thus, the structural equality loop pessimistically saves the two compared values across the action handling, and restarts from the beginning if no exception was thrown. This is necessary for correctness when the comparison is not a loop, but just a very large tree.
Automated tests for this would likely be finnicky and unreliable, but please let me know if and how I should write them.