-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop using integer comparison on potential pointers #11587
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reasoning is tortuous but correct and the patch itself is simple and correct. See an important style suggestion below.
This pessimizes slightly one case of pattern-matching compilation, but I don't think there are risks of noticeable performance regression. The pessimized case is the case where we do a non-exhaustive match against only constant variant constructors (not particularly common), and the change is to do an extra is_int
check (very cheap).
lambda/matching.ml
Outdated
test_int_or_block arg | ||
(make_test_sequence_variant_constant fail arg consts) | ||
act | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code would be nicer if you used the same style as the case below, which (if I understand correctly) is exactly symmetric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've followed your suggestion. The code now follows the style of the case below.
It is cheap indeed, but it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3). |
I looked at it and I don't see an obvious way to do it. Matching does not operate at the abstraction level of encodings, it always either splits the immediate-or-blocks domains by using |
I did look at the test generators in But the main reason why we wrote this patch is because we don't want to deal with pointer inputs to integer comparisons in the middle-end. The bug fix is just a bonus. So even if we find a way to patch the test generator in the way you suggest, I'd still ask for an option to add the extra |
That's a legitimate reason. I'm not too worried about the cost of the extra "is int" test, but was curious about the alternative. |
Is there any reason to not have this (small) fix in 5.0? |
Feel free to cherry-pick if you want, but I don't know that it would make a big difference for the flambda2 people, and I didn't want to add extra work to the release manager. |
Stop using integer comparison on potential pointers (cherry picked from commit 457ed4e)
We detected with Flambda 2 that there were some case where the compiler generated
Lambda
terms using the integer-specific comparison on values that could contain pointers. This is forbidden in our model of the language, so we considered whether we needed to update our model or patch the compiler. It turns out that there's a potential bug anyway in the only place we're aware of that can generate such comparisons, so we've decided to keep our strict model and patch the compiler. This PR proposes the corresponding patch for trunk.The following is an explanation of the bug, copied from the Flambda 2 PR (ocaml-flambda/flambda-backend#854).
Consider the following OCaml program:
It is compiled to the following lambda term:
The line annotated with
(* --> *)
corresponds in the pattern-matching to the point where we know thatt1
matches#simple
, and we're trying to see ift2
matches#simple
too. Crucially, we don't know whethert2
is an integer or a pointer.If we tested only for equality with the relevant integers this would work, as no pointer can be equal to an integer thanks to the tagging bit, but you can see that for the special case of
82908052
the test is decomposed into a first inequality(>= t2/337 82908052)
, introduced for balancing the test tree, then a final(>= t2/337 82908053)
. This works if we supposed that there is no valid input between82908052
and82908053
. However, these constants represent to tagged versions of the integers; in practice the test on machine integers will testt2 >= 165816105
andt2 >= 165816107
. There's a spurious value165816106
that can't be reached if the input is guaranteed to be an integer, but that might correspond to a pointer (in this case for alignment reasons this is unlikely, but with different constructors we could end up with the right alignment).Of course, we've never noticed any occurrences of the bug in the wild, because it's very unlikely that a pointer will get exactly the wrong value, and if it did occur it would likely go completely unnoticed (we would take the wrong branch, but it wouldn't lead to any obvious error like reading from data with the wrong type).