Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop using integer comparison on potential pointers #11587

Merged
merged 2 commits into from
Oct 4, 2022

Conversation

lthls
Copy link
Contributor

@lthls lthls commented Sep 30, 2022

We detected with Flambda 2 that there were some case where the compiler generated Lambda terms using the integer-specific comparison on values that could contain pointers. This is forbidden in our model of the language, so we considered whether we needed to update our model or patch the compiler. It turns out that there's a potential bug anyway in the only place we're aware of that can generate such comparisons, so we've decided to keep our strict model and patch the compiler. This PR proposes the corresponding patch for trunk.

The following is an explanation of the bug, copied from the Flambda 2 PR (ocaml-flambda/flambda-backend#854).

Consider the following OCaml program:

type simple = [ `Black | `Blue | `Red | `Green | `Yellow ]

let[@inline never] compare_simple s1 s2 = Sys.opaque_identity 0

type t = [ simple | `RGB of int ]

let compare t1 t2 =
  match t1, t2 with
  | #simple as s1, #simple as s2 -> compare_simple s1 s2
  | `RGB i1, `RGB i2 -> Int.compare i1 i2
  | x, y -> compare x y

let example = `RGB 12345

let compare_spec t = (compare [@inlined]) t example

It is compiled to the following lambda term:

(let
  (compare_simple/262 = (function s1/264 s2/265 never_inline : int (opaque 0))
   compare/335 =
     (function t1/336 t2/337 : int
       (catch
         (catch
           (if (isint t1/336)
             (if (>= t1/336 82908052) (if (!= t1/336 737308346) (if (!= t1/336 756711075) (if (>= t1/336 82908053) (exit 1) (exit 2)) (exit 2)) (exit 2))
               (if (!= t1/336 -937474657) (if (!= t1/336 4100401) (exit 1) (exit 2)) (exit 2)))
             (if (!= (field 0 t1/336) 4093677) (exit 1) (if (isint t2/337) (exit 1) (if (!= (field 0 t2/337) 4093677) (exit 1) (apply (field 8 (global Stdlib__Int!)) (field 1 t1/336) (field 1 t2/337))))))
          with (2)
           (catch
(* --> *)    (if (>= t2/337 82908052) (if (!= t2/337 737308346) (if (!= t2/337 756711075) (if (>= t2/337 82908053) (exit 1) (exit 3)) (exit 3)) (exit 3))
               (if (!= t2/337 -937474657) (if (!= t2/337 4100401) (exit 1) (exit 3)) (exit 3)))
            with (3) (apply compare_simple/262 t1/336 (makeblock 0 t1/336 t2/337))))
        with (1) (caml_compare t1/336 t2/337)))
   example/371 = [0: 4093677 12345]
   compare_spec/372 = (function t/374 : int (apply compare/335 t/374 example/371 always_inline)))
  (makeblock 0 compare_simple/262 compare/335 example/371 compare_spec/372))

The line annotated with (* --> *) corresponds in the pattern-matching to the point where we know that t1 matches #simple, and we're trying to see if t2 matches #simple too. Crucially, we don't know whether t2 is an integer or a pointer.
If we tested only for equality with the relevant integers this would work, as no pointer can be equal to an integer thanks to the tagging bit, but you can see that for the special case of 82908052 the test is decomposed into a first inequality (>= t2/337 82908052), introduced for balancing the test tree, then a final (>= t2/337 82908053). This works if we supposed that there is no valid input between 82908052 and 82908053. However, these constants represent to tagged versions of the integers; in practice the test on machine integers will test t2 >= 165816105 and t2 >= 165816107. There's a spurious value 165816106 that can't be reached if the input is guaranteed to be an integer, but that might correspond to a pointer (in this case for alignment reasons this is unlikely, but with different constructors we could end up with the right alignment).

Of course, we've never noticed any occurrences of the bug in the wild, because it's very unlikely that a pointer will get exactly the wrong value, and if it did occur it would likely go completely unnoticed (we would take the wrong branch, but it wouldn't lead to any obvious error like reading from data with the wrong type).

Copy link
Member

@gasche gasche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasoning is tortuous but correct and the patch itself is simple and correct. See an important style suggestion below.

This pessimizes slightly one case of pattern-matching compilation, but I don't think there are risks of noticeable performance regression. The pessimized case is the case where we do a non-exhaustive match against only constant variant constructors (not particularly common), and the change is to do an extra is_int check (very cheap).

test_int_or_block arg
(make_test_sequence_variant_constant fail arg consts)
act
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code would be nicer if you used the same style as the case below, which (if I understand correctly) is exactly symmetric.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've followed your suggestion. The code now follows the style of the case below.

@xavierleroy
Copy link
Contributor

the change is to do an extra is_int check (very cheap).

It is cheap indeed, but it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

@gasche
Copy link
Member

gasche commented Oct 1, 2022

it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

I looked at it and I don't see an obvious way to do it. Matching does not operate at the abstraction level of encodings, it always either splits the immediate-or-blocks domains by using is_int first, or (for pattern-matching on non-polymorphic variant constructors) it generates a switch instruction that is transformed to work with encoded numbers only later in Cmm. (We don't want to use a switch for polymorphic variants which are very sparse.) In particular, I don't see how we would deal with the bytecode backend that does not expose comparison opcodes for the underlying representation.

@lthls
Copy link
Contributor Author

lthls commented Oct 1, 2022

It is cheap indeed, but it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

I did look at the test generators in switch.ml while looking for the proper way to fix this, and I think that adding support for intermediate values would require significant changes to the whole Switch module (basically you would need to extend the notion of interval to handle open bounds in addition to the closed ones used currently).

But the main reason why we wrote this patch is because we don't want to deal with pointer inputs to integer comparisons in the middle-end. The bug fix is just a bonus. So even if we find a way to patch the test generator in the way you suggest, I'd still ask for an option to add the extra is_int check. I could write some code to detect if the test sequence only uses equality comparisons, and remove the extra is_int in this case though. I don't think it's worth the effort, but if that's the price for getting this patch merged I'm willing to pay it.

@xavierleroy
Copy link
Contributor

But the main reason why we wrote this patch is because we don't want to deal with pointer inputs to integer comparisons in the middle-end. The bug fix is just a bonus

That's a legitimate reason. I'm not too worried about the cost of the extra "is int" test, but was curious about the alternative.

@gasche gasche merged commit 457ed4e into ocaml:trunk Oct 4, 2022
@Octachron
Copy link
Member

Is there any reason to not have this (small) fix in 5.0?

@gasche
Copy link
Member

gasche commented Oct 5, 2022

Feel free to cherry-pick if you want, but I don't know that it would make a big difference for the flambda2 people, and I didn't want to add extra work to the release manager.

Octachron pushed a commit that referenced this pull request Oct 6, 2022
Stop using integer comparison on potential pointers

(cherry picked from commit 457ed4e)
Octachron added a commit that referenced this pull request Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants