Stop using integer comparison on potential pointers #11587

lthls · 2022-09-30T14:14:37Z

We detected with Flambda 2 that there were some case where the compiler generated Lambda terms using the integer-specific comparison on values that could contain pointers. This is forbidden in our model of the language, so we considered whether we needed to update our model or patch the compiler. It turns out that there's a potential bug anyway in the only place we're aware of that can generate such comparisons, so we've decided to keep our strict model and patch the compiler. This PR proposes the corresponding patch for trunk.

The following is an explanation of the bug, copied from the Flambda 2 PR (ocaml-flambda/flambda-backend#854).

Consider the following OCaml program:

type simple = [ `Black | `Blue | `Red | `Green | `Yellow ]

let[@inline never] compare_simple s1 s2 = Sys.opaque_identity 0

type t = [ simple | `RGB of int ]

let compare t1 t2 =
  match t1, t2 with
  | #simple as s1, #simple as s2 -> compare_simple s1 s2
  | `RGB i1, `RGB i2 -> Int.compare i1 i2
  | x, y -> compare x y

let example = `RGB 12345

let compare_spec t = (compare [@inlined]) t example

It is compiled to the following lambda term:

(let
  (compare_simple/262 = (function s1/264 s2/265 never_inline : int (opaque 0))
   compare/335 =
     (function t1/336 t2/337 : int
       (catch
         (catch
           (if (isint t1/336)
             (if (>= t1/336 82908052) (if (!= t1/336 737308346) (if (!= t1/336 756711075) (if (>= t1/336 82908053) (exit 1) (exit 2)) (exit 2)) (exit 2))
               (if (!= t1/336 -937474657) (if (!= t1/336 4100401) (exit 1) (exit 2)) (exit 2)))
             (if (!= (field 0 t1/336) 4093677) (exit 1) (if (isint t2/337) (exit 1) (if (!= (field 0 t2/337) 4093677) (exit 1) (apply (field 8 (global Stdlib__Int!)) (field 1 t1/336) (field 1 t2/337))))))
          with (2)
           (catch
(* --> *)    (if (>= t2/337 82908052) (if (!= t2/337 737308346) (if (!= t2/337 756711075) (if (>= t2/337 82908053) (exit 1) (exit 3)) (exit 3)) (exit 3))
               (if (!= t2/337 -937474657) (if (!= t2/337 4100401) (exit 1) (exit 3)) (exit 3)))
            with (3) (apply compare_simple/262 t1/336 (makeblock 0 t1/336 t2/337))))
        with (1) (caml_compare t1/336 t2/337)))
   example/371 = [0: 4093677 12345]
   compare_spec/372 = (function t/374 : int (apply compare/335 t/374 example/371 always_inline)))
  (makeblock 0 compare_simple/262 compare/335 example/371 compare_spec/372))

The line annotated with (* --> *) corresponds in the pattern-matching to the point where we know that t1 matches #simple, and we're trying to see if t2 matches #simple too. Crucially, we don't know whether t2 is an integer or a pointer.
If we tested only for equality with the relevant integers this would work, as no pointer can be equal to an integer thanks to the tagging bit, but you can see that for the special case of 82908052 the test is decomposed into a first inequality (>= t2/337 82908052), introduced for balancing the test tree, then a final (>= t2/337 82908053). This works if we supposed that there is no valid input between 82908052 and 82908053. However, these constants represent to tagged versions of the integers; in practice the test on machine integers will test t2 >= 165816105 and t2 >= 165816107. There's a spurious value 165816106 that can't be reached if the input is guaranteed to be an integer, but that might correspond to a pointer (in this case for alignment reasons this is unlikely, but with different constructors we could end up with the right alignment).

Of course, we've never noticed any occurrences of the bug in the wild, because it's very unlikely that a pointer will get exactly the wrong value, and if it did occur it would likely go completely unnoticed (we would take the wrong branch, but it wouldn't lead to any obvious error like reading from data with the wrong type).

gasche

The reasoning is tortuous but correct and the patch itself is simple and correct. See an important style suggestion below.

This pessimizes slightly one case of pattern-matching compilation, but I don't think there are risks of noticeable performance regression. The pessimized case is the case where we do a non-exhaustive match against only constant variant constructors (not particularly common), and the change is to do an extra is_int check (very cheap).

gasche · 2022-10-01T12:26:35Z

lambda/matching.ml

+              test_int_or_block arg
+                (make_test_sequence_variant_constant fail arg consts)
+                act
+            end


I think the code would be nicer if you used the same style as the case below, which (if I understand correctly) is exactly symmetric.

I've followed your suggestion. The code now follows the style of the case below.

xavierleroy · 2022-10-01T16:19:31Z

the change is to do an extra is_int check (very cheap).

It is cheap indeed, but it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

gasche · 2022-10-01T19:35:36Z

it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

I looked at it and I don't see an obvious way to do it. Matching does not operate at the abstraction level of encodings, it always either splits the immediate-or-blocks domains by using is_int first, or (for pattern-matching on non-polymorphic variant constructors) it generates a switch instruction that is transformed to work with encoded numbers only later in Cmm. (We don't want to use a switch for polymorphic variants which are very sparse.) In particular, I don't see how we would deal with the bytecode backend that does not expose comparison opcodes for the underlying representation.

lthls · 2022-10-01T22:24:44Z

It is cheap indeed, but it should not be too difficult to add an option to the code that generates the tree of comparisons so that it stops assuming that there are no values between the encoding of n (2n+1) and the encoding of n+1 (2n+3).

I did look at the test generators in switch.ml while looking for the proper way to fix this, and I think that adding support for intermediate values would require significant changes to the whole Switch module (basically you would need to extend the notion of interval to handle open bounds in addition to the closed ones used currently).

But the main reason why we wrote this patch is because we don't want to deal with pointer inputs to integer comparisons in the middle-end. The bug fix is just a bonus. So even if we find a way to patch the test generator in the way you suggest, I'd still ask for an option to add the extra is_int check. I could write some code to detect if the test sequence only uses equality comparisons, and remove the extra is_int in this case though. I don't think it's worth the effort, but if that's the price for getting this patch merged I'm willing to pay it.

xavierleroy · 2022-10-04T17:07:03Z

But the main reason why we wrote this patch is because we don't want to deal with pointer inputs to integer comparisons in the middle-end. The bug fix is just a bonus

That's a legitimate reason. I'm not too worried about the cost of the extra "is int" test, but was curious about the alternative.

Octachron · 2022-10-05T13:11:31Z

Is there any reason to not have this (small) fix in 5.0?

gasche · 2022-10-05T14:32:38Z

Feel free to cherry-pick if you want, but I don't know that it would make a big difference for the flambda2 people, and I didn't want to add extra work to the release manager.

Stop using integer comparison on potential pointers (cherry picked from commit 457ed4e)

Stop using integer comparison on potential pointers

0558a3b

lthls mentioned this pull request Sep 30, 2022

Remove integer comparison involving nonconstant polymorphic variants ocaml-flambda/flambda-backend#854

Merged

gasche approved these changes Oct 1, 2022

View reviewed changes

Review

f7041e9

gasche merged commit 457ed4e into ocaml:trunk Oct 4, 2022

Octachron pushed a commit that referenced this pull request Oct 6, 2022

Merge pull request #11587 from lthls/poly-variant-pointer-comparison

4d80083

Stop using integer comparison on potential pointers (cherry picked from commit 457ed4e)

Octachron added a commit that referenced this pull request Oct 6, 2022

Move #11587 Changes entry to 5.0 .

12f199d

Octachron mentioned this pull request Jan 3, 2023

documentation: sort modules alphabetically #11860

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stop using integer comparison on potential pointers #11587

Stop using integer comparison on potential pointers #11587

lthls commented Sep 30, 2022

gasche left a comment

gasche Oct 1, 2022

lthls Oct 1, 2022

xavierleroy commented Oct 1, 2022

gasche commented Oct 1, 2022

lthls commented Oct 1, 2022

xavierleroy commented Oct 4, 2022

Octachron commented Oct 5, 2022

gasche commented Oct 5, 2022

Stop using integer comparison on potential pointers #11587

Stop using integer comparison on potential pointers #11587

Conversation

lthls commented Sep 30, 2022

gasche left a comment

Choose a reason for hiding this comment

gasche Oct 1, 2022

Choose a reason for hiding this comment

lthls Oct 1, 2022

Choose a reason for hiding this comment

xavierleroy commented Oct 1, 2022

gasche commented Oct 1, 2022

lthls commented Oct 1, 2022

xavierleroy commented Oct 4, 2022

Octachron commented Oct 5, 2022

gasche commented Oct 5, 2022