Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bytecode miscompilation of recursive definition of empty first class module #12153

Closed
ncik-roberts opened this issue Mar 31, 2023 · 8 comments · Fixed by #12156
Closed

Bytecode miscompilation of recursive definition of empty first class module #12153

ncik-roberts opened this issue Mar 31, 2023 · 8 comments · Fixed by #12156

Comments

@ncik-roberts
Copy link
Contributor

ncik-roberts commented Mar 31, 2023

This produces segfaulting bytecode with trunk's ocamlc, but not 4.14.1. I haven't bisected the cause yet.

module type S = sig end;;

let rec k =
  (module struct end : S)

Notably, these variations do not segfault:

module type S = sig end;;

let k =
  (module struct end : S)
module type S = sig val x : int end;;

let rec k =
  (module struct let x = 3 end : S)

ocamlopt does not produce segfaulting code.

@lthls
Copy link
Contributor

lthls commented Mar 31, 2023

Blocks of size 0 are atoms in bytecode, i.e. they're all preallocated in read-only memory. The code in Rec_check assumes that blocks of any size can be pre-allocated and patched later, but this is not true for blocks of size 0.
In native code, blocks of size 0 are allocated statically too, but this occurs before the compilation of let-rec bindings so we don't generate code that will try to patch them.
I suspect that before 5.0, the atoms were allocated once too but in regular memory, so they could end up patched without too many problems (mostly because the only atoms actually in use are all of the same tag, 0).

One possible fix is to consider zero-sized blocks as RHS_nonrec in Bytegen.size_of_lambda, although @gasche may wish for a fix in Rec_check instead.

@gasche
Copy link
Member

gasche commented Mar 31, 2023

I can reproduce the issue with other block-0 constructs like

let rec (k : int array) = let _ = k in [||];;

Personally I think that the logic in Rec_check is morally correct. If the runtime wants to allocate 0-arg blocks as immutable, it should ensure that this is a correct optimization, and it is not the case here. A simple fix would be to ensure that caml_update_dummy always create a block whose tag is mutable, even when it is given 0 arguments -- but then maybe some parts of the runtime somehow assume that all 0-argument blocks are atoms / that all 0-argument blocks of a given tag are physically equal?

I am not sure that the fix you propose of using RHS_nonrec is correct. This means evaluating this part of the letrec binding in advance in an arbitrary order, in particular it may be evaluated before other RHS_nonrec that it depends on and observe the fact that they are not evaluated yet. Consider for example

let glob = ref 0
let rec (x : int array) = glob.contents <- y; [||]
and y = 1

The following behavior would be plausible: first comp_init binds y=0 in the initial environment, then x is evaluated, then y is bound to its real value 1. This would contradict the assumption that if a recursive binding x depends on another y, then the value of x is evaluated with a value for y that is (physically) equal to the final value of y.
In fact this cannot happen because r.x <- y is considered a Dereference of y and this definition is rejected statically, but this is only thanks to the weird corner case of unboxed float records. Remove this feature from the language and this example would work :-) There may be another example that is problematic and is allowed today, but I haven't thought of one quickly.

Finally: I'm not opposed to changing Rec_check to reject more programs due to this corner case. A fix would be to change the size-computation to return Dynamic when we statically see that the argument count will be 0. It is a bit ugly though because it makes the code more complex to reflect a concern that comes much later in the pipeline, but we already do this in some parts of Rec_check.

@lthls
Copy link
Contributor

lthls commented Mar 31, 2023

Personally I think that the logic in Rec_check is morally correct. If the runtime wants to allocate 0-arg blocks as immutable, it should ensure that this is a correct optimization, and it is not the case here. A simple fix would be to ensure that caml_update_dummy always create a block whose tag is mutable, even when it is given 0 arguments -- but then maybe some parts of the runtime somehow assume that all 0-argument blocks are atoms / that all 0-argument blocks of a given tag are physically equal?

If I remember correctly, the issue is that the GC cannot deal with blocks of size less than 1, so all atoms must live outside the heap. The native compiler ensures that zero-sized blocks are always statically allocated, but the bytecode compiler has to rely on the runtime's atom table.
I don't think that we can get around the fact that caml_alloc_dummy must not allocate 0-sized blocks, but we might change it so that when given a size of zero it actually allocates a block of size one. As a bonus, this links into #11895, which could do with a bit more discussion.

I am not sure that the fix you propose of using RHS_nonrec is correct. This means evaluating this part of the letrec binding in advance in an arbitrary order, in particular it may be evaluated before other RHS_nonrec that it depends on and observe the fact that they are not evaluated yet. Consider for example

I think it's correct, and my reasoning for that is that it's already used for all immediate values (integers, constant constructors) and structured constants.

let glob = ref 0
let rec (x : int array) = glob.contents <- y; [||]
and y = 1

You don't really have to use empty arrays: for the issue you're looking for, [] or [0] would work just as well.

The following behavior would be plausible: first comp_init binds y=0 in the initial environment, then x is evaluated, then y is bound to its real value 1. This would contradict the assumption that if a recursive binding x depends on another y, then the value of x is evaluated with a value for y that is (physically) equal to the final value of y. In fact this cannot happen because r.x <- y is considered a Dereference of y and this definition is rejected statically, but this is only thanks to the weird corner case of unboxed float records. Remove this feature from the language and this example would work :-) There may be another example that is problematic and is allowed today, but I haven't thought of one quickly.

I don't think removing unboxed float records would change anything. Storing a value in a mutable field escapes it, so even if the store itself doesn't read the value you must consider it dereferenced.

Finally: I'm not opposed to changing Rec_check to reject more programs due to this corner case. A fix would be to change the size-computation to return Dynamic when we statically see that the argument count will be 0. It is a bit ugly though because it makes the code more complex to reflect a concern that comes much later in the pipeline, but we already do this in some parts of Rec_check.

We could actually add a Constant case that allows the same modes as Static but is compiled the same way as Dynamic. I expect we'll want to look very carefully to make sure it's correct, but if we can do it it's one less incoherence between Rec_check and the size computations.

@gasche
Copy link
Member

gasche commented Apr 1, 2023

Good points.

We could actually add a Constant case that allows the same modes as Static but is compiled the same way as Dynamic. I expect we'll want to look very carefully to make sure it's correct.

I would be in favor of restricting the set of definitions we allow to err on the side of soundness, instead of trying to be clever to accept more programs of doubtful usefulness. If the value is Constant, then it does not need to be inside the recursive nest, and we can force it to not depend on other values.

@lthls
Copy link
Contributor

lthls commented Apr 1, 2023

I would be in favor of restricting the set of definitions we allow to err on the side of soundness, instead of trying to be clever to accept more programs of doubtful usefulness. If the value is Constant, then it does not need to be inside the recursive nest, and we can force it to not depend on other values.

I broadly agree with you. I'm less sure about what to do in the following case:

let rec x =
  let y = if false then x else 0 in
  Constr y

According to the current rules, this is accepted and classified as Static. But after simplification, this could turn into:

let rec x = Constr 0

Do you think we should stick to the original decision to pre-allocate and patch, or should we keep the current behaviour of switching to the dynamic strategy, as it is (probably) safe for constants ?

One way to solve all this would be to compile the recursive definitions right after we check them, but this is currently not possible because we cannot compute the size of functions in the typedtree. Which is a problem solved by #8956, so I guess that's one more reason to try to make progress there...

@xavierleroy
Copy link
Contributor

Recursive value definitions... the gift that keeps on giving...

My first reaction when reading @lthls analysis is that there's nothing to patch in a zero-sized block! Actually, there is no data field to patch, but the current code still wants to patch the tag.

From day 1 of OCaml, the only atom (zero-sized block) used to represent OCaml values is the one with tag 0, which is used for empty arrays and for structs containing no values. Atoms with tag != 0 are supported by the run-time system but not used to represent OCaml values.

Moreover, dummy blocks are created with tag 0. So there is really nothing to patch in a zero-sized block, as the tag is already the correct one. This could be implemented by adding a case to caml_update_dummy: if the size is 0, assert that the new tag is equal to the old tag, and do nothing.

Alternatively, we can tweak the analysis and compilation of recursive values once more to handle this case specially. Your choice.

@gasche
Copy link
Member

gasche commented Apr 1, 2023

Ah, I missed the fact that atomic blocks always use tag 0. Indeed in that case skipping the update is a dead simple fix.

gasche added a commit to gasche/ocaml that referenced this issue Apr 1, 2023
fixes ocaml#12153

Suggested-by: Xavier Leroy <xavier.leroy@college-de-france.fr>
Suggested-by: Vincent Laviron <vincent.laviron@gmail.com>
Reported-by: Nick Roberts <nroberts@janestreet.com>
@gasche
Copy link
Member

gasche commented Apr 1, 2023

I proposed the fix that @xavierleroy suggested in #12156 .

gasche added a commit to gasche/ocaml that referenced this issue Apr 1, 2023
fixes ocaml#12153

Suggested-by: Xavier Leroy <xavier.leroy@college-de-france.fr>
Suggested-by: Vincent Laviron <vincent.laviron@gmail.com>
Reported-by: Nick Roberts <nroberts@janestreet.com>
gasche added a commit to gasche/ocaml that referenced this issue Apr 1, 2023
fixes ocaml#12153

Suggested-by: Xavier Leroy <xavier.leroy@college-de-france.fr>
Suggested-by: Vincent Laviron <vincent.laviron@gmail.com>
Reported-by: Nick Roberts <nroberts@janestreet.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants