Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference counting in OCaml API #411

Closed
martin-neuhaeusser opened this issue Jan 11, 2016 · 12 comments
Closed

Reference counting in OCaml API #411

martin-neuhaeusser opened this issue Jan 11, 2016 · 12 comments
Assignees

Comments

@martin-neuhaeusser
Copy link
Contributor

The OCaml API uses Z3's reference counting by registering a finalizer (which is called upon a major garbage collection of OCaml) with all Z3 native objects:

type z3_native_object = {
  m_ctx : context ;
  mutable m_n_obj : Z3native.ptr ;
  inc_ref : Z3native.z3_context -> Z3native.ptr -> unit;
  dec_ref : Z3native.z3_context -> Z3native.ptr -> unit }

Upon freeing a z3_native_object, the finalizer uses dec_ref to decrement Z3's reference count for the m_n_obj. If it drops to zero, the object is freed by Z3.

Now a pattern in the bindings is as follows (e.g. for simplify):

 let simplify ( x : expr ) ( p : Params.params option ) = match p with
    | None -> expr_of_ptr (Expr.gc x) (Z3native.simplify (gnc x) (gno x))
    | Some pp -> expr_of_ptr (Expr.gc x) (Z3native.simplify_ex (gnc x) (gno x) (z3obj_gno pp))

Assume a call to simplify with some term and no parameters.
When expr_of_ptr (Expr.gc x) (Z3native.simplify (gnc x) (gno x)) gets evaluated,
gnc x and gno x are evaluated to obtain the native pointers before calling Z3native.simplify.
If the garbage collector performs a major collection before the call to Z3native.simplify, the OCaml value x might have already become unreachable: This occurs if x is not used at any later point in the program. For the simplify function, this might actually occur.
In such a garbage collection, value x is freed and its finalizer decrements Z3's reference counter on its native pointer (i.e. the pointer that has previously been obtained in gno x).
If the reference pointer drops to zero, Z3 frees the term and simplify is called with an invalid pointer.

If the simplify function (which is only one example - the same behavior occurs in many other functions in the OCaml API) is rewritten to force a garbage collection as follows,

  let simplify ( x : expr ) ( p : Params.params option ) = match p with
    | None ->
      let ctx = Expr.gc x in
      let nc = gnc x in
      let no = gno x in
      Gc.full_major ();
      expr_of_ptr ctx (Z3native.simplify nc no)
    | Some pp -> expr_of_ptr (Expr.gc x) (Z3native.simplify_ex (gnc x) (gno x) (z3obj_gno pp))

a segmentation fault can be provoked by a simple test program (if built as native-code executable):

let testfun () =
  let ctx = Z3.mk_context [] in
  let term = Z3.BitVector.mk_add ctx
      (Z3.BitVector.mk_numeral ctx "2" 16)
      (Z3.BitVector.mk_numeral ctx "3" 16)
  in
  let simpler_term = Z3.Expr.simplify term None in
  Printf.printf "The simplified term is %s.\n" (Z3.Expr.to_string simpler_term)

let _ =
  ignore (Z3.Log.open_ "Z3.log");
  Printf.printf "Testing OCaml API...\n";
  testfun ()

Note that if the term term is used after the invocation of Z3.Expr.simplify,
the garbage collector does not free it, and no segfault occurs:

let testfun () =
  let ctx = Z3.mk_context [] in
  let term = Z3.BitVector.mk_add ctx
      (Z3.BitVector.mk_numeral ctx "2" 16)
      (Z3.BitVector.mk_numeral ctx "3" 16)
  in
  let simpler_term = Z3.Expr.simplify term None in
  Printf.printf "The simplified term is %s.\n" (Z3.Expr.to_string simpler_term);
  Printf.printf "The original term was %s.\n" (Z3.Expr.to_string term)

let _ =
  ignore (Z3.Log.open_ "Z3.log");
  Printf.printf "Testing OCaml API...\n";
  testfun ()

Am I correctly suspecting a bug in the way, reference counting is implemented in the OCaml API?
The above example uses simplify, but there a plenty of other functions in the API that extract
native pointers from OCaml values such that the latter might become garbage collected, and their native objects being freed before they are used in a Z3 native call.

All of the above has been tested with Z3 at commit e88ac37. The problem might be related to issue #367. A try at fixing the issue is available at https://github.com/martin-neuhaeusser/z3

@NikolajBjorner
Copy link
Contributor

Martin, I agree. Issue #367 exposed this problem and it is systemic and avoided provided expressions are referenced after the calls. I started adding shorthands for functions that decompose and then build expressions using native APIs. I don't have a good solution to pin expressions in a way I can guarantee that the compiler doesn't perform some purity analysis and wipes out the pinning code and so far just use the reference counting facility for this.

@NikolajBjorner
Copy link
Contributor

It may be a reasonable idea to introduce auxiliary functions to pin arguments. These functions may then be implemented using inc_ref/dec_ref or something better. Sprinkling the code with inc/dec refs everywhere makes maintainance more difficult. E.g.,

  let pin1 e fn arg = inc_ref e; let r = fn arg in dec_ref e; r

@wintersteiger
Copy link
Contributor

Yes, this one's all over the place, mainly because I was under the impression that OCaml isn't allowed to garbage-collect function arguments while the function is still executing, but apparently this is what's happening now (maybe only in 4.0+?). I haven't got a plan of attack to this problem either. Maybe there's a magic function that tells the GC not to collect those objects?

@wintersteiger
Copy link
Contributor

@NikolajBjorner: Actually, what happens in ML when a function is called? Does that compile to an assignment that calls inc_ref, or can we perhaps make it do that by adding some operator=s or alike?

@NikolajBjorner
Copy link
Contributor

Following up on my comment on making things a bit more encapsulated. Here is another idea.
We could ensure that objects that should not be gc-ed are touched after the relevant calls.

E.g.:

 let t = ref (Obj.repr null)
 let touch o = let t1 = !t in t := Obj.repr o; t := t1
 let protect f a = let r = f a in touch a; r
 let protect2 f a b = let r = f a b in touch a; touch b; r
 let protect3 f a b c = let r = f a b c in touch a; touch b; touch c; r
 let protect4 f a b c d = let r = f a b c d in touch a; touch b; touch c; touch d; r

Then one can define:

  let get_size = protect (fun ( x : ast_vector ) ->Z3native.ast_vector_size (z3obj_gnc x) (z3obj_gno x))

  let get ( x : ast_vector ) ( i : int ) =
    protect (fun x -> ast_of_ptr (z3obj_gc x) (Z3native.ast_vector_get (z3obj_gnc x) (z3obj_gno x) i)) x

  let set ( x : ast_vector ) ( i : int ) ( value : ast ) =
    protect2 (fun x value -> Z3native.ast_vector_set (z3obj_gnc x) (z3obj_gno x) i (z3obj_gno value)) x value

let resize ( x : ast_vector ) ( new_size : int ) =
  protect (fun x -> Z3native.ast_vector_resize (z3obj_gnc x) (z3obj_gno x) new_size) x

There must be something even better, but at least encapsulating the pinning in auxiliary functions would make post-op surgeries easier to maintain.

@martin-neuhaeusser
Copy link
Contributor Author

@NikolajBjorner
I completely agree that pinning the z3_native_object records should be as opaque as possible and that sprinking the code with inc_ref/dec_ref calls should be avoided (I did so just to test if the segfaults that we encounter are gone). However, defining a "do-nothing" function to pin the expressions seems dangerous, as well. Once, the compiler becomes clever enough to optimize the call away, the error silently reappears.

To my opinion, pinning the terms with inc_ref/dec_ref (in whatever form) is not a good solution either, as it produces many calls to the native Z3 library which are almost always unnecessary.

I posted a question on the OCaml mailing list, looking for suggestions:
https://sympa.inria.fr/sympa/arc/caml-list/2016-01/msg00044.html

@jberdine
Copy link

Perhaps a bigger change than you are interested in, but what I did in http://z3.codeplex.com/SourceControl/network/forks/jjb/Z3/latest#src/api/ml/z3.0.idl was to create ocaml wrapper objects of Z3 native pointers as what ocaml calls 'custom blocks'. The comment at the beginning of that linked file gives a brief overview of the design. Custom blocks are single ocaml values, and so atomic from the perspective of the ocaml GC. The finalizers that call Z3 dec_ref functions (plus pointers to e.g. the enclosing Z3 context object) are installed in the custom block. Calls to the ocaml allocator from C are embraced by calls to Begin_roots and End_roots to keep the temporary pointers to ocaml values live across the alloc call. See e.g. c2ml_ ## T ## _opt in the file linked above (or check the code generated by camlidl for many more instances).

Are you sure that the crashes you are seeing are due to ast objects, or could they be due to context objects? I ended up with two versions of bindings, one that leaked context objects by never calling dec_ref on them, and one that properly cleaned them up. The version that cleaned them up relied on the ocaml Gc module finalizers being guaranteed to be called in reverse-registration order, and was careful to ensure that the finalizers for contexts were registered first.

Temporarily pinning objects might be easier done with the C caml_register_generational_global_root and caml_remove_generational_global_root functions, avoiding many calls to the Z3 lib.

@wintersteiger
Copy link
Contributor

Awesome, thanks for your comments @jberdine! I agree, we could take this down into the stubs where ocaml can't look into the code, so it won't collect anything. I kind of like the idea of using caml_register_generational_global_root, but it's not particularly nice either. Do you know whether that's compile-time and thus cheap or runtime expensive?

Formally speaking, it's actually OK if OCaml collects the "high-level" list-of-ast object. By creating a second copy of the native object, we should increment the ref counts, because we do in fact hold another reference, i.e., *_lton should actually call inf_ref on each object, and we delete the list-of-native object after the function call (i.e., call dec_ref on each). We could simply introduce a new data structure which calls the dec_ref for each element when the list/array goes out of scope (e.g., put a finalizer onto the array?), but OCaml users seem not to be partial to object-orientation it seems. As Martin said, this will also create loads of API calls which isn't particularly attractive either.

Originally the idea was to create a new luxury, highest-level structure as in the other APIs, for OCaml without the need for camlidl. But, almost nothing of that idea remains, so perhaps it's best if we just revert the native layer to what camlidl used to generate, and perhaps even use it, if it comes with all the different ocaml flavours now. We could still keep the module structure in z3.ml/mli which people seem to like.

@jberdine
Copy link

I agree, we could take this down into the stubs where ocaml can't look
into the code, so it won't collect anything.

The issue is not so much about C vs OCaml at the language level but at
what the OCaml GC sees. AFAIU, currently you store native pointers to
Z3 objects directly into fields of ocaml record values. The ocaml GC
only knows that these are pointers since they are at least 2-aligned,
and not to be followed by the gc since they point out of the ocaml heap.
These present some tricky cases for the gc, and note that since 4.02
there is a configure-time option to disallow them and optimize the gc.

With custom blocks, the header tells the gc that there are no pointers
to ocaml values in the block, which simplifies things.

But the main issue seems to be to keep the native pointer to a Z3 object
from outliving the ocaml value that the finalizer is associated with.
With naked pointers, this is hard, since ocaml code can separate the
wrapper object (with the finalizer) from the native pointer just be
getting a field. With custom blocks, ocaml code cannot separate the
native pointer from the ocaml value with the finalizer. C code can, and
so pointers to wrapper objects need to be protected from collection for
the duration of C code that accesses the native pointer. Since the gc
only runs when the ocaml allocator is called, this protection is often
pretty simple. The CAMLparam macros are often enough. If the native
pointers are long-lived (e.g. across multiple api calls), then
registering roots enters the picture.

One thing to keep in mind with registering roots is that this technique
is often not enough to ensure ordering constraints such as ast
finalizers are called before context finalizers. In such cases you
would need roots which are themselves in the gc heap. The difficulty is
that many objects may die in a collection pass, and their finalizers are
called in arbitrary order. So even if an ast wrapper has a pointer to a
context, and that pointer is registered as a gc root, the ast and
context wrappers can die in the same gc pass, and the context's
finalizer can be called first. This is for the C-level finalizers
attached to custom blocks, the ocaml-level finalizers provided by the Gc
module are rather heavier and provide a strong ordering guarantee.

I kind of like the idea of using
caml_register_generational_global_root, but it's not particularly nice
either. Do you know whether that's compile-time and thus cheap or
runtime expensive?

Run time. Registering a root adds the address to the set of addresses
the gc starts scanning from. I don't know if having lots of
generational global roots is very expensive. When I tried using them I
ran into the ordering issues described above and had to switch to the
ocaml Gc module finalizers instead. I don't have the data anymore, but
the version of the bindings that leak contexts was motivated by wanting
to avoid that huge number of heavy-weight finalizers. SLAyer chewed
through way too many Z3 objects for that. The cost was both run time as
well as delaying when objects were collected, increasing memory
consumption.

Formally speaking, it's actually OK if OCaml collects the "high-level"
list-of-ast object. By creating a second copy of the native object, we
should increment the ref counts, because we do in fact hold another
reference, i.e., *_lton should actually call inf_ref on each object,
and we delete the list-of-native object after the function call (i.e.,
call dec_ref on each).

Yes, there are two approaches that at some level are equivalent. The
custom block one prevents ocaml code from being able to make copies of
native pointers, and thereby ensures that the native pointer does not
live past the finalizer call. Another approach is to allow ocaml code
to make copies, but require it to call inc_ref and dec_ref manually.

We could simply introduce a new data structure which calls the dec_ref
for each element when the list/array goes out of scope (e.g., put a
finalizer onto the array?), but OCaml users seem not to be partial to
object-orientation it seems.

Yeah, we have a real module system.

As Martin said, this will also create loads of API calls which isn't
particularly attractive either.

Originally the idea was to create a new luxury, highest-level
structure as in the other APIs, for OCaml without the need for
camlidl. But, almost nothing of that idea remains, so perhaps it's
best if we just revert the native layer to what camlidl used to
generate, and perhaps even use it, if it comes with all the different
ocaml flavours now.

Flavors? The code it generates uses type names defined in the ocaml
runtime headers, so they automatically adjust to e.g. word size with the
runtime system.

Still, with the code camlidl generates as a guide, it is hopefully not
too big a job to adjust the Z3 generator to match. The main point where
the approaches differ that is important here, IIRC, is that camlidl
allows defining per-C-type conversions between OCaml and C values in C.
I don't know how big the consequences would be of changing to handle
reference counting in the C code rather than having an ultimately-thin C
layer and doing everything else in ocaml.

We could still keep the module structure in z3.ml/mli which people
seem to like.

Yes, absolutely, that is a real win.

@wintersteiger
Copy link
Contributor

I just spent another full working day exclusively on installing an ocaml compiler on my office machine. Out of all the supported Windows ways of compiling and binary packages, only one of them yielded an ocamlopt binary that I could run, but which was unable to link any executable against my system libraries. After 8 hours of pain and cursing I gave in and ssh'ed into our OSX machine, but the ocaml installed there (4.02.1) does not reproduce the problem.

Seriously, I don't think there is enough time left in my life to compile even a single ocaml program.

@wintersteiger
Copy link
Contributor

An update for the others: I recently pushed all of those issues down into the C layer and I keep that in a separate branch called new-ml-api on my fork for now (see here). This works OK, but I'm sure there are some bugs still lurking in all of that, and Martin has sent me some new interaction logs that trigger errors, but so far I don't know whether they are really in this layer, or in Z3 proper.

@wintersteiger wintersteiger self-assigned this Mar 5, 2016
@martin-neuhaeusser
Copy link
Contributor Author

The OCaml API has been re-worked in the current master.
With the new API, the segmentation faults when using the OCaml API are gone.
Therefore I close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants