4.04.0 changes explanation

Florian Angeletti edited this page Dec 8, 2016 · 25 revisions

This page contains the 4.04.0 changelog, quoted, with detailed explanations and comments on each item.

(This is the first page in this wiki, so we are still iterating on the preferred format and conventions.)

OCaml 4.04.0 (November 4th, 2016)

Language features:

  • MPR #7233: Support GADT equations on non-local abstract types (Jacques Garrigue)

When opening a GADT constructor gives "new type information", this information is only taken into account by the type-checker when it can be encoded as an equality between an abstract type t and a more precise type. In existing GADT code, those abstract types are introduced locally using the (type t) or type t . ... constructions; in particular, equations involving non-local abstract types were not supported. The code below would previously not type-check, and now it does:

type (_, _) eq = Refl : ('a, 'a) eq

module type S = sig
  type t
  val eql : (t, int) eq

module F (M : S) = struct
  let zero : M.t =
    match M.eql with
    | Refl -> 0 (* locally we have the equation (M.t = int) *)
  • GPR #187, GPR #578: Local opening of modules in a pattern. Syntax: M.(p), M.[p],M.[| p |], M.{p} (Florian Angeletti, Jacques Garrigue, review by Alain Frisch)

Note that M.(p) is supported, but not let open M in p which could raise various conflicts in the grammar.

  • GPR #301: local exception declarations let exception ... in (Alain Frisch)

A common pattern is to use exceptions for early exit:

     for i = 0 to n - 1 do
       if foo then raise Exit
   with Exit -> ()

but note that reasoning on this code relies on the assumption that the rest of the code will not raise Exit; in particular, if a library function used in the loop raised the exception, our assumptions that there is a single raise site would be a violated. Local exceptions let us make sure that this cannot happen:

   let exception Break in
     for i = 0 to n - 1 do
       if foo then raise Break
   with Break -> ()

another part of the code can only raise Break if we explicitly passed this local exception constructor to it.

Exception with arguments are of course supported: let exception Return of int * bool in ....

  • GPR #508: Allow shortcut for extension on semicolons: ;%foo (Jeremie Dimino)

There is no builtin semantics for this new construction, it is meant for ppx extensions -- as all extensions (as opposed to annotations [@foo]), the compiler will fail if some remain if the pre-processed program. See Yaron Minsky's blog post on monadic let syntax, which makes the case for monadic semicolon.

  • GPR #606: optimized representation for immutable records with a single field, and concrete types with a single constructor with a single argument. This is triggered with a [@@unboxed] attribute on the type definition. Currently mutually recursive datatypes are not well supported, this limitation should be lifted in the future (see MMPR #7364). (Damien Doligez)

This is a very cool feature for implementing cost-free abstractions. Note the limitations on mutually recursive datatypes noticed by Markus Mottl, which will hopefully be fixed in the next release. The following declaration will not be accepted:

type ('a, 'kind) tree =
  | Root : { mutable value : 'a; mutable rank : int } -> ('a, [ `root ]) tree
  | Inner : { mutable parent : 'a node } -> ('a, [ `inner ]) tree
and 'a node = Node : ('a, _) tree -> 'a node [@@ocaml.unboxed]

as the two types tree and node are checked sequentially, not in a mutually recursive way: when checking tree, node is considered unknown (maybe float), and thus the declaration is rejected. (There would be soundness issues in silently unboxed records of float fields.)

Breaking Changes:

Compiler user-interface and warnings:

  • MPR #6475, GPR #464: interpret all command-line options before compiling any files, changes (improves) the semantics of repeated -o options or -o combined with -c see the super-detailed commit message at da56cf6dfdc13 (whitequark)

This change may break code that uses the ocaml compiler as a frontend to compile .c file, but we checked that it is compatible with all OPAM packages.

There has been some back-and-forth on this issue since the initial commit (see ocaml/ocaml#758, ocaml/ocaml#761), and the final behavior unfortunately does not really correspond to what is described in the commit message. Currently the only combination of -c and -o that is allowed on C files is when -o gives exactly the path that the OCaml compiler would have chosen by default (sourcename.o as a path relative to the compilation directory).

I (@gasche) expect this issue to evolve again in the future -- hopefully we can have -o start becoming useful instead of just enforcing the status quo.

  • MPR #7147, GPR #475: add colors when reporting errors generated by ppx rewriters. Remove the Location.errorf_prefixed function which is no longer relevant (Simon Cruanes, Jérémie Dimino)

As far as I understand, this may only break your code if you used Location.errorf_prefixed through compiler-libs. You should use Location.errorf instead, as the error prefix is now included in the default reporting function.

  • GPR #591: Improve support for OCAMLPARAM: (i) do not use objects files with -a, -pack, -shared; (ii) use "before" objects in the toplevel (but not "after" objects); (iii) use -I dirs in the toplevel, (iv) fix bug where -I dirs were ignored when using threads (Marc Lasson, review by Damien Doligez and Alain Frisch)

This patch changes the semantics of cm{o,x} and cm{a,xa} in OCAMLPARAM (they are only used at link times and ignored when building archive, package and shared libraries so it could in theory break scripts using OCAMLPARAM and relying on the old semantics.

Build system:

  • GPR #512, GPR #587: Installed ocamlc, ocamlopt, and ocamllex are now the native-code versions of the tools, if those versions were built. The bytecode versions are now available with the .byte extension. (Demi Obenour)

This is a good change for users, as it means that ocamlc will be the fast version by default (instead of the bytecode-compiled version). Of course, ocamlrun ocamlc previously worked and now it does not anymore, so this can break scripts. You can fix those scripts by using ocamlc.byte instead: ocamlrun ocamlc.byte still works. Another reason for this change to break builds is that the native-code version of tools do not honor OCAMLRUNPARAM=l=... (which can be useful to avoid stack overflow in the compiler).

Bug fixes:

  • MPR #6505: Missed Type-error leads to a segfault upon record access. (Jacques Garrigue, extra report by Stephen Dolan) Proper fix required a more restrictive approach to recursive types: mutually recursive types are seen as abstract types (i.e. non-contractive) when checking the well-foundedness of the recursion.

To fix a bug in the type-checker that could lead to unsoundness, Jacques Garrigue had to make the handling of recursive types more restrictive in mutually recursive definitions. They are now seen as non-contractive, meaning that the following definition is rejected:

  type 'a u = < x : 'a>
  and 'a t = 'a t u;;
  type 'a u = < x : 'a>
  type 'a t = 'a t u;;

("Contractiveness" is the criterion that prevents you from defining the ill-defined type type 'a t = 'a t: whenever we have a right-hand-side of the form ('a t) u, we must check that u is "contractive", that it expands to at least one head type constructor before using its own parameter; type 'a u = 'a would be rejected here, and type 'a u = < x : 'a > is now rejected if it's part of the same mutually recursive group, as those are not considered contractive anymore.)

  • MPR #6752: Nominal types and scope escaping. Revert to strict scope for non-generalizable type variables, cf. Mantis. Note that this is actually stricter than the behavior before 4.03, cf. MPR #7313, meaning that you may sometimes need to add type annotations to explicitly instantiate non-generalizable type variables. (Jacques Garrigue, following discussion with Jeremy Yallop, Nicolas Ojeda Bar and Alain Frisch)
  • MPR #7278: Prevent private inline records from being mutated (Alain Frisch, report by Pierre Chambart)

If a type is exported as private through a module interface, then it should be possible to inspect values of this type (access field, pattern-match on its constructors), but not build new values or change existing values. See the manual on private declarations. The idea is to use this privacy restriction to enforce additional invariants on the value, to be guaranteed by the construction and mutation function exported by the inner module. The restriction that values accessed through a private view should not be mutable (even when the type declaration allows mutability) was not enforced on inline records, leading to code as the following being accepted:

module Bad(M : sig type a = private A of { mutable i : int } end) = struct
  let f (A r) = r.i <- 3

This is now forbidden. Code mutating private inline mutable fields should be fixed by dropping the private modifier, or exporting a mutation function from within the module: M could export a change_i function for example.

  • GPR #533: Thread library: fixed [Thread.wait_signal] so that it converts back the signal number returned by [sigwait] to an OS-independent number (Jérémie Dimino)

Other changes

Compiler user-interface and warnings:

  • MPR #7139: clarify the wording of Warning 38 (Unused exception or extension constructor) (Gabriel Scherer)

  • MPR #7169, GPR #501: clarify the wording of Warning 8 (Non-exhaustivity warning for pattern matching) (Florian Angeletti, review and report by Gabriel Scherer)

  • GPR #648: New -plugin option for ocamlc and ocamlopt, to dynamically extend the compilers at runtime. (Fabrice Le Fessant)

  • GPR #684: Detect unused module declarations (Alain Frisch)

  • GPR #706: Add a settable Env.Persistent_signature.load function so that cmi files can be loaded from other sources. This can be used to create self-contained toplevels. (Jérémie Dimino)

Standard library:

  • GPR #473: Provide Sys.backend_type so that user can write backend-specific code in some cases (for example, code generator). (Hongbo Zhang)

  • MPR #6279, GPR #553: implement Set.map (Gabriel Scherer)

A bug was quickly found in this implementation (MPR#7403, reported by @talex5)). It is fixed in the 4.04 branch, but it probably best to avoid using Set.map before the fix is delivered to users.

  • MPR #6820, GPR #560: Add Obj.reachable_words to compute the "transitive" heap size of a value (Alain Frisch, review by Mark Shinwell and Damien Doligez)

  • GPR #589: Add a non-allocating function to recover the number of allocated minor words. (Pierre Chambart, review by Damien Doligez and Gabriel Scherer)

  • GPR #626: String.split_on_char (Alain Frisch)

  • GPR #669: Filename.extension and Filename.remove_extension (Alain Frisch, request by Edgar Aroutiounian, review by Daniel Bunzli and Damien Doligez)

  • GPR#674: support unknown Sys.os_type in Filename, defaulting to Unix (Filename would previously fail at initialization time for Sys.os_type values other than "Unix", "Win32" and "Cygwin"; mirage-os uses "xen") (Anil Madhavapeddy)

Other libraries

  • MPR#4834, GPR#592: Add a Biggarray.Genarray.change_layout function to switch bigarrays between C and fortran layouts. (Guillaume Hennequin, review by Florian Angeletti)

Two concurrent memory layouts coexist for sensible multidimensional array implementation: row major and column major. Since the choice between the two is essentially arbitrary, libraries and languages tend to vary in their choice. It is therefore very useful for compatibility reasons to be able to switch between the two layouts.

Code generation and optimizations:

  • MPR #4747, GPR #328: Optimize Hashtbl by using in-place updates of its internal bucket lists. All operations run in constant stack size and are usually faster, except Hashtbl.copy which can be much slower (Alain Frisch)

  • MPR #6217, GPR #538: Optimize performance of record update: no more performance cliff when { foo with t1 = ..; t2 = ...; ... } hits 6 updated fields (Olivier Nicole, review by Thomas Braibant and Pierre Chambart)

This change was originally written as breaking in the Changelog, but this is a mistake, it does not change program behavior -- except by making them faster.

  • MPR #7023, GPR #336: Better unboxing strategy (Alain Frisch, Pierre Chambart)

  • MPR #7244, GPR #840: Ocamlopt + flambda requires a lot of memory to compile large array literal expressions (Pierre Chambart, review by Mark Shinwell)

  • MPR #7291, GPR #780: Handle specialisation of recursive function that does not always preserve the arguments (Pierre Chambart, Mark Shinwell, report by Simon Cruanes)

  • GPR #427: Obj.is_block is now an inlined OCaml function instead of a C external. This should be faster. (Demi Obenour)

  • GPR #580: Optimize immutable float records (Pierre Chambart, review by Mark Shinwell)

  • GPR #602: Do not generate dummy code to force module linking (Pierre Chambart, reviewed by Jacques Garrigue)

  • MPR #7328, GPR #702: Do not eliminate boxed int divisions by zero and avoid checking twice if divisor is zero with flambda. (Pierre Chambart, report by Jeremy Yallop)

  • GPR #703: Optimize some constant string operations when the "-safe-string" configure time option is enabled. (Pierre Chambart)

  • GPR #707: Load cross module information during a meet (Pierre Chambart, report by Leo White, review by Mark Shinwell)

  • GPR #709: Share a few more equal switch branches (Pierre Chambart, review by Gabriel Scherer)

  • GPR #712: Small improvements to type-based optimizations for array and lazy (Alain Frisch, review by Pierre Chambart)

  • GPR #714: Prevent warning 59 from triggering on Lazy of constants (Pierre Chambart, review by Leo White)

  • GPR #723 Sort emitted functions according to source location (Pierre Chambart, review by Mark Shinwell)

  • Lack of type normalization lead to missing simple compilation for "lazy x" (Alain Frisch)

Runtime system:

  • MPR #7210, GPR #562: Allows to register finalisation function that are called only when a value will never be reachable anymore. The drawbacks compared to the existing one is that the finalisation function is not called with the value as argument. These finalisers are registered with GC.finalise_last (François Bobot reviewed by Damien Doligez and Leo White)

  • GPR#247: In previous OCaml versions, inlining caused stack frames to disappear from stacktraces. This made debugging harder in presence of optimizations, and flambda was going to make this worse. The debugging information produced by the compiler now enables the reconstruction of the original backtrace. Use Printexc.get_raw_backtrace_next_slot to traverse the list of inlined stack frames.
    (Frédéric Bour, review by Mark Shinwell and Xavier Leroy)

One question remains: should the semantics of backtraces be guaranteed (so a program can rely on their contents, e.g. for tests).

A specific case is the preservation of tail-call information, discussed in GPR #739. A tail-call replaces the stack frame of the callee, which disappears from the backtrace. With inlining, this loss of information is known at compile-time. The question becomes: should the compiler be as precise as possible or generate less useful information to stay close to the original program? This tension will increase as flambda gets more clever.

  • GPR #590: Do not perform compaction if the real overhead is less than expected (Thomas Braibant)


  • MPR #7189: toplevel #show, follow chains of module aliases (Gabriel Scherer, report by Daniel Bünzli, review by Thomas Refis)

  • MPR #7248: have ocamldep interpret -open arguments in left-to-right order (Gabriel Scherer, report by Anton Bachin)

  • MPR #7272, GPR #798: ocamldoc, missing line breaks in type_*.html files (Florian Angeletti)

  • MPR #7290: ocamldoc, improved support for inline records (Florian Angeletti)

Inline records should no longer make ocamldoc crash. Moreover, inline record fields can now be documented like ordinary record fields, e.g.

type a = A of { x:bool (** documentation for field x *); y:bool (** documentation for field y *) }
  • MPR #7323, GPR #750: ensure "ocamllex -ml" works with -safe-string (Hongbo Zhang)

  • MPR #7350, GPR #806: ocamldoc, add viewport metadata to generated html pages (Florian Angeletti, request by Daniel Bünzli)

  • GPR #452: Make the output of ocamldep more stable (Alain Frisch)

  • GPR #548: empty documentation comments (Florian Angeletti)

  • GPR #575: Add the -no-version option to the toplevel (Sébastien Hinderer)

  • GPR #598: Add a --strict option to ocamlyacc treat conflicts as errors (this option is now used for the compiler's parser) (Jeremy Yallop)

  • GPR #613: make ocamldoc use -open arguments (Florian Angeletti)

The behavior of ocamldoc -open is now consistent with the behavior of ocamlc/opt and ocamldep. In particular, this should help to build documentation for libraries that use aliases for namespacing.

  • GPR #718: ocamldoc, fix order of extensible variant constructors (Florian Angeletti)

Extensible variant constructors are now displayed in the right order in the documentation. As an important side-effect, these constructor documentation comments are no longer discarded.

Debugging and profiling:

  • GPR #585: Spacetime, a new memory profiler (Mark Shinwell, Leo White)

Runtime system:

  • MPR #7203, GPR #534: Add a new primitive caml_alloc_float_array to allocate an array of floats (Thomas Braibant)

Manual and documentation:

  • MPR #7007, MPR #7311: document the existence of OCAMLPARAM and ocaml_compiler_internal_params (Damien Doligez, reports by Wim Lewis and Gabriel Scherer)

  • MPR #7243: warn users against using WinZip to unpack the source archive (Damien Doligez, report by Shayne Fletcher)

  • MPR #7245, GPR #565: clarification to the wording and documentation of Warning 52 (fragile constant pattern) (Gabriel Scherer, William, Adrien Nader, Jacques Garrigue)

  • MPR #7265, GPR #769: Restore 4.02.3 behaviour of Unix.fstat, if the file descriptor doesn't wrap a regular file (win32unix only) (Andreas Hauptmann, review by David Allsopp)

  • MPR #7288: flatten : Avoid confusion (Damien Doligez, report by user 'tormen')

  • MPR #7355: Gc.finalise and lazy values (Jeremy Yallop)

  • GPR #841: Document that [Store_field] must not be used to populate arrays of values declared using [CAMLlocalN] (Mark Shinwell)

Build system:

  • GPR #324: Compiler developers: Adding new primitives to the standard runtime doesn't require anymore to run make bootstrap (François Bobot)

  • GPR #384: Fix compilation using old Microsoft C Compilers not supporting secure CRT functions (SDK Visual Studio 2005 compiler and earlier) and standard 64-bit integer literals (Visual Studio .NET 2002 and earlier) (David Allsopp)

  • GPR #507: More sharing between Unix and Windows makefiles (whitequark, review by Alain Frisch)

  • GPR #687: "./configure -safe-string" to get a system where "-unsafe-string" is not allowed, thus giving stronger non-local guarantees about immutability of strings (Alain Frisch, review by Hezekiah M. Carty)

OCaml 4.02.0 introduced the -safe-string flag to turn the string type into an immutable buffer, and added a new bytes type for mutable content. Since then, many community libraries have migrated to the new model and are compatible with "safe string" mode. OCaml 4.04.0 introduces the next step to promoting immutable strings via a new configure-time option that turns safe strings into the default mode. In this mode, if the project has specified -unsafe-string, it is flagged as a compilation error.

Making immutability the default has the advantage of permitting more compiler optimisations. For instance, GPR#703 enables a significant improvements in the Ctypes foreign function stub generator.

Bug fixes:

  • MPR #7112: Aliased arguments ignored for equality of module types (Jacques Garrigue, report by Leo White)

  • MPR #7134: compiler forcing aliases it shouldn't while reporting type errors (Jacques Garrigue, report and suggestion by sliquister)

  • MPR #7153: document that Unix.SOCK_SEQPACKET is not really usable.

  • MPR #7165, GPR #494: uncaught exception on invalid lexer directive (Gabriel Scherer, report by KC Sivaramakrishnan using afl-fuzz)

  • MPR #7257, GPR #583: revert a 4.03 change of behavior on (Unix.sleep 0.), it now calls (nano)sleep for 0 seconds as in (< 4.03) versions. (Hannes Mehnert, review by Damien Doligez)

The longstanding behaviour of OCaml when handling a Unix.sleep 0 was to issue a system call for 0 seconds. OCaml 4.03 changed this behaviour to return immediately, which makes it difficult for profiling tools such as dtrace to accurately track sequences of sleeps. OCaml 4.04 now reverts back to the original behaviour of always issuing a nanosleep system call for a zero value.

  • MPR #7260: GADT + subtyping compile time crash (Jacques Garrigue, report by Nicolas Ojeda Bar)

  • MPR #7269: Segfault from conjunctive constraints in GADT (Jacques Garrigue, report by Stephen Dolan)

  • MPR #7276: Support more than FD_SETSIZE sockets in Windows' emulation of select (David Scott, review by Alain Frisch)

  • MPR #7284: Bug in mcomp_fields leads to segfault (Jacques Garrigue, report by Leo White)

  • MPR #7285: Relaxed value restriction broken with principal (Jacques Garrigue, report by Leo White)

  • MPR #7297: -strict-sequence turns off Warning 21 (Jacques Garrigue, report by Valentin Gatien-Baron)

  • MPR #7299: remove access to OCaml heap inside blocking section in win32unix (David Allsopp, report by Andreas Hauptmann)

  • MPR #7300: remove access to OCaml heap inside blocking in Unix.sleep on Windows (David Allsopp)

  • MPR #7305: -principal causes loop in type checker when compiling (Jacques Garrigue, report by Anil Madhavapeddy, analysis by Leo White)

  • MPR #7330: Missing exhaustivity check for extensible variant (Jacques Garrigue, report by Elarnon *)

  • MPR #7374: Contractiveness check unsound with constraints (Jacques Garrigue, report by Leo White)

  • MPR #7378: GADT constructors can be re-exposed with an incompatible type (Jacques Garrigue, report by Alain Frisch)

  • MPR #7389: Unsoundness in GADT exhaustiveness with existential variables (Jacques Garrigue, report by Stephen Dolan)

  • GPR #600: (similar to GPR #555) ensure that register typing constraints are respected at N-way join points in the control flow graph (Mark Shinwell)

  • GPR #672: Fix float_of_hex parser to correctly reject some invalid forms (Bogdan Tătăroiu, review by Thomas Braibant and Alain Frisch)

  • GPR #700: Fix maximum weak bucket size (Nicolas Ojeda Bar, review by François Bobot)

  • GPR #708 Allow more module aliases in strengthening (Leo White)

  • GPR #713, MPR #7301: Fix wrong code generation involving lazy values in Flambda mode (Mark Shinwell, review by Pierre Chambart and Alain Frisch)

  • GPR #721: Fix infinite loop in flambda due to [@@specialise] annotations

  • GPR #779: Building native runtime on Windows could fail when bootstrapping FlexDLL if there was also a system-installed flexlink (David Allsopp, report Michael Soegtrop)

  • GPR #805, GPR #815, GPR #833: check for integer overflow in String.concat (Jeremy Yallop, review by Damien Doligez, Alain Frisch, Daniel Bünzli, Fabrice Le Fessant)

  • GPR #810: check for integer overflow in Array.concat (Jeremy Yallop)

  • GPR #814: fix the Buffer.add_substring bounds check to handle overflow (Jeremy Yallop)

  • GPR #880: Fix [@@inline] with default parameters in flambda (Leo White)

  • GPR #525: fix build on OpenIndiana (Sergey Avseyev, review by Damien Doligez)

Internal/compiler-libs changes:

  • MPR #7200, GPR #539: Improve, fix, and add test for parsing/pprintast.ml (Runhang Li, David Sheets, Alain Frisch)

  • GPR #351: make driver/pparse.ml functions type-safe (Gabriel Scherer, Dmitrii Kosarev, review by Jérémie Dimino)

  • GPR #516: Improve Texp_record constructor representation, and propagate updated record type information (Pierre Chambart, review by Alain Frisch)

  • GPR #678: Graphics.close_graph crashes 64-bit Windows ports (re-implementation of MPR #3963) (David Allsopp)

  • GPR #679: delay registration of docstring after the mapper is applied (Hugo Heuzard, review by Leo White)

  • GPR #872: don't attach (**/**) comments to any particular node (Thomas Refis, review by Leo White)

(**/**) is a special documentation comment used to discard some part of the .mli interface file in the generated documentation. This changes makes sure that such a special comment is always considered as a toplevel (aka floating) comment, easing the implementation of this special comment in external tools.

You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.
Press h to open a hovercard with more details.