The MLton repository
Clone or download
MatthewFluet Merge pull request #273 from daemanos/new-overflow
Add new overflow-checking primitives

Motivation

MLton currently implements checked arithmetic with specialized primitives.
These primitives are exposed to the Basis Library implementation as functions
that implicitly raise a primitive exception:

    val +! = _prim "WordS8_addCheck": int * int -> int;

In the XML IR, special care is taken to "remember" the primive exception
associated with these primitives in order to implement exceptions correctly.  In
the SSA, SSA2, RSSA, and Machine IRs, these primitives are treated as transfers,
rather than statements.

This pull request implements a possibly better implementation of these
operations as simple primitives that return a boolean:

    val +! = _prim "Word8_add": int * int -> int;
    val +? = _prim "WordS8_addCheckP": int * int -> bool;
    val +$ = fn (x, y) => let val r = +! (x, y)
                          in  if +? (x, y) then raise Overflow else r
                          end

This would eliminate the special cases in the XML IR and in the SSA, SSA2, RSSA,
and Machine IRs, where the primitives would be treated as statements.  Other
compilers provide overflow checking via boolean-returning functions:

 * https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
 * https://llvm.org/docs/LangRef.html#arithmetic-with-overflow-intrinsics

Implementation

This patch refactors the primitive checked-arithmetic operations such that the
suffix `?` represents a new overflow-checking predicate, a suffix `!` represents
the non-overflow-checking variant, and a suffix `$` represents the
overflow-checking variant (mnemonic: `$` for "safe" or "expensive"). The
behavior of the new `$`-operations is controlled by a compile-time constant,
`MLton.newOverflow`.  When set to false (the default), the `$`-operations make
use of the old-style implicit-`Overflow` primitives.  When set to true, the
`$`-operations are implemented as an `if`-expression that branches on the result
of the corresponding `?`-operation and either raises the `Overflow` exception or
returns the result of the corresponding `!`-operation.  Finally, the bare
operation is aliased to either the `$`-form (with overflow detection enabled) or
the `!`-form (with overflow detection disabled).  Essentially:

    val +! = _prim "Word8_add": int * int -> int;
    val +? = _prim "WordS8_addCheckP": int * int -> bool;
    val +$ = if MLton.newOverflow
             then fn (x, y) => let val r = +! (x, y)
                               in  if +? (x, y) then raise Overflow else r
                               end
             else fn (x, y) => ((_prim "WordS8_addCheckP": int * int -> int;) (x, y))
                               handle PrimOverflow => raise Overflow
    val + = if MLton.detectOverflow then +$ else +!

Note that the checked-arithmetic using `!`- and `$`-operations is careful to
perform the `!`-operation before the `$`-operation.  With the native-codegens, a
new peephole optimization combines the separate unchecked-arithmetic operation
and checked-arithmetic-predicate operation into a single instruction.  For the
C-codgen, the new checked-arithmetic-predicate primitives are translated to uses
of the `__builtin_{add,sub,mul}_overflow` intrinsics (which improves upon the
previous explicit conditional checking, but requires gcc 5 or greater).
Similarly, for the LLVM-codgen, the new checked-arithmetic-predicate primitives
are translated to uses of the `{sadd,uadd,smul,umul,ssub,usub}.with.overflow`
intrinsics.  For both the C- and LLVM-codegens, it is expected that these
intrinsics will be combined with the separate unchecked-arithmetic operation.

In addition, the `RedundantTests` optimization has been extended to eliminate
the overflow test when adding or subtracting 1 with the new primitives.

Performance

The native-codegen peephole optimization and `RedundantTests` have been mostly
sufficient to keep performance on par with the older checked-arithmetic
primitives, and in some cases performance has even significantly improved.  Below
is a summary of the exceptional runtime ratios in the different codegens (both
positive and negative):

| Benchmark       | Native | LLVM |    C |
|-----------------|--------|------|------|
| even-odd        |   1.00 | 1.00 | 1.09 |
| hamlet          |   0.98 | 0.90 | 0.93 |
| imp-for         |   0.99 | 1.50 | 0.46 |
| lexgen          |   0.92 | 1.31 | 1.24 |
| matrix-multiply |   0.99 | 1.00 | 0.87 |
| md5             |   1.06 | 1.01 | 0.97 |
| tensor          |   1.01 | 1.00 | 0.57 |

No benchmarks were consistently worse or better on all codegen, e.g., the
`imp-for` benchmark performed exceptionally badly on the LLVM codegen, but was
much faster on the C codegen and about even on the native codegen.  For this
particular benchmark, the cause of the slowdown with the LLVM codegen has yet to
be discovered.  Similarly, the cause of the slowdown in `lexgen` with the C- and
LLVM-codegens is unknown.  For the `md5` benchmark, on the other hand, the cause
of the slowdown with the native codegen seems to be a failure to eliminate
common subexpressions in certain circumstances, which can potentially be
improved in the future.  Improvements in the C-codegen are likely to be due to
the better overflow checking with builtins.

Future work

The `CommonSubexp` optimization currently handles `Arith` transfers specially;
in particular, the result of an `Arith` transfer can be used in common `Arith`
transfers that it dominates.  This was done so that code like:

    (n + m) + (n + m)

can be transformed to

    let val r = n + m in r + r end

With the new checked-arithmetic-predicate primitives, the computation of the
boolean value may be common-subexpr eliminated, but `Case` discrimination will
not.  This forces the boolean value to be reified and to be discriminated
multiple times (though, perhaps `KnownCase` could eliminate subsequent
discriminations).  Extending `CommonSubexpr` to propagate flow-sensitive
relationships at `Case` transfers to the dominated targets could improve the
performance `md5` with `MLton.newOverflow true` and potentially improve
performance elsewhere as well (e.g., by propagating the results of comparisons
as well).

Once all performance slowdowns with `MLton.newOverflow true` have been
eliminated, it would be desirable to remove the old-style implicit-`Overflow`
primitives and `Arith` transfers.  This would eliminate many previous instances
of special-case code to handle these primitives and transfers.

Finally, it may be worth investigating an implementation of the
checked-operations via

    val +$ = fn (x, y) => let val b = +? (x, y)
                              val r = +! (x, y)
                          in  if b then raise Overflow else r
                          end

rather than

    val +$ = fn (x, y) => let val r = +! (x, y)
                          in  if +? (x, y) then raise Overflow else r
                          end

The advantage of calculating the boolean first is that when `x` (or `y`) is a
loop variable and `r` will be passed as the value for the next iteration, then
both `x` and `r` could be assigned the same location (pseudo-register or stack
slot).  `x` and `r` cannot share the same location when the boolean is
calculated second, because `x` and `r` are both live at the calculation of the
boolean.  See #218.  This would require reworking the native-codegen
peephole optimization.
Latest commit 0b04074 Oct 15, 2018
Permalink
Failed to load latest commit information.
basis-library Merge pull request #273 from daemanos/new-overflow Oct 15, 2018
benchmark Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
bin Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
doc Update link to C-- on CMinusMinus page May 30, 2018
ide Replace `labels` with `cl-labels` Jun 10, 2018
include Alphabetize symbols in `include/amd64-main.h` Sep 10, 2018
lib Revert unnecessary white-space changes Aug 18, 2018
man Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
mllex Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
mlnlffigen Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
mlprof Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
mlton Merge pull request #273 from daemanos/new-overflow Oct 15, 2018
mlyacc Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
package Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
regression Merge pull request #273 from daemanos/new-overflow Oct 15, 2018
runtime Merge pull request #273 from daemanos/new-overflow Oct 15, 2018
util/cm2mlb Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
.gitignore Un-ignore vim configuration files Aug 14, 2018
.travis.yml Use GCC5 for Travis CI Aug 13, 2018
CHANGELOG.adoc Merge pull request #273 from daemanos/new-overflow Oct 15, 2018
Dockerfile Drop `apt-get install autotools` from `./Dockerfile` Feb 1, 2018
LICENSE Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
Makefile Merge branch 'master' of github.com:MLton/mlton into parse-ssa May 18, 2018
Makefile.binary Set `LIB_REL_BIN` in `mlton` script when installing Feb 15, 2018
README.adoc Minor updates to `./README.adoc` and Features page Feb 5, 2018

README.adoc

MLton

Build Status

MLton is a whole-program optimizing compiler for the Standard ML programming language.

Features

  • Portability. Runs on the following platforms:

    • ARM: Linux (Debian).

    • Alpha: Linux (Debian).

    • AMD64: Darwin (Mac OS X), FreeBSD, Linux (Debian, Fedora, Ubuntu, …​), OpenBSD, Solaris (10 and above).

    • HPPA: HPUX (11.11 and above), Linux (Debian).

    • IA64: HPUX (11.11 and above), Linux (Debian).

    • PowerPC: AIX (5.2 and above), Darwin (Mac OS X), Linux (Debian, Fedora).

    • PowerPC64: AIX (5.2 and above).

    • S390: Linux (Debian).

    • Sparc: Linux (Debian), Solaris (8 and above).

    • X86: Cygwin/Windows, Darwin (Mac OS X), FreeBSD, Linux (Debian, Fedora, Ubuntu, …​), MinGW/Windows, NetBSD, OpenBSD, Solaris (10 and above).

  • Robustness.

    • Supports the full SML 97 language as given in The Definition of Standard ML (Revised).

    • A complete implementation of the Basis Library.

    • Generates standalone executables.

    • Compiles large programs.

    • Support for large amounts of memory (up to 4G on 32-bit systems; more on 64-bit systems).

    • Support for large array lengths (up to 231 - 1 on 32-bit systems; up to 263-1 on 64-bit systems).

    • Support for large files, using 64-bit file positions.

  • Performance.

    • Executables have excellent running times.

    • Generates small executables.

    • Untagged and unboxed native integers, reals, and words.

    • Unboxed native arrays.

    • Multiple garbage collection strategies.

    • Fast arbitrary-precision arithmetic based on the GMP.

  • Tools.

    • Source-level profiling for both time and allocation.

    • MLLex lexer generator.

    • MLYacc parser generator.

    • MLNLFFIGEN foreign-function-interface generator.

  • Extensions.

    • A simple and fast C FFI that supports calling from SML to C and from C to SML.

    • The ML Basis system for programming in the very large.

    • Libraries for continuations, finalization, interval timers, random numbers, resource limits, resource usage, signal handlers, object size, system logging, threads, weak pointers, and world save and restore.

Build and Install (from source)

Requirements

Software

  • GCC or Clang (The C compiler must support -std=gnu11.)

  • GMP (GNU Multiple Precision arithmetic library)

  • GNU Make

  • GNU Bash

  • binutils (ar, ranlib, strip, …​)

  • miscellaneous Unix utilities (diff, find, grep, gzip, patch, sed, tar, xargs, …​)

  • Standard ML compiler and tools to bootstrap:

    • MLton (mlton, mllex, and mlyacc) recommended. Pre-built binary packages for MLton can be installed via an OS package manager or (for select platforms) obtained from http://mlton.org.

    • SML/NJ (sml, ml-lex, ml-yacc) supported, but not recommended.

  • (optional, for documentation only) TeX, AsciiDoc, Pygments, GraphicsMagick or ImageMagick, …​

Hardware

  • ≥ 1GB RAM (for 32-bit platforms) or ≥ 2GB RAM (for 64-bit platforms)

Build Instructions

On typical platforms, building MLton requires no configuration and can be accomplished via:

$ make all

A small set of Makefile variables can be used to customize the build:

  • CC: Specify C compiler. Can be used for alternative tools (e.g., CC=clang or CC=gcc-7).

  • WITH_GMP_DIR, WITH_GMP_INC_DIR, WITH_GMP_LIB_DIR: Specify GMP include and library paths, if not on default search paths. (If WITH_GMP_DIR is set, then WITH_GMP_INC_DIR defaults to $(WITH_GMP_DIR)/include and WITH_GMP_LIB_DIR defaults to $(WITH_GMP_DIR)/lib.)

  • MLTON_RUNTIME_ARGS, MLTON_COMPILE_ARGS: Specify runtime and compile arguments given to (the to-be-built) mlton when compiling distributed executables ((self-compiled) mlton, mllex, mlyacc, mlprof, and mlnlffigen). Can be used for testing (e.g., MLTON_COMPILE_ARGS="-codegen c") or for downstream packaging.

  • BOOTSTRAP_MLTON_RUNTIME_ARGS, BOOTSTRAP_MLTON_COMPILE_ARGS: Specify runtime and compile arguments given to "old" mlton when compiling "bootstrapped" mlton. Can be used to work around bugs in "old" mlton when compiling "bootstrapped" mlton.

For example:

$ make CC=clang WITH_GMP_DIR=/opt/gmp MLTON_COMPILE_ARGS="-codegen c" all

The build artifacts are located under ./build. The just-built mlton can be executed via ./build/bin/mlton.

Building documentation can be accomplished via:

$ make docs

Install Instructions

On typical platforms, installing MLton (after performing make all and, optionally, make docs) to /usr/local can be accomplished via:

$ make install

A small set of Makefile variables can be used to customize the installation:

  • PREFIX: Specify the installation prefix.

For example:

$ make PREFIX=/opt/mlton install

Install (from binary package)

Requirements

Software

  • GCC or Clang (The C compiler must support -std=gnu11.)

  • GMP (GNU Multiple Precision arithmetic library)

  • GNU Make

  • GNU Bash

  • miscellaneous Unix utilities (bzip2, gzip, sed, tar, …​)

Binary Package

A .tgz or .tbz binary package can be extracted at any location, yielding README.adoc (this file), CHANGELOG.adoc, LICENSE, Makefile, bin/, lib/, and share/. The compiler and tools can be executed in-place (e.g., ./bin/mlton).

A small set of Makefile variables can be used to customize the binary package via make update:

  • CC: Specify C compiler. Can be used for alternative tools (e.g., CC=clang or CC=gcc-7).

  • WITH_GMP_DIR, WITH_GMP_INC_DIR, WITH_GMP_LIB_DIR: Specify GMP include and library paths, if not on default search paths. (If WITH_GMP_DIR is set, then WITH_GMP_INC_DIR defaults to $(WITH_GMP_DIR)/include and WITH_GMP_LIB_DIR defaults to $(WITH_GMP_DIR)/lib.)

For example:

$ make CC=clang WITH_GMP_DIR=/opt/gmp update

Install Instructions

On typical platforms, installing MLton (after optionally performing make update) to /usr/local can be accomplished via:

$ make install

A small set of Makefile variables can be used to customize the installation:

  • PREFIX: Specify the installation prefix.

For example:

$ make PREFIX=/opt/mlton install

Resources

Need help? Found a bug?

Submit an issue if you need any help. We welcome pull requests with bug fixes or changes.