Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Globalization of arrays and refs of "small" types in ConstantPropagation #206

Closed
MatthewFluet opened this issue Oct 21, 2017 · 1 comment

Comments

@MatthewFluet
Copy link
Member

MatthewFluet commented Oct 21, 2017

Space-safety prohibits ConstantPropagation from globalizing all arrays and refs that are allocated at most once by a program. In particular, because globals are live for the duration of the program, globalizing an int list ref (for example) would not be safe-for-space: an arbitrarily large list may be written to the reference and never garbage collected (whereas, when the int list ref is not globalized, it will be garbage collected when it is no longer live). On the other hand, globalizing an int ref is safe-for-space.

However, MLton currently uses a very conservative estimation for space safety. Only "small" types may be globalized, where smallness is defined as:

     fun isSmall t =
         case dest t of
            Array _ => false
          | Datatype _ => false
          | Ref t => isSmall t
          | Tuple ts => Vector.forall (ts, isSmall)
          | Vector _ => false
          | _ => true

Note that no Datatype is small; this is conservative (since a recursive datatype could represent unbounded data), but prevents globalizing bool ref. Also, no Array is small; this is correct (because an int array ref should not be globalized), but the globalization of a val a: t array = Array_alloc[t] (l) is currently conditioned on the smallness of t array, not the smallness of t. It would be correct to globalize an array if t were small; note that to globalize val a: t array = Array_alloc[t] (l), l (the length) must be globalized and must, therefore, be a constant and the array is of constant size. (This is Stephen Weeks's relaxed notion of safe-for-space, where the constant factor blowup can be chosen per program.) In practice, it may be better to limit globalization of arrays to ones with "small" length in addition to small element type.

RefFlatten uses a more precise notion of small/large types (although RefFlatten was meant to be safe-for-space, the initial definition of small/large types was incorrect; see b1a0a80).

MatthewFluet added a commit to MatthewFluet/mlton that referenced this issue Oct 21, 2017
Currently, no arrays are globalized; see MLton#206.

However, if "small" arrays were globalized, then an
`Array_allocRaw[t] (l)` would be globalized as `Array_alloc[t] (l)`
and the `Array_toArray` (that casts the raw array to a normal array)
would be dropped.

Now, the `Array` abstract value tracks whether the array is raw or
normal, which is used to generate either `Array_allocRaw` or
`Array_alloc` if the array is globalized.

Also, `Array_toArray` creates the destination `Array` abstract value
with `Birth.unknown()`, which prohibits globalization.
MatthewFluet added a commit to MatthewFluet/mlton that referenced this issue Oct 21, 2017
Currently, no arrays are globalized; see MLton#206.

However, if "small" arrays were globalized, then an
`Array_allocRaw[t] (l)` would be globalized as `Array_alloc[t] (l)`
and the `Array_toArray` (that casts the raw array to a normal array)
would be dropped.

Now, the `Array` abstract value tracks whether the array is raw or
normal, which is used to generate either `Array_allocRaw` or
`Array_alloc` if the array is globalized.

Also, `Array_toArray` creates the destination `Array` abstract value
with `Birth.unknown()`, which prohibits globalization.
@MatthewFluet
Copy link
Member Author

There are (at least) two orthogonal improvements that could be investigated:

  • Allow globalization of "small constant" arrays:

    • Allow globalization of arrays with "small type" contents; this simply requires changing the Type.isSmall resultType to Type.isSmall (Vector.first targs) in fun bear z = ... in fun primApp.
    • Restrict globalization of arrays with "small type" contents to ones with "small" length; this requires changing the fn _ => length to check if length is a "small constant size".
  • Relax notion of "small type":

    • Pre-process the datatypes and allow any datatype with all nullary constructors to be considered small. (Simple iteration of datatypes.) This would allow bool (and other "enumeration" datatypes) to be considered small.
    • Pre-process the datatypes as with RefFlatten, allowing any datatype with all constructors with small types to be considered small. (Fixed-point iteration of datatypes.) This would allow some value carrying datatypes to be considered small, for example, some closure datatypes where the environment constructors only capture small data.
    • Post-process the datatypes after the main constant propagation analysis, taking into account components of constructors that are known to be constant. For example, a datatype like datatype t = T of int array would not be considered small by either of the previous approaches, but if it was determined that the int array were itself constant (and globalized) then the datatype would be transformed to datatype t = T of unit, which would be considered small.

MatthewFluet added a commit that referenced this issue Jan 16, 2019
Alternate strategies for globalization in ConstantPropagation

Extend globalization aspect of ConstantPropagation to support
globalization of arrays and to support different "small type"
strategies.

Closes #206.

Space-safety prohibits ConstantPropagation from globalizing all arrays
and refs that are allocated at most once by a program. In particular,
because globals are live for the duration of the program, globalizing
an `int list ref` (for example) would not be safe-for-space: an
arbitrarily large list may be written to the reference and never be
garbage collected (whereas, when the `int list ref` is not globalized,
it will be garbage collected when it is no longer live). On the other
hand, globalizing an `int ref` is safe-for-space.

However, MLton previously used only a very conservative estimation for
space safety. Only "small" types may be globalized, where smallness is
defined as:

    fun isSmall t =
        case dest t of
           Array _ => false
         | Datatype _ => false
         | Ref t => isSmall t
         | Tuple ts => Vector.forall (ts, isSmall)
         | Vector _ => false
         | _ => true

Note that no `Datatype` is small; this is conservative (since a
recursive datatype could represent unbounded data), but prevents
globalizing `bool ref`. Also, no `Array` is small; this is correct
(because an `int array ref` should not be globalized), but the
globalization of a `val a: t array = Array_alloc[t] (l)` was
conditioned on the smallness of `t array`, not the smallness of
`t`. It is correct to globalize an array if `t` were small; note that
to globalize `val a: t array = Array_alloc[t] (l)`, `l` (the length)
must be globalized and must, therefore, be a constant and the array is
of constant size. (This is Stephen Weeks's relaxed notion of
safe-for-space, where the constant factor blowup can be chosen per
program.)

This pull request adds support for alternate globalization strategies:
 * `-globalize-arrays {false|true}`: globalize arrays
 * `-globalize-refs {true|false}`: globalize refs
 * `-globalize-small-int-inf {true|false}`:globalize `IntInf` as a
   small type
 * `-globalize-small-type {1|0|2|3|4|9}`: strategies for classifying a
   type as "small":
   * `0`: constant `false` function (no types considered small)
   * `1`: no `Datatype` is considered small (original strategy)
   * `2`: `Datatype`s with all nullary constructors are considered
     small
   * `3`: `Datatype`s with all constructor arguments considered small
     according to strategy `2` are considered small
   * `4`: Fixed-point analysis of `Datatype`s to determine smallness
   * `9`: constant `true` function (all types considered small; not
     safe-for-space)

The defaults correspond to the previous behavior.

Unfortunately, additional globalization has little to no (positive)
effect on benchmarks:

MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton
MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1
MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2
MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3
MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4
MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1
MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2
MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3
MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4
run time ratio
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8
DLXSimulator        1.00   1.00   1.03   1.00   1.00   0.99   1.00   1.01   1.01
checksum            1.00   1.00   1.13   1.14   1.13   1.00   1.12   1.13   1.14
flat-array          1.00   1.00   1.00   1.00   1.01   1.19   1.19   1.19   1.19
hamlet              1.00   1.00   1.02   1.01   1.01   1.00   0.96   0.97   0.97
imp-for             1.00   1.00   1.06   1.05   1.06   1.00   1.05   1.06   1.06
knuth-bendix        1.00   1.00   1.03   1.03   1.03   1.00   1.03   1.03   1.03
lexgen              1.00   0.99   1.05   0.99   1.03   1.04   1.03   1.02   1.07
model-elimination   1.00   1.01   1.02   1.02   1.02   1.00   1.03   1.02   1.03
peek                1.00   1.00   1.03   1.03   1.04   1.00   1.04   1.04   1.04
ray                 1.00   1.03   0.99   0.99   1.01   0.99   0.99   0.99   0.98
raytrace            1.00   1.03   1.00   1.02   1.03   0.99   1.00   1.01   1.01
simple              1.00   0.98   0.99   1.00   0.99   0.97   0.97   0.96   0.97
tak                 1.00   1.00   1.04   1.03   1.10   1.01   1.02   1.08   1.00
wc-scanStream       1.00   1.00   1.06   1.06   1.06   1.00   1.04   1.06   1.07

Note that `MLton0` and `MLton1` generate identical code (modulo the
random magic number), so the slowdowns in `ray` and `raytrace` are
noise, which also suggests that slowdowns/speedups of <= 3% are also
likely noise.

The slowdown in `flat-array` with `-globalize-array true` is explained
as follows.  The `flat-array` benchmark uses `Vector.tabulate` to
allocate a vector that is used for all iterations of the benchmark.
With `-globalize-array false`, the array is not globalized, and in
SSA/SSA2, we have:

    x_1212: ((word32, word32) tuple) array =
      prim Array_alloc((word32, word32) tuple) (global_138 (*0xF4240*))
...
    x_757: ((word32, word32) tuple) vector =
      prim Array_toVector((word32, word32) tuple) (x_1212)
...
    x_1287: (word32, word32) tuple =
      prim Vector_sub((word32, word32) tuple) (x_757, x_1283)

but with `-globalize-array true`, the array is globalized, and in
SSA/SS2, we have:

global_490: ((word32, word32) tuple) array =
  prim Array_alloc((word32, word32) tuple) (global_138 (*0xF4240*))
...
    x_757: ((word32, word32) tuple) vector =
      prim Array_toVector((word32, word32) tuple) (global_490)
...
    x_1286: (word32, word32) tuple =
      prim Vector_sub((word32, word32) tuple) (x_757, x_1282)

At RSSA, the `Array_toVector` becomes a header update and the array
variable is cast/copy-propagated for the vector variable;
with `-globalize-arrays false`, we have

  L_531 (x_1212: Objptr (opt_11)) CReturn {func = {..., target = GC_sequenceAllocate}} =
...
    OW64 (x_1212, ~8): Word64 := opt_12
...
    x_1354: Word32 = XW32 (Cast (x_1212, Objptr (opt_12)), x_1283, 8, 0)
    x_1353: Word32 = XW32 (Cast (x_1212, Objptr (opt_12)), x_1283, 8, 4)


but with `-globalize-arrays true`, we have

  L_488 (global_490: Objptr (opt_7)) CReturn {func = {..., target = GC_sequenceAllocate}} =
...
    OW64 (global_490, ~8): Word64 := opt_12
...
    x_1353: Word32 = XW32 (Cast (global_490, Objptr (opt_12)), x_1282, 8, 0)
    x_1352: Word32 = XW32 (Cast (global_490, Objptr (opt_12)), x_1282, 8, 4)

Finally, with `-globalize-arrays false`, `x_1212` becomes a local
(because the loops to initialize and use the vector are
non-allocating):

    RW32(2): Word32
     = XW32 (Cast (RP(0): Objptr (opt_11), Objptr (opt_12)),
             RW64(0): Word64, 8, 0): Word32
    RW32(3): Word32
     = XW32 (Cast (RP(0): Objptr (opt_11), Objptr (opt_12)),
             RW64(0): Word64, 8, 4): Word32

but with `-globalize-arrays true`:

    RW32(2): Word32
     = XW32 (Cast (glob {index = 1, isRoot = true, ty = Objptr (opt_7)},
                   Objptr (opt_12)),
             RW64(0): Word64, 8, 0): Word32
    RW32(3): Word32
     = XW32 (Cast (glob {index = 1, isRoot = true, ty = Objptr (opt_7)},
                   Objptr (opt_12)),
             RW64(0): Word64, 8, 4): Word32

The innermost loop of the benchmark goes from indexing a sequence
stored in a local (`RP(0)`) to indexing a sequence stored in a global
(`GP(1)`).  All of the codegens should implement the former by using a
hardware register for `RP(0)`, but will implement the latter with a
memory read.

In light of the above, and related to #218, it may be
beneficial to "deglobalize" object pointer globals; that is, in RSSA
functions that have multiple accesses through the same object pointer
global (particularly within loops) could be translated to copy the
global to a local.

The slowdown in `checksum` is less easily explained.  The only new
objects globalized with `-globalize-small-type 2` as compared to
`-globalize-small-type 1` are two `bool ref` objects, corresponding to
the `exiting` flag of `basis-library/mlton/exit.sml` and the
`staticIsInUse` flag of `basis-library/util/one.sml` used by
`Int.fmt`.  That small change seems to lead to code layout and cache
effects that result in the slowdown, because the assembly code is not
substantial different.  With `-enable-pass machineSuffle` and
`-seed-rand <w>`, one can perturb the code layout and observe that the
slowdowns are not universal:

MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton
MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1
MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand 42424242
MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand deadbeef
MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2
MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand 42424242
MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand deadbeef
MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3
MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand 42424242
MLton9 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand deadbeef
MLton10 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4
MLton11 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand 42424242
MLton12 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand deadbeef
MLton13 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1
MLton14 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand 42424242
MLton15 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand deadbeef
MLton16 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2
MLton17 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand 42424242
MLton18 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand deadbeef
MLton19 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3
MLton20 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand 42424242
MLton21 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand deadbeef
MLton22 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4
MLton23 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand 42424242
MLton24 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand deadbeef
run time ratio
benchmark     MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9 MLton10 MLton11 MLton12 MLton13 MLton14 MLton15 MLton16 MLton17 MLton18 MLton19 MLton20 MLton21 MLton22 MLton23 MLton24
checksum        1.00   1.00   1.00   1.01   1.14   1.01   1.00   1.14   1.01   1.01    1.14    1.00    1.04    1.00    1.00    1.00    1.14    1.00    1.01    1.16    1.00    1.01    1.15    1.00    1.01
flat-array      1.00   1.01   1.02   1.00   1.00   1.01   1.01   1.03   1.01   1.01    1.01    1.01    1.01    1.23    1.19    1.19    1.19    1.20    1.19    1.20    1.20    1.19    1.20    1.19    1.20
hamlet          1.00   0.99   1.00   0.99   1.01   1.01   1.00   1.01   1.03   1.01    1.01    1.02    1.01    1.00    1.00    0.99    0.95    0.95    0.94    0.97    0.96    0.96    0.98    0.98    0.99
imp-for         1.00   1.00   1.05   1.05   1.05   1.02   1.00   1.05   0.99   1.00    1.05    1.00    1.01    1.00    1.05    1.05    1.05    0.99    1.00    1.05    1.00    1.00    1.05    1.00    0.99
lexgen          1.00   0.97   1.00   0.97   1.03   1.03   1.01   1.04   0.99   0.95    0.99    0.99    0.95    0.97    0.98    0.95    0.96    1.01    0.98    1.00    1.04    0.95    0.96    1.00    0.95
peek            1.00   1.00   1.00   1.01   1.03   1.01   1.04   1.03   1.01   1.04    1.03    1.01    1.03    1.00    1.00    1.01    1.04    1.01    1.05    1.03    1.01    1.04    1.04    1.00    1.04
simple          1.00   1.01   1.01   1.00   1.00   0.99   1.02   1.00   0.99   1.00    1.00    1.00    1.00    0.98    0.97    0.99    0.98    0.97    0.99    0.97    0.99    0.98    0.98    0.98    0.99
tak             1.00   0.99   0.90   1.00   1.05   0.90   0.99   1.02   0.89   0.99    1.04    0.90    1.00    0.99    0.89    0.99    0.99    0.90    0.99    1.01    0.90    1.00    1.00    0.90    1.00
wc-scanStream   1.00   1.01   1.01   1.03   1.06   1.02   1.03   1.05   1.02   1.02    1.04    1.01    1.00    1.01    1.01    1.00    1.06    1.01    1.00    1.07    1.01    1.00    1.05    1.03    1.02

Note that while `checksum` with MLton4 has a slowdown, `checksum` with
MLton5 and MLton6 (which are identical up to shuffling of the
functions and basic blocks at the MachineIR) do not have a slowdown.
Similarly `tak` with MLton0 and MLton1 have similar running time, but
`tak` with MLton3 has a speedup.  On the other hand, `flat-array`'s
slowdowns with `-globalize-arrays true` are not due to code layout
effects.

`hamlet` may have a slight speedup with `-globalize-arrays true`, but
that is significantly outweighted by the slowdown in `flat-array`.

The conclusion is to leave the defaults corresponding to the original
behavior.


Full benchmark results:

MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton
MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1
MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2
MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3
MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4
MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1
MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2
MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3
MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4
run time ratio
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8
DLXSimulator        1.00   1.00   1.03   1.00   1.00   0.99   1.00   1.01   1.01
barnes-hut          1.00   1.01   1.01   1.02   1.01   1.01   1.01   1.02   1.01
boyer               1.00   1.00   1.01   1.01   1.02   1.00   1.01   1.01   1.01
checksum            1.00   1.00   1.13   1.14   1.13   1.00   1.12   1.13   1.14
count-graphs        1.00   1.00   0.99   1.01   0.99   1.00   1.00   1.00   0.99
even-odd            1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
fft                 1.00   1.00   1.01   1.02   1.02   1.01   1.01   1.00   1.01
fib                 1.00   1.00   1.00   1.00   1.01   1.00   1.00   1.00   1.00
flat-array          1.00   1.00   1.00   1.00   1.01   1.19   1.19   1.19   1.19
hamlet              1.00   1.00   1.02   1.01   1.01   1.00   0.96   0.97   0.97
imp-for             1.00   1.00   1.06   1.05   1.06   1.00   1.05   1.06   1.06
knuth-bendix        1.00   1.00   1.03   1.03   1.03   1.00   1.03   1.03   1.03
lexgen              1.00   0.99   1.05   0.99   1.03   1.04   1.03   1.02   1.07
life                1.00   1.00   1.00   1.00   1.00   1.01   1.01   1.00   1.00
logic               1.00   1.00   0.98   0.99   0.99   1.00   0.98   0.99   1.00
mandelbrot          1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
matrix-multiply     1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.01   1.00
md5                 1.00   1.00   0.99   0.99   0.99   1.00   0.99   0.99   0.99
merge               1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
mlyacc              1.00   1.01   1.01   1.01   1.01   1.00   0.99   0.99   1.01
model-elimination   1.00   1.01   1.02   1.02   1.02   1.00   1.03   1.02   1.03
mpuz                1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
nucleic             1.00   1.01   1.00   1.00   0.99   1.00   0.99   0.99   1.00
output1             1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
peek                1.00   1.00   1.03   1.03   1.04   1.00   1.04   1.04   1.04
psdes-random        1.00   1.00   1.00   1.00   1.00   1.00   1.01   1.00   1.00
ratio-regions       1.00   0.99   0.98   0.98   1.01   1.01   0.99   1.00   1.01
ray                 1.00   1.03   0.99   0.99   1.01   0.99   0.99   0.99   0.98
raytrace            1.00   1.03   1.00   1.02   1.03   0.99   1.00   1.01   1.01
simple              1.00   0.98   0.99   1.00   0.99   0.97   0.97   0.96   0.97
smith-normal-form   1.00   1.00   1.01   0.99   1.00   1.00   1.00   1.00   1.00
string-concat       1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
tailfib             1.00   1.00   1.00   0.99   1.00   1.00   1.00   1.00   1.00
tak                 1.00   1.00   1.04   1.03   1.10   1.01   1.02   1.08   1.00
tensor              1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
tsp                 1.00   1.00   1.00   1.00   1.00   0.99   1.00   1.00   1.00
tyan                1.00   1.00   1.00   1.01   1.01   1.01   1.01   1.01   1.01
vector-rev          1.00   0.99   0.99   0.98   0.98   0.98   0.99   0.98   0.98
vector32-concat     1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
vector64-concat     1.00   1.00   1.00   1.00   1.00   0.99   1.00   1.00   0.99
vliw                1.00   0.98   1.00   1.01   1.00   1.00   0.98   0.98   0.98
wc-input1           1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
wc-scanStream       1.00   1.00   1.06   1.06   1.06   1.00   1.04   1.06   1.07
zebra               1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00   1.00
zern                1.00   1.02   1.02   0.99   1.01   0.99   0.99   1.01   0.99
size
benchmark            MLton0    MLton1    MLton2    MLton3    MLton4    MLton5    MLton6    MLton7    MLton8
DLXSimulator        209,076   209,076   209,140   209,140   209,140   209,076   208,836   208,340   208,340
barnes-hut          176,199   176,199   176,071   176,071   176,071   176,199   176,071   176,071   176,071
boyer               243,369   243,369   243,289   243,289   243,289   243,369   243,289   243,289   243,289
checksum            117,561   117,561   117,433   117,433   117,433   117,561   117,433   117,433   117,433
count-graphs        145,065   145,065   145,017   145,017   145,017   145,065   144,937   144,937   144,937
even-odd            117,529   117,529   117,433   117,433   117,433   117,529   117,433   117,433   117,433
fft                 142,307   142,307   141,315   141,315   141,315   142,307   141,315   141,315   141,315
fib                 117,449   117,449   117,321   117,321   117,321   117,449   117,321   117,321   117,321
flat-array          117,177   117,177   117,049   117,049   117,049   117,193   117,081   117,081   117,081
hamlet            1,434,228 1,434,228 1,433,220 1,433,220 1,433,220 1,434,228 1,432,564 1,427,956 1,427,396
imp-for             117,241   117,241   117,145   117,145   117,145   117,241   117,145   117,145   117,145
knuth-bendix        186,116   186,116   186,212   186,212   186,212   186,116   186,212   186,212   186,212
lexgen              290,931   290,931   290,819   290,819   290,819   290,931   290,819   290,819   290,819
life                141,113   141,113   141,065   141,065   141,065   141,113   141,065   141,065   141,065
logic               197,417   197,417   197,273   197,273   197,273   197,417   197,273   197,273   197,273
mandelbrot          117,273   117,273   117,177   117,177   117,177   117,273   117,177   117,177   117,177
matrix-multiply     119,577   119,577   119,417   119,417   119,417   119,577   119,417   119,417   119,417
md5                 144,676   144,676   144,500   144,500   144,500   144,676   144,500   144,500   144,500
merge               118,953   118,953   118,857   118,857   118,857   118,953   118,857   118,857   118,857
mlyacc              643,555   643,555   643,651   643,651   643,651   643,555   643,651   643,651   643,475
model-elimination   796,054   796,054   793,958   793,798   793,798   796,054   794,166   792,246   792,246
mpuz                123,545   123,545   123,481   123,481   123,481   123,545   123,481   123,481   123,481
nucleic             297,249   297,249   297,233   297,233   297,233   297,249   297,233   297,233   297,233
output1             151,768   151,768   149,848   149,848   149,848   151,768   149,848   149,848   149,848
peek                150,164   150,164   150,132   150,132   150,132   150,164   150,132   150,132   150,132
psdes-random        121,545   121,545   121,401   121,401   121,401   121,545   121,401   121,401   121,401
ratio-regions       144,137   144,137   144,169   144,169   144,169   144,137   144,169   144,169   144,169
ray                 250,058   250,058   250,218   250,218   250,218   250,058   249,818   249,066   249,066
raytrace            368,988   368,988   368,108   368,108   368,108   368,956   367,868   367,468   367,468
simple              345,205   345,205   345,381   345,381   345,381   329,557   329,557   329,317   329,317
smith-normal-form   279,837   279,837   279,645   279,645   279,645   279,837   279,341   279,341   279,341
string-concat       119,129   119,129   119,033   119,033   119,033   119,209   119,033   119,033   119,033
tailfib             117,273   117,273   117,177   117,177   117,177   117,273   117,177   117,177   117,177
tak                 117,449   117,449   117,321   117,321   117,321   117,449   117,321   117,321   117,321
tensor              179,292   179,292   176,908   176,908   176,908   179,292   176,908   176,908   176,908
tsp                 158,860   158,860   158,668   158,668   158,668   158,860   158,668   158,668   158,668
tyan                223,588   223,588   223,044   223,044   223,044   223,588   223,044   223,044   223,044
vector-rev          118,105   118,105   118,009   118,009   118,009   118,153   117,977   117,977   117,977
vector32-concat     118,297   118,297   118,201   118,201   118,201   118,329   118,185   118,185   118,185
vector64-concat     118,329   118,329   118,169   118,169   118,169   118,329   118,217   118,217   118,217
vliw                505,509   505,509   503,013   503,013   503,013   505,637   500,917   497,957   497,957
wc-input1           179,051   179,051   178,923   178,923   178,923   179,051   178,923   178,923   178,923
wc-scanStream       188,155   188,155   188,027   188,027   188,027   188,155   188,027   188,027   188,027
zebra               225,364   225,364   225,220   225,220   225,220   225,364   225,220   225,220   225,220
zern                153,241   153,241   152,521   152,521   152,521   153,241   152,585   152,585   152,585
compile time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8
DLXSimulator        3.19   3.14   3.45   3.51   3.02   3.27   3.18   3.45   3.24
barnes-hut          2.93   2.94   2.96   2.97   3.06   2.92   2.96   2.93   2.96
boyer               3.36   3.49   3.48   3.32   3.42   3.40   3.52   3.53   3.49
checksum            2.53   2.56   2.47   2.56   2.57   2.46   2.56   2.54   2.62
count-graphs        2.68   2.69   2.71   2.80   2.80   2.79   2.67   2.69   2.76
even-odd            2.45   2.56   2.46   2.56   2.48   2.50   2.48   2.55   2.57
fft                 2.64   2.57   2.62   2.66   2.35   2.68   2.60   2.54   2.62
fib                 2.46   2.46   2.58   2.48   2.55   2.45   2.55   2.54   2.46
flat-array          2.52   2.56   2.55   2.56   2.55   2.54   2.53   2.56   2.55
hamlet             15.27  15.74  14.48  14.35  14.48  15.63  15.21  14.88  15.04
imp-for             2.48   2.56   2.52   2.55   2.55   2.34   2.42   2.48   2.54
knuth-bendix        2.89   2.90   2.90   3.01   3.02   2.93   3.01   3.13   2.99
lexgen              3.40   3.88   3.76   3.82   3.74   3.62   3.50   3.81   3.71
life                2.66   2.71   2.70   2.74   2.72   2.64   2.74   2.68   2.60
logic               3.06   3.05   3.11   3.12   3.00   3.13   3.14   2.92   2.83
mandelbrot          2.51   2.54   2.53   2.58   2.57   2.53   2.56   2.55   2.45
matrix-multiply     2.48   2.50   2.52   2.49   2.57   2.60   2.60   2.57   2.53
md5                 2.65   2.68   2.78   2.58   2.78   2.80   2.58   2.80   2.69
merge               2.47   2.49   2.58   2.57   2.55   2.49   2.57   2.50   2.52
mlyacc              7.85   7.94   8.00   7.98   7.90   7.69   8.05   8.09   7.54
model-elimination   7.08   7.98   7.62   7.11   7.40   8.24   7.84   7.96   8.15
mpuz                2.34   2.61   2.60   2.56   2.53   2.62   2.42   2.53   2.52
nucleic             4.06   4.17   4.07   4.05   4.08   4.06   4.28   4.33   4.14
output1             2.69   2.68   2.52   2.58   2.77   2.79   2.78   2.57   2.77
peek                2.74   2.78   2.58   2.80   2.80   2.73   2.70   2.68   2.72
psdes-random        2.56   2.53   2.49   2.48   2.50   2.48   2.53   2.64   2.50
ratio-regions       2.80   2.82   2.62   2.88   2.86   2.89   2.81   2.80   2.79
ray                 3.34   3.62   3.69   3.47   3.45   3.60   3.48   3.37   3.40
raytrace            4.55   4.88   4.47   4.32   4.64   4.30   4.48   4.43   4.50
simple              4.01   4.07   4.04   4.00   3.91   3.83   3.77   3.81   3.74
smith-normal-form   3.75   3.60   3.82   3.58   3.40   3.79   3.60   3.58   3.58
string-concat       2.46   2.66   2.54   2.45   2.57   2.56   2.46   2.51   2.49
tailfib             2.44   2.54   2.53   2.57   2.50   2.57   2.57   2.36   2.54
tak                 2.45   2.57   2.56   2.63   2.47   2.44   2.43   2.44   2.52
tensor              3.05   3.16   3.07   3.15   3.10   3.32   3.18   3.16   3.13
tsp                 2.81   2.79   2.57   2.74   2.76   2.75   2.84   2.84   2.82
tyan                3.27   3.06   3.38   3.26   3.35   3.23   3.22   3.22   3.24
vector-rev          2.49   2.57   2.56   2.33   2.61   2.40   2.52   2.49   2.54
vector32-concat     2.53   2.49   2.52   2.47   2.55   2.63   2.49   2.49   2.51
vector64-concat     2.48   2.46   2.50   2.52   2.48   2.54   2.46   2.47   2.48
vliw                5.63   5.63   5.64   6.14   5.43   6.26   5.78   6.17   6.16
wc-input1           3.06   2.96   2.95   2.98   2.91   2.87   2.90   2.97   2.96
wc-scanStream       3.01   2.97   3.02   2.95   3.06   3.04   3.03   3.01   3.09
zebra               3.30   3.37   3.36   3.26   3.34   3.51   3.31   3.26   3.34
zern                2.69   2.74   2.62   2.64   2.72   2.74   2.73   2.43   2.70
run time
benchmark         MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8
DLXSimulator       32.67  32.69  33.65  32.71  32.53  32.50  32.63  32.84  32.92
barnes-hut         28.33  28.63  28.64  28.85  28.59  28.50  28.71  28.78  28.61
boyer              55.98  56.01  56.81  56.75  56.82  56.08  56.29  56.72  56.76
checksum           25.35  25.39  28.74  28.79  28.64  25.39  28.42  28.68  28.87
count-graphs       39.96  39.98  39.51  40.23  39.38  40.13  40.16  39.92  39.75
even-odd           39.09  39.07  39.09  39.06  39.08  39.14  39.10  39.09  39.09
fft                30.56  30.69  30.75  31.05  31.20  30.98  30.86  30.62  30.77
fib                17.70  17.75  17.77  17.77  17.83  17.77  17.74  17.67  17.77
flat-array         23.60  23.54  23.65  23.70  23.82  28.18  28.09  28.02  28.01
hamlet             39.69  39.77  40.32  40.20  40.20  39.72  38.00  38.70  38.68
imp-for            24.40  24.44  25.78  25.74  25.79  24.51  25.73  25.98  25.84
knuth-bendix       34.10  34.09  35.05  34.97  35.03  34.16  35.02  35.11  35.07
lexgen             34.22  33.77  35.88  33.88  35.27  35.64  35.09  34.93  36.52
life               38.70  38.80  38.76  38.76  38.89  38.90  38.91  38.74  38.84
logic              35.01  34.93  34.44  34.82  34.82  35.06  34.47  34.77  34.93
mandelbrot         35.78  35.77  35.79  35.80  35.81  35.82  35.79  35.78  35.81
matrix-multiply    29.74  29.80  29.82  29.89  29.79  29.72  29.84  29.90  29.86
md5                28.38  28.39  28.04  28.05  28.01  28.42  28.12  28.14  27.98
merge              32.55  32.45  32.41  32.44  32.59  32.48  32.41  32.56  32.40
mlyacc             32.76  33.01  32.94  33.14  33.23  32.83  32.52  32.49  33.11
model-elimination  38.05  38.26  38.89  38.72  38.80  38.09  39.11  38.97  39.36
mpuz               29.94  29.86  29.88  29.91  29.89  29.91  29.89  29.86  29.90
nucleic            33.73  33.91  33.60  33.68  33.35  33.74  33.40  33.47  33.58
output1            30.01  30.01  29.99  30.01  30.02  30.06  29.99  30.10  29.91
peek               33.58  33.60  34.75  34.63  34.78  33.61  34.79  34.78  34.86
psdes-random       33.84  33.91  33.83  33.86  33.91  33.88  34.16  33.91  33.91
ratio-regions      49.08  48.63  48.33  48.33  49.40  49.45  48.70  48.95  49.45
ray                37.55  38.58  37.18  37.06  37.80  37.04  37.19  37.08  36.65
raytrace           34.20  35.34  34.35  34.94  35.23  34.00  34.08  34.61  34.52
simple             29.73  29.25  29.47  29.64  29.51  28.96  28.84  28.57  28.86
smith-normal-form  39.88  39.79  40.38  39.56  39.99  39.92  39.93  39.85  39.88
string-concat      91.43  91.56  91.68  91.40  91.65  91.58  91.75  91.50  91.36
tailfib            38.06  38.06  37.93  37.85  37.90  38.12  38.00  37.98  37.87
tak                30.73  30.77  31.96  31.60  33.85  30.94  31.41  33.22  30.87
tensor             39.62  39.59  39.67  39.70  39.70  39.54  39.71  39.76  39.67
tsp                37.88  37.80  37.84  37.79  37.75  37.62  37.86  37.99  37.82
tyan               30.48  30.45  30.55  30.86  30.79  30.78  30.74  30.85  30.79
vector-rev         27.00  26.60  26.61  26.44  26.55  26.37  26.68  26.53  26.39
vector32-concat    82.46  82.52  82.56  82.72  82.43  82.56  82.37  82.30  82.22
vector64-concat    91.66  91.86  91.57  91.77  91.63  91.13  91.54  91.42  91.13
vliw               28.91  28.43  28.94  29.05  28.90  28.82  28.29  28.27  28.44
wc-input1          43.95  43.81  43.87  43.95  43.86  43.93  43.95  43.89  43.79
wc-scanStream      21.63  21.64  22.96  23.03  23.02  21.65  22.51  22.89  23.17
zebra              30.39  30.46  30.32  30.29  30.28  30.37  30.37  30.36  30.39
zern               32.37  33.09  33.05  32.00  32.71  31.99  32.04  32.83  31.96
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant