-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Globalization of arrays and refs of "small" types in ConstantPropagation #206
Labels
Comments
MatthewFluet
added a commit
to MatthewFluet/mlton
that referenced
this issue
Oct 21, 2017
Currently, no arrays are globalized; see MLton#206. However, if "small" arrays were globalized, then an `Array_allocRaw[t] (l)` would be globalized as `Array_alloc[t] (l)` and the `Array_toArray` (that casts the raw array to a normal array) would be dropped. Now, the `Array` abstract value tracks whether the array is raw or normal, which is used to generate either `Array_allocRaw` or `Array_alloc` if the array is globalized. Also, `Array_toArray` creates the destination `Array` abstract value with `Birth.unknown()`, which prohibits globalization.
MatthewFluet
added a commit
to MatthewFluet/mlton
that referenced
this issue
Oct 21, 2017
Currently, no arrays are globalized; see MLton#206. However, if "small" arrays were globalized, then an `Array_allocRaw[t] (l)` would be globalized as `Array_alloc[t] (l)` and the `Array_toArray` (that casts the raw array to a normal array) would be dropped. Now, the `Array` abstract value tracks whether the array is raw or normal, which is used to generate either `Array_allocRaw` or `Array_alloc` if the array is globalized. Also, `Array_toArray` creates the destination `Array` abstract value with `Birth.unknown()`, which prohibits globalization.
There are (at least) two orthogonal improvements that could be investigated:
|
MatthewFluet
added a commit
that referenced
this issue
Jan 16, 2019
Alternate strategies for globalization in ConstantPropagation Extend globalization aspect of ConstantPropagation to support globalization of arrays and to support different "small type" strategies. Closes #206. Space-safety prohibits ConstantPropagation from globalizing all arrays and refs that are allocated at most once by a program. In particular, because globals are live for the duration of the program, globalizing an `int list ref` (for example) would not be safe-for-space: an arbitrarily large list may be written to the reference and never be garbage collected (whereas, when the `int list ref` is not globalized, it will be garbage collected when it is no longer live). On the other hand, globalizing an `int ref` is safe-for-space. However, MLton previously used only a very conservative estimation for space safety. Only "small" types may be globalized, where smallness is defined as: fun isSmall t = case dest t of Array _ => false | Datatype _ => false | Ref t => isSmall t | Tuple ts => Vector.forall (ts, isSmall) | Vector _ => false | _ => true Note that no `Datatype` is small; this is conservative (since a recursive datatype could represent unbounded data), but prevents globalizing `bool ref`. Also, no `Array` is small; this is correct (because an `int array ref` should not be globalized), but the globalization of a `val a: t array = Array_alloc[t] (l)` was conditioned on the smallness of `t array`, not the smallness of `t`. It is correct to globalize an array if `t` were small; note that to globalize `val a: t array = Array_alloc[t] (l)`, `l` (the length) must be globalized and must, therefore, be a constant and the array is of constant size. (This is Stephen Weeks's relaxed notion of safe-for-space, where the constant factor blowup can be chosen per program.) This pull request adds support for alternate globalization strategies: * `-globalize-arrays {false|true}`: globalize arrays * `-globalize-refs {true|false}`: globalize refs * `-globalize-small-int-inf {true|false}`:globalize `IntInf` as a small type * `-globalize-small-type {1|0|2|3|4|9}`: strategies for classifying a type as "small": * `0`: constant `false` function (no types considered small) * `1`: no `Datatype` is considered small (original strategy) * `2`: `Datatype`s with all nullary constructors are considered small * `3`: `Datatype`s with all constructor arguments considered small according to strategy `2` are considered small * `4`: Fixed-point analysis of `Datatype`s to determine smallness * `9`: constant `true` function (all types considered small; not safe-for-space) The defaults correspond to the previous behavior. Unfortunately, additional globalization has little to no (positive) effect on benchmarks: MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 run time ratio benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 DLXSimulator 1.00 1.00 1.03 1.00 1.00 0.99 1.00 1.01 1.01 checksum 1.00 1.00 1.13 1.14 1.13 1.00 1.12 1.13 1.14 flat-array 1.00 1.00 1.00 1.00 1.01 1.19 1.19 1.19 1.19 hamlet 1.00 1.00 1.02 1.01 1.01 1.00 0.96 0.97 0.97 imp-for 1.00 1.00 1.06 1.05 1.06 1.00 1.05 1.06 1.06 knuth-bendix 1.00 1.00 1.03 1.03 1.03 1.00 1.03 1.03 1.03 lexgen 1.00 0.99 1.05 0.99 1.03 1.04 1.03 1.02 1.07 model-elimination 1.00 1.01 1.02 1.02 1.02 1.00 1.03 1.02 1.03 peek 1.00 1.00 1.03 1.03 1.04 1.00 1.04 1.04 1.04 ray 1.00 1.03 0.99 0.99 1.01 0.99 0.99 0.99 0.98 raytrace 1.00 1.03 1.00 1.02 1.03 0.99 1.00 1.01 1.01 simple 1.00 0.98 0.99 1.00 0.99 0.97 0.97 0.96 0.97 tak 1.00 1.00 1.04 1.03 1.10 1.01 1.02 1.08 1.00 wc-scanStream 1.00 1.00 1.06 1.06 1.06 1.00 1.04 1.06 1.07 Note that `MLton0` and `MLton1` generate identical code (modulo the random magic number), so the slowdowns in `ray` and `raytrace` are noise, which also suggests that slowdowns/speedups of <= 3% are also likely noise. The slowdown in `flat-array` with `-globalize-array true` is explained as follows. The `flat-array` benchmark uses `Vector.tabulate` to allocate a vector that is used for all iterations of the benchmark. With `-globalize-array false`, the array is not globalized, and in SSA/SSA2, we have: x_1212: ((word32, word32) tuple) array = prim Array_alloc((word32, word32) tuple) (global_138 (*0xF4240*)) ... x_757: ((word32, word32) tuple) vector = prim Array_toVector((word32, word32) tuple) (x_1212) ... x_1287: (word32, word32) tuple = prim Vector_sub((word32, word32) tuple) (x_757, x_1283) but with `-globalize-array true`, the array is globalized, and in SSA/SS2, we have: global_490: ((word32, word32) tuple) array = prim Array_alloc((word32, word32) tuple) (global_138 (*0xF4240*)) ... x_757: ((word32, word32) tuple) vector = prim Array_toVector((word32, word32) tuple) (global_490) ... x_1286: (word32, word32) tuple = prim Vector_sub((word32, word32) tuple) (x_757, x_1282) At RSSA, the `Array_toVector` becomes a header update and the array variable is cast/copy-propagated for the vector variable; with `-globalize-arrays false`, we have L_531 (x_1212: Objptr (opt_11)) CReturn {func = {..., target = GC_sequenceAllocate}} = ... OW64 (x_1212, ~8): Word64 := opt_12 ... x_1354: Word32 = XW32 (Cast (x_1212, Objptr (opt_12)), x_1283, 8, 0) x_1353: Word32 = XW32 (Cast (x_1212, Objptr (opt_12)), x_1283, 8, 4) but with `-globalize-arrays true`, we have L_488 (global_490: Objptr (opt_7)) CReturn {func = {..., target = GC_sequenceAllocate}} = ... OW64 (global_490, ~8): Word64 := opt_12 ... x_1353: Word32 = XW32 (Cast (global_490, Objptr (opt_12)), x_1282, 8, 0) x_1352: Word32 = XW32 (Cast (global_490, Objptr (opt_12)), x_1282, 8, 4) Finally, with `-globalize-arrays false`, `x_1212` becomes a local (because the loops to initialize and use the vector are non-allocating): RW32(2): Word32 = XW32 (Cast (RP(0): Objptr (opt_11), Objptr (opt_12)), RW64(0): Word64, 8, 0): Word32 RW32(3): Word32 = XW32 (Cast (RP(0): Objptr (opt_11), Objptr (opt_12)), RW64(0): Word64, 8, 4): Word32 but with `-globalize-arrays true`: RW32(2): Word32 = XW32 (Cast (glob {index = 1, isRoot = true, ty = Objptr (opt_7)}, Objptr (opt_12)), RW64(0): Word64, 8, 0): Word32 RW32(3): Word32 = XW32 (Cast (glob {index = 1, isRoot = true, ty = Objptr (opt_7)}, Objptr (opt_12)), RW64(0): Word64, 8, 4): Word32 The innermost loop of the benchmark goes from indexing a sequence stored in a local (`RP(0)`) to indexing a sequence stored in a global (`GP(1)`). All of the codegens should implement the former by using a hardware register for `RP(0)`, but will implement the latter with a memory read. In light of the above, and related to #218, it may be beneficial to "deglobalize" object pointer globals; that is, in RSSA functions that have multiple accesses through the same object pointer global (particularly within loops) could be translated to copy the global to a local. The slowdown in `checksum` is less easily explained. The only new objects globalized with `-globalize-small-type 2` as compared to `-globalize-small-type 1` are two `bool ref` objects, corresponding to the `exiting` flag of `basis-library/mlton/exit.sml` and the `staticIsInUse` flag of `basis-library/util/one.sml` used by `Int.fmt`. That small change seems to lead to code layout and cache effects that result in the slowdown, because the assembly code is not substantial different. With `-enable-pass machineSuffle` and `-seed-rand <w>`, one can perturb the code layout and observe that the slowdowns are not universal: MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand 42424242 MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand deadbeef MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand 42424242 MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand deadbeef MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand 42424242 MLton9 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand deadbeef MLton10 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 MLton11 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand 42424242 MLton12 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand deadbeef MLton13 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 MLton14 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand 42424242 MLton15 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 -enable-pass machineShuffle -seed-rand deadbeef MLton16 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 MLton17 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand 42424242 MLton18 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 -enable-pass machineShuffle -seed-rand deadbeef MLton19 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 MLton20 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand 42424242 MLton21 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 -enable-pass machineShuffle -seed-rand deadbeef MLton22 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 MLton23 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand 42424242 MLton24 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 -enable-pass machineShuffle -seed-rand deadbeef run time ratio benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 MLton9 MLton10 MLton11 MLton12 MLton13 MLton14 MLton15 MLton16 MLton17 MLton18 MLton19 MLton20 MLton21 MLton22 MLton23 MLton24 checksum 1.00 1.00 1.00 1.01 1.14 1.01 1.00 1.14 1.01 1.01 1.14 1.00 1.04 1.00 1.00 1.00 1.14 1.00 1.01 1.16 1.00 1.01 1.15 1.00 1.01 flat-array 1.00 1.01 1.02 1.00 1.00 1.01 1.01 1.03 1.01 1.01 1.01 1.01 1.01 1.23 1.19 1.19 1.19 1.20 1.19 1.20 1.20 1.19 1.20 1.19 1.20 hamlet 1.00 0.99 1.00 0.99 1.01 1.01 1.00 1.01 1.03 1.01 1.01 1.02 1.01 1.00 1.00 0.99 0.95 0.95 0.94 0.97 0.96 0.96 0.98 0.98 0.99 imp-for 1.00 1.00 1.05 1.05 1.05 1.02 1.00 1.05 0.99 1.00 1.05 1.00 1.01 1.00 1.05 1.05 1.05 0.99 1.00 1.05 1.00 1.00 1.05 1.00 0.99 lexgen 1.00 0.97 1.00 0.97 1.03 1.03 1.01 1.04 0.99 0.95 0.99 0.99 0.95 0.97 0.98 0.95 0.96 1.01 0.98 1.00 1.04 0.95 0.96 1.00 0.95 peek 1.00 1.00 1.00 1.01 1.03 1.01 1.04 1.03 1.01 1.04 1.03 1.01 1.03 1.00 1.00 1.01 1.04 1.01 1.05 1.03 1.01 1.04 1.04 1.00 1.04 simple 1.00 1.01 1.01 1.00 1.00 0.99 1.02 1.00 0.99 1.00 1.00 1.00 1.00 0.98 0.97 0.99 0.98 0.97 0.99 0.97 0.99 0.98 0.98 0.98 0.99 tak 1.00 0.99 0.90 1.00 1.05 0.90 0.99 1.02 0.89 0.99 1.04 0.90 1.00 0.99 0.89 0.99 0.99 0.90 0.99 1.01 0.90 1.00 1.00 0.90 1.00 wc-scanStream 1.00 1.01 1.01 1.03 1.06 1.02 1.03 1.05 1.02 1.02 1.04 1.01 1.00 1.01 1.01 1.00 1.06 1.01 1.00 1.07 1.01 1.00 1.05 1.03 1.02 Note that while `checksum` with MLton4 has a slowdown, `checksum` with MLton5 and MLton6 (which are identical up to shuffling of the functions and basic blocks at the MachineIR) do not have a slowdown. Similarly `tak` with MLton0 and MLton1 have similar running time, but `tak` with MLton3 has a speedup. On the other hand, `flat-array`'s slowdowns with `-globalize-arrays true` are not due to code layout effects. `hamlet` may have a slight speedup with `-globalize-arrays true`, but that is significantly outweighted by the slowdown in `flat-array`. The conclusion is to leave the defaults corresponding to the original behavior. Full benchmark results: MLton0 -- ~/devel/mlton/builds/20190106.115052-gfe996d4/bin/mlton MLton1 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 1 MLton2 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 2 MLton3 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 3 MLton4 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays false -globalize-refs true -globalize-small-type 4 MLton5 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 1 MLton6 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 2 MLton7 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 3 MLton8 -- ~/devel/mlton/builds/20190111.182738-g0847620/bin/mlton -globalize-arrays true -globalize-refs true -globalize-small-type 4 run time ratio benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 DLXSimulator 1.00 1.00 1.03 1.00 1.00 0.99 1.00 1.01 1.01 barnes-hut 1.00 1.01 1.01 1.02 1.01 1.01 1.01 1.02 1.01 boyer 1.00 1.00 1.01 1.01 1.02 1.00 1.01 1.01 1.01 checksum 1.00 1.00 1.13 1.14 1.13 1.00 1.12 1.13 1.14 count-graphs 1.00 1.00 0.99 1.01 0.99 1.00 1.00 1.00 0.99 even-odd 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 fft 1.00 1.00 1.01 1.02 1.02 1.01 1.01 1.00 1.01 fib 1.00 1.00 1.00 1.00 1.01 1.00 1.00 1.00 1.00 flat-array 1.00 1.00 1.00 1.00 1.01 1.19 1.19 1.19 1.19 hamlet 1.00 1.00 1.02 1.01 1.01 1.00 0.96 0.97 0.97 imp-for 1.00 1.00 1.06 1.05 1.06 1.00 1.05 1.06 1.06 knuth-bendix 1.00 1.00 1.03 1.03 1.03 1.00 1.03 1.03 1.03 lexgen 1.00 0.99 1.05 0.99 1.03 1.04 1.03 1.02 1.07 life 1.00 1.00 1.00 1.00 1.00 1.01 1.01 1.00 1.00 logic 1.00 1.00 0.98 0.99 0.99 1.00 0.98 0.99 1.00 mandelbrot 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 matrix-multiply 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 md5 1.00 1.00 0.99 0.99 0.99 1.00 0.99 0.99 0.99 merge 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 mlyacc 1.00 1.01 1.01 1.01 1.01 1.00 0.99 0.99 1.01 model-elimination 1.00 1.01 1.02 1.02 1.02 1.00 1.03 1.02 1.03 mpuz 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 nucleic 1.00 1.01 1.00 1.00 0.99 1.00 0.99 0.99 1.00 output1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 peek 1.00 1.00 1.03 1.03 1.04 1.00 1.04 1.04 1.04 psdes-random 1.00 1.00 1.00 1.00 1.00 1.00 1.01 1.00 1.00 ratio-regions 1.00 0.99 0.98 0.98 1.01 1.01 0.99 1.00 1.01 ray 1.00 1.03 0.99 0.99 1.01 0.99 0.99 0.99 0.98 raytrace 1.00 1.03 1.00 1.02 1.03 0.99 1.00 1.01 1.01 simple 1.00 0.98 0.99 1.00 0.99 0.97 0.97 0.96 0.97 smith-normal-form 1.00 1.00 1.01 0.99 1.00 1.00 1.00 1.00 1.00 string-concat 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 tailfib 1.00 1.00 1.00 0.99 1.00 1.00 1.00 1.00 1.00 tak 1.00 1.00 1.04 1.03 1.10 1.01 1.02 1.08 1.00 tensor 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 tsp 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 1.00 tyan 1.00 1.00 1.00 1.01 1.01 1.01 1.01 1.01 1.01 vector-rev 1.00 0.99 0.99 0.98 0.98 0.98 0.99 0.98 0.98 vector32-concat 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 vector64-concat 1.00 1.00 1.00 1.00 1.00 0.99 1.00 1.00 0.99 vliw 1.00 0.98 1.00 1.01 1.00 1.00 0.98 0.98 0.98 wc-input1 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 wc-scanStream 1.00 1.00 1.06 1.06 1.06 1.00 1.04 1.06 1.07 zebra 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 zern 1.00 1.02 1.02 0.99 1.01 0.99 0.99 1.01 0.99 size benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 DLXSimulator 209,076 209,076 209,140 209,140 209,140 209,076 208,836 208,340 208,340 barnes-hut 176,199 176,199 176,071 176,071 176,071 176,199 176,071 176,071 176,071 boyer 243,369 243,369 243,289 243,289 243,289 243,369 243,289 243,289 243,289 checksum 117,561 117,561 117,433 117,433 117,433 117,561 117,433 117,433 117,433 count-graphs 145,065 145,065 145,017 145,017 145,017 145,065 144,937 144,937 144,937 even-odd 117,529 117,529 117,433 117,433 117,433 117,529 117,433 117,433 117,433 fft 142,307 142,307 141,315 141,315 141,315 142,307 141,315 141,315 141,315 fib 117,449 117,449 117,321 117,321 117,321 117,449 117,321 117,321 117,321 flat-array 117,177 117,177 117,049 117,049 117,049 117,193 117,081 117,081 117,081 hamlet 1,434,228 1,434,228 1,433,220 1,433,220 1,433,220 1,434,228 1,432,564 1,427,956 1,427,396 imp-for 117,241 117,241 117,145 117,145 117,145 117,241 117,145 117,145 117,145 knuth-bendix 186,116 186,116 186,212 186,212 186,212 186,116 186,212 186,212 186,212 lexgen 290,931 290,931 290,819 290,819 290,819 290,931 290,819 290,819 290,819 life 141,113 141,113 141,065 141,065 141,065 141,113 141,065 141,065 141,065 logic 197,417 197,417 197,273 197,273 197,273 197,417 197,273 197,273 197,273 mandelbrot 117,273 117,273 117,177 117,177 117,177 117,273 117,177 117,177 117,177 matrix-multiply 119,577 119,577 119,417 119,417 119,417 119,577 119,417 119,417 119,417 md5 144,676 144,676 144,500 144,500 144,500 144,676 144,500 144,500 144,500 merge 118,953 118,953 118,857 118,857 118,857 118,953 118,857 118,857 118,857 mlyacc 643,555 643,555 643,651 643,651 643,651 643,555 643,651 643,651 643,475 model-elimination 796,054 796,054 793,958 793,798 793,798 796,054 794,166 792,246 792,246 mpuz 123,545 123,545 123,481 123,481 123,481 123,545 123,481 123,481 123,481 nucleic 297,249 297,249 297,233 297,233 297,233 297,249 297,233 297,233 297,233 output1 151,768 151,768 149,848 149,848 149,848 151,768 149,848 149,848 149,848 peek 150,164 150,164 150,132 150,132 150,132 150,164 150,132 150,132 150,132 psdes-random 121,545 121,545 121,401 121,401 121,401 121,545 121,401 121,401 121,401 ratio-regions 144,137 144,137 144,169 144,169 144,169 144,137 144,169 144,169 144,169 ray 250,058 250,058 250,218 250,218 250,218 250,058 249,818 249,066 249,066 raytrace 368,988 368,988 368,108 368,108 368,108 368,956 367,868 367,468 367,468 simple 345,205 345,205 345,381 345,381 345,381 329,557 329,557 329,317 329,317 smith-normal-form 279,837 279,837 279,645 279,645 279,645 279,837 279,341 279,341 279,341 string-concat 119,129 119,129 119,033 119,033 119,033 119,209 119,033 119,033 119,033 tailfib 117,273 117,273 117,177 117,177 117,177 117,273 117,177 117,177 117,177 tak 117,449 117,449 117,321 117,321 117,321 117,449 117,321 117,321 117,321 tensor 179,292 179,292 176,908 176,908 176,908 179,292 176,908 176,908 176,908 tsp 158,860 158,860 158,668 158,668 158,668 158,860 158,668 158,668 158,668 tyan 223,588 223,588 223,044 223,044 223,044 223,588 223,044 223,044 223,044 vector-rev 118,105 118,105 118,009 118,009 118,009 118,153 117,977 117,977 117,977 vector32-concat 118,297 118,297 118,201 118,201 118,201 118,329 118,185 118,185 118,185 vector64-concat 118,329 118,329 118,169 118,169 118,169 118,329 118,217 118,217 118,217 vliw 505,509 505,509 503,013 503,013 503,013 505,637 500,917 497,957 497,957 wc-input1 179,051 179,051 178,923 178,923 178,923 179,051 178,923 178,923 178,923 wc-scanStream 188,155 188,155 188,027 188,027 188,027 188,155 188,027 188,027 188,027 zebra 225,364 225,364 225,220 225,220 225,220 225,364 225,220 225,220 225,220 zern 153,241 153,241 152,521 152,521 152,521 153,241 152,585 152,585 152,585 compile time benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 DLXSimulator 3.19 3.14 3.45 3.51 3.02 3.27 3.18 3.45 3.24 barnes-hut 2.93 2.94 2.96 2.97 3.06 2.92 2.96 2.93 2.96 boyer 3.36 3.49 3.48 3.32 3.42 3.40 3.52 3.53 3.49 checksum 2.53 2.56 2.47 2.56 2.57 2.46 2.56 2.54 2.62 count-graphs 2.68 2.69 2.71 2.80 2.80 2.79 2.67 2.69 2.76 even-odd 2.45 2.56 2.46 2.56 2.48 2.50 2.48 2.55 2.57 fft 2.64 2.57 2.62 2.66 2.35 2.68 2.60 2.54 2.62 fib 2.46 2.46 2.58 2.48 2.55 2.45 2.55 2.54 2.46 flat-array 2.52 2.56 2.55 2.56 2.55 2.54 2.53 2.56 2.55 hamlet 15.27 15.74 14.48 14.35 14.48 15.63 15.21 14.88 15.04 imp-for 2.48 2.56 2.52 2.55 2.55 2.34 2.42 2.48 2.54 knuth-bendix 2.89 2.90 2.90 3.01 3.02 2.93 3.01 3.13 2.99 lexgen 3.40 3.88 3.76 3.82 3.74 3.62 3.50 3.81 3.71 life 2.66 2.71 2.70 2.74 2.72 2.64 2.74 2.68 2.60 logic 3.06 3.05 3.11 3.12 3.00 3.13 3.14 2.92 2.83 mandelbrot 2.51 2.54 2.53 2.58 2.57 2.53 2.56 2.55 2.45 matrix-multiply 2.48 2.50 2.52 2.49 2.57 2.60 2.60 2.57 2.53 md5 2.65 2.68 2.78 2.58 2.78 2.80 2.58 2.80 2.69 merge 2.47 2.49 2.58 2.57 2.55 2.49 2.57 2.50 2.52 mlyacc 7.85 7.94 8.00 7.98 7.90 7.69 8.05 8.09 7.54 model-elimination 7.08 7.98 7.62 7.11 7.40 8.24 7.84 7.96 8.15 mpuz 2.34 2.61 2.60 2.56 2.53 2.62 2.42 2.53 2.52 nucleic 4.06 4.17 4.07 4.05 4.08 4.06 4.28 4.33 4.14 output1 2.69 2.68 2.52 2.58 2.77 2.79 2.78 2.57 2.77 peek 2.74 2.78 2.58 2.80 2.80 2.73 2.70 2.68 2.72 psdes-random 2.56 2.53 2.49 2.48 2.50 2.48 2.53 2.64 2.50 ratio-regions 2.80 2.82 2.62 2.88 2.86 2.89 2.81 2.80 2.79 ray 3.34 3.62 3.69 3.47 3.45 3.60 3.48 3.37 3.40 raytrace 4.55 4.88 4.47 4.32 4.64 4.30 4.48 4.43 4.50 simple 4.01 4.07 4.04 4.00 3.91 3.83 3.77 3.81 3.74 smith-normal-form 3.75 3.60 3.82 3.58 3.40 3.79 3.60 3.58 3.58 string-concat 2.46 2.66 2.54 2.45 2.57 2.56 2.46 2.51 2.49 tailfib 2.44 2.54 2.53 2.57 2.50 2.57 2.57 2.36 2.54 tak 2.45 2.57 2.56 2.63 2.47 2.44 2.43 2.44 2.52 tensor 3.05 3.16 3.07 3.15 3.10 3.32 3.18 3.16 3.13 tsp 2.81 2.79 2.57 2.74 2.76 2.75 2.84 2.84 2.82 tyan 3.27 3.06 3.38 3.26 3.35 3.23 3.22 3.22 3.24 vector-rev 2.49 2.57 2.56 2.33 2.61 2.40 2.52 2.49 2.54 vector32-concat 2.53 2.49 2.52 2.47 2.55 2.63 2.49 2.49 2.51 vector64-concat 2.48 2.46 2.50 2.52 2.48 2.54 2.46 2.47 2.48 vliw 5.63 5.63 5.64 6.14 5.43 6.26 5.78 6.17 6.16 wc-input1 3.06 2.96 2.95 2.98 2.91 2.87 2.90 2.97 2.96 wc-scanStream 3.01 2.97 3.02 2.95 3.06 3.04 3.03 3.01 3.09 zebra 3.30 3.37 3.36 3.26 3.34 3.51 3.31 3.26 3.34 zern 2.69 2.74 2.62 2.64 2.72 2.74 2.73 2.43 2.70 run time benchmark MLton0 MLton1 MLton2 MLton3 MLton4 MLton5 MLton6 MLton7 MLton8 DLXSimulator 32.67 32.69 33.65 32.71 32.53 32.50 32.63 32.84 32.92 barnes-hut 28.33 28.63 28.64 28.85 28.59 28.50 28.71 28.78 28.61 boyer 55.98 56.01 56.81 56.75 56.82 56.08 56.29 56.72 56.76 checksum 25.35 25.39 28.74 28.79 28.64 25.39 28.42 28.68 28.87 count-graphs 39.96 39.98 39.51 40.23 39.38 40.13 40.16 39.92 39.75 even-odd 39.09 39.07 39.09 39.06 39.08 39.14 39.10 39.09 39.09 fft 30.56 30.69 30.75 31.05 31.20 30.98 30.86 30.62 30.77 fib 17.70 17.75 17.77 17.77 17.83 17.77 17.74 17.67 17.77 flat-array 23.60 23.54 23.65 23.70 23.82 28.18 28.09 28.02 28.01 hamlet 39.69 39.77 40.32 40.20 40.20 39.72 38.00 38.70 38.68 imp-for 24.40 24.44 25.78 25.74 25.79 24.51 25.73 25.98 25.84 knuth-bendix 34.10 34.09 35.05 34.97 35.03 34.16 35.02 35.11 35.07 lexgen 34.22 33.77 35.88 33.88 35.27 35.64 35.09 34.93 36.52 life 38.70 38.80 38.76 38.76 38.89 38.90 38.91 38.74 38.84 logic 35.01 34.93 34.44 34.82 34.82 35.06 34.47 34.77 34.93 mandelbrot 35.78 35.77 35.79 35.80 35.81 35.82 35.79 35.78 35.81 matrix-multiply 29.74 29.80 29.82 29.89 29.79 29.72 29.84 29.90 29.86 md5 28.38 28.39 28.04 28.05 28.01 28.42 28.12 28.14 27.98 merge 32.55 32.45 32.41 32.44 32.59 32.48 32.41 32.56 32.40 mlyacc 32.76 33.01 32.94 33.14 33.23 32.83 32.52 32.49 33.11 model-elimination 38.05 38.26 38.89 38.72 38.80 38.09 39.11 38.97 39.36 mpuz 29.94 29.86 29.88 29.91 29.89 29.91 29.89 29.86 29.90 nucleic 33.73 33.91 33.60 33.68 33.35 33.74 33.40 33.47 33.58 output1 30.01 30.01 29.99 30.01 30.02 30.06 29.99 30.10 29.91 peek 33.58 33.60 34.75 34.63 34.78 33.61 34.79 34.78 34.86 psdes-random 33.84 33.91 33.83 33.86 33.91 33.88 34.16 33.91 33.91 ratio-regions 49.08 48.63 48.33 48.33 49.40 49.45 48.70 48.95 49.45 ray 37.55 38.58 37.18 37.06 37.80 37.04 37.19 37.08 36.65 raytrace 34.20 35.34 34.35 34.94 35.23 34.00 34.08 34.61 34.52 simple 29.73 29.25 29.47 29.64 29.51 28.96 28.84 28.57 28.86 smith-normal-form 39.88 39.79 40.38 39.56 39.99 39.92 39.93 39.85 39.88 string-concat 91.43 91.56 91.68 91.40 91.65 91.58 91.75 91.50 91.36 tailfib 38.06 38.06 37.93 37.85 37.90 38.12 38.00 37.98 37.87 tak 30.73 30.77 31.96 31.60 33.85 30.94 31.41 33.22 30.87 tensor 39.62 39.59 39.67 39.70 39.70 39.54 39.71 39.76 39.67 tsp 37.88 37.80 37.84 37.79 37.75 37.62 37.86 37.99 37.82 tyan 30.48 30.45 30.55 30.86 30.79 30.78 30.74 30.85 30.79 vector-rev 27.00 26.60 26.61 26.44 26.55 26.37 26.68 26.53 26.39 vector32-concat 82.46 82.52 82.56 82.72 82.43 82.56 82.37 82.30 82.22 vector64-concat 91.66 91.86 91.57 91.77 91.63 91.13 91.54 91.42 91.13 vliw 28.91 28.43 28.94 29.05 28.90 28.82 28.29 28.27 28.44 wc-input1 43.95 43.81 43.87 43.95 43.86 43.93 43.95 43.89 43.79 wc-scanStream 21.63 21.64 22.96 23.03 23.02 21.65 22.51 22.89 23.17 zebra 30.39 30.46 30.32 30.29 30.28 30.37 30.37 30.36 30.39 zern 32.37 33.09 33.05 32.00 32.71 31.99 32.04 32.83 31.96
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Space-safety prohibits ConstantPropagation from globalizing all arrays and refs that are allocated at most once by a program. In particular, because globals are live for the duration of the program, globalizing an
int list ref
(for example) would not be safe-for-space: an arbitrarily large list may be written to the reference and never garbage collected (whereas, when theint list ref
is not globalized, it will be garbage collected when it is no longer live). On the other hand, globalizing anint ref
is safe-for-space.However, MLton currently uses a very conservative estimation for space safety. Only "small" types may be globalized, where smallness is defined as:
Note that no
Datatype
is small; this is conservative (since a recursive datatype could represent unbounded data), but prevents globalizingbool ref
. Also, noArray
is small; this is correct (because anint array ref
should not be globalized), but the globalization of aval a: t array = Array_alloc[t] (l)
is currently conditioned on the smallness oft array
, not the smallness oft
. It would be correct to globalize an array ift
were small; note that to globalizeval a: t array = Array_alloc[t] (l)
,l
(the length) must be globalized and must, therefore, be a constant and the array is of constant size. (This is Stephen Weeks's relaxed notion of safe-for-space, where the constant factor blowup can be chosen per program.) In practice, it may be better to limit globalization of arrays to ones with "small" length in addition to small element type.RefFlatten uses a more precise notion of small/large types (although RefFlatten was meant to be safe-for-space, the initial definition of small/large types was incorrect; see b1a0a80).
The text was updated successfully, but these errors were encountered: