Whole program dead code elimination #608

Open
wants to merge 158 commits into
from

Conversation

Projects
None yet
@chambart
Contributor

chambart commented Jun 8, 2016

I propose in this PR to add another link mode to the ocamlopt compiler with flambda enabled.

There is a new -lto option to the compiler to mark when a file should export sufficient information for the link. When all the .cmx and .cmxa files of a project are built with this option, it is possible to link with it too. When linking, all the flambda informations are concatenated and go through a dead code elimination pass that removes all unreferenced symbols (but keep effets) and build a new big object file containing the whole program.
Of course, this prevent using dynlink in this program as the modules referenced by the loaded module might have been eliminated. It may be possible to provide a mode where some interfaces are requested, and only those would be available for a dynlinked module but this is not implemented yet.
There are no optimization specific to whole program applied yet.

Outside of the undynlinkability, the other drawbacks are:

  • cmxa files are really larger. (stdlib goes from 15kB to 1.9MB). Those files didn't contain much,
    for this mode, they must contain the whole code of the included modules since the compiler
    might not have access to the cmx files while linking.
  • cmx files are not that much bigger (less than 10% on the stdlib), probably quite more in -Oclassic

Overall compilation time does not change significantly, but link time of course increases a lot:

without -lto

$ time ./ocamlopt.opt -dtimings -g -nostdlib -I stdlib -I otherlibs/dynlink  -ccopt "-Wl,-E" -o ocamlc.opt   compilerlibs/ocamlcommon.cmxa compilerlibs/ocamlbytecomp.cmxa   driver/main.cmx -cclib "-lm  -ldl -lcurses -lpthread" 

real    0m0.565s
user    0m0.464s
sys 0m0.088s

with -lto

$ time ./ocamlopt.opt -dtimings -g -nostdlib -I stdlib -I otherlibs/dynlink  -ccopt "-Wl,-E" -lto -o ocamlc.opt   compilerlibs/ocamlcommon.cmxa compilerlibs/ocamlbytecomp.cmxa   driver/main.cmx -cclib "-lm  -ldl -lcurses -lpthread"

real    0m8.243s
user    0m7.912s
sys 0m0.316s

all: 5.752s
flambda(concatenate)(link): 0.096s
flambda(remove_unused_program_constructs)(link): 0.512s
flambda(backend)(link): 0.452s
cmm(link): 0.604s
compile_phrases(link): 2.992s
selection(link): 0.228s
comballoc(link): 0.020s
cse(link): 0.180s
deadcode(link): 0.088s
spill(link): 0.312s
split(link): 0.136s
liveness(link): 0.264s
regalloc(link): 1.088s
linearize(link): 0.024s
scheduling(link): 0.004s
emit(link): 0.400s
assemble(link): 0.008s

The effect is not wonderful on the compiler itself: the size of ocamlc.opt decreases by ~10% (there is not much dead code there), but on some extreme examples we can get quite a lot. There are still some cases where this does not eliminate as much as expected.

This patch is based on #602 only the commits after 88c2c8c (Also remove linking hack for bytecode) are relevant. This other PR is needed to allow removing unneeded toplevel modules.

Note the -lto or 'link time optimization' is quite badly named. Please suggest a better option.

@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Jun 8, 2016

Contributor

Rather than making cmxa largely redundant with cmx files, couldn't we get rid of cmxa files (by e.g, allowing to have link options in cmx files)?
I am not asking for this to be mandatory, that would break compatibility very badly. But maybe just to make this workflow viable, now that cmx files are needed for flambda to operate properly.

Contributor

let-def commented Jun 8, 2016

Rather than making cmxa largely redundant with cmx files, couldn't we get rid of cmxa files (by e.g, allowing to have link options in cmx files)?
I am not asking for this to be mandatory, that would break compatibility very badly. But maybe just to make this workflow viable, now that cmx files are needed for flambda to operate properly.

@lpw25

This comment has been minimized.

Show comment
Hide comment
@lpw25

lpw25 Jun 8, 2016

Contributor

@let-def I agree and have a patch that does this as part of a larger "namespaces" proposal.

Contributor

lpw25 commented Jun 8, 2016

@let-def I agree and have a patch that does this as part of a larger "namespaces" proposal.

@xavierleroy

This comment has been minimized.

Show comment
Hide comment
@xavierleroy

xavierleroy Jun 8, 2016

Contributor

Concerning @let-def's suggestion, it's exactly what Caml Light did back in the days. Then, someone objected on the basis of the following scenario: you have a library "foo" containing two modules, "foo_aux" and "foo". With the proposed approach, you first compile foo.ml, obtaining foo.cmx, then build the library foo.cmx, overwriting the previous foo.cmx file...

Also, .cmx files describe .o files while .cmxa files describe .a files. What are you proposing? .cmx files that describe .a files? Get rid of .a files?

Contributor

xavierleroy commented Jun 8, 2016

Concerning @let-def's suggestion, it's exactly what Caml Light did back in the days. Then, someone objected on the basis of the following scenario: you have a library "foo" containing two modules, "foo_aux" and "foo". With the proposed approach, you first compile foo.ml, obtaining foo.cmx, then build the library foo.cmx, overwriting the previous foo.cmx file...

Also, .cmx files describe .o files while .cmxa files describe .a files. What are you proposing? .cmx files that describe .a files? Get rid of .a files?

@let-def

This comment has been minimized.

Show comment
Hide comment
@let-def

let-def Jun 8, 2016

Contributor

Yes, offer a workflow without .a files. Libraries add one more level of names, I would like to be able to do without if possible.
The linker could receive all objects (with existing tools, it's mostly a matter of putting all .cmx files in the META description).

Contributor

let-def commented Jun 8, 2016

Yes, offer a workflow without .a files. Libraries add one more level of names, I would like to be able to do without if possible.
The linker could receive all objects (with existing tools, it's mostly a matter of putting all .cmx files in the META description).

@lpw25

This comment has been minimized.

Show comment
Hide comment
@lpw25

lpw25 Jun 8, 2016

Contributor

I had assumed @let-def was suggesting allowing to use a directory filled with .cmx files in place of a .cmxa. You already need such a directory for cross-module inlining so why not use it for linking.

Contributor

lpw25 commented Jun 8, 2016

I had assumed @let-def was suggesting allowing to use a directory filled with .cmx files in place of a .cmxa. You already need such a directory for cross-module inlining so why not use it for linking.

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jun 8, 2016

Member

@chambart I'm curious to see the gain that this could have on some MirageOS binaries. Is there a way to turn -lto by default on the whole switch (and if yes, could this be added as a new experimental opam compiler?)

Member

samoht commented Jun 8, 2016

@chambart I'm curious to see the gain that this could have on some MirageOS binaries. Is there a way to turn -lto by default on the whole switch (and if yes, could this be added as a new experimental opam compiler?)

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 8, 2016

Contributor

@let-def why not, but this probably does not fit in this PR (that is already large enouth). Currently beside link information, there is also no way to tell using only cmx files that a file should be linked only if used. This matter if there is some initialization code. We could of course also add a flag to the cmx files to tell something like that.

@samoht you can use export OCAMLPARAM=lto=1,_ or use something similar to the the hack in: ocaml/opam-repository#6115 (which by the way was not propagated to the current default 4.03 compiler as I just noticed...)

Contributor

chambart commented Jun 8, 2016

@let-def why not, but this probably does not fit in this PR (that is already large enouth). Currently beside link information, there is also no way to tell using only cmx files that a file should be linked only if used. This matter if there is some initialization code. We could of course also add a flag to the cmx files to tell something like that.

@samoht you can use export OCAMLPARAM=lto=1,_ or use something similar to the the hack in: ocaml/opam-repository#6115 (which by the way was not propagated to the current default 4.03 compiler as I just noticed...)

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 8, 2016

Contributor

The last patch adds a minimal optimization round when linking to allow to remove some more references to toplevel modules in situations containing something like:

module A

initialize_symbol a (call ...)
initialize_symbol camlA (a.(0))

module B

let v = A.camlA.(0)

When concatenating A and B, B still maintain a reference to camlA but it could have been redirected to A.a. This usually does not matter since reaching the value in a field of a symbol or another has the same performance cost, hence there is no information propagated in the cmx file to know that A.camlA.(0) is an alias of A.a.(0).

It now matters. This alias is something that inline_and_simplify knows about, so running it after concatenation allows to remove all references to camlA.

In practice that can appear for instance if we use stdin that force the whole pervasive module to stay alive. A quite extreme example, with this pass and sufficiently aggressive inlining option (-O3 -inline-max-unroll 4), the complete code of the program:

Printf.printf "Hello world %i\n" 3

Is in clambda (this is after un-anf, hence constants are not shown)

un-anf (init_code):
(seq
  (caml_register_named_value "camlPervasives__Pccall_arg_1389"
    "camlPervasives__Pmakeblock_1388")
  (caml_ml_open_descriptor_in 0)
  (setfield_ptr(init) 0 "camlPervasives__Pccall_1585"
    (caml_ml_open_descriptor_out 1))
  (caml_ml_open_descriptor_out 2)
  (setfield_ptr(init) 0 "camlPervasives__exit_function_1549"
    (makemutable 0 "camlPervasives__flush_all_582_closure"))
  (caml_register_named_value "camlPervasives__Pccall_arg_1460"
    "camlPervasives__do_at_exit_1213_closure")
  (setfield_ptr(init) 0 "camlTest__Pccall_282"
    (caml_format_int "camlCamlinternalFormat__const_string_10802" 3))
  (caml_ml_output (field 0 "camlPervasives__Pccall_1585")
    "camlTest__Pmakeblock_arg_34" 0 12)
  (setfield_ptr(init) 0 "camlTest__Pccall_arg_280"
    (string.length (field 0 "camlTest__Pccall_282")))
  (caml_ml_output (field 0 "camlPervasives__Pccall_1585")
    (field 0 "camlTest__Pccall_282") 0 (field 0 "camlTest__Pccall_arg_280"))
  (caml_ml_output_char (field 0 "camlPervasives__Pccall_1585") 10)
  (setfield_ptr(init) 0 "camlStd_exit__simplify_fv_22"
    (field 0 (field 0 "camlPervasives__exit_function_1549")))
  (apply (field 0 "camlStd_exit__simplify_fv_22") 0a) 0a)

un-anf (camlPervasives__iter_586):
(if param_588/1262
  (let
    (sequence_591/1264
       (try (caml_ml_flush (field 0 param_588/1262)) with exn_592/1267 0a))
    (apply* camlPervasives__iter_586  (field 1 param_588/1262)))
  0a)


un-anf (camlPervasives__do_at_exit_1213):
(apply (field 0 (field 0 "camlPervasives__exit_function_1549")) 0a)

un-anf (camlPervasives__flush_all_582):
(apply* camlPervasives__iter_586  (caml_ml_out_channels_list 0a))

and the constants are

Pervasives.camlPervasives__Pmakeblock_arg_1387: "index out of bounds"
CamlinternalFormat.camlCamlinternalFormat__const_string_10802: "%i"

Pervasives.camlPervasives__Pccall_arg_1389: "Pervasives.array_bound_error"

Test.camlTest__Pmakeblock_arg_34: "Hello world "

Pervasives.camlPervasives__Pccall_arg_1460: "Pervasives.do_at_exit"

Pervasives.camlPervasives__Pmakeblock_1388: block(0,"caml_exn_Invalid_argument","camlPervasives__Pmakeblock_arg_1387")

Notice that the majority of the code is related to do_at_exit

Contributor

chambart commented Jun 8, 2016

The last patch adds a minimal optimization round when linking to allow to remove some more references to toplevel modules in situations containing something like:

module A

initialize_symbol a (call ...)
initialize_symbol camlA (a.(0))

module B

let v = A.camlA.(0)

When concatenating A and B, B still maintain a reference to camlA but it could have been redirected to A.a. This usually does not matter since reaching the value in a field of a symbol or another has the same performance cost, hence there is no information propagated in the cmx file to know that A.camlA.(0) is an alias of A.a.(0).

It now matters. This alias is something that inline_and_simplify knows about, so running it after concatenation allows to remove all references to camlA.

In practice that can appear for instance if we use stdin that force the whole pervasive module to stay alive. A quite extreme example, with this pass and sufficiently aggressive inlining option (-O3 -inline-max-unroll 4), the complete code of the program:

Printf.printf "Hello world %i\n" 3

Is in clambda (this is after un-anf, hence constants are not shown)

un-anf (init_code):
(seq
  (caml_register_named_value "camlPervasives__Pccall_arg_1389"
    "camlPervasives__Pmakeblock_1388")
  (caml_ml_open_descriptor_in 0)
  (setfield_ptr(init) 0 "camlPervasives__Pccall_1585"
    (caml_ml_open_descriptor_out 1))
  (caml_ml_open_descriptor_out 2)
  (setfield_ptr(init) 0 "camlPervasives__exit_function_1549"
    (makemutable 0 "camlPervasives__flush_all_582_closure"))
  (caml_register_named_value "camlPervasives__Pccall_arg_1460"
    "camlPervasives__do_at_exit_1213_closure")
  (setfield_ptr(init) 0 "camlTest__Pccall_282"
    (caml_format_int "camlCamlinternalFormat__const_string_10802" 3))
  (caml_ml_output (field 0 "camlPervasives__Pccall_1585")
    "camlTest__Pmakeblock_arg_34" 0 12)
  (setfield_ptr(init) 0 "camlTest__Pccall_arg_280"
    (string.length (field 0 "camlTest__Pccall_282")))
  (caml_ml_output (field 0 "camlPervasives__Pccall_1585")
    (field 0 "camlTest__Pccall_282") 0 (field 0 "camlTest__Pccall_arg_280"))
  (caml_ml_output_char (field 0 "camlPervasives__Pccall_1585") 10)
  (setfield_ptr(init) 0 "camlStd_exit__simplify_fv_22"
    (field 0 (field 0 "camlPervasives__exit_function_1549")))
  (apply (field 0 "camlStd_exit__simplify_fv_22") 0a) 0a)

un-anf (camlPervasives__iter_586):
(if param_588/1262
  (let
    (sequence_591/1264
       (try (caml_ml_flush (field 0 param_588/1262)) with exn_592/1267 0a))
    (apply* camlPervasives__iter_586  (field 1 param_588/1262)))
  0a)


un-anf (camlPervasives__do_at_exit_1213):
(apply (field 0 (field 0 "camlPervasives__exit_function_1549")) 0a)

un-anf (camlPervasives__flush_all_582):
(apply* camlPervasives__iter_586  (caml_ml_out_channels_list 0a))

and the constants are

Pervasives.camlPervasives__Pmakeblock_arg_1387: "index out of bounds"
CamlinternalFormat.camlCamlinternalFormat__const_string_10802: "%i"

Pervasives.camlPervasives__Pccall_arg_1389: "Pervasives.array_bound_error"

Test.camlTest__Pmakeblock_arg_34: "Hello world "

Pervasives.camlPervasives__Pccall_arg_1460: "Pervasives.do_at_exit"

Pervasives.camlPervasives__Pmakeblock_1388: block(0,"caml_exn_Invalid_argument","camlPervasives__Pmakeblock_arg_1387")

Notice that the majority of the code is related to do_at_exit

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jun 8, 2016

Member

@chambart any chance that you could fix the 4.03 description and add your new compiler to opam-repository? :-) (it not, I'll try to do this next week)

Member

samoht commented Jun 8, 2016

@chambart any chance that you could fix the 4.03 description and add your new compiler to opam-repository? :-) (it not, I'll try to do this next week)

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Jun 8, 2016

Contributor

Did you consider an actual "whole program optimizer" based on flambda? I can imagine such a compiler loading e.g. .cmt files (produced "quickly" by ocamlc, or by ocamlopt in non-flambda mode) and doing all the compilation+linking globally. The closed world assumption opens many more opportunities for optimizations (including tweaks to the representation of values if they cannot be observed from C). This would support quick compilation (including with ocamlopt) for reasonably fast edit/compile cycle but also local testing, while allowing full optimization mode for building critical executables.

Contributor

alainfrisch commented Jun 8, 2016

Did you consider an actual "whole program optimizer" based on flambda? I can imagine such a compiler loading e.g. .cmt files (produced "quickly" by ocamlc, or by ocamlopt in non-flambda mode) and doing all the compilation+linking globally. The closed world assumption opens many more opportunities for optimizations (including tweaks to the representation of values if they cannot be observed from C). This would support quick compilation (including with ocamlopt) for reasonably fast edit/compile cycle but also local testing, while allowing full optimization mode for building critical executables.

@bluddy

This comment has been minimized.

Show comment
Hide comment
@bluddy

bluddy Jun 8, 2016

@alainfrisch I think we were all thinking about this and hoping for it. Whole-program compilation is the 'holy grail' in terms of optimization potential. It would be nice to introduce optimizations here (like type-specialization of functions and optimized type representation, which would provide serious performance boosts) and then slowly let some of them leak out to the open-universe case.

bluddy commented Jun 8, 2016

@alainfrisch I think we were all thinking about this and hoping for it. Whole-program compilation is the 'holy grail' in terms of optimization potential. It would be nice to introduce optimizations here (like type-specialization of functions and optimized type representation, which would provide serious performance boosts) and then slowly let some of them leak out to the open-universe case.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jun 9, 2016

Contributor

@alainfrisch We haven't really thought about this much, but we're intending to spend some amount of time later this year trying to significantly improve compilation speed at Jane Street, and one thing we're considering is stopping the compiler earlier for a "type-check only" mode (to give fast feedback) with delayed "background" output of object files after that. I think this would fit in with what you propose.

I will undertake to review this patch.

Contributor

mshinwell commented Jun 9, 2016

@alainfrisch We haven't really thought about this much, but we're intending to spend some amount of time later this year trying to significantly improve compilation speed at Jane Street, and one thing we're considering is stopping the compiler earlier for a "type-check only" mode (to give fast feedback) with delayed "background" output of object files after that. I think this would fit in with what you propose.

I will undertake to review this patch.

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jun 9, 2016

Contributor

(Also, I'm not sure "-lto" is badly named. It describes what is going on and is nearly the same as the corresponding GCC option for the same thing.)

Contributor

mshinwell commented Jun 9, 2016

(Also, I'm not sure "-lto" is badly named. It describes what is going on and is nearly the same as the corresponding GCC option for the same thing.)

@chambart chambart referenced this pull request in OCamlPro/flambda-task-force Jun 9, 2016

Open

Whole-program compilation #150

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Jun 9, 2016

Contributor

Some of the optimizations I would like will require changes to the OCaml GC – specifically, the ability to have blocks in the heap that contain a mixture of pointers and non-pointers. This would allow for floats, int32, int64, and nativeint to be unboxed everywhere, except possibly where alignment constraints for int64 proved problematic and as arguments to polymorphic functions that were not type-specialized.

Even more far-out, if type-specialization makes most functions operate on unboxed, specialized data (such as machine ints, doubles, and pointers), OCaml might benefit from an LLVM backend. But that is a ways off.

Contributor

DemiMarie commented Jun 9, 2016

Some of the optimizations I would like will require changes to the OCaml GC – specifically, the ability to have blocks in the heap that contain a mixture of pointers and non-pointers. This would allow for floats, int32, int64, and nativeint to be unboxed everywhere, except possibly where alignment constraints for int64 proved problematic and as arguments to polymorphic functions that were not type-specialized.

Even more far-out, if type-specialization makes most functions operate on unboxed, specialized data (such as machine ints, doubles, and pointers), OCaml might benefit from an LLVM backend. But that is a ways off.

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 10, 2016

Contributor

@alainfrisch without changing anything else to the toolchain, it is probably sufficient for you to build with ocamlopt -Oclassic -opaque -lto for the majority of the builds (while developing), and only link with -lto -O3 only when releasing or benchmarking.

This would benefit from a reasonably fast build for each file and have the same ability as bytecode not to rebuild the whole world when you change a file. The performance won't be marvelous of course.
And when linking for release, it will be horribly slow and eat a lot of ram, but that shouldn't happen as often.

Currently this patch does not run the optimization passes when linking, but if you consider that workflow useful, I can change that. This require some changes as the passes do not expect symbols declared from different compilation units and will complain about that.

Contributor

chambart commented Jun 10, 2016

@alainfrisch without changing anything else to the toolchain, it is probably sufficient for you to build with ocamlopt -Oclassic -opaque -lto for the majority of the builds (while developing), and only link with -lto -O3 only when releasing or benchmarking.

This would benefit from a reasonably fast build for each file and have the same ability as bytecode not to rebuild the whole world when you change a file. The performance won't be marvelous of course.
And when linking for release, it will be horribly slow and eat a lot of ram, but that shouldn't happen as often.

Currently this patch does not run the optimization passes when linking, but if you consider that workflow useful, I can change that. This require some changes as the passes do not expect symbols declared from different compilation units and will complain about that.

@chambart chambart referenced this pull request in ocaml/opam-repository Jun 10, 2016

Merged

Add a trunk compiler with lto enabled by default #6734

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Jun 10, 2016

Contributor

@chambart If I understand correctlly, you claim that most of the overhead of -Oclassic compared to the legacy pipeline is due to cross-module optimizations. Is this only your intuition, or has this been empirically confirmed?

Contributor

alainfrisch commented Jun 10, 2016

@chambart If I understand correctlly, you claim that most of the overhead of -Oclassic compared to the legacy pipeline is due to cross-module optimizations. Is this only your intuition, or has this been empirically confirmed?

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jun 10, 2016

Member

@chambart I'm getting

Fatal error: exception File "asmcomp/closure_offsets.ml", line 99, characters 6-12: Assertion failed

when I am trying your PR (but using OCAMLPARAM="lto=1,_" instead of adding a file in lib/ocaml). Is that expected?

Member

samoht commented Jun 10, 2016

@chambart I'm getting

Fatal error: exception File "asmcomp/closure_offsets.ml", line 99, characters 6-12: Assertion failed

when I am trying your PR (but using OCAMLPARAM="lto=1,_" instead of adding a file in lib/ocaml). Is that expected?

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 10, 2016

Contributor

@alainfrisch no, I claim that what you gain from not recompiling everything when you change a given file is bigger than the overhead of flambda. This of course requires that you build system is able to see that the cmx file didn't change.

Contributor

chambart commented Jun 10, 2016

@alainfrisch no, I claim that what you gain from not recompiling everything when you change a given file is bigger than the overhead of flambda. This of course requires that you build system is able to see that the cmx file didn't change.

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 10, 2016

Contributor

@samoht no. I'll look into this.
I assume you are getting that when linking. This is probably linked to the fact that the middle-end does not like symbols from different compilation units in the same file. This will probably need a rewriting pass to clean this.

Contributor

chambart commented Jun 10, 2016

@samoht no. I'll look into this.
I assume you are getting that when linking. This is probably linked to the fact that the middle-end does not like symbols from different compilation units in the same file. This will probably need a rewriting pass to clean this.

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jun 10, 2016

Member

Latest command shown in the logs is:

cd tools; /Applications/Xcode.app/Contents/Developer/usr/bin/make opt.opt
# ../boot/ocamlrun ../ocamlopt -nostdlib -I ../stdlib -absname -w +a-4-9-41-42-44-45-48 -strict-sequence
-warn-error A -safe-string -strict-formats -I ../utils -I ../parsing -I ../typing -I ../bytecomp -I ../asmcomp
-I ../middle_end -I ../middle_end/base_types -I ../driver -I ../toplevel -c - ocamldep.ml
# ../boot/ocamlrun ../ocamlopt -nostdlib -I ../stdlib -I ../utils -I ../parsing -I ../typing -I ../bytecomp -I
../asmcomp -I ../middle_end -I ../middle_end/base_types -I ../driver -I ../toplevel -I .. -o ocamldep.opt
timings.cmx misc.cmx config.cmx identifiable.cmx numbers.cmx arg_helper.cmx clflags.cmx terminfo.cmx
warnings.cmx location.cmx longident.cmx docstrings.cmx syntaxerr.cmx ast_helper.cmx parser.cmx lexer.cmx
parse.cmx ccomp.cmx ast_mapper.cmx ast_iterator.cmx builtin_attributes.cmx ast_invariants.cmx pparse.cmx
compenv.cmx depend.cmx ocamldep.cmx

It's on OSX 10.11 if that makes a difference...

Member

samoht commented Jun 10, 2016

Latest command shown in the logs is:

cd tools; /Applications/Xcode.app/Contents/Developer/usr/bin/make opt.opt
# ../boot/ocamlrun ../ocamlopt -nostdlib -I ../stdlib -absname -w +a-4-9-41-42-44-45-48 -strict-sequence
-warn-error A -safe-string -strict-formats -I ../utils -I ../parsing -I ../typing -I ../bytecomp -I ../asmcomp
-I ../middle_end -I ../middle_end/base_types -I ../driver -I ../toplevel -c - ocamldep.ml
# ../boot/ocamlrun ../ocamlopt -nostdlib -I ../stdlib -I ../utils -I ../parsing -I ../typing -I ../bytecomp -I
../asmcomp -I ../middle_end -I ../middle_end/base_types -I ../driver -I ../toplevel -I .. -o ocamldep.opt
timings.cmx misc.cmx config.cmx identifiable.cmx numbers.cmx arg_helper.cmx clflags.cmx terminfo.cmx
warnings.cmx location.cmx longident.cmx docstrings.cmx syntaxerr.cmx ast_helper.cmx parser.cmx lexer.cmx
parse.cmx ccomp.cmx ast_mapper.cmx ast_iterator.cmx builtin_attributes.cmx ast_invariants.cmx pparse.cmx
compenv.cmx depend.cmx ocamldep.cmx

It's on OSX 10.11 if that makes a difference...

@alainfrisch

This comment has been minimized.

Show comment
Hide comment
@alainfrisch

alainfrisch Jun 10, 2016

Contributor

Ok understood. (But in fast dev mode, I already compile with -opaque.)

Contributor

alainfrisch commented Jun 10, 2016

Ok understood. (But in fast dev mode, I already compile with -opaque.)

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 14, 2016

Contributor

@samoht if you want to test again, I fixed a few problems

Contributor

chambart commented Jun 14, 2016

@samoht if you want to test again, I fixed a few problems

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jun 16, 2016

@chambart @samoht per discussion with @samoht just now, I gave the 4.04.0+forced_lto switch a try. Built a simple hello world program fine (325kB vs 484kB with OSX "system" which is 4.03.0) but hit a snag when trying to build mirage-www -- it seems that opam install ocamlbuild fails with

# Error: Modules Ocamlbuild_pack was compiled without the `-lto`
#        option. It is needed for linking with the `-lto` option.

FWIW, possibly related, opam install base-ocamlbuild also fails due to opam constraint:

[ERROR] base-ocamlbuild is not available because it requires OCaml >= 3.10 & < 4.03.

If there's an easy fix / something I can try to get the build going / you can point me to what I should try pinning and building locally, I'll try again later today :)

mor1 commented Jun 16, 2016

@chambart @samoht per discussion with @samoht just now, I gave the 4.04.0+forced_lto switch a try. Built a simple hello world program fine (325kB vs 484kB with OSX "system" which is 4.03.0) but hit a snag when trying to build mirage-www -- it seems that opam install ocamlbuild fails with

# Error: Modules Ocamlbuild_pack was compiled without the `-lto`
#        option. It is needed for linking with the `-lto` option.

FWIW, possibly related, opam install base-ocamlbuild also fails due to opam constraint:

[ERROR] base-ocamlbuild is not available because it requires OCaml >= 3.10 & < 4.03.

If there's an easy fix / something I can try to get the build going / you can point me to what I should try pinning and building locally, I'll try again later today :)

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 17, 2016

Contributor

@mor1, I did not seriously try packs, I wouldn't be surprised that I missed something in the handling of packs.
The base-ocamlbuild constraint comes from the removal of ocamlbuild from the core distribution. Before 4.03 it was inside and the 'fake' package base-ocamlbuild expose it. Since 4.03, it needs to be installed as a real package.

The reason something is failing in ocamlbuild is probably due to it being the deepest package using pack in the package dependency tree.

I'll try to add some tests for packs and lto in the testsuite.

Contributor

chambart commented Jun 17, 2016

@mor1, I did not seriously try packs, I wouldn't be surprised that I missed something in the handling of packs.
The base-ocamlbuild constraint comes from the removal of ocamlbuild from the core distribution. Before 4.03 it was inside and the 'fake' package base-ocamlbuild expose it. Since 4.03, it needs to be installed as a real package.

The reason something is failing in ocamlbuild is probably due to it being the deepest package using pack in the package dependency tree.

I'll try to add some tests for packs and lto in the testsuite.

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 20, 2016

Contributor

@mor1 I fixed a few things for pack, this should be better now.

By the way, I wanted to try mirage-www, the version on opam repository is not installable on trunk, is there a different repository with the up to date versions ?

Contributor

chambart commented Jun 20, 2016

@mor1 I fixed a few things for pack, this should be better now.

By the way, I wanted to try mirage-www, the version on opam repository is not installable on trunk, is there a different repository with the up to date versions ?

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jun 20, 2016

Thanks-- I'll give it a try (may not be for a few days as travelling).

I haven't tried mirage-www recently, I'll give that a look too.

Is there an easy way to try out your updates? Do I just use the same switch as before?

mor1 commented Jun 20, 2016

Thanks-- I'll give it a try (may not be for a few days as travelling).

I haven't tried mirage-www recently, I'll give that a look too.

Is there an easy way to try out your updates? Do I just use the same switch as before?

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jun 20, 2016

Contributor

Yes opam switch reinstall 4.04.0+forced_lto should do the work (no need to opam update before)

Contributor

chambart commented Jun 20, 2016

Yes opam switch reinstall 4.04.0+forced_lto should do the work (no need to opam update before)

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jun 21, 2016

Hi Pierre; Tried again with mirage-www -- seems to work ok for me with 4.03.0 by the way. FWIW, with this forced_lto branch, it got further but failed this time with

#=== ERROR while installing ppx_core.113.33.01+4.03 ===========================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.err
### stdout ###
# Error: Error on dynamically loaded library: /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so: dlopen(/Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so, 138): Symbol not found: _caml_extern_sp
# [...]
#  in /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so
# Command exited with code 2.
# + ocamlfind ocamlopt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa -linkpkg myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# W: Cannot find source file matching module 'ppx_core' in library ppx_core
# W: Cannot find source file matching module 'Ast_builder_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_pattern_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_iter' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map_with_context' in library ppx_core
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' src/ppx_core.cma src/ppx_core.cmxa src/ppx_core.a src/ppx_core.cmxs src/gen/gen.native src/gen/gen_ast_pattern.native src/gen/gen_ast_builder.native -use-ocamlfind -tag debug' terminated with error code 10")
# make: *** [build] Error 1

I'm about to board a flight so can't do any more now. Will try to look into it further after I land...

mor1 commented Jun 21, 2016

Hi Pierre; Tried again with mirage-www -- seems to work ok for me with 4.03.0 by the way. FWIW, with this forced_lto branch, it got further but failed this time with

#=== ERROR while installing ppx_core.113.33.01+4.03 ===========================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-47576-d2c37b.err
### stdout ###
# Error: Error on dynamically loaded library: /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so: dlopen(/Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so, 138): Symbol not found: _caml_extern_sp
# [...]
#  in /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so
# Command exited with code 2.
# + ocamlfind ocamlopt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa -linkpkg myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# W: Cannot find source file matching module 'ppx_core' in library ppx_core
# W: Cannot find source file matching module 'Ast_builder_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_pattern_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_iter' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map_with_context' in library ppx_core
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' src/ppx_core.cma src/ppx_core.cmxa src/ppx_core.a src/ppx_core.cmxs src/gen/gen.native src/gen/gen_ast_pattern.native src/gen/gen_ast_builder.native -use-ocamlfind -tag debug' terminated with error code 10")
# make: *** [build] Error 1

I'm about to board a flight so can't do any more now. Will try to look into it further after I land...

@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jun 30, 2016

Contributor

@mor1 : Did you manage to investigate any further?

Contributor

mshinwell commented Jun 30, 2016

@mor1 : Did you manage to investigate any further?

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jun 30, 2016

No; I'll give it another shot with an updated switch now in case more fixes since. I don't really have a great deal of time to investigate (well, learn how the compiler is put together in order to investigate) though :(

mor1 commented Jun 30, 2016

No; I'll give it another shot with an updated switch now in case more fixes since. I don't really have a great deal of time to investigate (well, learn how the compiler is put together in order to investigate) though :(

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jun 30, 2016

...still errors building mirage-profile and ppx_core when trying to opam install tcpip:

=-=- Processing actions -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  🐫
[ERROR] The compilation of mirage-profile failed at "make".
[ERROR] The compilation of ppx_core failed at "make".

#=== ERROR while installing mirage-profile.0.7.0 ==============================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.err
### stdout ###
# [...]
# Command line: ppx_cstruct '/var/folders/zm/05tvwc3n57jf5q9_nsn5jkb40000gn/T/camlppxb1de03' '/var/folders/zm/05tvwc3n57jf5q9_nsn5jkb40000gn/T/camlppxaeb486'
#
# Command exited with code 2.
# + /Users/mort/.opam/4.04.0+forced_lto/bin/ocamlopt.opt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# File "setup.ml", line 5847, characters 11-28:
# Warning 3: deprecated: String.capitalize
# Use String.capitalize_ascii instead.
# File "setup.ml", line 5848, characters 11-30:
# Warning 3: deprecated: String.uncapitalize
# Use String.uncapitalize_ascii instead.
# W: Cannot find source file matching module 'mProf' in library mProf
# W: Cannot find source file matching module 'Trace' in library mProf
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' lib/mProf.cma lib/mProf.cmxa lib/mProf.a lib/mProf.cmxs unix/libmProf_unix_stubs.a unix/dllmProf_unix_stubs.so unix/mProf_unix.cma unix/mProf_unix.cmxa unix/mProf_unix.a unix/mProf_unix.cmxs -tag debug' terminated with error code 10")
# make: *** [build] Error 1


#=== ERROR while installing ppx_core.113.33.01+4.03 ===========================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.err
### stdout ###
# Error: Error on dynamically loaded library: /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so: dlopen(/Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so, 138): Symbol not found: _caml_extern_sp
# [...]
#  in /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so
# Command exited with code 2.
# + ocamlfind ocamlopt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa -linkpkg myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# W: Cannot find source file matching module 'ppx_core' in library ppx_core
# W: Cannot find source file matching module 'Ast_builder_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_pattern_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_iter' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map_with_context' in library ppx_core
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' src/ppx_core.cma src/ppx_core.cmxa src/ppx_core.a src/ppx_core.cmxs src/gen/gen.native src/gen/gen_ast_pattern.native src/gen/gen_ast_builder.native -use-ocamlfind -tag debug' terminated with error code 10")
# make: *** [build] Error 1

mor1 commented Jun 30, 2016

...still errors building mirage-profile and ppx_core when trying to opam install tcpip:

=-=- Processing actions -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=  🐫
[ERROR] The compilation of mirage-profile failed at "make".
[ERROR] The compilation of ppx_core failed at "make".

#=== ERROR while installing mirage-profile.0.7.0 ==============================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/mirage-profile.0.7.0/mirage-profile-96257-60459a.err
### stdout ###
# [...]
# Command line: ppx_cstruct '/var/folders/zm/05tvwc3n57jf5q9_nsn5jkb40000gn/T/camlppxb1de03' '/var/folders/zm/05tvwc3n57jf5q9_nsn5jkb40000gn/T/camlppxaeb486'
#
# Command exited with code 2.
# + /Users/mort/.opam/4.04.0+forced_lto/bin/ocamlopt.opt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# File "setup.ml", line 5847, characters 11-28:
# Warning 3: deprecated: String.capitalize
# Use String.capitalize_ascii instead.
# File "setup.ml", line 5848, characters 11-30:
# Warning 3: deprecated: String.uncapitalize
# Use String.uncapitalize_ascii instead.
# W: Cannot find source file matching module 'mProf' in library mProf
# W: Cannot find source file matching module 'Trace' in library mProf
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' lib/mProf.cma lib/mProf.cmxa lib/mProf.a lib/mProf.cmxs unix/libmProf_unix_stubs.a unix/dllmProf_unix_stubs.so unix/mProf_unix.cma unix/mProf_unix.cmxa unix/mProf_unix.a unix/mProf_unix.cmxs -tag debug' terminated with error code 10")
# make: *** [build] Error 1


#=== ERROR while installing ppx_core.113.33.01+4.03 ===========================#
# opam-version 1.2.2
# os           darwin
# command      make
# path         /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03
# compiler     4.04.0+forced_lto
# exit-code    2
# env-file     /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.env
# stdout-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.out
# stderr-file  /Users/mort/.opam/4.04.0+forced_lto/build/ppx_core.113.33.01+4.03/ppx_core-96257-b76d46.err
### stdout ###
# Error: Error on dynamically loaded library: /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so: dlopen(/Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so, 138): Symbol not found: _caml_extern_sp
# [...]
#  in /Users/mort/.opam/4.04.0+forced_lto/lib/ocaml/stublibs/dllthreads.so
# Command exited with code 2.
# + ocamlfind ocamlopt unix.cmxa -I /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuildlib.cmxa -linkpkg myocamlbuild.ml /Users/mort/.opam/4.04.0+forced_lto/lib/ocamlbuild/ocamlbuild.cmx -o myocamlbuild
# File "myocamlbuild.ml", line 518, characters 43-62:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
# File "myocamlbuild.ml", line 531, characters 51-70:
# Warning 3: deprecated: Ocamlbuild_plugin.String.uncapitalize
# Use String.uncapitalize_ascii instead.
### stderr ###
# [...]
# W: Cannot find source file matching module 'ppx_core' in library ppx_core
# W: Cannot find source file matching module 'Ast_builder_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_pattern_generated' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_fold_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_iter' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map' in library ppx_core
# W: Cannot find source file matching module 'Ast_traverse_map_with_context' in library ppx_core
# E: Failure("Command ''/Users/mort/.opam/4.04.0+forced_lto/bin/ocamlbuild' src/ppx_core.cma src/ppx_core.cmxa src/ppx_core.a src/ppx_core.cmxs src/gen/gen.native src/gen/gen_ast_pattern.native src/gen/gen_ast_builder.native -use-ocamlfind -tag debug' terminated with error code 10")
# make: *** [build] Error 1
@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jul 7, 2016

Contributor

@mor1 I tried a bit more mirage. Apparently this is not yet building with any version of 4.04. In particular cstruct has not been ported yet. The built ppx_cstruct returns an empty file.

Contributor

chambart commented Jul 7, 2016

@mor1 I tried a bit more mirage. Apparently this is not yet building with any version of 4.04. In particular cstruct has not been ported yet. The built ppx_cstruct returns an empty file.

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Jul 8, 2016

Contributor

One series of whole program optimizations I would like to see is complete (or near-complete) defunctorization and monomorphization, followed by points-to analysis (such as 0CFA or m-CFA) and environment analysis (to eliminate unneeded closures), along with dead-code elimination, and finally lowering to SSA form. The SSA IR would be very C-like (by this point most boxing would probably have been eliminated) so this would be a good candidate for lowering to LLVM. This would be combined with runtime changes to allow for machine ints, doubles, etc. to be stored unboxed in the heap.

MLton is really the best example of such optimizations.

Contributor

DemiMarie commented Jul 8, 2016

One series of whole program optimizations I would like to see is complete (or near-complete) defunctorization and monomorphization, followed by points-to analysis (such as 0CFA or m-CFA) and environment analysis (to eliminate unneeded closures), along with dead-code elimination, and finally lowering to SSA form. The SSA IR would be very C-like (by this point most boxing would probably have been eliminated) so this would be a good candidate for lowering to LLVM. This would be combined with runtime changes to allow for machine ints, doubles, etc. to be stored unboxed in the heap.

MLton is really the best example of such optimizations.

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jul 8, 2016

Contributor

@DemiMarie You are basically asking for a whole compiler rewrite... I'm not sure that so much effort is needed to get a big part of the benefit.

By the way the complexity for using LLVM as a code generator mainly comes from GC interactions. But this is not the place to discuss that.

Contributor

chambart commented Jul 8, 2016

@DemiMarie You are basically asking for a whole compiler rewrite... I'm not sure that so much effort is needed to get a big part of the benefit.

By the way the complexity for using LLVM as a code generator mainly comes from GC interactions. But this is not the place to discuss that.

@mor1

This comment has been minimized.

Show comment
Hide comment
@mor1

mor1 Jul 8, 2016

@chambart Ok, thanks. I'll investigate with vanilla 4.04 a bit more first.

mor1 commented Jul 8, 2016

@chambart Ok, thanks. I'll investigate with vanilla 4.04 a bit more first.

@DemiMarie

This comment has been minimized.

Show comment
Hide comment
@DemiMarie

DemiMarie Jul 11, 2016

Contributor

Other than dead code elimination, what other optimizations could be done with the entire program available that could not be done without it, given the current design of the OCaml compiler?

(This is not meant to discourage anyone – I am genuinely curious).

Contributor

DemiMarie commented Jul 11, 2016

Other than dead code elimination, what other optimizations could be done with the entire program available that could not be done without it, given the current design of the OCaml compiler?

(This is not meant to discourage anyone – I am genuinely curious).

@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart Jul 18, 2016

Contributor

The test failure comes from the gc bug fixed in f7620cd
I think that this proposal is ready: who objects, want some more change, want to review, ... ?

@alainfrisch @xavierleroy @mshinwell

Fabrice suggested to change the user interface to: remove the -lto option and activate the addition of link time information through a configure option. The -use-lto option would remain with the same effect.

Contributor

chambart commented Jul 18, 2016

The test failure comes from the gc bug fixed in f7620cd
I think that this proposal is ready: who objects, want some more change, want to review, ... ?

@alainfrisch @xavierleroy @mshinwell

Fabrice suggested to change the user interface to: remove the -lto option and activate the addition of link time information through a configure option. The -use-lto option would remain with the same effect.

shindere added some commits Apr 14, 2017

Add support for building and testing with flambda to the new CI script
If the flambda environment variable is set to true, the feature is
enabled in the build system.

(cherry picked from commit 24b028c)
Replace legacy CI build script by new one
Since there are no jobs left on Inria's CI that use the legacy
ci-build script, replace it by the new one.

(cherry picked from commit 7d6a661)
@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht May 12, 2017

Member

What is the status of that patch? We are still very interested to use/test it in the context of MirageOS.

Member

samoht commented May 12, 2017

What is the status of that patch? We are still very interested to use/test it in the context of MirageOS.

chambart added some commits Jun 3, 2016

Record propagate code to the cmxa files when linked with -lto
Also adds a new warning (60) for dependencies not compiled with -lto
Defines the flambda semantics of a few more C primitives
This allows to remove some initialisation code from pervasives
and from unused exceptions.
Rewrite symbols for lto linking before transformations
This allow to keep the invariant that all defined symbols
are from the current compilation unit.
Keep an empty global_map symbol in lto
This allows to link the dynlink module even if it will not work
Test for pack + lto
Also checks that the evaluation order of packed modules does not change
@chambart

This comment has been minimized.

Show comment
Hide comment
@chambart

chambart May 16, 2017

Contributor

@samoht I just rebased and updated for 4.05.

I think otherwise the status is still the same: We need some real world test to validate.

Contributor

chambart commented May 16, 2017

@samoht I just rebased and updated for 4.05.

I think otherwise the status is still the same: We need some real world test to validate.

@chambart chambart referenced this pull request in ocaml/opam-repository May 23, 2017

Merged

add 4.05+trunk+lto compiler #9274

@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Jun 2, 2017

Contributor

So I tried this in the context of uucp where people have been complaining about executable size (see this issue). On this program:

let () =
  Printf.printf "%b" (Uucp.White.is_white_space (Uchar.of_int 0x0020));
  ()

The results are as follows:

9.3M	test.native # Without -use-lto
412K	test.native # With -use-lto
Contributor

dbuenzli commented Jun 2, 2017

So I tried this in the context of uucp where people have been complaining about executable size (see this issue). On this program:

let () =
  Printf.printf "%b" (Uucp.White.is_white_space (Uchar.of_int 0x0020));
  ()

The results are as follows:

9.3M	test.native # Without -use-lto
412K	test.native # With -use-lto
@dbuenzli

This comment has been minimized.

Show comment
Hide comment
@dbuenzli

dbuenzli Jun 2, 2017

Contributor
open Base
let () =
  print_endline "Hello Warld!";
  ()

Here's a hello world base:

3.1M	test.native # Without -use-lto
672K	test.native # With -use-lto
Contributor

dbuenzli commented Jun 2, 2017

open Base
let () =
  print_endline "Hello Warld!";
  ()

Here's a hello world base:

3.1M	test.native # Without -use-lto
672K	test.native # With -use-lto
@mshinwell

This comment has been minimized.

Show comment
Hide comment
@mshinwell

mshinwell Jun 6, 2017

Contributor

I tried this on a large executable which measured 220Mb in size when compiled with 4.03+flambda. Using 4.04+flambda with LTO enabled the executable reduced to 105Mb in size. It took more than ten minutes to link, which seems excessive. The ocamlopt.opt memory consumption which I think was around 4Gb is probably more reasonable as there is a lot of code involved here.

I have not yet investigated whether there are further opportunities for dead code elimination. I was hoping for a larger reduction in size, so there may be something about the code that prevents it. I will try to build it in bytecode so I can see what reduction is obtained by ocamlclean.

Examination of the 220Mb executable has also revealed two problems with ELF string tables. These totalled 80Mb (!). Firstly, especially when built without dynlink support, then we shouldn't be having all of the symbols in the dynamic symbol table as well as the normal one. I think the ELF "hidden" visibility support may fix this; I will investigate and submit a pull request. Secondly we shouldn't be generating such verbose symbol names; some of them may also point at duplicate copies of code. We will look at this in due course.

Contributor

mshinwell commented Jun 6, 2017

I tried this on a large executable which measured 220Mb in size when compiled with 4.03+flambda. Using 4.04+flambda with LTO enabled the executable reduced to 105Mb in size. It took more than ten minutes to link, which seems excessive. The ocamlopt.opt memory consumption which I think was around 4Gb is probably more reasonable as there is a lot of code involved here.

I have not yet investigated whether there are further opportunities for dead code elimination. I was hoping for a larger reduction in size, so there may be something about the code that prevents it. I will try to build it in bytecode so I can see what reduction is obtained by ocamlclean.

Examination of the 220Mb executable has also revealed two problems with ELF string tables. These totalled 80Mb (!). Firstly, especially when built without dynlink support, then we shouldn't be having all of the symbols in the dynamic symbol table as well as the normal one. I think the ELF "hidden" visibility support may fix this; I will investigate and submit a pull request. Secondly we shouldn't be generating such verbose symbol names; some of them may also point at duplicate copies of code. We will look at this in due course.

@damiendoligez damiendoligez added this to the 4.07-or-later milestone Sep 27, 2017

@damiendoligez damiendoligez removed this from the consider-for-4.07 milestone Jun 1, 2018

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jul 11, 2018

Member

Any chance to update these patches to 4.06 and/or 4.07?

Member

samoht commented Jul 11, 2018

Any chance to update these patches to 4.06 and/or 4.07?

@damiendoligez

This comment has been minimized.

Show comment
Hide comment
@damiendoligez

damiendoligez Jul 12, 2018

Member

You mean 4.08, right?

Member

damiendoligez commented Jul 12, 2018

You mean 4.08, right?

@samoht

This comment has been minimized.

Show comment
Hide comment
@samoht

samoht Jul 12, 2018

Member

Or 4.08 indeed. But just having an opam switch for 4.06+lto and/or 4.07+lto would already be great :-)

Member

samoht commented Jul 12, 2018

Or 4.08 indeed. But just having an opam switch for 4.06+lto and/or 4.07+lto would already be great :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment