Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

significant compilation time increased after minor tweaks #7357

Closed
vicuna opened this Issue Sep 14, 2016 · 7 comments

Comments

Projects
None yet
2 participants
@vicuna
Copy link
Collaborator

vicuna commented Sep 14, 2016

Original bug ID: 7357
Reporter: @bobzhang
Assigned to: @alainfrisch
Status: resolved (set by @alainfrisch on 2016-12-21T13:40:29Z)
Resolution: fixed
Priority: normal
Severity: minor
Target version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Fixed in version: 4.05.0 +dev/beta1/beta2/beta3/rc1
Category: middle end (typedtree to clambda)
Tags: github

Bug description

previously I have one big file whole_compiler.ml and a dummy whole_compiler.mli, now I changed the implementation of whole_compiler.ml to get rid of the dummy interface as below:

include (struct
(* nothing changed for the old code *)
end : sig end)

The compilation time doubled from 19s to 38s for native backend. (4.02.3)

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Sep 14, 2016

Comment author: @alainfrisch

This could be #7067.

Can you check with the current trunk? (4.03 might be impacted by #7302)

It would also be useful to report the result of -dtimings.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Sep 28, 2016

Comment author: @bobzhang

I will try again with 4.04 beta later this week

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Oct 1, 2016

Comment author: @bobzhang

so this problem is still there against 4.04:
to reproduce:
checkout this branch:https://github.com/bloomberg/bucklescript/tree/mantis_7357
cd jscomp
jscomp>time ../../ocaml-wk/bin/ocamlopt.opt -dtimings -w -a -I bin ./bin/config_whole_compiler.mli ./bin/config_whole_compiler.ml ./bin/whole_compiler2.ml -o bin/bsc.exe
all: 31.103s
parsing(./bin/config_whole_compiler.mli): 0.000s
parsing(./bin/config_whole_compiler.ml): 0.000s
typing(./bin/config_whole_compiler.ml): 0.003s
transl(./bin/config_whole_compiler.ml): 0.000s
generate(./bin/config_whole_compiler.ml): 0.005s
cmm(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
compile_phrases(sourcefile(./bin/config_whole_compiler.ml)): 0.002s
selection(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
comballoc(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
cse(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
deadcode(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
spill(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
split(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
liveness(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
regalloc(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
linearize(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
scheduling(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
emit(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
assemble(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
parsing(./bin/whole_compiler2.ml): 0.394s
typing(./bin/whole_compiler2.ml): 2.176s
transl(./bin/whole_compiler2.ml): 0.457s
generate(./bin/whole_compiler2.ml): 25.250s
cmm(sourcefile(./bin/whole_compiler2.ml)): 0.287s
compile_phrases(sourcefile(./bin/whole_compiler2.ml)): 23.886s
selection(sourcefile(./bin/whole_compiler2.ml)): 0.224s
comballoc(sourcefile(./bin/whole_compiler2.ml)): 0.030s
cse(sourcefile(./bin/whole_compiler2.ml)): 0.158s
deadcode(sourcefile(./bin/whole_compiler2.ml)): 0.064s
spill(sourcefile(./bin/whole_compiler2.ml)): 0.444s
split(sourcefile(./bin/whole_compiler2.ml)): 0.194s
liveness(sourcefile(./bin/whole_compiler2.ml)): 0.294s
regalloc(sourcefile(./bin/whole_compiler2.ml)): 22.123s
linearize(sourcefile(./bin/whole_compiler2.ml)): 0.036s
scheduling(sourcefile(./bin/whole_compiler2.ml)): 0.003s
emit(sourcefile(./bin/whole_compiler2.ml)): 0.187s
assemble(sourcefile(./bin/whole_compiler2.ml)): 0.001s
selection(startup): 0.002s
comballoc(startup): 0.000s
cse(startup): 0.001s
deadcode(startup): 0.000s
spill(startup): 0.002s
split(startup): 0.001s
liveness(startup): 0.001s
regalloc(startup): 0.023s
linearize(startup): 0.000s
scheduling(startup): 0.000s
emit(startup): 0.001s
assemble(startup): 0.000s

real 0m33.968s
user 0m33.161s
sys 0m0.652s
jscomp>time ../../ocaml-wk/bin/ocamlopt.opt -dtimings -w -a -I bin ./bin/config_whole_compiler.mli ./bin/config_whole_compiler.ml ./bin/whole_compiler.mli ./bin/whole_compiler.ml -o bin/bsc.exe
all: 15.094s
parsing(./bin/config_whole_compiler.mli): 0.000s
parsing(./bin/config_whole_compiler.ml): 0.000s
typing(./bin/config_whole_compiler.ml): 0.003s
transl(./bin/config_whole_compiler.ml): 0.000s
generate(./bin/config_whole_compiler.ml): 0.005s
cmm(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
compile_phrases(sourcefile(./bin/config_whole_compiler.ml)): 0.002s
selection(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
comballoc(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
cse(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
deadcode(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
spill(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
split(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
liveness(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
regalloc(sourcefile(./bin/config_whole_compiler.ml)): 0.001s
linearize(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
scheduling(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
emit(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
assemble(sourcefile(./bin/config_whole_compiler.ml)): 0.000s
parsing(./bin/whole_compiler.mli): 0.000s
parsing(./bin/whole_compiler.ml): 0.399s
typing(./bin/whole_compiler.ml): 2.159s
transl(./bin/whole_compiler.ml): 0.534s
generate(./bin/whole_compiler.ml): 10.522s
cmm(sourcefile(./bin/whole_compiler.ml)): 0.293s
compile_phrases(sourcefile(./bin/whole_compiler.ml)): 9.206s
selection(sourcefile(./bin/whole_compiler.ml)): 0.248s
comballoc(sourcefile(./bin/whole_compiler.ml)): 0.040s
cse(sourcefile(./bin/whole_compiler.ml)): 0.150s
deadcode(sourcefile(./bin/whole_compiler.ml)): 0.120s
spill(sourcefile(./bin/whole_compiler.ml)): 0.302s
split(sourcefile(./bin/whole_compiler.ml)): 0.137s
liveness(sourcefile(./bin/whole_compiler.ml)): 0.248s
regalloc(sourcefile(./bin/whole_compiler.ml)): 7.600s
linearize(sourcefile(./bin/whole_compiler.ml)): 0.029s
scheduling(sourcefile(./bin/whole_compiler.ml)): 0.004s
emit(sourcefile(./bin/whole_compiler.ml)): 0.199s
assemble(sourcefile(./bin/whole_compiler.ml)): 0.001s
selection(startup): 0.002s
comballoc(startup): 0.000s
cse(startup): 0.002s
deadcode(startup): 0.001s
spill(startup): 0.008s
split(startup): 0.001s
liveness(startup): 0.003s
regalloc(startup): 0.022s
linearize(startup): 0.000s
scheduling(startup): 0.000s
emit(startup): 0.001s
assemble(startup): 0.000s

real 0m17.806s
user 0m17.079s
sys 0m0.614s

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Oct 1, 2016

Comment author: @bobzhang

note the bucklescript compiler is self contained (works with 4.02,4.03 and 4.04), it might be a good example to test how flambda behaves on such large file (100K loc)

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Oct 3, 2016

Comment author: @alainfrisch

I confirm the difference between whole_compiler.ml and whole_compiler2.ml; on my machine, with a version synchronized with trunk a few weeks ago:

  • whole_compiler2.ml: 20s
  • whole_compiler.ml : 9.5s

(Enabling the -linscan allocator from #375, the difference is much smaller:

  • whole_compiler2.ml : 7.5s
  • whole_compiler.ml : 7.1s
    )

Another interesting note: the timings above have been obtained on Windows using a direct binary code emitter; using the normal assembly backend (msvc 32-bit port):

  • whole_compiler2.ml: 37s
  • whole_compiler.ml : 24s

It seems Microsoft's assembler does not like such huge files...

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Oct 3, 2016

Comment author: @alainfrisch

Proposed fix #832. Compilation time becomes similar between whole_compiler.ml and whole_compiler2.ml.

@vicuna

This comment has been minimized.

Copy link
Collaborator Author

vicuna commented Dec 21, 2016

Comment author: @alainfrisch

Fixed by commit 9dbcb1e.

@vicuna vicuna closed this Dec 21, 2016

@vicuna vicuna added the middle-end label Mar 14, 2019

@vicuna vicuna added this to the 4.05.0 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.