Skip to content
This repository has been archived by the owner on May 12, 2018. It is now read-only.

Speed up the compilation process v5 #129

Merged
merged 3 commits into from Mar 5, 2014
Merged

Speed up the compilation process v5 #129

merged 3 commits into from Mar 5, 2014

Conversation

ghost
Copy link

@ghost ghost commented Sep 1, 2013

Sames as #113 except:

  • trivial fixes and updates
  • build info data refactoring
  • using term_to_binary compression
  • not writing build info file if the graph is unmodified
  • default enabling keep_build_info
  • unconditionally keeping build info
  • making the info file universal and storing it as <base_dir>/.rebarinfo
  • introducing <base_dir>/.rebar/ and storing erlc build info there
  • fixing regressions

@ghost ghost mentioned this pull request Sep 1, 2013
@proger
Copy link

proger commented Sep 11, 2013

I've been using eflame to profile rebar runtimes in cases when all files already compiled (so ideally rebar should finish instantly), but it seems like init_rebarinfo is taking some time to gather infos when it's not necessary — is it possible to make it not do that?

flame graph (the way to build this picture is explained at the bottom of eflame's README)

@ferd
Copy link
Contributor

ferd commented Sep 17, 2013

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

@dizzyd
Copy link
Member

dizzyd commented Sep 20, 2013

@ferd @Tuncer I'm not going to review all 5 versions of this PR - can we please pick one and close the others?

@ferd
Copy link
Contributor

ferd commented Sep 20, 2013

@dizzyd my understanding is that each branch does more than the other. You could review v5 only, and if you don't like a feature or something, it may turn out that v4, v3, v2, or v1 already have the separation for that.

@ghost
Copy link
Author

ghost commented Sep 20, 2013

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

@ferd, @proger, any technical issue with postponing that to after we merge one of the 5 variants first?

@ferd
Copy link
Contributor

ferd commented Sep 20, 2013

No issue. It's only an inefficient thing. It can be fixed without
breaking anything also.

On 09/20, Tuncer Ayaz wrote:

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

@ferd, any technical issue with postponing that to after we merge one of the 5 variants first?


Reply to this email directly or view it on GitHub:
#129 (comment)

@ghost
Copy link
Author

ghost commented Sep 22, 2013

@proger, @ferd, thanks for the init_rebarinfo suggestion. I made the necessary changes.

@archaelus
Copy link

I've started using this version of the patch day-to-day and I like it. Rebar seems super fast now.

👍

@tsloughter
Copy link
Member

I've been using it for a while now. No issues.

@jaredmorrow
Copy link
Contributor

Since this also adds the notion of a new file, I will tag rebar as it is, then test this branch as well as I can and merge it in, any problems with that @Tuncer?

@ghost
Copy link
Author

ghost commented Nov 22, 2013

Works for me.

@ghost
Copy link
Author

ghost commented Dec 6, 2013

While discussing lock-deps and erlc-speedup-v5 with @seth, he suggested to introduce a directory for all info files. That's a good idea, and with that implemented, I consider erlc-speedup-v5 ready to merge. That directory is also a good place to store a rebar lock-deps info file.

@archaelus
Copy link

We've been using this patch on our version of rebar at Heroku and it kicks ass - compilation speed is fantastic and day-to-day development feels much better with this patch in rebar.

💯 I would love to see this patch land in mainline.

@drobakowski
Copy link

Hi @Tuncer and @zinid,

thanks a lot for the great speed improvements! I’ve tried this patch for some projects but I’ve encountered some compilation errors while using it. One project has a behaviour module located in a subfolder of src/ e.g. src/behaviour/some_behaviour.erl. You can easily reproduce the error for example with poolboy:

mkdir src/behaviour && mv src/poolboy_worker.erl src/behaviour/ && cp test/poolboy_test_worker.erl src/

Errors while trying to compile:

==> poolboy (compile)
Compiling src/poolboy_test_worker.erl failed:
src/poolboy_test_worker.erl:3: behaviour poolboy_worker undefined
ERROR: compile failed while processing /private/tmp/poolboy: rebar_abort

Another error appears while trying to compile erlando:

==> erlando (compile)
Compiling src/monad_plus.erl failed:
src/monad_plus.erl:none: undefined parse transform 'do'
ERROR: compile failed while processing /private/tmp/erlando: rebar_abort

Bisecting with git located the first bad commit at 0796ffe, for both projects.

@ghost
Copy link
Author

ghost commented Jan 18, 2014

@drobakowski thanks for the bug report. I'll look into it.

@ghost
Copy link
Author

ghost commented Jan 19, 2014

@drobakowski can you please retest?

@drobakowski
Copy link

@Tuncer compiling erlando now works fine! The first example with the modified poolboy worker still doesn't compile. I've uploaded you the modified poolboy worker src on DB.

@ghost
Copy link
Author

ghost commented Jan 19, 2014

@drobakowski that's right, I didn't fix that yet.

@@ -260,7 +274,7 @@ doterl_compile(Config, OutDir) ->
doterl_compile(Config, OutDir, []).

doterl_compile(Config, OutDir, MoreSources) ->
FirstErls = rebar_config:get_list(Config, erl_first_files, []),
ErlFirstErls = rebar_config:get_list(Config, erl_first_files, []),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable name seems to have a bit of redundancy...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to ErlFirstFiles

@Vagabond
Copy link
Contributor

Vagabond commented Feb 5, 2014

If this can be squashed into a saner commit history, I think this is ready to merge.

@Vagabond
Copy link
Contributor

Vagabond commented Feb 5, 2014

As some anecdotal evidence, it takes a riak rebuild from 20 to 10 seconds and riak_cs build from 10 to 5 seconds, which is very nice for iterative development.

@ghost
Copy link
Author

ghost commented Feb 6, 2014

Before we can merge this, I first need to fix the 2nd bug reported by @drobakowski, and I'd also welcome a review of the code that skips updating the info file if there's no change. Specifically, whether the logic is correct for the case of a compile error, and if in that case a different update strategy is needed.

Once that's all done, I'll look at the commit history.

@Vagabond
Copy link
Contributor

Vagabond commented Feb 6, 2014

The test case provided by @drobakowski passed for me, when I tried it. I thought you had fixed it.

@ghost
Copy link
Author

ghost commented Feb 6, 2014

I have only fixed the first bug. The second problem happens in the 'modified poolboy worker' scenario.

@Vagabond
Copy link
Contributor

Vagabond commented Feb 6, 2014

I can't replicate the error, but maybe I don't understand how to.

@drobakowski
Copy link

@Tuncer the compile errors also occurs with -j1 and also with i386. Building the erlang vm with halfword-emu doesn't seem to work quiet well on a Mac and hence I'm unable to test it with this vm. Sorry, maybe someone else could try to reproduce this with halfword-emu?

@drobakowski
Copy link

I'm able to reproduce the error on Ubuntu Server 12.04.4 LTS (GNU/Linux 3.2.0-27-generic x86_64) with R16B03-1 64-bit halfword-emu.

@Vagabond
Copy link
Contributor

Vagabond commented Feb 7, 2014

FWIW I am on Arch Linux x64 with erlang R16B02.

Andrew

@ghost
Copy link
Author

ghost commented Feb 20, 2014

@drobakowski I have uncovered a 2nd bug and fixed both regressions. Can you please retest?

@drobakowski
Copy link

@Tuncer great, 1c8a9ae and 5e81289 are working fine now but fixing "catch all" in 70ce087 lead to some other error. See the diff note.

@ghost
Copy link
Author

ghost commented Feb 20, 2014

@drobakowski thanks for testing, backed out 70ce087. I'd prefer to not hide all erl_syntax[_lib] errors and replaced it with a TODO comment.

@drobakowski
Copy link

@Tuncer thanks a lot for your great work! :) 👍

@ghost
Copy link
Author

ghost commented Mar 4, 2014

ping

@archaelus
Copy link

This is wonderful in day-to-day use - are there any blockers to merging it into mainline?

@jaredmorrow
Copy link
Contributor

No, I was originally waiting on the bugs @drobakowski found, and hadn't gotten back around to checking in.

@Vagabond
Copy link
Contributor

Vagabond commented Mar 4, 2014

Can you squash this into a saner commit history, please?

@Vagabond
Copy link
Contributor

Vagabond commented Mar 4, 2014

Were the two bugs holding things up fixed here: https://github.com/tuncer/rebar/commit/44ff85aba3a05ad78bd93e18907b2598259a15c3 ?

Also, you fixed the bugs, but this branch adds nothing to the regression tests? Should we have tests for those edge cases?

@ghost
Copy link
Author

ghost commented Mar 4, 2014

Can you squash this into a saner commit history, please?

I could probably squash a8372f7 0303434 88dcb6f 9369311 92be8c1 31ebeb1, but I'm not sure we'd want to collapse all the regression fixes and behavioral changes into one or two commits. If that's sufficient, I'll make the changes tomorrow.

@Vagabond
Copy link
Contributor

Vagabond commented Mar 4, 2014

Are they regressions that affected rebar before this branch, or were they regressions added in the branch? I'm fine with >1 commits for this, but 26 is pretty far beyond what I'd expect for a change of this size. It isn't even easy to tie all these back to this feature being added, if someone was to do a git blame.

@ghost
Copy link
Author

ghost commented Mar 4, 2014

Were the two bugs holding things up fixed here: tuncer@44ff85a ?

Yes.

Also, you fixed the bugs, but this branch adds nothing to the regression tests? Should we have tests for those edge cases?

We don't really have a test suite for rebar_erlc_compiler, but we should create one.

@Vagabond
Copy link
Contributor

Vagabond commented Mar 4, 2014

Or we can add an inttest that mirrors the issue.

@ghost
Copy link
Author

ghost commented Mar 4, 2014

Are they regressions that affected rebar before this branch, or were they regressions added in the branch?

New regressions added by the initial commit.

I'm fine with >1 commits for this, but 26 is pretty far beyond what I'd expect for a change of this size. It isn't even easy to tie all these back to this feature being added, if someone was to do a git blame.

If you want me to collapse it into a couple commits, I can try to do that, but it would be a couple commits with long commit messages. Keeping the various fixes and behavioral changes separate helps if we have to git-bisect a problem.

Anyway, I'll see what I can squash.

@Vagabond
Copy link
Contributor

Vagabond commented Mar 4, 2014

I'd rather make git bisect and git blame work better than have so many commits.

@ghost
Copy link
Author

ghost commented Mar 4, 2014

Or we can add an inttest that mirrors the issue.

That's what I meant by test suite. I'm not sure when I'll have time for that. So, if anyone wants to speed up the merge process, here's what inttest/erlc should most most likely do:

  • single demo project that uses all features and file types supported by rebar_erlc_compiler and also requires non-trivial compile priority to successfully build
  • initial erl_opts should not have no_debug_info
  • run rebar compile
  • verify result
  • add no_debug_info to erl_opts
  • run rebar clean compile
  • verify that beam files have no debug info

zinid and others added 2 commits March 5, 2014 15:20
* Do not parse source files twice while checking for relationship.
* Keep files relationships in a graph.
* The option 'keep_build_info' is introduced. When set to 'true'
  the graph will be kept in ebin/.rebar.build.info and will be
  used by further compiler calls. The default is 'false'.
@ghost
Copy link
Author

ghost commented Mar 5, 2014

I'd rather make git bisect and git blame work better than have so many commits.

Collapsed commits as much as possible.

* update files
* fix Dialyzer warning
* unconditionally enable info fil
* clean-up inconsistencies
* use term_to_binary compression
* use try...catch instead of case...catch...of
* do not write build info file if the graph is unmodified
* store info file as <base_dir>/.rebarinfo
* properly support list of compile directives
* fix regressions:
 - Fix a bug in handling of files to compile first.
 - If a file that is depended upon itself depends on other files, make sure
   those are compiled first. While at it, rename variables for correctness.
   Reported-by: David Robakowski
 - Make sure that FirstFiles has no dupes and preserves the proper order.
 - headers referenced via -include_lib() were not properly resolved to absolute
   filenames
 - .erl files found in sub dirs of src_dirs were not properly resolved to
   absolute filenames
@Vagabond
Copy link
Contributor

Vagabond commented Mar 5, 2014

Thank you. I will look at this today.

Vagabond added a commit that referenced this pull request Mar 5, 2014
Speed up the compilation process v5
@Vagabond Vagabond merged commit 62b0062 into rebar:master Mar 5, 2014
@ghost ghost deleted the erlc-speedup-v5 branch March 5, 2014 18:47
@ghost ghost mentioned this pull request Mar 5, 2014
@ghost
Copy link
Author

ghost commented Mar 5, 2014

Test suite included in #242.

jaredmorrow added a commit that referenced this pull request May 19, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
10 participants