Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Speed up the compilation process v5 #129

Merged
merged 3 commits into from Mar 5, 2014

Conversation

Projects
None yet
Contributor

tuncer commented Sep 1, 2013

Sames as #113 except:

  • trivial fixes and updates
  • build info data refactoring
  • using term_to_binary compression
  • not writing build info file if the graph is unmodified
  • default enabling keep_build_info
  • unconditionally keeping build info
  • making the info file universal and storing it as <base_dir>/.rebarinfo
  • introducing <base_dir>/.rebar/ and storing erlc build info there
  • fixing regressions

proger commented Sep 11, 2013

I've been using eflame to profile rebar runtimes in cases when all files already compiled (so ideally rebar should finish instantly), but it seems like init_rebarinfo is taking some time to gather infos when it's not necessary — is it possible to make it not do that?

flame graph (the way to build this picture is explained at the bottom of eflame's README)

Contributor

ferd commented Sep 17, 2013

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

Owner

dizzyd commented Sep 20, 2013

@ferd @tuncer I'm not going to review all 5 versions of this PR - can we please pick one and close the others?

Contributor

ferd commented Sep 20, 2013

@dizzyd my understanding is that each branch does more than the other. You could review v5 only, and if you don't like a feature or something, it may turn out that v4, v3, v2, or v1 already have the separation for that.

Contributor

tuncer commented Sep 20, 2013

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

@ferd, @proger, any technical issue with postponing that to after we merge one of the 5 variants first?

Contributor

ferd commented Sep 20, 2013

No issue. It's only an inefficient thing. It can be fixed without
breaking anything also.

On 09/20, Tuncer Ayaz wrote:

Regarding init_rebarinfo, I'm guessing the biggest gains in time could be to confirm whether the graph has been modified or not on a run, and then avoiding re-writing the file when that happens.

Right now it looks like the entire project is serialized again every time, and flushed back to disk, even if nothing changed.

That would be a low hanging fruit, I guess.

@ferd, any technical issue with postponing that to after we merge one of the 5 variants first?


Reply to this email directly or view it on GitHub:
#129 (comment)

Contributor

tuncer commented Sep 22, 2013

@proger, @ferd, thanks for the init_rebarinfo suggestion. I made the necessary changes.

I've started using this version of the patch day-to-day and I like it. Rebar seems super fast now.

👍

Member

tsloughter commented Nov 21, 2013

I've been using it for a while now. No issues.

Contributor

jaredmorrow commented Nov 21, 2013

Since this also adds the notion of a new file, I will tag rebar as it is, then test this branch as well as I can and merge it in, any problems with that @tuncer?

Contributor

tuncer commented Nov 22, 2013

Works for me.

Contributor

tuncer commented Dec 6, 2013

While discussing lock-deps and erlc-speedup-v5 with @seth, he suggested to introduce a directory for all info files. That's a good idea, and with that implemented, I consider erlc-speedup-v5 ready to merge. That directory is also a good place to store a rebar lock-deps info file.

We've been using this patch on our version of rebar at Heroku and it kicks ass - compilation speed is fantastic and day-to-day development feels much better with this patch in rebar.

💯 I would love to see this patch land in mainline.

Hi @tuncer and @zinid,

thanks a lot for the great speed improvements! I’ve tried this patch for some projects but I’ve encountered some compilation errors while using it. One project has a behaviour module located in a subfolder of src/ e.g. src/behaviour/some_behaviour.erl. You can easily reproduce the error for example with poolboy:

mkdir src/behaviour && mv src/poolboy_worker.erl src/behaviour/ && cp test/poolboy_test_worker.erl src/

Errors while trying to compile:

==> poolboy (compile)
Compiling src/poolboy_test_worker.erl failed:
src/poolboy_test_worker.erl:3: behaviour poolboy_worker undefined
ERROR: compile failed while processing /private/tmp/poolboy: rebar_abort

Another error appears while trying to compile erlando:

==> erlando (compile)
Compiling src/monad_plus.erl failed:
src/monad_plus.erl:none: undefined parse transform 'do'
ERROR: compile failed while processing /private/tmp/erlando: rebar_abort

Bisecting with git located the first bad commit at 0796ffe, for both projects.

Contributor

tuncer commented Jan 18, 2014

@drobakowski thanks for the bug report. I'll look into it.

Contributor

tuncer commented Jan 19, 2014

@drobakowski can you please retest?

@tuncer compiling erlando now works fine! The first example with the modified poolboy worker still doesn't compile. I've uploaded you the modified poolboy worker src on DB.

Contributor

tuncer commented Jan 19, 2014

@drobakowski that's right, I didn't fix that yet.

@Vagabond Vagabond and 1 other commented on an outdated diff Feb 5, 2014

src/rebar_erlc_compiler.erl
@@ -260,7 +274,7 @@ doterl_compile(Config, OutDir) ->
doterl_compile(Config, OutDir, []).
doterl_compile(Config, OutDir, MoreSources) ->
- FirstErls = rebar_config:get_list(Config, erl_first_files, []),
+ ErlFirstErls = rebar_config:get_list(Config, erl_first_files, []),
@Vagabond

Vagabond Feb 5, 2014

Contributor

This variable name seems to have a bit of redundancy...

@tuncer

tuncer Mar 5, 2014

Contributor

Renamed to ErlFirstFiles

Contributor

Vagabond commented Feb 5, 2014

If this can be squashed into a saner commit history, I think this is ready to merge.

Contributor

Vagabond commented Feb 5, 2014

As some anecdotal evidence, it takes a riak rebuild from 20 to 10 seconds and riak_cs build from 10 to 5 seconds, which is very nice for iterative development.

Contributor

tuncer commented Feb 6, 2014

Before we can merge this, I first need to fix the 2nd bug reported by @drobakowski, and I'd also welcome a review of the code that skips updating the info file if there's no change. Specifically, whether the logic is correct for the case of a compile error, and if in that case a different update strategy is needed.

Once that's all done, I'll look at the commit history.

Contributor

Vagabond commented Feb 6, 2014

The test case provided by @drobakowski passed for me, when I tried it. I thought you had fixed it.

Contributor

tuncer commented Feb 6, 2014

I have only fixed the first bug. The second problem happens in the 'modified poolboy worker' scenario.

Contributor

Vagabond commented Feb 6, 2014

I can't replicate the error, but maybe I don't understand how to.

Contributor

tuncer commented Feb 6, 2014

Have you tried building poolboy from the archive @drobakowski linked to?

Contributor

Vagabond commented Feb 6, 2014

Yes

Contributor

tuncer commented Feb 6, 2014

I was able to reliably reproduce the following error 18 days ago, and the debugging I did back then was inconclusive.

==> poolboy (compile)
Compiling src/poolboy_test_worker.erl failed:
src/poolboy_test_worker.erl:3: behaviour poolboy_worker undefined
ERROR: compile failed while processing /private/tmp/poolboy: rebar_abort

I wasn't able to reproduce the error today. The only difference is that the erlang vm I used for testing 18 days ago was i386, and today I used a amd64 halfword-emu vm. I don't have access to a i386 vm right now, so can someone verify with i386 R16B03-1?

@drobakowski, can you still reproduce the poolboy error? If you can, what are the platform details?

Yes I can still reproduce the error with the latest commit from tuncer/rebar@5b88515 on a x86_64 / Darwin Kernel Version 13.0.0 with 64-bit R16B03-1.

Contributor

tuncer commented Feb 6, 2014

Thanks, if you have time, I'd welcome verification it fails for you with halfword-emu and i386, too.

Contributor

tuncer commented Feb 6, 2014

@drobakowski can you also test rebar -j1 compile?

Yeah sure, I will try to spend some time tomorrow morning for building the proper erlang vms.

@tuncer the compile errors also occurs with -j1 and also with i386. Building the erlang vm with halfword-emu doesn't seem to work quiet well on a Mac and hence I'm unable to test it with this vm. Sorry, maybe someone else could try to reproduce this with halfword-emu?

I'm able to reproduce the error on Ubuntu Server 12.04.4 LTS (GNU/Linux 3.2.0-27-generic x86_64) with R16B03-1 64-bit halfword-emu.

Contributor

Vagabond commented Feb 7, 2014

FWIW I am on Arch Linux x64 with erlang R16B02.

Andrew

Contributor

tuncer commented Feb 20, 2014

@drobakowski I have uncovered a 2nd bug and fixed both regressions. Can you please retest?

@tuncer great, 1c8a9ae and 5e81289 are working fine now but fixing "catch all" in 70ce087 lead to some other error. See the diff note.

Contributor

tuncer commented Feb 20, 2014

@drobakowski thanks for testing, backed out 70ce087. I'd prefer to not hide all erl_syntax[_lib] errors and replaced it with a TODO comment.

@tuncer thanks a lot for your great work! :) 👍

Contributor

tuncer commented Mar 4, 2014

ping

This is wonderful in day-to-day use - are there any blockers to merging it into mainline?

Contributor

jaredmorrow commented Mar 4, 2014

No, I was originally waiting on the bugs @drobakowski found, and hadn't gotten back around to checking in.

Contributor

Vagabond commented Mar 4, 2014

Can you squash this into a saner commit history, please?

Contributor

Vagabond commented Mar 4, 2014

Were the two bugs holding things up fixed here: tuncer/rebar@44ff85a ?

Also, you fixed the bugs, but this branch adds nothing to the regression tests? Should we have tests for those edge cases?

Contributor

tuncer commented Mar 4, 2014

Can you squash this into a saner commit history, please?

I could probably squash a8372f7 0303434 88dcb6f 9369311 92be8c1 31ebeb1, but I'm not sure we'd want to collapse all the regression fixes and behavioral changes into one or two commits. If that's sufficient, I'll make the changes tomorrow.

Contributor

Vagabond commented Mar 4, 2014

Are they regressions that affected rebar before this branch, or were they regressions added in the branch? I'm fine with >1 commits for this, but 26 is pretty far beyond what I'd expect for a change of this size. It isn't even easy to tie all these back to this feature being added, if someone was to do a git blame.

Contributor

tuncer commented Mar 4, 2014

Were the two bugs holding things up fixed here: tuncer@44ff85a ?

Yes.

Also, you fixed the bugs, but this branch adds nothing to the regression tests? Should we have tests for those edge cases?

We don't really have a test suite for rebar_erlc_compiler, but we should create one.

Contributor

Vagabond commented Mar 4, 2014

Or we can add an inttest that mirrors the issue.

Contributor

tuncer commented Mar 4, 2014

Are they regressions that affected rebar before this branch, or were they regressions added in the branch?

New regressions added by the initial commit.

I'm fine with >1 commits for this, but 26 is pretty far beyond what I'd expect for a change of this size. It isn't even easy to tie all these back to this feature being added, if someone was to do a git blame.

If you want me to collapse it into a couple commits, I can try to do that, but it would be a couple commits with long commit messages. Keeping the various fixes and behavioral changes separate helps if we have to git-bisect a problem.

Anyway, I'll see what I can squash.

Contributor

Vagabond commented Mar 4, 2014

I'd rather make git bisect and git blame work better than have so many commits.

Contributor

tuncer commented Mar 4, 2014

Or we can add an inttest that mirrors the issue.

That's what I meant by test suite. I'm not sure when I'll have time for that. So, if anyone wants to speed up the merge process, here's what inttest/erlc should most most likely do:

  • single demo project that uses all features and file types supported by rebar_erlc_compiler and also requires non-trivial compile priority to successfully build
  • initial erl_opts should not have no_debug_info
  • run rebar compile
  • verify result
  • add no_debug_info to erl_opts
  • run rebar clean compile
  • verify that beam files have no debug info

zinid and others added some commits Jul 1, 2013

@zinid @tuncer zinid Speed up the compilation process
* Do not parse source files twice while checking for relationship.
* Keep files relationships in a graph.
* The option 'keep_build_info' is introduced. When set to 'true'
  the graph will be kept in ebin/.rebar.build.info and will be
  used by further compiler calls. The default is 'false'.
fd17693
@nox @tuncer nox erlc: add support for detecting core transforms 7742d70
Contributor

tuncer commented Mar 5, 2014

I'd rather make git bisect and git blame work better than have so many commits.

Collapsed commits as much as possible.

@tuncer tuncer erlc: clean-up, enhance, and regression fix fd17693
* update files
* fix Dialyzer warning
* unconditionally enable info fil
* clean-up inconsistencies
* use term_to_binary compression
* use try...catch instead of case...catch...of
* do not write build info file if the graph is unmodified
* store info file as <base_dir>/.rebarinfo
* properly support list of compile directives
* fix regressions:
 - Fix a bug in handling of files to compile first.
 - If a file that is depended upon itself depends on other files, make sure
   those are compiled first. While at it, rename variables for correctness.
   Reported-by: David Robakowski
 - Make sure that FirstFiles has no dupes and preserves the proper order.
 - headers referenced via -include_lib() were not properly resolved to absolute
   filenames
 - .erl files found in sub dirs of src_dirs were not properly resolved to
   absolute filenames
b690842
Contributor

Vagabond commented Mar 5, 2014

Thank you. I will look at this today.

@Vagabond Vagabond added a commit that referenced this pull request Mar 5, 2014

@Vagabond Vagabond Merge pull request #129 from tuncer/erlc-speedup-v5
Speed up the compilation process v5
62b0062

@Vagabond Vagabond merged commit 62b0062 into rebar:master Mar 5, 2014

1 check passed

default The Travis CI build passed
Details

@tuncer tuncer deleted the tuncer:erlc-speedup-v5 branch Mar 5, 2014

@tuncer tuncer referenced this pull request Mar 5, 2014

Merged

Extra commits for #129 #242

Contributor

tuncer commented Mar 5, 2014

Test suite included in #242.

@jaredmorrow jaredmorrow pushed a commit that referenced this pull request May 19, 2014

Jared Morrow Merge pull request #242 from tuncer/erlc-speedup-v5-fixup
Extra commits for #129
fdb66ec
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment