-
Notifications
You must be signed in to change notification settings - Fork 296
Refactor logic and optimizations in rebar_erlc_compiler:doterl_compile/4 #467
Conversation
ghost
commented
Mar 29, 2015
- Change format of erlcinfo so that it is more robust and contains enough info for deciding which files need to be recompiled.
- Add tests.
- Remove unneeded code.
Thanks for the patch. @norton and @nevar, can you check that the changes don't break your projects? You forgot to append your name in Please do not rewrite the branch for any followup fixes, because it's a large change and, given GitHub's review system, it's the only way to review further changes. That means, make any fixes in extra commits, even though it's many commits. Can you separately explain the include_path and store_erlc_info changes in more detail, in order to leave no room for interpretation? If it makes sense, maybe as a code comment. |
Don't worry, I am used to doing review by sending just fixups. I have made a force push shortly after initial pull request only because nobody had probably looked at it at that time. Although here the fixups will bring some merge conflicts during final rebase, but I should manage. Actually I am also used to commenting on particular lines of source code instead of writing comments to conversation sections. So if you don't mind I will continue the review in particular commits, where appropriate. |
I think a have adressed all your comments. Either by sending a fixup or by commenting to corresponding commit. |
@Tuncer Here is the result with one project - ubf. I did not check the details of the failure.
|
I don't mind having the fixups as extra commits, though you'd have to fix the commit message in that case.
Of course, it's just that Github's review system is insufficient, and I need a way to keep track of review progress. There's a lot of commits, but I'll try to not miss any of the inline comments. |
needed_files(G, OutDir, [Source|Rest]) -> | ||
Target = target_base(OutDir, Source) ++ ".beam", | ||
Needed = digraph:vertex(G, Source) > {Source, filelib:last_modified(Target)}, | ||
[Source||Needed] ++ needed_files(G, OutDir, Rest); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not:
needed_files(G, OutDir, ErlFileList) ->
[File ||
File <- ErlFileList,
digraph:vertex(G, File) > {File, beam_last_modified(OutDir, File)}].
beam_last_modified(OutDir, Source) ->
filelib:last_modified(target_base(OutDir, Source) ++ ".beam").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have solved it very similarly with lists:filter
as per suggestion by @ferd.
Great work. Looked code. Not much only part related to |
So I think that I have adressed all issues, except that |
@kejv Oops ... I used a private url for the cloning recipe. ubf is not a private repository and is available at GitHub. I updated the recipe above. |
Let's get the regression(s) fixed and move this forward. We can address subjective code issues independently. |
Its case statement is noop.
Also pas only InclDirs into init_erlcinfo as this is the only thing from Config/ErlOpts needed there.
Otherwise the code cannot be safely refactored, since potential new erros slip unnoticed.
As erlcinfo is only an internal optimization mechanism, user should be only minimally bothered by issues with it. Especially it shouldn't ever cause a fatal error and if it cannot be restored properly, only a warning (not error) message should be emitted (if any). Also it probably doesn't make sense for this warning message to be too detailed - just state that something went wrong and silently delete the erlcinfo file.
Current separation of part of the logic into include_path obscures the purpose of Dirs (see expand_file_names). Moreover for each particular Erl its source file was included twice in Dirs, which is now corrected. Also InclDirs (specified in erl_opts) as part of erlcinfo since they can affect resulting graph, which needs to be therefore regenerated when InclDirs change. See added test as an example.
I have squashed all the fixups, fixed merge conflicts along the way and added one additional fixup for that |
@kejv wrote:
Yeah, I found+fixed the initial problems with include_lib only after disabling the catch-all, and then commented it shouldn't unconditionally suppress all errors. I like that we don't do that anymore, and I wonder if we have a means of reporting any other error in a nicer way. I'm thinking of anything that might throw in Other than that, this is ready to merge, isn't it? |
As I said earlier I have experimented a little with syntax errors in compile attributes, hoping that I will get exception here in process_attr/2 instead of in actual compilation, but I didn't (at least the error messages were the same as from erlc directly). So if there's a source of rebar exceptions here I don't of it. |
It would be easier to match |
Ok then. I will do one final rebase and then it can be merged. |
When the source file changes it could happen that some of its dependecies get removed. In that case we should remove these former dependencies from the graph, so that they don't influence recompilation of the source file anymore.
- use filelib:last_modified/1 instead of date/0 and time/0 combo in source file vertices - simplify store_erlcinfo/2 - document init_erlcinfo/2
Optimizing for best compression could be very slow for large projects, so optimize rather for the speed. erlcinfo file isn't that huge anyway and moreover difference between levels 2 and 9 isn't that big in practice (although there's quite big difference between level 2 and no compression at all).
The precise algorithm is now following: - Decide which files need to be compiled first (ErlFirstFiles). - Split AllErlFiles which are not in ErlFirstFiles into two groups, one consisting of files some other file depends on (DepErls) and the rest (OtherErls). - Final FirstErls are obtained as ErlFirstFiles (in original order) plus toposorted DepErls. OtherErls are left as is since they can be compiled in any (random) order. Also create the erlcinfo graph as acyclic, since otherwise the toposort could not work and we don't want to have cycles among dependencies anyway.
These times already contain max modified info of their dependencies. Therefore we no longer have to check in internal_erl_compile whether the file really needs recompiling, which simplifies the flow somewhat, because the work with dependency graph is now localized to much smaller space then before.
As per @Tuncer: "That's most likely a vestige of package module support".
Since process_attr/3 searches source code for attributes, it can happen that it finds an attribute which is eventually not needed by the compilation, e.g. hidden by ifdef macro - see enclosed test. Such attribute can reference a file which is not in the path or which even doesn't exist at all, and current code doesn't expect such a situation. We fix things by simply ignoring such files.
Its format was (repeatedly) changed in previous commits.
@kejv changes look good 👍 Thank you.
|
@ferd, okay to merge? |
Let's see how it goes and if it breaks stuff! |
Refactor logic and optimizations in rebar_erlc_compiler:doterl_compile/4
I saw the same problem reported on #479 where my project uses "include_lib" statements when I updated rebar. This one gives me error: The error is: |
@carlosedp see #485 |