Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ninja scans dependencies before order-only dependencies are run #760

Closed
mathstuf opened this issue May 6, 2014 · 25 comments
Closed

Ninja scans dependencies before order-only dependencies are run #760

mathstuf opened this issue May 6, 2014 · 25 comments

Comments

@mathstuf
Copy link
Contributor

mathstuf commented May 6, 2014

There is a CMake bug about ninja not running this properly:

rule R
    command = $COMMAND
build always: R
    COMMAND = echo 1 >> out
build out: phony out
build out-copy: R out || always
    COMMAND = cp out out-copy
build all: phony out-copy always
default all

Basically, ninja is scanning the dependencies of out-copy before always is run and therefore skips the fact that out has changed. Is it intended that dependency scanning is only guaranteed to be done after direct dependencies, not order-only dependencies or a bug? If the former, documentation that this is the case would be nice.

@nanoant
Copy link

nanoant commented Jun 17, 2014

It seems given example is kind of invalid, why are you setting out as phony if it is not. Why are you expecting Ninja to restat out after running build always if out is NOT set as output of this rule?

Putting build always out: R simply fixes your issue.

@mathstuf
Copy link
Contributor Author

I agree that adding out as an output would fix it, but it is a generated file and the fact that out is created may not be known (it is phony because no rule was known to create it).

I am wondering whether this is intended behavior or something that needed changing (hence the question at the end of the initial report). If it is working as intended, then there needs to be some update on CMake's docs about how to do add_custom_command and add_custom_target chaining properly (and hopefully Ninja would document the behavior that order-only dependencies do not block the "does this target need built" determination). I ask if it could be changed because make (at least the gmake and bmake implementations) don't even consider whether a target needs run until its order-only dependencies have been run and some consistency there would be Nice To Have.

@nanoant
Copy link

nanoant commented Jun 17, 2014

Ninja in fact documentaries this clearly in manual (hence order-only name):

When these are out of date, the output is not rebuilt until they are built, but changes in order-only dependencies alone do not cause the output to be rebuilt.

If you want build-always rule then you need explicit dependency (i.e. always which does not have to exists), so this is really what has to be done:

  rule R
      command = $COMMAND
  build always: R
      COMMAND = echo 1 >> out
  build out-copy: R | always
      COMMAND = cp out out-copy
  build all: phony out-copy
  default all

@mathstuf
Copy link
Contributor Author

I think the case where the docs are unclear is that it states "the output is not rebuilt until they are built" does not convey that Ninja may have already decided that the output does not need to be rerun before the order-only dependency has been run and it never looks twice. This is in contrast with make which doesn't consider a recipe's state before all of its dependencies are complete (including order-only). How is this:

When these are out of date, the output is not rebuilt until they are built, but changes in order-only dependencies alone do not cause the output to be rebuilt. They are also not considered when determining whether the output is out of date or not.

for a clarification in the docs?

I know that there are solutions to this, but unfortunately the information is not always known. In this case, the tool writing the ninja file does not (and, in general, cannot) know that out is written by the always rule to add the dependency from out-copy onto always since it just runs a string of arbitrary commands. On CMake's side, I think the solution is going to be for the Ninja generator to warn when this output is generated to let the user know that more information is required, but I wanted to make sure that this is the actual intended behavior before writing it off.

@bradking
Copy link
Contributor

For reference, there is also discussion on the CMake side of this issue here:

http://www.cmake.org/Bug/view.php?id=14963

It suggests a way CMake could be given more information to enable it to generate valid Ninja files for such cases. However, if Ninja were to have a way to request this behavior (e.g. "restat-dependencies-after-order-only = 1") then the problem would be solved for all existing CMake projects with no changes.

@nanoant
Copy link

nanoant commented Jun 18, 2014

Actually Ninja re-stats all outputs of rules being executed. Problem here it that you expect Ninja to re-stat some file after executing a rule that has NOT specified that file as its output. It is pure coincidence it works in make because build graph doesn't really imply that this file should be re-stat.

Therefore IMHO it is not Ninja bug or flaw and it has to be either fixed in CMake (if it is a fault of generator) by defining properly output of some rule (if it isn't already there) or fixed by developer in CMakeLists.txt properly defining OUTPUT using add_custom_command, instead add_custom_target which does not offer way to specify output of such target.

@bradking
Copy link
Contributor

What I explain in this CMake issue:

http://www.cmake.org/Bug/view.php?id=14963

is exactly what you state. Ninja does not need to be changed for this to be made to work.

Consider this Ninja issue as a feature request. If Ninja were to offer a way to ask that dependencies of a rule be stat-ed after the order-only dependencies are satisfied then CMake could be taught to use this feature and all existing projects using CMake will work unmodified.

@nanoant
Copy link

nanoant commented Jun 18, 2014

@bradking Now I understand, you want somehow make Ninja behave as make would. I got your point.

The question to you is this feature needed to make some of CMake internals (modules) work properly, or it is just for sake of developers so they don't need to worry about rewriting their add_custom_target to add_custom_command with outputs?

So if I understand correctly the reason of Kitware/CMake@539356f12 is the same as above, compatibility with make.

If it is so then I am eager to support this, unless if isn't enforced by default, but triggered either by some cmake command line define, CMakeLists.txt define or policy and described in documentation. WDYT?

As I complained in http://www.cmake.org/Bug/view.php?id=14972 forcing this by default leads to some other harmful side effects (phony rules for all source files for in-source build).

Also note that putting restat-dependencies-after-order-only = 1 by default unconditionally by CMake for all custom commands may possibly degrade rebuild performance too.

@bradking
Copy link
Contributor

@nanoant Okay. Let me explain the history of why CMake works this way. Build systems prior to ninja do not have a feature like "restat = 1". CMake needs to support side-effect outputs that are not always updated. Consider this example:

$ cat build.ninja
rule R
    command = $COMMAND
    # restat = 1 # simulate lack of feature by not specifying restat
build output-stamp output: R | input
    COMMAND = if ! test -f output || ! diff input output >/dev/null ; then cp input output; fi && touch output-stamp
build use-output: R output || output-stamp
    COMMAND = cp output use-output
default use-output

The idea is that output does not always change with input so we use a separate timestamp file to track whether the rule has been run. Let's try some examples:

$ echo 1 >input
$ ninja -v
[1/2] if ! test -f output || ! diff input output >/dev/null ; then cp input output; fi && touch output-stamp
[2/2] cp output use-output
$ ninja -v
ninja: no work to do.
$ echo 2 >input
$ ninja -v
[1/2] if ! test -f output || ! diff input output >/dev/null ; then cp input output; fi && touch output-stamp
[2/2] cp output use-output
$ ninja -v
ninja: no work to do.

So far so good. Now let's update the timestamp of input in a way that does not change output:

$ echo 2 >input
$ ninja -v
[1/2] if ! test -f output || ! diff input output >/dev/null ; then cp input output; fi && touch output-stamp
[2/2] cp output use-output
$ ninja -v
[1/2] if ! test -f output || ! diff input output >/dev/null ; then cp input output; fi && touch output-stamp
[2/2] cp output use-output

Now ninja needs to run the output rule every time because output never becomes newer than input but is listed as an output of the rule. With ninja one could use "restat = 1" to fix this, but CMake evolved long before ninja existed.

A common convention in the CMake world that works with all other supported build systems is to leave off the explicit specification of output so that the build tool considers only the timestamp of input versus output-stamp. We still want use-output to only run when output has really changed, so there is an order-only dependency on output-stamp but a real dependency on output. Then we arrive at the case explained in the original submission of this issue.

Since CMake evolved when no build tools supported restat, it never considered needing enough information to produce the above example with the explicitly listed output. Therefore the add_custom_command and add_custom_target commands do not provide an interface for specifying side-effect outputs. That is what CMake issue 14963 is about. Furthermore, since no such interface exists, no projects specify this information. Therefore fixing this on the CMake side requires both adding the feature to CMake (which is worthwhile regardless of the outcome here) and updating existing projects to use the feature before they can resolve this issue.

@bradking
Copy link
Contributor

After thinking about this further, we do not actually need a re-stat after order-only dependencies. We just need Ninja to delay its initial stat until after order-only dependencies. I think this should just be the way Ninja works and should not need any option.

IIUC, Ninja stats everything up front so it can start as many initial tasks as possible. Furthermore, one of Ninja's design goals is to do as little work as possible prior to starting the first task evaluation. Since rules with an order-only dependency cannot possibly be the first to be evaluated, why do their dependencies need to be stat-ed up front? Just consider each rule for evaluation after its order-only dependencies are satisfied and at that time stat any dependencies that have not yet been stat-ed or known to have been updated as the output of an already-evaluated rule.

For rules without order-only dependencies this approach is equivalent to the current approach. For rules with order-only dependencies this is a sensible behavior. It reduces the amount of disk access that ninja has to do before starting task evaluations. If a rule fails prior to an order-only dependency being satisfied, ninja will have done less total work and failure will be known sooner.

@nanoant
Copy link

nanoant commented Jun 25, 2014

@bradking Wow, these are amazing insights. Indeed such behavior would fix all the issues w/o any need to do some extra work on CMake side.

Looking at Ninja code however I can see this would require massive changes as nodes are stated fist time they appear in build graph and this is done top (target) to bottom (source files) in AddTarget. Then build plan is executed. The easiest workaround would be to introduce restat_rdeps setting for command (in this case CMake's custom command), that will force restat (via RecomputeDirty) on implicit inputs of reverse dependencies of such custom command. This time we would need to mark dirty rdeps bottom to the top.

This looks like smaller change. Would be great to get some opinion on this subject from @martine

@mathstuf
Copy link
Contributor Author

I'm not too familiar with ninja's internals, but when building the graph of things to do, would it be possible to have a DelayedStatNode subclass for targets which should defer their stat until after the order-only dependencies?

sbc100 pushed a commit to sbc100/native_client that referenced this issue Feb 20, 2015
scan_sources.py should always emit the original source files passed as input,
even if it cannot find them. This is necessary to create a proper dependency
arc from the build rule "mojo_nacl" to the generated source "libmojo.cc" from
the "monacl_codegen" build rule..

It's not sufficient to convey this source dependency through ninja depfiles.
This is because the emitted dependency in the mojo_nacl.ninja from "mojo_nacl"
to "monacl_codegen" is order-only and depenencies are scanned before order-only
dependencies are run (ninja-build/ninja#760).

BUG=https://code.google.com/p/chromium/issues/detail?id=440451
TEST=manually modified libmojo.cc.tmpl and rebuilt mojo_nacl.
R=bradnelson@google.com, ncbray@chromium.org

Review URL: https://codereview.chromium.org/798733002

git-svn-id: svn://svn.chromium.org/native_client/trunk/src/native_client@14212 fcba33aa-ac0c-11dd-b9e7-8d5594d729c2
@ahartmetz
Copy link

Is anybody still interested in implementing Brad's suggestion?
I use CMake with Ninja, I find the byproducts thing to be a bit of a bummer because it's hard to explain to people who just want to build something without becoming an expert in build system matters, and I enjoy working on graph algorithms. If nobody else picks this up, I'll add it to the pile of stuff I want to do in "my copious free time" 😊.

@arjanhouben
Copy link

I think this bug should be given more attention, as builds can very easily be incorrect without the user knowing it.
I use a lot of the following constructs.

add_custom_target(
   generate_version
    COMMAND
    <generate version.tmp>
    copy_if_different version.tmp version.h
)
add_executable( my_exe main.cpp version.h )
target_link_libraries( my_exe some_lib )
add_dependencies( my_exe generate_version )

This is an oversimplified example, leaving out the GENERATED property for the version.h file for example.
The problem here is that while my_exe will probably build without issues, if the only thing to change is the contents of version.h, my_exe will not be updated.
This may be obvious enough since the build step will be pretty much instant.
However, as in my case often is, if some_lib is updated which should lead to a different version.h, my_exe is NOT updated, since the only thing to happen is to re-link the executable.
Another problem with this, is that it is non-trivial to check afterwards, as the timestamps of version.h, some_lib and my_exe might actually be exactly as you expect, while still not having included the latest content of version.h in my_exe!

@mathstuf
Copy link
Contributor Author

FYI, the better way to do that is add_custom_command(OUTPUT version.h) followed by add_custom_target(generate_version DEPENDS version.h). The dependencies actually work that way.

@arjanhouben
Copy link

That would not have the same effect, as version.h is not generated if it already exists. The point of the generate_version command is to run always and check for reasons to bump the version, but only trigger a new build if something actually did change.

@mathstuf
Copy link
Contributor Author

Oh, right. Add version.h.noexist to the output list. I'll find the code which does this properly later today…

@bradking
Copy link
Contributor

In the time since this issue was reported I've become very familiar with Ninja internals while implementing features needed for Fortran support. It is pretty clear that what is requested in this issue cannot be implemented without a major re-implementation of Ninja internals. It just does not fit in Ninja's model. Furthermore, deferred stat prevents a monolithic build plan from being computed ahead of time, so progress information cannot be generated reliably. The feature's I've implemented to support Fortran (not yet integrated upstream) do allow build-time dependency discovery in the rare cases that really need it.

Since CMake 3.2 both the add_custom_command and add_custom_target commands support a BYPRODUCTS option meant specifically to add the extra dependency information that is needed to generate a proper build.ninja file without requiring changes to Ninja. With a mature solution on the CMake side, I think this issue can be closed.

@arjanhouben
Copy link

I have been able to create a sample which shows how this solution does not work for RC files.
CMakeLists.txt
generate_version.cmake.txt ( remove .txt )

If you run
cmake -GNinja <source_dir>;ninja
it will complain that it cannot find generate_version.rc
run ninja again and it will work ( because the last run actually did create them )

if you do the same with NMake
cmake -G"NMake Makefiles" <source_dir>;nmake
it will complain that it cannot generate version.cpp

removing the byproducts and instead adding them as outputs for a custom target make the sample work as expected on NMake.
But that is not a solution for Ninja as it doesn't update the files before compiling/linking, leading to a generated executable that is one iteration behind the generated source files.

Not that you can check the validity of the build by comparing the contents of version.cpp, the output of the generate_test executable and the version info ( on windows ) of the executable, for example through sigcheck.exe

@bradking
Copy link
Contributor

@arjanhouben you're missing a dependency:

add_dependencies(generate_test generate_version_files)

@arjanhouben
Copy link

arjanhouben commented Oct 14, 2016

@bradking you are right, that will fix the first time missing file dependency.
It doesn't fix the problem unfortunately, as can be seen:

cmake ...
ninja
./generate_test.exe -> YkJ3PdwtuK
sigcheck.exe generate_test.exe -> YkJ3PdwtuK

so far so good, now I run (only) Ninja again

ninja
./generate_test.exe -> Mss8Ib8WNg
sigcheck.exe generate_test.exe -> YkJ3PdwtuK

Now it is clear that the RC file was NOT used in the binary ( same string as before ) while the version.cpp WAS updated ( new string in executable )
Note that the RC file itself was updated, it just was not used in the creation of the new binary

@bradking
Copy link
Contributor

@arjanhouben I can't reproduce that. It works for me. Either way, this discussion is outside the scope of this issue so if you still have trouble please ask over in CMake land.

@nico nico closed this as completed Oct 23, 2016
@nico nico reopened this Oct 23, 2016
@nico
Copy link
Collaborator

nico commented Oct 23, 2016

Oh sorry, looks like the earlier parts of the issue are still valid.

@nico nico closed this as completed Oct 23, 2016
@nico nico reopened this Oct 23, 2016
@bradking
Copy link
Contributor

looks like the earlier parts of the issue are still valid.

@nico technically yes, but as I noted here solving this will require a fundamental shift in Ninja's model. Unless that is ever a realistic option I don't think there is a reason to keep this open. We now have a mature solution on the CMake side for the original problem that motivated this issue.

@arjanhouben
Copy link

I've created a CMake issue for the out of date RC file here: https://gitlab.kitware.com/cmake/cmake/issues/16417
As @bradking pointed out, it does not seem to be an issue in Ninja, but it is most easily reproduced by using Ninja as a generator

vgvassilev added a commit to vgvassilev/root that referenced this issue Apr 1, 2017
Using the BUILD_BYPRODUCTS directive we tell ninja that those libs will be
created and it is ok to use them as dependencies.

Further info:
 * ninja-build/ninja#760
 * https://cmake.org/Bug/view.php?id=14963
 * https://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=557aef0b
vgvassilev added a commit to vgvassilev/root that referenced this issue Apr 3, 2017
Using the BUILD_BYPRODUCTS directive we tell ninja that those libs will be
created and it is ok to use them as dependencies.

Further info:
 * ninja-build/ninja#760
 * https://cmake.org/Bug/view.php?id=14963
 * https://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=557aef0b
vgvassilev added a commit to vgvassilev/root that referenced this issue Apr 6, 2017
Using the BUILD_BYPRODUCTS directive we tell ninja that those libs will be
created and it is ok to use them as dependencies.

Further info:
 * ninja-build/ninja#760
 * https://cmake.org/Bug/view.php?id=14963
 * https://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=557aef0b
vgvassilev added a commit to root-project/root that referenced this issue Apr 6, 2017
Using the BUILD_BYPRODUCTS directive we tell ninja that those libs will be
created and it is ok to use them as dependencies.

Further info:
 * ninja-build/ninja#760
 * https://cmake.org/Bug/view.php?id=14963
 * https://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=557aef0b
zzxuanyuan pushed a commit to zzxuanyuan/root-compressor-dummy that referenced this issue Apr 13, 2017
Using the BUILD_BYPRODUCTS directive we tell ninja that those libs will be
created and it is ok to use them as dependencies.

Further info:
 * ninja-build/ninja#760
 * https://cmake.org/Bug/view.php?id=14963
 * https://cmake.org/gitweb?p=cmake.git;a=commitdiff;h=557aef0b
@jhasse jhasse closed this as completed Nov 2, 2018
apodolsk pushed a commit to memsql/llvm-project that referenced this issue Jan 5, 2023
Simply saying "every .pcm.o depends on the .pcm" in an -fmodules-codegen
build doesn't suffice. Ninja would need to check .pcm timestamps just
before it starts working on libAllModules, since clang rewrites them as a
side effect of compiling order-only dependencies of libAllModules. But in
fact Ninja is designed to only check timestamps at the very start of the
build (ninja-build/ninja#760).

So instead pipe proper -MD -MT etc. depfile generation support through the
pcm -> .o compilation path, so that Ninja can learn exactly which headers
foo.pcm.o depends on, the same way it learns it for foo.cpp.o. LLVM
already has convenient infra for this, since it needs to be able to
produce depfiles when loading modules for .cpp -> .o compilation too. We
just do the same steps in the .pcm -> .o path now.

In particular: DependencyFileGenerator is the normal depfile code, and the
ASTReader that both paths use to load modules from disk has a
ASTReaderListener::visitInputFile into which it hooks. Each module file
stores refs to its constituent headers, textual includes, and imported
modules in a header, and ASTReader recurses into imported modules. So
visitInputFile gets the full set of transitive includes.

Also, clang wasn't actually forwarding -MMD etc to the -cc1 invocation
when doing .pcm -> .o compilation (the design is: `clang` rephrases your
gcc-compatible invocation into an internal `clang -cc1` invocation), since
those are considered "preprocessor options" but there's no preprocessing
happening. The design bug is that these aren't really preprocessor options
now, but I just chase fix it in Clang.cpp and pass down all preprocessor
options.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants