Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nim doesn't recompile dependencies when it should, can be fixed via clang -M #510

Open
timotheecour opened this issue Jan 24, 2019 · 15 comments

Comments

@timotheecour
Copy link
Member

timotheecour commented Jan 24, 2019

This issue affects any nim code that interfaces with C, and is a common annoyance with nimterop, causing sometimes hard to diagnose compile time errors, or worse, silently giving wrong outputs at runtime (as shown in test case)
The issue is that nim doesn't recompile the C files if they depend on a header file that was modified since last compilation.

proposal for a fix

for every compiler invocation, eg:

clang -c  -w -I/Users/timothee/git_clone/nim/Nim/build  -I/Users/timothee/git_clone/nim/Nim/lib -I/Users/timothee/git_clone/nim/Nim/tests/misc/nimcachetest -o /Users/timothee/.cache/nim/tmain_d/stdlib_helpers2.c.o /Users/timothee/.cache/nim/tmain_d/stdlib_helpers2.c

generate a call to clang -M, eg:

clang -M -w -I/Users/timothee/git_clone/nim/Nim/build  -I/Users/timothee/git_clone/nim/Nim/lib -I/Users/timothee/git_clone/nim/Nim/tests/misc/nimcachetest /Users/timothee/.cache/nim/tmain_d/stdlib_helpers2.c
stdlib_helpers2.o: /Users/timothee/.cache/nim/tmain_d/stdlib_helpers2.c \
  /Users/timothee/git_clone/nim/Nim/lib/nimbase.h \
  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/include/limits.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/limits.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/sys/cdefs.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/sys/_symbol_aliasing.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/sys/_posix_availability.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/machine/limits.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/i386/limits.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/i386/_limits.h \
  /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include/sys/syslimits.h \
  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/include/stddef.h \
  /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/include/__stddef_max_align_t.h

this tells you the (recursive!) include dependencies, hence the things that should get tracked to validate/invalidate the cache

This should be robust.

[EDIT] note: this approach should work w most/(all?) major C compilers

these all return recursive includes:

  • clang -M
  • gcc -M
  • windows: visual studio C++: /showIncludes (see link )
timotheecour referenced this issue in timotheecour/Nim Jan 24, 2019
timotheecour referenced this issue in timotheecour/Nim Jan 24, 2019
@timotheecour timotheecour changed the title nimcache cache is not robust, should rely on clang -M nim doesn't recompile dependencies when it should, can be fixed via clang -M Jan 24, 2019
@Araq
Copy link
Member

Araq commented Jan 24, 2019

It's not Nim's business, Nim is not a build tool for C code. If the header file changed you use -f.

@timotheecour
Copy link
Member Author

timotheecour commented Jan 24, 2019

If the header file changed you use -f.

that's the thing, user may not know it the header changed. He may not even know there's a header involved, it could be coming from a third party library (say, if using nimterop or any other thing that depends on C that's pretty much guaranteed to happen, and happened many times to me and @genotrance ).
As demo'd in the PR, best case scenario it's a clang/gcc compile error, worst case scenario it's plain old undefined behavior (as in PR, wrong value read at runtime)

using a blanket defensive strategy via -f is your nim.cfg isn't an option for efficiency reasons either.

The fix isn't even hard; it's just a bug, and it's a PR away.

@Araq
Copy link
Member

Araq commented Jan 24, 2019

  1. It is not a bug at all.
  2. The fix relies on clang as far as you told me.
  3. The fix is generally the wrong idea.

If you have .compile: "file.c" in your code and you ship file.c code with your package, there is nothing that could go wrong, except missing header files maybe. You don't develop C and Nim code at the same time without a build tool, or if you do, you should know about -f.

A more general "should this be recompiled?" hook would be fine with me, but not this clang-specific "fix".

@timotheecour
Copy link
Member Author

timotheecour commented Jan 24, 2019

  1. The fix relies on clang as far as you told me.

gcc works exact same, gcc -M

A more general "should this be recompiled?" hook would be fine with me, but not this clang-specific "fix".

it's not clang-specific as mentioned above, it's a general fix that is robust

how about I hide it behind a flag --forceBuild:on|off|trackdeps ?

@Araq
Copy link
Member

Araq commented Jan 24, 2019

how about I hide it behind a flag --forceBuild:on|off|trackdeps ?

I dunno, ideally the implementation is clean enough that it can always be enabled by default.

@genotrance
Copy link

How does this relate to Nim's own incremental compilation? Can some of that effort be leveraged here?

You'd have to run this -M command once per file and cache that list as long as the file doesn't change. At least you can avoid running it for every compile. But if any of the dependencies do change then you could just run -M for that file and update the tracking list.

Looks like we'd have to do something similar in nimterop as well since we cache the Nim output only based on the header changing or not.

@genotrance
Copy link

That is not what {.compile.} does.

The Nim manual clearly states: Note: Nim computes a SHA1 checksum and only recompiles the file if it has changed. You can use the -f command line option to force recompilation of the file.

I have a similar statement in nimterop as well. We could leave it as a user's problem but now that I know I can detect dependencies, I will be using it to make the Nim code caching more robust. No doubt it is gcc/clang specific and Nim is much larger in its purview but I can bet we have superior support even within Nim for these compilers compared to tcc, vcc and others.

No. It's not the job of compile to check all dependencies if it actually needs to do it. It is your job to call the compiler in a way that it actually compiles the file every time, when you have the knowledge that every time you compile your nim program the dependencies will be regenerated.

Well, a pragma like {.compile.} is a feature of the compiler to avoid make files and the like. It may not be full-featured, but does make working C/C++ libs very convenient. If it wasn't of interest, it would never have been added.

Meanwhile, this isn't nimterop specific - we are not talking about Nim code that is dynamically generated and should be recompiled if it changes. {.compile.} is pointing to a C file whose header changed. This can happen to anyone who is working on a C lib and wants to pull it into Nim. Or even for those who do a git pull on a C dependency. If it won't track dependencies, either you nim -f all the time or we will have to use other tricks to inform gcc to compile and link these files.

All in all, the goal is to make interop better. If that is valuable then the compiler can assist with it. If not, the community will have to work around the status quo.

@Araq
Copy link
Member

Araq commented Jan 24, 2019

This can happen to anyone who is working on a C lib and wants to pull it into Nim.

It never happened to me though and it can only happen when you use a feature without reading how it really works. Ok, you cannot read about every feature that you end up using, but at the same time, we cannot put ever more effort into rewarding ignorance.

@genotrance
Copy link

We are talking about improving an existing feature since it has affected us first hand and will inevitably impact our users. Further, we are willing to contribute and improve it with consensus.

We can debate the merits of the design or whether it is worth the effort or added complexity but it isn't an RTFM issue. {.compile.} doesn't track dependencies, okay, but it surely can be better.

Finishing touches are boring, tedious and never ending but they do make a difference.

@demotomohiro
Copy link

There are C/C++ build tools. For example, CMake, Autotools, scons, ninja, etc.
These tools detect header file change used by C/C++ files and recompile them only when they really need to be recompiled.
CMake supports major c/c++ compilers and they generate Makefile for these compilers.
(CMake supports visual studio and they can generate .vcproj files)
CMake and autotools are widely used in many opensource C/C++ projects.
(CMakeLists.txt is a file for CMake)

I used CMake before I use Nim but I never used it with Nim.
I don't know this is really a good idea but it might be better than using clang -M.

Instead of using {.compile.} pragma, use dynlib pragma to import c functions and execute following cmake command to build C/C++ project as dynamic link library before compiling your nim file.
(These might be able to put on your *.nims files or nimble file)

# Configure project and generate Makefiles
#   Execute following command on build directory
cmake "path-to-parent-directory-of-CMakeLists.txt"
#   Or following command might work with CMake 3.13
cmake -S "path-to-parent-directory-of-CMakeLists.txt" -B "path-to-build-directory"

# Build project
cmake --build "path-to-build-directory"

@timotheecour
Copy link
Member Author

timotheecour commented Jan 25, 2019

@krux02

That doesn't really matter, because it doesn't address all of the other supported compilers. And you can't just use clang all the time, first of all clang isn't a dependency of Nim. Also the c file could have compiler dependent behavior. So you need to take the right compiler.

all of the other supported compilers

can you provide a link with the list?

it works with all major compilers AFAIK, eg:

  • clang,
  • gcc
  • visual studio : /showIncludes (see link )
  • it should work with mingw (via gcc)

Maybe there's a niche compiler that doesn't have such option but it's irrelevant; functionality shouldn't be dumbed down the the lowest common denominator at the expense of platforms/environments that do expose such functionality

/cc @Araq

If the header file changed you use -f

nimble doesn't have -f and I shouldn't have to remove ~/.cache/ just to make nimble test work (see nimterop/nimterop#29); not to mention cases where this leads to silent wrong values at runtime w no compile error ;
(and yes, that nimble -f can/should be supported, but that's just fixing a symptom, there will be other more complex cases where nimble itself is called by some other program, etc (eg brew etc) ; I'm interested in fixing the root cause, which is easy to fix via this proposal)

It never cause any problems

as always, it's tempting to dismiss a bug by saying: "I was never affected by it, so this must not be a real issue" but reality is different workflows lead to different bugs encountered.

as often, I'm spending more time explaining the rationale than it would take me to code up the PR

@Araq
Copy link
Member

Araq commented Jan 25, 2019

as often, I'm spending more time explaining the rationale than it would take me to code up the PR

Ok, go for it.

@timotheecour
Copy link
Member Author

timotheecour commented Jan 25, 2019

I just want to point out that the test case is not a test case where tracking of dependencies is necessary.

It is. @krux02 I think you misunderstood the test. The test illustrates what happens when a (implicity, maybe recursive) include depdency changes in between 2 compilations. When reading the test, just mentally treat "D20190123T230907.h" as if it were "/home/user/homebrew/include/opencv2/opencv.hpp"

ie, in real life, it's not generated by the nim application but exists in your filesystem. The fact it's auto-generated in this test is because I want the test to be self contained.

It is an example where it is known at compile time the the generated file needs to be recompiled every time

hopefully above explanation clears that point too.

I also want to point out that tracking every dependency every time will have an impact on compilation speed. All those system files that are very unlikely to change are not small. Tracking them will have a performance impact. Even when the programmer knows that the dependencies don't need to be tracked.

yes, I thought about that and a proper solution will have to do some benchmarking to measure added cost in compile time to check whether it's acceptable as default or whether it needs to be hidden by a flag.

if performance dicates it, It's easy to remove system files from the list of dependencies: we can run clang -E -x c - -v < /dev/null to get default includes:

 /usr/local/include
 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/10.0.0/include
 /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include
 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/usr/include
 /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.14.sdk/System/Library/Frameworks (framework directory)

and subtract out dependencies starting w a default include.

are not small.

file size should have no (amortized) impact:

  • first, check last modification time: if it's older than already compiled foo.o, that dependency is fresh; else (rare case, hence no "amortized" impact), check md5 hash of dependency (eg covers case of cd openc && git checkout foo && git checkout -)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants