Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always pass -D__BAZEL__ to C compiler #2160

Closed
jart opened this issue Dec 1, 2016 · 20 comments
Closed

Always pass -D__BAZEL__ to C compiler #2160

jart opened this issue Dec 1, 2016 · 20 comments
Assignees
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-CPP Issues for C++ rules type: feature request

Comments

@jart
Copy link
Contributor

jart commented Dec 1, 2016

As we know, C++ headers have quadratic stat() multiplied by all the -isystem flags Bazel passes to gcc. But if all include paths are vendored to the Bazel execroot, it's only quadratic, and compiling goes way faster. But rewriting include lines breaks people who want to compile the sources for a particular project outside of Bazel. It also creates headaches for people packaging libraries for Linux distros.

Maybe the solution is to write code like this:

#ifdef __BAZEL__
#  include "external/zlib/zlib.h"
#else
#  include <zlib.h>
#endif

If that is the optimal approach, then maybe it would make sense to make that a standard define, so -D__BAZEL__ doesn't need to be manually passed copts.

@iirina iirina added category: rules > C++ P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) type: feature request labels Dec 1, 2016
@abergmeier
Copy link
Contributor

Why then not take the more general approach and have a VEND_HDR_PREFIX defined, which can be used like so:

#ifdef VEND_HDR_PREFIX
#  include " VEND_HDR_PREFIX /zlib.h"
#else
#  include <zlib.h>
#endif

This way you have at least a slim chance to upstream that.

IIRC, defines with two underscores and then uppercase are reserved for the standard (implementor).

@jart
Copy link
Contributor Author

jart commented Dec 2, 2016

Well there would be a different prefix for every third party library, because they're all in their own bazel repositories.

@lberki
Copy link
Contributor

lberki commented Dec 2, 2016

Does the performance matter in practice? I was under the impression that the look-for-every-include-under-every-include-path-entry algorithm isn't a concern for Bazel, because there aren't that many include path entries.

@abergmeier-dsfishlabs
Copy link
Contributor

abergmeier-dsfishlabs commented Dec 2, 2016

Should this not perhaps be rather solved at Compiler invocation level? I would rather have a map which you feed the compiler so it no longer searches for <zlib.h> but is taught where it is. Especially since a lot of that information is already available in Bazel.

@lberki
Copy link
Contributor

lberki commented Dec 2, 2016

Does any compiler at the moment provide that functionality?

@abergmeier-dsfishlabs
Copy link
Contributor

Not that I know of. Would be awesome, though 🤓

@jart
Copy link
Contributor Author

jart commented Dec 3, 2016

Does the performance matter in practice?

I ran strace on GCC invocations for TensorFlow and determined those stat() system calls were >50% of wall time. So yes, but with a grain of salt.

Part of the problem is that TensorFlow's build configuration is not optimal. It basically globs everything into one big cc_library. This means every single one of our dependencies (nearly all of which specify the includes attribute) are part of the compilation for nearly every .cc file. If TensorFlow broke up that cc_library into more finely grained targets, it would make TensorFlow builds faster. But it would not make builds faster for downstream targets referencing TensorFlow as a single label.

Should this not perhaps be rather solved at Compiler invocation level? I would rather have a map which you feed the compiler [...]

That would be the optimal solution. I spent an hour digging through the GCC docs looking for exactly that a month ago. Couldn't find anything that would let me do that. If we know any GCC experts it would be a good idea to consult them though.

@abergmeier-dsfishlabs
Copy link
Contributor

@jart For Tensorflow then, I would expect a persistent compiler (e.g. GCC) process to be another (better) option. This one could cache include file locations on previous stat calls.
That being said - do not know whether there is a variant like that.
A quick Google found this however (at least interesting): http://clang.llvm.org/doxygen/classclang_1_1FileSystemStatCache.html

@abergmeier-dsfishlabs
Copy link
Contributor

Perhaps cc @DanAlbert?

@DanAlbert
Copy link

Not sure what I can offer here.

@abergmeier-dsfishlabs
Copy link
Contributor

@DanAlbert Maybe you could enlighten us whether there is a persistent gcc/clang compiler, with advanced caching of includes, etc?

@ulfjack
Copy link
Contributor

ulfjack commented Dec 8, 2016

I think Bazel should not put the remote packages under external, but just under the name of the remote repository. So for zlib, the files would be under zlib/. That way, you can make it
#include <zlib/zlib.h>
or
#include "zlib/zlib.h"

Ideally, upstream would install the files under /usr/lib/include/zlib/ instead of mashing everything into a single directory, which seems very non-ideal if you ask me. If they don't want to do that (backwards compat?), maybe they can at least provide an optional mechanism?

@DanAlbert
Copy link

I certainly haven't heard of anything like it. Aren't modules supposed to be the solution for this?

@jart
Copy link
Contributor Author

jart commented Dec 13, 2016

Yes but when are modules actually going to happen? And will they require existing codebases to be rewritten in order to reap the benifits? https://youtu.be/ND-TuW0KIgg

@abergmeier-dsfishlabs
Copy link
Contributor

Yes but when are modules actually going to happen?

Depending on when the different directions of Microsoft and Clang are resolved. Soonest seems C++20. And when I look at the pace that XCode ships Clang variants it way well take until 2022 on Mac.
That is actually the reason why I thought a persistent gcc compiler may exist. Because modules will not be a solution for cross platform for a LONG time.

And will they require existing codebases to be rewritten in order to reap the benifits?

To a certain extend yes. Seems more rewrite for Microsoft (<- am prefering) than Clang approach.

@hlopko
Copy link
Member

hlopko commented Oct 25, 2017

Aren't header maps solving this problem? We have an external contribution at #3712. Could you take a look at it and comment if it will help your use case?

@ghost
Copy link

ghost commented Apr 26, 2019

Another use case for implicit -D__BAZEL__ is code which, when built with Bazel, refers to auxiliary files from a Bazel-generated runfiles tree.

#ifdef __BAZEL__
#  include "tools/cpp/runfiles/runfiles.h"
#endif 

// ...

int main(int argc, char *argv[]) {

#ifdef __BAZEL__
  std::string error;
  std::unique_ptr<Runfiles> runfiles(Runfiles::Create(argv[0], &error));
  if (runfiles == nullptr) { /* ... */ }
  std::string path = runfiles->Rlocation("my_workspace/path/to/my/data.txt");
#else 
  std::string path("some/legacy/path/to/my/data.txt");
#endif // ifdef __BAZEL__

@hlopko hlopko removed their assignment Dec 6, 2019
@fmeum
Copy link
Collaborator

fmeum commented Dec 12, 2022

With Bazel 6, all Bazel cc_* rules define BAZEL_CURRENT_REPOSITORY to the name of the repository containing the target, which can be used for #2160 (comment). Relying on it for #2160 (comment) would seem a bit hacky though.

Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Feb 16, 2024
Copy link

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please post @bazelbuild/triage in a comment here and we'll take a look. Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-CPP Issues for C++ rules type: feature request
Projects
None yet
Development

No branches or pull requests

9 participants