Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad incremental build after APT update: undeclared inclusions #9213

Closed
wchargin opened this issue Aug 20, 2019 · 12 comments
Closed

Bad incremental build after APT update: undeclared inclusions #9213

wchargin opened this issue Aug 20, 2019 · 12 comments
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Rules-CPP Issues for C++ rules type: support / not a bug (process)

Comments

@wchargin
Copy link
Contributor

Description of the problem / feature request:

Building a tree that worked fine yesterday now fails with this error:

ERROR: /HOMEDIR/.cache/bazel/_bazel_wchargin/52a95bbdd50941251730eb33b7476a66/external/zlib_archive/BUILD.bazel:5:1: undeclared inclusion(s) in rule '@zlib_archive//:zlib':
this rule is missing dependency declarations for the following files included by 'external/zlib_archive/inffast.c':
  '/usr/lib/gcc/x86_64-linux-gnu/8/include/stddef.h'
  '/usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/limits.h'
  '/usr/lib/gcc/x86_64-linux-gnu/8/include-fixed/syslimits.h'
  '/usr/lib/gcc/x86_64-linux-gnu/8/include/stdarg.h'
Target //tensorboard/components/tf_backend/test:test_chromium failed to build

These four include files are provided by libgcc-8-dev:amd64, which was
updated this morning between the successful build and the failing build.
This seems like a probable culprit?

A coworker of mine, @caisq, encountered the exact same failure
yesterday, with the same undeclared inclusions (@zlib_archive//:zlib).
He ran bazel clean --expunge, which resolved the error. But my
understanding per the bazel clean docs is that any case is
that any case in which bazel clean changes the output of a build is
considered a high-priority bug.

Feature requests: what underlying problem are you trying to solve with this feature?

N/A

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

This is a cache poisoning bug. I cannot reproduce it from a clean state.
If I run bazel clean --expunge, the problem will go away. Cloning the
repository into a fresh temp directory and building there works fine.

With my cache, running bazel build //tensorboard is sufficient to
trigger the above error.

What operating system are you running Bazel on?

gLinux (like Debian)

What's the output of bazel info release?

release 0.28.1

If bazel info release returns "development version" or "(@non-git)", tell us how you built Bazel.

N/A

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

git@github.com:tensorflow/tensorboard.git
master
fatal: ambiguous argument 'master': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
841bdde5ab75cd0a68fdf6b9573190e8a82fc1de

(I don’t use a master branch; I work mostly in detached HEAD states.)

Have you found anything relevant by searching the web?

I’ve found various related issues:

…but none with an actual solution. Modifying an upstream CROSSTOOL file
clearly isn’t the answer, and the others just seem to suggest running
bazel clean.

Any other information, logs, or outputs that you want to share?

This project uses --incompatible_use_python_toolchains=false, but the
builds from yesterday and today were in the same virtualenv with the
same packages. (Also, this doesn’t look like a Python problem.)

@wchargin
Copy link
Contributor Author

wchargin commented Aug 20, 2019

I need to expunge this cache to proceed with my work, but I have
archived my bazel info output_base directory and can provide it to
Googlers upon request.

(note to self: ~/tmp/bazel_issue_9213_output_base.tar.gz)

@aiuto aiuto added untriaged team-Rules-CPP Issues for C++ rules labels Aug 20, 2019
@aiuto
Copy link
Contributor

aiuto commented Aug 20, 2019

My hunch is that this only happens with auto-configured toolchains. Those are essentially non-hermetic, so an OS update can break us in cache-poisoning ways.

The best short term solution is bazel clean --expunge.

@wchargin
Copy link
Contributor Author

I used --toolchain_resolution_debug on the relevant target:

$ bazel build --toolchain_resolution_debug @zlib_archive//:zlib
INFO: Build option --toolchain_resolution_debug has changed, discarding analysis cache.
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, 
INFO: ToolchainResolution: Looking for toolchain of type @bazel_tools//tools/cpp:toolchain_type...
INFO: ToolchainResolution:   Considering toolchain @local_config_cc//:cc-compiler-armabi-v7a...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:arm, which does not match value @platforms//cpu:x86_64 from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:     Toolchain constraint @platforms//os:os has value @platforms//os:android, which does not match value @platforms//os:linux from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:   Rejected toolchain @local_config_cc//:cc-compiler-armabi-v7a, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain @local_config_cc//:cc-compiler-k8...
INFO: ToolchainResolution:   For toolchain type @bazel_tools//tools/cpp:toolchain_type, possible execution platforms and toolchains: {@bazel_tools//platforms:host_platform -> @local_config_cc//:cc-compiler-k8}
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, type @bazel_tools//tools/cpp:toolchain_type -> toolchain @local_config_cc//:cc-compiler-k8
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, 
INFO: Analyzed target @zlib_archive//:zlib (0 packages loaded, 100 targets configured).
INFO: Found 1 target...

This is the same output that I see when I build a cc_library in a
fresh workspace:

$ cd "$(mktemp -d)"
$ >WORKSPACE
$ >BUILD printf '%s\n' 'cc_library(name = "foo", srcs = ["foo.cc"])'
$ >foo.cc
$ bazel build --toolchain_resolution_debug //:foo
Starting local Bazel server and connecting to it...
INFO: ToolchainResolution: Looking for toolchain of type @bazel_tools//tools/cpp:toolchain_type...
INFO: ToolchainResolution:   Considering toolchain @local_config_cc//:cc-compiler-armabi-v7a...
INFO: ToolchainResolution:     Toolchain constraint @platforms//cpu:cpu has value @platforms//cpu:arm, which does not match value @platforms//cpu:x86_64 from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:     Toolchain constraint @platforms//os:os has value @platforms//os:android, which does not match value @platforms//os:linux from the target platform @bazel_tools//platforms:target_platform
INFO: ToolchainResolution:   Rejected toolchain @local_config_cc//:cc-compiler-armabi-v7a, because of target platform mismatch
INFO: ToolchainResolution:   Considering toolchain @local_config_cc//:cc-compiler-k8...
INFO: ToolchainResolution:   For toolchain type @bazel_tools//tools/cpp:toolchain_type, possible execution platforms and toolchains: {@bazel_tools//platforms:host_platform -> @local_config_cc//:cc-compiler-k8}
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, type @bazel_tools//tools/cpp:toolchain_type -> toolchain @local_config_cc//:cc-compiler-k8
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, 
INFO: ToolchainResolution: Selected execution platform @bazel_tools//platforms:host_platform, 
INFO: Analyzed target //:foo (13 packages loaded, 66 targets configured).
INFO: Found 1 target...
Target //:foo up-to-date:
  bazel-bin/libfoo.a
  bazel-bin/libfoo.so

Does this mean that the default toolchains as of Bazel 0.28.1 are
non-hermetic? (I have no user-level .bazelrc.)

@katre
Copy link
Member

katre commented Aug 21, 2019

Yes, the entire @local_config_cc repository is boxing up the locally installed cc compiler found on the system and making it available to Bazel. This is, unfortunately, non-hermetic. If you want a fully hermetic toolchain, you'll need to check one in reproducibly, either in the main repo or as an http_archive.

I'm not sure what we can do about the actual error: @hlopko and @lberki, is there anything we can do to checksum the compiler state and invalidate it when the local compiler is updated?

@hlopko
Copy link
Member

hlopko commented Aug 21, 2019

There is no such support in Starlark repositories, but @aehlig
just made me aware of 4a1380f which adds bazel sync --configure that will be used in this situation in the future (after we teach C++ autoconfiguration to use this mechanism). I'll assign to @aehlig now to close/mark this issue as duplicate/comment further.

@hlopko hlopko added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Rules-CPP Issues for C++ rules labels Aug 21, 2019
@aiuto aiuto pinned this issue Aug 23, 2019
@aehlig
Copy link
Contributor

aehlig commented Aug 23, 2019

(after we teach C++ autoconfiguration to use this mechanism)

This will happen in https://bazel-review.googlesource.com/c/bazel/+/111372

However, to keep the property that bazel at HEAD can be built with the latest released bazel, this change will only be submitted after the release of bazel 0.29.0 (which will be the first bazel release containing the concept of configure rules).

@aehlig
Copy link
Contributor

aehlig commented Aug 23, 2019

I'm not sure what we can do about the actual error: @hlopko and @lberki, is there anything we can do to checksum the compiler state and invalidate it when the local compiler is updated?

@katre Watching the compiler binary for changes is not enough: the compiler calls other programs, or might even be wrapper itself; auto-configured tool chains also depend on other files like, as in this case, the location of standard header files. So, a complete check would basically mean rerunning the auto-configuration rules unconditionally. This, however, is too expensive to do at every build (think of the time, an autotools-like configure script runs; that's nothing you wan to happen at an otherwise null build). So, the best solution we can offer is to tell users to run bazel sync --configure whenever they change the locally installed toolchains.

bazel-io pushed a commit that referenced this issue Aug 29, 2019
So that the auto configuration can be re-done by `bazel sync --configure`.

Related to #9213.

Change-Id: Ibc93015d094416a92a969e2d3656d142e6ff850b
PiperOrigin-RevId: 266099342
@dslomov dslomov unpinned this issue Sep 2, 2019
@EliRibble
Copy link

Is there a workaround for this issue? I've tried bazel clean --expunge and bazel sync --configure but I still get the same error afterwards.

@hlopko
Copy link
Member

hlopko commented Sep 11, 2019

Can it be that you're using local disk cache or remote cache? If so, #9296 applies to you. Have you tried Bazel 0.29.1?

@EliRibble
Copy link

I'm on 0.29.1 now and ran bazel clean --expunge and bazel sync --configure and my build worked after that, thanks.

@aehlig
Copy link
Contributor

aehlig commented Sep 13, 2019

There is no such support in Starlark repositories, but @aehlig
just made me aware of 4a1380f which adds bazel sync --configure that will be used in this situation in the future (after we teach C++ autoconfiguration to use this mechanism). I'll assign to @aehlig now to close/mark this issue as duplicate/comment further.

bazel sync --configure has been added by now, and the CC autoconfiguration rules were marked as configure rules in e1a6877. So I don't see anything actionable from my side any more.

@aehlig aehlig removed their assignment Sep 13, 2019
@philwo philwo added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Jun 15, 2020
@philwo philwo added team-Rules-CPP Issues for C++ rules and removed team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels Feb 8, 2021
@oquenchil oquenchil added P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) type: support / not a bug (process) and removed untriaged labels Feb 11, 2021
antoninbas added a commit to p4lang/p4runtime that referenced this issue Mar 5, 2021
There seems to be a Bazel issue which can be resolved by expunging the
cache: bazelbuild/bazel#9213
antoninbas added a commit to p4lang/p4runtime that referenced this issue Mar 5, 2021
There seems to be a Bazel issue which can be resolved by expunging the
cache: bazelbuild/bazel#9213
antoninbas added a commit to p4lang/PI that referenced this issue Mar 5, 2021
There seems to be a Bazel issue which can be resolved by expunging the
cache: bazelbuild/bazel#9213
antoninbas added a commit to p4lang/PI that referenced this issue Mar 5, 2021
There seems to be a Bazel issue which can be resolved by expunging the
cache: bazelbuild/bazel#9213
@sgowroji sgowroji added the stale Issues or PRs that are stale (no activity for 30 days) label Feb 15, 2023
@sgowroji
Copy link
Member

Hi there! We're doing a clean up of old issues and will be closing this one. Please reopen if you’d like to discuss anything further. We’ll respond as soon as we have the bandwidth/resources to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P4 This is either out of scope or we don't have bandwidth to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. team-Rules-CPP Issues for C++ rules type: support / not a bug (process)
Projects
None yet
Development

No branches or pull requests

9 participants