Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bazel clean failed after interruption on Windows #3043

Closed
meteorcloudy opened this issue May 24, 2017 · 10 comments
Closed

bazel clean failed after interruption on Windows #3043

meteorcloudy opened this issue May 24, 2017 · 10 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug

Comments

@meteorcloudy
Copy link
Member

Description of the problem / feature request / question:

When I was investigating #3025, I didn't reproduce the same error, but found this one.

Using Bazel 0.5.0-rc6 on Windows 10

If possible, provide a minimal example to reproduce the problem:

  1. Try to build //src/main/cpp:client
bazel build --cpu=x64_windows_msvc --copt=-w --host_copt=-w //src/main/cpp:client
  1. Press Ctrl+C to interrupt the build after a few seconds
bazel build --cpu=x64_windows_msvc --copt=-w --host_copt=-w //src/main/cpp:client
..............
INFO: Found 1 target...
[38 / 168] Compiling third_party/grpc/src/core/client_config/lb_policies/round_robin.c

Bazel caught interrupt signal; shutting down.
ERROR: C:/tools/msys64/home/pcloudy/workspace/bazel/third_party/grpc/BUILD:55:1: C++ compilation of rule '//third_party/grpc:grpc_unsecure' failed: msvc_cl.bat failed: error executing command external/local_config_cc/wrapper/bin/msvc_cl.bat /DOS_WINDOWS=OS_WINDOWS /DCOMPILER_MSVC /DNOGDI /DNOMINMAX /DPRAGMA_SUPPORTED /D_WIN32_WINNT=0x0600 /D_CRT_SECURE_NO_DEPRECATE /D_CRT_SECURE_NO_WARNINGS ... (remaining 35 argument(s) skipped): com.google.devtools.build.lib.shell.AbnormalTerminationException: Process terminated by signal 2.
Target //src/main/cpp:client failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: build interrupted.
INFO: Elapsed time: 13.840s, Critical Path: 2.99
  1. Run bazel clean, got the following error:
bazel clean
INFO: Starting clean (this may take a while). Consider using --async if the clean takes more than several minutes.
ERROR: C:/tmp/_bazel_pcloudy/7uxoax_v/action_cache/action_journal_v12.blaze (Permission denied).
  1. Try build again, this will fail.
bazel build --cpu=x64_windows_msvc --copt=-w --host_copt=-w //src/main/cpp:client
INFO: Found 1 target...
ERROR: Error during action cache initialization: Failed to load filename index data. Corrupted files were renamed to 'C:/tmp/_bazel_pcloudy/7uxoax_v/action_cache/*.bad'. Blaze will now reset action cache data, causing a full rebuild.
ERROR: couldn't create action cache: Failed to load filename index data. If error persists, use 'blaze clean'.
INFO: Elapsed time: 1.194s

Environment info

  • Operating System: Windows 10

  • Bazel version (output of bazel info release): 0.5.0-rc6 (both MSYS and MSVC version)

@meteorcloudy
Copy link
Member Author

The only workaround I found is to run bazel shutdown before clean or build again.

@meteorcloudy meteorcloudy changed the title bazel clean failed on Windows bazel clean failed after interruption on Windows May 24, 2017
@meteorcloudy
Copy link
Member Author

meteorcloudy commented May 24, 2017

I noticed two more things:

  1. This bug even exists in 0.4.5. :(
  2. If you don't run bazel clean, just rerun bazel build after interruption, you won't get any error. And, if you run bazel clean after the build finished, it'll succeed.
    That explains why we didn't notice this error before, because we usually don't run bazel clean if we interrupt the build intentionally.

@meteorcloudy
Copy link
Member Author

FYI, @dslomov @laszlocsomor @damienmg

@damienmg
Copy link
Contributor

It is not a regression so let's not block 0.5.0

@laszlocsomor
Copy link
Contributor

I suspect the culprit is the same as with every other blaze clean bug on Windows (#2480, #1906, #1586): the java server holds open file descriptors.

bazel-io pushed a commit that referenced this issue Jun 2, 2017
This allows `bazel clean` to delete this file.

See #1586
See #1906
See #2480
See #3043

Change-Id: I245f368c2f2564511bbe6f06193a3ead49724d7b
PiperOrigin-RevId: 157818284
@laszlocsomor
Copy link
Contributor

I could repro this bug by interrupting Bazel during action execution, then attempting to clean. Culprit is that Bazel holds open the files that cannot be removed.

Workaround: bazel --batch clean. This shuts down the current server (releasing the held files), then launches a new server in batch mode, and cleans the output tree.

@meteorcloudy
Copy link
Member Author

Suspected culprit: 184faf6

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Jun 14, 2017

@meteorcloudy : Who how did you find that? Bisect?

@meteorcloudy
Copy link
Member Author

I noticed this commit by blaming PersistentMap.java, I am trying to confirm this by reverting this change .

@meteorcloudy
Copy link
Member Author

OK, looks like it's not the culprit, after reverting, the problem still exists. But @ulfjack provided some ideas in #2660

Clean should tell the action cache to close all files and remove all in-memory data. If it's not doing that right now, then that'd be a bug already. If it does clear the in-memory action cache, then it shouldn't be difficult to close any open files as part of that.

@meteorcloudy meteorcloudy self-assigned this Jun 14, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug
Projects
None yet
Development

No branches or pull requests

3 participants