Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: corrupt output files #2774

Closed
laszlocsomor opened this issue Apr 3, 2017 · 6 comments
Closed

Windows: corrupt output files #2774

laszlocsomor opened this issue Apr 3, 2017 · 6 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug
Milestone

Comments

@laszlocsomor
Copy link
Contributor

laszlocsomor commented Apr 3, 2017

Description of the problem / feature request / question:

Sometimes Bazel produces corrupt output files on Windows. This happens rarely and sporadically so it's hard to repro, but it's a serious problem.

This is likely the cause of #2742 and of the errors I've mentioned in #2742 (comment), as well as some of our intermittent CI failures.

If possible, provide a minimal example to reproduce the problem:

Repro-ing this was hard; I had to run a build and keep interrupting and restarting it. We finally figured out that Bazel may not be deleting outputs of killed actions when the build is interrupted, and upon the next build it uses those partial output files. If for example the C++ linking action mmaps the output file and zeroes the memory, and the linking action is killed before it's done, that can explain the all-zero output files that nonetheless have a right size.

And lo and behold here's a simple repro indeed:

genrule(
    name = "slow",
    outs = ["slow.txt"],
    cmd = "echo slow1 > $@ ; sleep 60; echo slow2 >> $@",
)

genrule(
    name = "fast",
    outs = ["fast.txt"],
    cmd = "echo fast > $@",
)

genrule(
    name = "top",
    srcs = [
        "fast",
        "slow",
    ],
    outs = ["top.txt"],
    cmd = "echo top1 > $@ ; cat $(location :fast) >> $@; cat $(location :slow) >> $@ ; echo top2 >> $@",
)

Now run:

bazel clean  # so we have a clean output tree for the demo
bazel build //foo:top -s

Interrupt the build while the slow action is sleeping, then ls bazel-genfiles/foo/.
It'll have slow.txt with contents "slow1".

Now re-run the build and watch as :slow is not rebuilt, and top is built and top.txt contains "slow1" but not "slow2".

Explanation

When a build or test action fails, it fails the whole build, so Bazel interrupts all running actions. All interrupted actions should be considered failed so even if they produced some output the next build should ignore that.

Unfortunately we ignore InterruptedException in StandaloneSpawnStrategy and, on Windows, consider the action succeeded, and use its partial output in the next build.

The fix is to add

        if (Thread.currentThread().isInterrupted()) {
          throw new InterruptedException();
        }

after cmd.execute.

I'll do that tomorrow and also write a test.

@laszlocsomor laszlocsomor added platform: windows P1 I'll work on this now. (Assignee required) type: bug labels Apr 3, 2017
@laszlocsomor laszlocsomor self-assigned this Apr 3, 2017
@laszlocsomor
Copy link
Contributor Author

Big thanks to both @philwo and @ulfjack for their help in figuring this out.

@laszlocsomor
Copy link
Contributor Author

@dslomov , @damienmg : this is a strict blocker for 0.5.0. I'll fix it today.

@laszlocsomor laszlocsomor added P0 This is an emergency and more important than other current work. (Assignee required) and removed P1 I'll work on this now. (Assignee required) labels Apr 4, 2017
@damienmg
Copy link
Contributor

damienmg commented Apr 4, 2017 via email

@damienmg damienmg added P1 I'll work on this now. (Assignee required) and removed P0 This is an emergency and more important than other current work. (Assignee required) labels Apr 4, 2017
@laszlocsomor
Copy link
Contributor Author

Fixed internally, will be pushed to github later today.

@laszlocsomor
Copy link
Contributor Author

@hlopko
Copy link
Member

hlopko commented Apr 11, 2017

This is just awesome. 0.5 will be an awesome release :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug
Projects
None yet
Development

No branches or pull requests

3 participants