Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows repository_ctx.execute(command) sometimes returns incomplete result #2675

Closed
meteorcloudy opened this issue Mar 14, 2017 · 18 comments
Closed
Assignees
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug
Milestone

Comments

@meteorcloudy
Copy link
Member

This is causing #2434 and the internal Kokoro Windows Flakiness: missing "LIB"

Extracting Bazel installation...
................................
____Loading package: src
____Loading package: @bazel_tools//tools/cpp
____Loading package: @bazel_tools//tools/jdk
____Loading package: @local_config_xcode//
____Loading package: @local_jdk//
ERROR: in target '//external:cc_toolchain': no such package '@local_config_cc//': key "LIB" not found in dictionary.
@meteorcloudy meteorcloudy added platform: windows P1 I'll work on this now. (Assignee required) type: bug labels Mar 14, 2017
@meteorcloudy meteorcloudy self-assigned this Mar 14, 2017
@ulfjack
Copy link
Contributor

ulfjack commented Mar 17, 2017

Any progress? This is killing my presubmits.

@meteorcloudy
Copy link
Member Author

Still haven't found the root cause, but looks like there is already a workaround for the Kokoro presubmit?(cl/150360798) I'll keep digging...

@meteorcloudy
Copy link
Member Author

This issue might have something to do with #2774
Yet, I could never reproduce this on my local machine.
//cc @laszlocsomor

@meteorcloudy
Copy link
Member Author

@laszlocsomor Can you check if there is similar problem in SkylarkExecutionResult.java ?

@dslomov dslomov added this to the 0.5 milestone Apr 4, 2017
@dslomov
Copy link
Contributor

dslomov commented Apr 4, 2017

Dupe for #2774 ?

@laszlocsomor
Copy link
Contributor

@dslomov : I'm not convinced of that yet. Are you sure?

@dslomov
Copy link
Contributor

dslomov commented Apr 4, 2017

Missed a question mark :)

@laszlocsomor
Copy link
Contributor

Ah :) I'm trying to repro it now.

@laszlocsomor
Copy link
Contributor

@meteorcloudy : re #2675 (comment): I don't think so, I believe we may be losing the stdout/stderr somehow. Since you were unable to repro this locally, I thought maybe because you didn't redirect the output to a file, whereas on CI it is redirected I believe.

Anyway, I set up an experiment with repo rules: I was building two of them in parallel, a fast and a slow one. It seems repo rules' actions (repository_ctx.execute) cannot be interrupted -- neither with Ctrl+C nor with one of them calling fail().

@dslomov : because of the reasons above, I doubt this bug is a dupe of #2774.

@hlopko
Copy link
Member

hlopko commented Apr 10, 2017

Is this still expected to be fixed in 0.5? We still have some release blockers and we will not cut the release sooner than in 2 weeks, but we should be getting ready to.

@dslomov
Copy link
Contributor

dslomov commented Apr 10, 2017

I do not think we have a repro, so we do not know if it is fixed or not. I do not think we need to treat it as a release blocker.

@laszlocsomor
Copy link
Contributor

I've added stricter error checks and more logging to cc_configure.bzl in hopes of catching this bug the next time it appears. We don't have a repro, sadly.

@hlopko
Copy link
Member

hlopko commented Apr 10, 2017

Ok thanks, not treating as release blocker. Good luck! :)

@damienmg
Copy link
Contributor

Let's move it to 0.6

@damienmg damienmg modified the milestones: 0.6, 0.5 Apr 11, 2017
bazel-io pushed a commit that referenced this issue Apr 11, 2017
Add stricter error checks in hopes of catching
occasional CI flakiness where the stdout of a
command seems to get lost.

It's now an error if the command returns a
non-zero exit code (or a zero one if it's expected
to fail) or if its stdout is empty. Previously we
only checked if stderr was empty to consider the
action successful.

See #2675

RELNOTES: none
PiperOrigin-RevId: 152685220
@snnn
Copy link
Contributor

snnn commented Jul 17, 2017

I can reproduce it on 688dbf7 in a CI system. As @meteorcloudy , I could never reproduce this on my local machine.

@meteorcloudy
Copy link
Member Author

meteorcloudy commented Oct 10, 2017

I finally figure out the root cause, it's a one-liner fix. I'll send it soon.

@snnn
Copy link
Contributor

snnn commented Oct 22, 2017

Hi @meteorcloudy , How is it going? Is it solved?

@meteorcloudy
Copy link
Member Author

Well, it turned out to be much more complicated that I thought. But yes! I have a fix under review, it will be fixed soon. https://bazel-review.googlesource.com/#/c/bazel/+/18790/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) platform: windows type: bug
Projects
None yet
Development

No branches or pull requests

7 participants