-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[skymeld-dogfood] build unexpectedly exits causing following commands to wait for non-existing command to complete #19211
Comments
Thanks so much for trying out skymeld and filing the issue! I noticed that you're using Bazel 6.3.1. In the last 3 months we've fixed some interrupt issues with Skymeld and with luck perhaps this issue is among those. Would you mind trying your builds again at HEAD and see if it's reproducible? |
Thanks for the fast answer! There are some breakages at HEAD with rules_apple so we can't jump on it just yet. I will report back as soon as those get resolved and we're able to test again. |
@joeleba This is reproducing on 7.2.0rc1. |
I also sent |
@bazel-io fork 7.2.0 |
I marked this as a release blocker for now - Brentley's stack trace has this look like a regression caused by 52adf0b. |
That exact stacktrace has appeared in multiple users |
CC @joeleba |
Seems like I hijacked this issue, sorry @BalestraPatrick. Is your original issue resolved? If so, once my issue is we could close this. |
We haven't seen this issue after bumping to Bazel 7.0 and having skymeld enabled by default, so it can be closed after that from my side. |
@brentleyjones Can you please file a new issue for the regression? |
Description of the bug:
We added
common --experimental_merged_skyframe_analysis_execution
to ourbazelrc
a few days ago. Shortly after that, developers started reporting issues regarding builds never completing locally. Upon inspection, builds were stuck in a state such asAnother command (pid=76180) is running. Waiting for it to complete on the server (server_pid=27106)...
.We've never seen this behavior before and given the multiple reports by developers, we reverted our change. Since then, we didn't receive any reports of builds getting stuck. From our BES, we can see that some of these developers experienced unexpectedly killed or stopped builds before the following builds don't start (the BES is truncated, so the builds show up as "Disconnected" in our BES service).
In the above example, there was no process with
pid=76180
but there was a process forserver_pid=27106
. Runningjstack
against the server process reveals that it's stuck in some waiting state (not sure if that's expected, but hopefully it's helpful). The workaround for developers was to runkill -9 27106
.Let me know if I can somehow provide more logs or details.
cc: @joeleba
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Unfortunately I don't have a reproducible example. We didn't see on CI during the short timeframe where this flag was enabled, but we thus turned it off. Our IDE integration has multiple output bases, so it's possible that something specific to that integration is making it hit this error.
Which operating system are you running Bazel on?
macOS 13.5
What is the output of
bazel info release
?6.3.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse master; git rev-parse HEAD
?No response
Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
stacktrace2.txt
The text was updated successfully, but these errors were encountered: