Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines that depend on RBE remote cache are failing #1055

Closed
jayconrod opened this issue Oct 22, 2020 · 7 comments
Closed

Pipelines that depend on RBE remote cache are failing #1055

jayconrod opened this issue Oct 22, 2020 · 7 comments
Assignees

Comments

@jayconrod
Copy link

Apologies if this isn't the right repo: let me know if I should file this somewhere else.

A couple examples:

Errors look like:


Received Goaway
--
  | load_shed
  | at com.google.devtools.build.lib.remote.GrpcCacheClient.lambda$downloadBlob$10(GrpcCacheClient.java:296)
  | at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:192)
  | at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.doFallback(AbstractCatchingFuture.java:179)
  | at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:124)
  | at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
  | at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1174)
  | at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:969)
  | at com.google.common.util.concurrent.AbstractFuture.setFuture(AbstractFuture.java:800)
  | at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:203)
  | at com.google.common.util.concurrent.AbstractCatchingFuture$AsyncCatchingFuture.setResult(AbstractCatchingFuture.java:179)
  | at com.google.common.util.concurrent.AbstractCatchingFuture.run(AbstractCatchingFuture.java:133)
  | at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
  | at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1174)
  | at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:969)
  | at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:760)
  | at com.google.common.util.concurrent.SettableFuture.setException(SettableFuture.java:53)
  | at com.google.devtools.build.lib.remote.GrpcCacheClient$1.onError(GrpcCacheClient.java:344)
  | at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:449)
  | at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
  | at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
  | at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
  | at com.google.devtools.build.lib.remote.util.NetworkTime$NetworkTimeCall$1.onClose(NetworkTime.java:93)
  | at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
  | at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
  | at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
  | at io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:700)
  | at io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
  | at io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
  | at io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
  | at io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:399)
  | at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:521)
  | at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:66)
  | at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:641)
  | at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$700(ClientCallImpl.java:529)
  | at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:703)
  | at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:692)
  | at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
  | at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
  | ... 3 more
  | Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: HTTP/2 error code: NO_ERROR
@philwo
Copy link
Member

philwo commented Oct 22, 2020

Thank you @jayconrod. I'm looking into this now.

@ifoox
Copy link

ifoox commented Oct 22, 2020

Hi Jay, this is a loadshedding issue on the RBE side. Our oncall is currently looking into it. Feel free to ping me internally if you need additional details, but hopefully we will have this resolved shortly. Are you still seeing the issue or did it just happen once?

@jayconrod
Copy link
Author

Thanks for looking into this!

I'm still seeing the issue (as of ~20 minutes ago). I first noticed it when tests on a PR failed. They were still failing after a restart. Same deal with the nightly tests on the main branch.

@keith
Copy link
Member

keith commented Oct 22, 2020

We see this externally a lot as well. I filed a ticket https://issuetracker.google.com/u/1/issues/171446864

@keith
Copy link
Member

keith commented Oct 23, 2020

@jayconrod
Copy link
Author

rules_go is no longer seeing any issues. BuildKite is passing again.

@bergsieker
Copy link

To close the loop here: this was triggered by a server change that has mostly been rolled back. Unfortunately a handful of clusters won't receive the update until next week.

We've also root-caused and fixed the source of the failure so subsequent server releases won't cause issues.

@philwo philwo closed this as completed Feb 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants