Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to debug rpc failures? #6

Open
GiantPluto opened this issue Oct 4, 2023 · 8 comments
Open

How to debug rpc failures? #6

GiantPluto opened this issue Oct 4, 2023 · 8 comments
Assignees

Comments

@GiantPluto
Copy link

I am trying to use the reclient with chromium where its already integrated. Modified the server and instance details to point it to the local BuildGrid server I have setup with no auth. There is no GCP involved here. On running the build. I am seeing the below failure. How can I debug RPC errors and what could be the reason?

rpc error: code = Unavailable desc = rpc error: code = Unavailable desc = retry budget exhausted (6 attempts): all SubConns are in TransientFailure, authentication type (identity) used="no authentication"

@gkousik
Copy link
Collaborator

gkousik commented Oct 5, 2023

If you need no auth between the RBE server and reproxy, you need to set RBE_service_no_auth=true when starting reproxy. Do you see this error message with that option set?

@GiantPluto
Copy link
Author

service_no_auth=true is set in the config itself. I am getting the error with the same config

@gkousik
Copy link
Collaborator

gkousik commented Oct 31, 2023

Could you give more details about your server config too? Which RBE server implementation are you using? Can you share your reproxy.INFO / reproxy.WARNING / reproxy.ERROR and reproxy_outerr.log files?

@GiantPluto
Copy link
Author

GiantPluto commented Nov 6, 2023

@gkousik Thanks for the response. Please find the attached zip containing all the logs you have asked and the reproxy.cfg .
REProxy_logs.zip

Btw I am using the BuildGrid

@gkousik
Copy link
Collaborator

gkousik commented Nov 15, 2023

Thanks! From the reclient config, I'm not seeing a lot that's amiss. The service_no_auth flag is getting passed in as it should be but the error message doesn't really tell me what's wrong.

I've never worked with Buildgrid before but from https://buildgrid.build/user/configuration.html#authorization-section, can you confirm that authorization is set to none in your server config?

If you can confirm that the server config is also correct, can you also disable TLS with RBE_service_no_security=true and see if that works? (https://github.com/bazelbuild/remote-apis-sdks/blob/042d9851eb2847d088094bba28b4ea065be83c27/go/pkg/flags/flags.go#L45)

@GiantPluto
Copy link
Author

GiantPluto commented Nov 16, 2023

@gkousik In BuildGrid, authorization is not configured which defaults to none. I have had goma working before with these servers. Started facing this issue during the migration to reclient. I will try setting RBE_service_no_security=true and will update you. Could you let me know which RBE backend servers you have worked with?

@gkousik
Copy link
Collaborator

gkousik commented Jan 8, 2024

Hi @GiantPluto we test with a Google-internal RBE implementation which implements the RBE API specification in https://github.com/bazelbuild/remote-apis.

Did RBE_service_no_security work for your use case?

@ap4ss3rby
Copy link

ap4ss3rby commented Mar 11, 2024

I would like to chime in, as this issue also affected me while trying to build android using RBE. Referencing the settings the chromium kajiya reference server uses (referenced in the readme.md remote-apis) , I ended up with this config in a source script I use before sourcing build/envsetup. The compiler specific environment variables are irrelevent to the auth issue, but are included for reproducability. Afterwards, I was able to confirm that the CPU usage was not from native build tools but from bazel buildfarm itself. This appears to be due to autoauth, which is not documented whatsoever in the readme.md

export USE_RBE=1
export RBE_R8_EXEC_STRATEGY=remote_local_fallback
export RBE_CXX_EXEC_STRATEGY=remote_local_fallback
export RBE_D8_EXEC_STRATEGY=remote_local_fallback
export RBE_JAVAC_EXEC_STRATEGY=remote_local_fallback
export RBE_JAVAC=1
export RBE_R8=1
export RBE_D8=1
export RBE_service="localhost:8980"
export RBE_service_no_security=true
export RBE_service_no_auth=true
export RBE_automatic_auth=false

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants