Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CAPI Failure When UAA Isn't Available on Internal Address Is Late and Obscure #43

Closed
anEXPer opened this issue Jan 24, 2017 · 4 comments
Labels

Comments

@anEXPer
Copy link
Member

anEXPer commented Jan 24, 2017

If UAA isn't set to have uaa.service.cf.internal in uaa.zones.internal.hostnames, or is otherwise unavailable, CC emits 502s with no error message on all endpoints that attempt token validation.

It should probably fail to start, instead. Failing that, it should emit a clear error message.

Steps to Reproduce

Override the uaa.zones.internal.hostnames to nil in your stub, deploy, then try to use CC. I've not tried this, so you could instead deploy with the minimal-aws example manifest prior to the fix here.

Expected result

The CC should probably fail to come up if it can't talk to UAA after a deploy. This would allow the canary to prevent all of the CCs from becoming non-functional in a misconfigured upgrade scenario.

If not, the message should at least say "yo I can't get the key from UAA because 404s on this wack internal address" or something.

Current result

From the user's perspective:

$ cf orgs
Getting orgs as admin...

FAILED
Server error, status code: 502, error code: 0, message:

From the logs:

{"timestamp":1485217641.3378894,"message":"Fetching uaa verification keys failed","log_level":"error","source":"cc.uaa_verification_keys","data":{},"thread_id":47193615883740,"fiber_id":47193621784980,"process_id":28295,"file":"/var/vcap/data/packages/cloud_controller_ng/aa586e54ee45aa382a0d5ebbab32e1c8aa048953.1-9face0a59275e9a96297203adbc563da1e7b8afd/cloud_controller_ng/lib/cloud_controller/uaa/uaa_verification_keys.rb","lineno":22,"method":"update_keys"}
{"timestamp":1485217641.3382447,"message":"Failed communicating with UAA: The UAA was unavailable","log_level":"error","source":"cc.security_context_setter","data":{},"thread_id":47193615883740,"fiber_id":47193621784980,"process_id":28295,"file":"/var/vcap/packages/cloud_controller_ng/cloud_controller_ng/middleware/security_context_setter.rb","lineno":22,"method":"rescue in call"}

And this is after the deploy is successful, so only post-deployment testing catches it.

@cf-gitbot
Copy link

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/138211943

The labels on this github issue will be updated when the story is started.

@anEXPer anEXPer changed the title CAPI Failure when UAA Isn't Available on Internal Address Is Late and Obscure CAPI Failure When UAA Isn't Available on Internal Address is Late and Obscure Jan 24, 2017
@anEXPer anEXPer changed the title CAPI Failure When UAA Isn't Available on Internal Address is Late and Obscure CAPI Failure When UAA Isn't Available on Internal Address Is Late and Obscure Jan 24, 2017
@aashah
Copy link
Contributor

aashah commented Feb 18, 2017

Hey @anEXPer,

Apologies for the delay in responding. I'm curious whether the CC property cc.uaa.internal_url was updated as well. Admittedly, I don't know how someone would know to do such a thing. I'm hoping that updating that would resolve this issue, and am open to some suggestions on how to make discovering CC's desire to talk to UAA with that property easier.

@anEXPer
Copy link
Member Author

anEXPer commented Feb 18, 2017

The default: "uaa.service.cf.internal" on that property takes care of it for us.

Regardless, the particulars of how CC and UAA get mutually configured to talk to one another is secondary to the main issue.

We'd like to see jobs that can't possibly work fail in post-start, so that the deploy won't proceed. If talking to UAA is impossible, CC should fail, to prevent that config change from rolling to other instances and taking down the API for the whole deployment.

@jenspinney
Copy link
Contributor

@anEXPer We made some commits to return a clearer error message when using the CC API (see the commits attached to the associated story).

We'll close this issue based on the error message fix, and we'll ask @zrob to consider whether we want to or can make the bosh deploy fail if UAA is unavailable. If we end up tackling that, we'll make a separate tracker story or bug for it.

Thanks,
CAPI Community Pair, @aashah and @jenspinney

capi-bot added a commit that referenced this issue Oct 26, 2023
…/tps

Bump src/code.cloudfoundry.org/cc-uploader
  dependabot[bot]:
     Bump github.com/onsi/gomega from 1.28.1 to 1.29.0 (#25)
Bump src/code.cloudfoundry.org/tps
  dependabot[bot]:
     Bump github.com/onsi/gomega from 1.28.1 to 1.29.0 (#43)
ari-wg-gitbot added a commit that referenced this issue May 2, 2024
…/tps

Bump src/code.cloudfoundry.org/cc-uploader
  dependabot[bot]:
     Bump github.com/onsi/gomega from 1.33.0 to 1.33.1 (#43)
Bump src/code.cloudfoundry.org/tps
  dependabot[bot]:
     Bump github.com/onsi/gomega from 1.33.0 to 1.33.1 (#63)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants