Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: kubernetes authentication method failing on AWS EKS #2942

Closed
1 task done
tstraley opened this issue Apr 3, 2024 · 4 comments · Fixed by flipt-io/docs#200
Closed
1 task done

[Bug]: kubernetes authentication method failing on AWS EKS #2942

tstraley opened this issue Apr 3, 2024 · 4 comments · Fixed by flipt-io/docs#200
Labels

Comments

@tstraley
Copy link

tstraley commented Apr 3, 2024

Bug Description

Config includes:

authentication:
  methods:
    kubernetes:
      enabled: true
  required: true

When attempting to make API call using a service account token to retrieve a flipt client token, it eventually returns an empty reply:

curl http://10.0.10.105:8080/auth/v1/method/kubernetes/serviceaccount --data "{\"service_account_token\":\"$token\"}" -v -H "Content-Type: application/json"
*   Trying 10.0.10.105:8080...
* Connected to 10.0.10.105 (10.0.10.105) port 8080 (#0)
> POST /auth/v1/method/kubernetes/serviceaccount HTTP/1.1
> Host: 10.0.10.105:8080
> User-Agent: curl/7.81.0
> Accept: */*
> Content-Type: application/json
> Content-Length: 1060
> 
* Empty reply from server
* Closing connection 0
curl: (52) Empty reply from server

And the flipt service records this error:

{"L":"ERROR","T":"2024-04-03T22:42:33Z","M":"finished unary call with code Internal","server":"grpc","grpc.start_time":"2024-04-03T22:40:23Z","system":"grpc","span.kind":"server","grpc.service":"flipt.auth.AuthenticationMethodKubernetesService","grpc.method":"VerifyServiceAccount","peer.address":"127.0.0.1:53260","error":"rpc error: code = Internal desc = verifying service account: failed to verify signature: fetching keys oidc: get keys failed Get \"https://ip-172-16-173-141.ec2.internal:443/openid/v1/jwks\": dial tcp 172.16.173.141:443: connect: connection timed out","grpc.code":"Internal","grpc.time_ms":130569.35}

It looks like there is an understanding in the code that the URL provided by the oidc well-known endpoint (in this case returning that ip-172-16-173-141.ec2.internal address) is not correct and the provided discovery URL (in this case the default kubernetes.default.svc.local) should be used instead

// does not match the supplied discovery URL.

Despite this attempt to handle this, the underlying oidc provider is calling the invalid URL which flipt cannot reach as part of this updateKeys call https://github.com/coreos/go-oidc/blob/22dfdcabd450013b4d51ac15b6423f529d957e9f/oidc/jwks.go#L230 which is in the 'Verify' codepath.

Version Info

v1.39.2

Search

  • I searched for other open and closed issues before opening this

Steps to Reproduce

  1. Deploy flipt into kubernetes with helm chart as documented in https://docs.flipt.io/self-hosted/kubernetes
  2. Configure kubernetes auth method as documented in https://docs.flipt.io/configuration/authentication#kubernetes
  3. Attempt to use the kubernetes auth method to get a client token as documented in https://docs.flipt.io/authentication/methods#via-the-api

Expected Behavior

Expected to be able to get a client token response as detailed in https://docs.flipt.io/authentication/methods#via-the-api

Additional Context

Running in AWS EKS

@tstraley tstraley added the bug label Apr 3, 2024
@GeorgeMac
Copy link
Contributor

GeorgeMac commented Apr 4, 2024

Thanks for raising this @tstraley ! It has been a while since I implemented this, so taking me a hot minute to rebuild my context 😂 bare with me on this one.

Just adding a bunch of context off the top of my head:

If I remember correctly, the bit we have to workaround (the issuer mismatch) is simply that we instruct the go-oidc library to not return an error when the issuer described by the discovery endpoint does not match the URL we used to request that document. This is where the oidc.InsecureIssuerURLContext comes in:
https://pkg.go.dev/github.com/coreos/go-oidc/v3/oidc#InsecureIssuerURLContext

The go-oidc library will return an error after it gets the discovery well-known endpoint if the host used to fetch that does not match the issuer URL in the response. We use the local k8s DNS address to get the discovery document, but it returns a JWKS URL and issuer that does not match that same local k8s DNS name.

However, the go-oidc library will still use JWKs, which is what we're seeing here I believe.

I think this is a form of this issue: aws/containers-roadmap#2234
And it seems related to how EKS it set up to distribute service accounts for IAM roles via its own OIDC provider:
https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html#irsa-oidc-background

This issue linked suggests a work-around for EKS is to actually start the discovery journey using the custom EKS OIDC address (e.g. https://oidc.eks.<region>.amazonaws.com/id/<cluster-id>/.well-known/openid-configuration).

You can currently change this in your Flipt configuration like so:

authentication:
  methods:
    kubernetes:
      enabled: true
      discovery_url: "https://oidc.eks.<region>.amazonaws.com/id/<cluster-id>"
  required: true

Could you give this a try for us 🙏

If this works we can make a docs update to explain this edge case a bit better 💯

@GeorgeMac
Copy link
Contributor

This could be related to aws/containers-roadmap#2234

Haha great timing 🙌

@tstraley
Copy link
Author

tstraley commented Apr 4, 2024

Thanks @GeorgeMac -- this makes a lot of sense.

I tried out your suggestion. First attempt was causing pod to crash on startup, but was eventually able to get relevant error (there were some red-herring "context closed" errors on a couple pod restarts masking this one):

Error: configuring kubernetes authentication: fetching OIDC configuration: Get "https://oidc.eks.us-east-1.amazonaws.com/id/<our cluster id>/.well-known/openid-configuration": tls: failed to verify certificate: x509: certificate signed by unknown authority

This endpoint is one of AWS's public endpoints and uses a cert signed by CN=Amazon RSA 2048 M02. I could probably get this specific CA, put it in a k8s config map, and mount the volume; but to get by for now I added /etc/ssl/certs/ca-certificates.crt as the ca path, since this OS CA bundle is in the flipt container image.

authentication:
  methods:
    kubernetes:
      enabled: true
      discovery_url: "https://oidc.eks.<region>.amazonaws.com/id/<cluster-id>"
      ca_path: "/etc/ssl/certs/ca-certificates.crt"
  required: true

This started up fine, and now appears to be working properly!

Please feel free to resolve and update docs as you see fit. Thanks for the help!

@GeorgeMac
Copy link
Contributor

That's amazing, thanks for raising this and working through it!

I will open a docs issue before closing this, so we make sure to get these details in there for future folks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
2 participants