Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenID integration issue with large tokens (caused by a long list of groups) #3874

Closed
somejfn opened this issue Apr 13, 2021 · 12 comments · Fixed by #3898
Closed

OpenID integration issue with large tokens (caused by a long list of groups) #3874

somejfn opened this issue Apr 13, 2021 · 12 comments · Fixed by #3898
Assignees
Labels
backlog Triaged Issue added to backlog enhancement This is the preferred way to describe new end-to-end features.

Comments

@somejfn
Copy link

somejfn commented Apr 13, 2021

Describe the bug

With a Kiali instance deployed with the operator, with openid integration (in my case keycloak) and groups being returned for RBAC purposes, the login process fails if the token being returned is larger than ~2894 characters (encoded). In my case, large tokens are returned because we have a lot of groups being returned and it could be common to reach 8K characters.

Ref: discussion #3870

Versions used
Kiali: 1.32
Istio: 1.9.2
Kubernetes flavour and version: Kubeadm 1.19.2

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a Kiali instance with OpenID auth against a Keycloak endpoint
  2. Make sure to return group membership information in the openid client so it is part of the token body
  3. Create a test user in keycloak
  4. Create a series of groups, assign them to the test user up to the point the array is ~1500 characters
  5. Try to login. Once authenticated, rather than getting the kiali UI you are redirected back to the Kiali login page. No error logged.

Notes:

  • See the ingress annotation below were needed allow the large token through nginx.
  • You can reproduce the problem with rbac disabled as long as the large token is returned

My kilia CR

apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
  namespace: istio-system
  name: kiali
  labels:
    helm.sh/chart: kiali-operator-1.32.0
    app: kiali-operator
    app.kubernetes.io/name: kiali-operator
    app.kubernetes.io/instance: kiali-operator
    version: "v1.32.0"
    app.kubernetes.io/version: "v1.32.0"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/part-of: "kiali-operator"
annotations:
  ansible.sdk.operatorframework.io/verbosity: "1"
spec:
  api:
    namespaces:
      exclude:
      - "istio-operator"
      - "kube.*"
      - "default"
  auth:
    strategy: "openid"
    openid:
      client_id: "kiali-lab"
      issuer_uri: "https://keycloak-001.kd.io/auth/realms/kubernetes"
      insecure_skip_verify_tls: true
      username_claim: "email"
      disable_rbac: false
      scopes: ["openid", "groups", "email"]
  deployment:
    logger:
      log_level: debug
    accessible_namespaces:
    - '**'
    image_version: "v1.32.0"
    ingress_enabled: true
    override_ingress_yaml:
      metadata:
        annotations:
          kubernetes.io/ingress.class: my-nginx-ingress
          nginx.ingress.kubernetes.io/secure-backends: "false"
          nginx.ingress.kubernetes.io/backend-protocol: "HTTP"
          nginx.ingress.kubernetes.io/proxy-buffer-size: "64k"
          nginx.ingress.kubernetes.io/proxy-buffers-number: "8"
      spec:
        tls:
        - hosts:
          - kiali.my-nginx-ingress.lab01.kd.io
          secretName: kiali-tls
        rules:
        - host: kiali.my-nginx-ingress.lab01.kd.io
          http:
            paths:
            - path: /kiali
              backend:
                serviceName: kiali
                servicePort: 20001

Expected behavior
The JWT token size should not be a problem and allow Kiali UI login

@somejfn somejfn added the bug Something isn't working label Apr 13, 2021
@jmazzitelli
Copy link
Collaborator

jmazzitelli commented Apr 13, 2021

I'm going to give my suggestion of the solution.

The issue is the token that is generated by the OIDC provider is abnormally large. It won't fit in a single HTTP cookie (most browsers limit it to 4k per cookie).

We should split up the token across multiple cookies when we get a OIDC token larger than 4k. If the token is less than 4k, we do what we do today (create "TOKEN_COOKIE" and use it as is). If the token is larger than 4k, split it up into mulitple 4k chunks and store them in multiple cookies "TOKEN_COOKIE1", "TOKEN_COOKIE2", etc... When we receive the auth request, we look to see if we have TOKEN_COOKIE - if we do, use it as-is like we do today. If we don't see that cookie, look for cookies of the form "TOKEN_COOKIE<n>" and sew them back together.

If you still hit the cookie size limit, the alternatives would be (a) change to a different browser with a more forgiving cookie size limit or (b) get the OIDC provider to not generate such an abnormally large token.

Alternative solutions I would reject:

  1. Compress and base64 encode the token to attempt to shrink it. I did some rudimentary testing using gzip and base64 and the size of the cookie would actually increase, not decrease, when doing this. Even if cookies didn't need to be base64 encoded (i.e. assume the cookie string can include binary chars as produced by zip compression, which I am not sure is allowed) the size reduction is not significant (in my tests I saw a random string of size 6,000 reduced to 4,500 - and that original random string consisted solely of characters [a-zA-Z0-9] and thus probably not an accurate representation of a true OIDC token which is probably more complex with more kinds of characters ... so attempting to compress the cookie string is not a viable option.
  2. Store the long token in the server and return the Cookie with a value of a hash or key representing the token in storage. Kiali has no server side storage other than in-memory. While we could do this solution, it would mean the user would only be able to talk to that single Kiali pod - if you scale Kiali across multiple pods, the user would need to log in each time it got redirected to another pod (assuming there is no request affinity on the load balancer to ensure the requests are redirected to the same pod each time). And if the pod is restarted, the user would need to log in again.

@somejfn
Copy link
Author

somejfn commented Apr 13, 2021

@jmazzitelli If the cookie size >4k is the problem, can we explain how I saw Kiali log that "Debug Raw token" line with a 6204 characters encoded token and it was not truncated ?

@jmazzitelli
Copy link
Collaborator

jmazzitelli commented Apr 13, 2021

"Debug Raw token" line with a 6204 encoded token and it was not truncated ?

I haven't looked at that log you specifically mention (and thus the code that produced it) but I suspect the raw token came across as a query string or perhaps in an HTTP POST field (neither of which have this 4k limit) as part of the initial OIDC login/authentication protocol - as part of the followup auth protocol then, that token (or something similar to it) gets set into an HTTP cookie generated by Kiali which the browser client then uses to authenticate future requests.

@jshaughn
Copy link
Collaborator

I generally agree. I don't like the memory approach, it will be a bad experience to have UI Auth behavior tied to kiali pod deployment/life-cycle. We want to keep Kiali stateless and so server-side storage is not attractive. From what I can see recent versions of FF and Chrome will not have an issue with multiple-cookies. Safari may have an 8k limit. I think Edge is 10k.

So, I agree with @jmazzitelli , best-effort using multiple cookies when needed. If that fails then it will be out of scope for Kiali and the user will need to change browsers or shorten tokens. I'mnot sure whether the chunks should be assembled client-side or server-side, that can be determined later.

@israel-hdez What do you think?

@jshaughn jshaughn added area/scalability+performance backlog Triaged Issue added to backlog labels Apr 13, 2021
@jshaughn jshaughn added this to Backlog in Sprint 55 (v1.33) via automation Apr 13, 2021
@jshaughn jshaughn added enhancement This is the preferred way to describe new end-to-end features. and removed bug Something isn't working labels Apr 13, 2021
@israel-hdez
Copy link
Member

I'm OK with the multi-cookie approach.

From this page: http://browsercookielimits.iain.guru/, using several cookies is going to work OK on Chrome and Firefox. Safari and Opera may work or may not.

Also, if Kiali is exposed via ingress-nginx, this LB may have problems. But, from this SO answer, it seems it's a matter of changing configurations in the LB: https://stackoverflow.com/questions/40289225/kubernetes-nginx-ingress-controller-returns-502-but-only-for-ajax-xmlhttprequest.

So, it's probably fine to use several cookies.

@kulbhusanazad
Copy link

Hello All , i have been working with @israel-hdez for sometime to get the openId integration working for the Azure Kubernetes (AKS). So for my use case as of now we hit upon the same point where the access token size is more than 4K.

so wanted to join this discussion.

There is some for this defined: https://tools.ietf.org/html/rfc2965

_Implementation Limits

Practical user agent implementations have limits on the number and
size of cookies that they can store. In general, user agents' cookie
support should have no fixed limits. They should strive to store as
many frequently-used cookies as possible. Furthermore, general-use
user agents SHOULD provide each of the following minimum capabilities
individually, although not necessarily simultaneously:

  *  at least 300 cookies

  *  at least 4096 bytes per cookie (as measured by the characters
     that comprise the cookie non-terminal in the syntax description
     of the Set-Cookie2 header, and as received in the Set-Cookie2
     header)

  *  at least 20 cookies per unique host or domain name

User agents created for specific purposes or for limited-capacity
devices SHOULD provide at least 20 cookies of 4096 bytes, to ensure
that the user can interact with a session-based origin server.

The information in a Set-Cookie2 response header MUST be retained in
its entirety. If for some reason there is inadequate space to store
the cookie, it MUST be discarded, not truncated.

Applications should use as few and as small cookies as possible, and
they should cope gracefully with the loss of a cookie.

5.3.1 Denial of Service Attacks User agents MAY choose to set an
upper bound on the number of cookies to be stored from a given host
or domain name or on the size of the cookie information. Otherwise a
malicious server could attempt to flood a user agent with many
cookies, or large cookies, on successive responses, which would force
out cookies the user agent had received from other servers. However,
the minima specified above SHOULD still be supported._

@jshaughn
Copy link
Collaborator

@israel-hdez @kulbhusanazad , I think Kiali should still limit the number of cookies devoted to the token storage, maybe it should be 1 by default and made configurable. Or some other small default if we want to avoid making it configurable.

@jmazzitelli
Copy link
Collaborator

@israel-hdez @kulbhusanazad , I think Kiali should still limit the number of cookies devoted to the token storage, maybe it should be 1 by default and made configurable. Or some other small default if we want to avoid making it configurable.

I don't see why we need to do this. Just create what we need. No need for yet another configuration setting. If the client browser can't handle it, so be it. But I don't think we need to artificially limit it - it is not like its security issue that we need to limit. If a token is 12k and we need 3 cookies - so be it. If its 16k and we need 4, so be it. We should handle it so as not to force Kiali to be the bottleneck - let the browser be the limiting factor. We should create as many cookies as required.

@kulbhusanazad "There is some for this defined: https://tools.ietf.org/html/rfc2965" - that is all fine and good - but we are limited to what actual browsers implement. They may or may not follow the "should" constraints of the spec. So it isn't the spec we should be concerned about, it should be what actual browsers allow.

@kulbhusanazad
Copy link

right !! agree we are limited to what the browsers allows.

Also throwing pointer , is it an option to have kiali leverage what a tool like redis ( https://hub.docker.com/_/redis) offers out of the box i.e. storing/retrieving key values on the server side in memory.
with this i guess then we don't have to enhance and maintain the cookie chunking/consolidation etc logic in kiali.

so if redis can be bundled with kiali ( with minimal change and complexity) then it will help a lot for the users like us who are hitting this issue. Or provide a way to configure if such existing redis like infrastructure exists.

@jshaughn
Copy link
Collaborator

I don't see why we need to do this.

@jmazzitelli My thinking was only as a protection against a maliciously huge token. But if that is not a concern then it's not a useful limitation.

As for redis, Kiali is a stateless on the server side and we have valued not requiring and other server-side deployment. Having said that, if a community member were interested enough to make a contribution maybe it's something we could guide. Wouldlike te hear from @israel-hdez on that topic.

@jmazzitelli
Copy link
Collaborator

FWIW: requiring (a) a second deployment and (b) server-side storage would be a huge paradigm shift for Kiali.

Just requiring a second deployment is huge from an installation perspective (e.g. Do we install it - operator and helm changes? Do we expect it to already exist? What if it doesn't exist? Do we uninstall it when we remove Kiali? More operator and helm changes.)

And having server-side storage would introduce some massive new requirements on Kiali that do not exist today (e.g. how do we create the space we need - does it require persistent volume configuration specific for our needs? How do we prune old data? how do we ensure we do not bloat the storage such that it avoids out-of-space errors? how do we scale it across multiple kiali pods?)

There has to be a hugely important use-case and a widely required feature for us to introduce something like that. This (supporting abnormally long OIDC tokens) is not one of them IMO.

@israel-hdez
Copy link
Member

@israel-hdez @kulbhusanazad , I think Kiali should still limit the number of cookies devoted to the token storage, maybe it should be 1 by default and made configurable. Or some other small default if we want to avoid making it configurable.

I don't see why we need to do this. Just create what we need.

@jmazzitelli @jshaughn Perhaps put a hard-coded limit of 180?

I think it would be surprising to get a token of about half a MB to the point we need 180 cookies (the Chrome limit).
So, I wouldn't care too much about the token length.

Perhaps, the main issue would be at the moment of reading the cookies when a request comes. If we put no limit, it probably may make easy to do a DoS attack by sending requests to Kiali with large number of cookies and force memory overflows. So, I would limit reads rather than writes.

As for redis, Kiali is a stateless on the server side and we have valued not requiring and other server-side deployment. Having said that, if a community member were interested enough to make a contribution maybe it's something we could guide. Wouldlike te hear from @israel-hdez on that topic.

@kulbhusanazad The multi-cookie approach is generic and does not require additional dependencies. This is the reason I'm preferring this.

Having said that, I think server-side storage will be more stable, but everybody will want their preferred storage supported in Kiali. That would be a similar headache to the one we had for authentication in the past: everybody wanted to login with their own system and we covered this to some extent by introducing OpenID support.

So, as @jshaughn said, I think we are seeking for community contributions so that community add support for Kiali to use whatever is the preferred storage of each user.

FWIW: requiring (a) a second deployment and (b) server-side storage would be a huge paradigm shift for Kiali.

@jmazzitelli We don't know if people will want the storage to live in Kubernetes or use a manged solution, or something else. So, I think it will be preferable that storage remains out of scope of Kiali. That doesn't mean server-side storage shouldn't be out of consideration. Just let people do the job of deploying the storage by their-selves. Then, Kiali should provide configs options Kiali to be able to use that storage.

@lucasponce lucasponce removed this from Backlog in Sprint 55 (v1.33) Apr 16, 2021
@lucasponce lucasponce added this to Backlog in Sprint 56 (v1.34) via automation Apr 16, 2021
israel-hdez added a commit to israel-hdez/swscore that referenced this issue Apr 16, 2021
In the authorization code flow, tokens returned by IdP can be long to the point that the session cookie also becomes long. Although RFCs don't set a maximum cookie size (see https://tools.ietf.org/html/rfc2965), in practice, major browsers use the RFC minimum limit as it's supported maximum - which is 4K. This means that longs tokens can cause the session token to reach this limit and be discarded by browsers.

This is overcoming these limitations by splitting session data in chunk no longer than 4K and setting multiple cookies, if needed.

The multi-cookie approach is applied only to the "authorization code" flow of OpenID. The "implicit flow" is left unchanged because IDPs tend to omit group membership data, which is the data that users have reported to generate larger tokens.

Fixes kiali#3874
Sprint 56 (v1.34) automation moved this from Backlog to Done Apr 16, 2021
israel-hdez added a commit that referenced this issue Apr 16, 2021
* Fix login failure with long OpenId tokens

In the authorization code flow, tokens returned by IdP can be long to the point that the session cookie also becomes long. Although RFCs don't set a maximum cookie size (see https://tools.ietf.org/html/rfc2965), in practice, major browsers use the RFC minimum limit as it's supported maximum - which is 4K. This means that longs tokens can cause the session token to reach this limit and be discarded by browsers.

This is overcoming these limitations by splitting session data in chunk no longer than 4K and setting multiple cookies, if needed.

The multi-cookie approach is applied only to the "authorization code" flow of OpenID. The "implicit flow" is left unchanged because IDPs tend to omit group membership data, which is the data that users have reported to generate larger tokens.

Fixes #3874
Co-authored-by: John Mazzitelli <mazz@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog Triaged Issue added to backlog enhancement This is the preferred way to describe new end-to-end features.
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

5 participants