Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding oauth2-proxy as optional alternative to oidc-authservice #2409

Merged
merged 13 commits into from
Aug 29, 2023

Conversation

axel7083
Copy link
Contributor

@axel7083 axel7083 commented Mar 16, 2023

Description

OAuth2-proxy allows to have multiple providers1 for authorizing incoming requests. It can be used to add additional providers while keeping the default Dex configuration. For example it can validate or invalidate JWT tokens in the Authorization header, extract any claims from them and map them to upstream headers.

It is possible to mimic the exact behavior of the oidc-authservice, since oauth2-proxy can respond static response2 with filtered headers3 with a code depending on the outcome of the authorization.

Use case

Easier "personal access token"

Simplifying the kubeflow/pipeline access from outside the cluster when the full kubeflow is deployed. Currently the official way in the documentation is to use the username/password to create a session4 which then can be saved, but the cookies are usually not long terms, so they cannot be used properly in CI/CD pipelines.

Having the oauth2-proxy offer alternatives and tools for developers to interact in a simpler way with the kubeflow resources.

The Istio AuthorizationPolicy has a build-in support for claims

Currently for each Profile, and user to access a notebook instance (like a Jupyter Notebook), we need to adapt the istio AuthorizationPolicy.

This is not very scalable for lot of users. In the context of supporting groups: The oidc-authservice provide the kubeflow-groups header, which is a string with user's groups separated with commas. Istio is not capable of parsing that.

The current best way to allows groups in the format provided by the oidc-authservice5 is to something hacky like allowing all the possible values for a group.

Let's take an example: the request.header[kubeflow-groups] = group1,admin,group2. Since we cannot know the order, we have to allows the following values admin (if only one group is provided), *,admin (if the group is at the end), *,admin,* (if the group is in the middle of others), admin,* (at the begining).

However, as it is shown in their documentation6 they support verifying JWT authorization claims. So, having the oatuh2-proxy as the EnvoyFilter filter, would provide the kubeflow-groups and a valid JWT token that the Istio would be capable of parsing and extract claims

The hacky stuff could be simply replaced with the following:

when:
    - key: request.auth.claims[groups]
      values: ["admin"]

Possible issues

Currently the logout URL is hardcoded in the central dashboard and need to be changed to /oauth2/logout. The best solution would be to set inside a config-map the endpoint so it can be easily changed. A simple fix for now is to create a VirtualService which take the /logout url and rewrite it to /oauth2/logout.

The default oatuh2-proxy configuration send the JWT token to the client, in this context the performance could be impacted, and add a bit of redundancy since the oauth2-proxy would first parse the JWT token, then add the required headers (kubeflow-userid and kubeflow-groups) then will forward the request with the jwt token, and istio will parse it again. A simple fix, could be to change the default oauth2-proxy configuration and add a Redis server for it to cache the token.

Footnotes

  1. In the alpha-config of oauth2-proxy you can define multiple providers.

  2. See the upstreams-configuration in their documentation.

  3. The injectResponseHeaders options allows to extract claims from the token to put them in the response, which can then be forward to the kubeflow services by istio. See the alpha-config for details.

  4. Connect the Pipelines SDK to Kubeflow Pipelines

  5. oidc-authservice

  6. Istio authz-jwt

@jbottum
Copy link

jbottum commented May 22, 2023

/priority p1
/area manifests
/kind feature

@google-oss-prow
Copy link

@jbottum: The label(s) area/manifests cannot be applied, because the repository doesn't have them.

In response to this:

/priority p1
/area manifests
/kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@kimwnasptd
Copy link
Member

Hey @axel7083, nice work exposing all this context!

I'd split this into the following distinct efforts:

  1. Replacing oidc-authservice with oauth2-proxy`
  2. Support for programmatic clients from outside the cluster
  3. Exposing group information (headers, JWT token claims etc)

Let's focus this effort on the first one for now, how to mimic AuthService's OIDC Client functionality with oauth2-proxy and continue the discussion about Programmatic Clients and Group Info in distinct issues.

So let's remove authservice and just keep oauth2-proxy in this PR. Indeed we'll also need to handle the logout URL in the CentralDashboard, after this change.

@axel7083
Copy link
Contributor Author

axel7083 commented Jul 25, 2023

@kimwnasptd I removed the dependencies on oidc-authservice with oauth2-proxy.

Some information

I tested it on a minikube cluster. I only used a port-forward during the test. To make it behing a real domain the following elements should be modified:

https://github.com/axel7083/manifests/blob/9647b628d9ec8f3c2fa92adff7eba5242d61ac65/common/auth-proxy/overlays/oauth2-proxy/deployment.yaml#L38

https://github.com/axel7083/manifests/blob/9647b628d9ec8f3c2fa92adff7eba5242d61ac65/common/dex/base/config-map.yaml#L30

Copy link
Member

@kimwnasptd kimwnasptd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work @axel7083!

Some high level comments:

  • Let's rename the folder to common/oauth2-proxy to be super clear to users about the component
  • Let's not remove the oidc-authservice yet, but leave them both for this release with authservice as the default

common/auth-proxy/overlays/oauth2-proxy/deployment.yaml Outdated Show resolved Hide resolved
README.md Outdated

```sh
kustomize build common/oidc-authservice/base | kubectl apply -f -
kustomize build common/auth-proxy/overlays/oauth-proxy | kubectl apply -f -
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave the oidc-authservice there by default for this release and we can make the migration in the next release to only oauth2-proxy.

But we could do something like this there:

kustomize build common/oidc-authservice/base | kubectl apply -f -
# kustomize build common/auth-proxy/overlays/oauth-proxy | kubectl apply -f -

common/dex/base/config-map.yaml Outdated Show resolved Hide resolved
@@ -39,8 +39,8 @@ resources:
- ../common/istio-1-16/istio-crds/base
- ../common/istio-1-16/istio-namespace/base
- ../common/istio-1-16/istio-install/base
# OIDC Authservice
- ../common/oidc-authservice/base
# oauth
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned in a comment above, let's keep authservice as the default just for this release

@kimwnasptd
Copy link
Member

Oh and the most important one, don't forget to have an OWNERS file with yourself as an approver for the oauth2-proxy. Especially since you've spearheaded this effort :)

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 11, 2023

@axel7083 the default way is currently not username and password but the oidc-authservice serviceaccounttoken method. Since Kubeflow 1.7 this is officially supported. You just have to send a header with the Serviceaccount/bearer token and you are authenticated. I can provide a one liner to do this, if needed.

@axel7083
Copy link
Contributor Author

@axel7083 the default way is currently not username and password but the oidc-authservice serviceaccounttoken method. Since Kubeflow 1.7 this is officially supported. You just have to send a header with the Serviceaccount/bearer token and you are authenticated. I can provide a one liner to do this, if needed.

Thanks, but I already know how to use ServiceAccount token. The need for oauth2-proxy is beyond that

  • Being able to propagate the JWT token instead of plain text email in the header
  • Allows multiple provider to be configured.

Internally we created a personal access token manager, users have multiple groups, and each groups are linked to specific role (like namespace), Users can generate by themselves JWT tokens in the centraldashboard that can be used for kubeflow pipeline with choosen scopes (groups delegation), those token add more controls like life duration, and user tracking (we are able to know who make the requests.), this is possible thanks for oauth2-proxy

@annajung
Copy link
Member

Hi @axel7083, can you follow the steps to join the Kubeflow org? You can't be added to the OWNERS file without being an org member.

If you need a sponsor, happy to do it

@annajung
Copy link
Member

/verify-owners

@kimwnasptd
Copy link
Member

@axel7083 thanks for the great work! I'll take a final look the next days. @axel7083 would you be available to join the next Manifests WG call on 7th of Sept? You can subscribe to the community calendar to view these events https://www.kubeflow.org/docs/about/community/#kubeflow-community-call

cc @annajung @juliusvonkohout @ to also take a peek

@robbellMF
Copy link

Hey @axel7083 - have you seen this issue? Looks like oauth2-proxy might not actually support multiple providers yet, even though the config lets you define multiple providers.

I don't think this needs to be a blocker for this PR - oauth2-proxy still provides a path for better programmatic access from outside the cluster and exposing group info so this change has huge value.

@axel7083
Copy link
Contributor Author

Hey @axel7083 - have you seen this issue? Looks like oauth2-proxy might not actually support multiple providers yet, even though the config lets you define multiple providers.

I don't think this needs to be a blocker for this PR - oauth2-proxy still provides a path for better programmatic access from outside the cluster and exposing group info so this change has huge value.

Hey ! Yes I already know, I noticed that later when I was conducting some testing, and forgot to mention it here, so thanks you for reminding it !

The PR for multi tenancy feature is open and being discussed ( a bit slow since the owner of oauth2-proxy is managing it on its free time) oauth2-proxy/oauth2-proxy#1923 so probably something we will see in the future (really hope so.)

As you said, and I still agree, the benefits of having oauth2-proxy are still interesting.

@axel7083 thanks for the great work! I'll take a final look the next days. @axel7083 would you be available to join the next Manifests WG call on 7th of Sept? You can subscribe to the community calendar to view these events https://www.kubeflow.org/docs/about/community/#kubeflow-community-call

cc @annajung @juliusvonkohout @ to also take a peek

I am not sure if I will available on the 7 of September for now, my calendar is not well defined and mostly not under my control for the next few weeks. I will add it to my calendar and will try to join. That could be a very interesting call.

@kromanow94
Copy link
Contributor

Hey @axel7083, I see we were working on a very similar contribution in parallel!

I also wanted to add support for oauth2-proxy but with a little different configuration. I'd like to describe the differences in the alternative, provide some arguments in favor and open a discussion if you don't mind. I think there might be some value in combining our work here.

oauth2-proxy config

I have two points:

  • I think it's worth keeping oauth2-proxy in it's own namespace because oauth2-proxy is integrated with istio but is not a part of istio. With that in mind, I also understand it was made to keep oauth2-proxy and istio close and ease the kustomize setup and it's rather an insignificant issue
  • skip_auth_regex is deprecated and should be replaced with skip_auth_route

(DEPRECATED for --skip-auth-route) bypass authentication for requests paths that match (may be given multiple times)

https://oauth2-proxy.github.io/oauth2-proxy/docs/configuration/overview/

Istio mesh config vs EnvoyFilter

It is described in Istio Issue Better External Authorization support and in the Istio docs External Authorization that there is a new, improved and recommended option to configure auth with mesh config using envoyExtAuthzHttp. The documentation is based on this very same use case which is integrating oauth2-proxy with Istio. That said I believe there is value in making a few changes in this PR to use the recommeneded configuration.

I have experience in this exact setup because I implemented it in multiple Kubeflow environments that I administrate and develop. I'll go through the differences:

envoyExtAuthzHttp extension instead of EnvoyFilter

To configure the external authorizer with envoyExtAuthzHttp, a following configuration should be placed in the istio ConfigMap:

  mesh: |-
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow:
        - authorization
        - path
        - x-auth-request-email
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

This configuration is very similar to the EnvoyFilter but it doesn't add the kubeflow-userid. This is because the headers might be added by istio with RequestAuthentication and then are parsed directly from the JWT.

From the end goal perspective which is providing the auth headers it doesn't make a difference if it's oauth2-proxy or istio that's adding the header but I think it makes better sense to use Istio for that because this way we can keep the oauth2-proxy configuration simple and not too Kubeflow specific. From istio perspective, kubeflow-userid header is specific to the application which is Kubeflow.

To follow the current convention in the repo of configuring istio with istioctl and IstioOperator file, a following section can be added to the profile-overlay.yaml file:

  meshConfig:
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow:
        - authorization
        - path
        - x-auth-request-email
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

I was thinking about what could be the easiest way of providing this configuration to istio and I think we could either create a dedicated istio directory with separate profile-overlay.yaml file but, knowing that at end this configuration will be just added to the istio ConfigMap in istio root namespace, a simpler option would be to create a common/istio-1-17/istio-install/overlays/oauth2-proxy overlay and add this specific patch (this also takes into account the istio-configmap-disable-tracing.yaml patch):

apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  # Configuration file for the mesh networks to be used by the Split Horizon EDS.
  mesh: |-
    accessLogFile: /dev/stdout
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      proxyMetadata: {}
      tracing: {}
    enablePrometheusMerge: true
    rootNamespace: istio-system
    tcpKeepalive:
      interval: 5s
      probes: 3
      time: 10s
    trustDomain: cluster.local
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow: # x-auth-* headers are optional if set with RequestAuthentication.spec.jwtRules.outputClaimToHeaders
        - authorization
        - path
        - x-auth-request-email   
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

RequestAuthentication to trust and use jwt

This is how we tell Istio that if a JWT is present in the request and is issued by dex, it should be trusted and possible to use with AuthorizationPolicy using information from JWT. Additionally, we can use outputClaimToHeaders to instruct istio to take a claim from JWT and put to header. This is where we define kubeflow-userid and kubeflow-groups headers.

apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: dex-jwt
  namespace: istio-system
spec:
  jwtRules:
    - forwardOriginalToken: true
      issuer: http://dex.auth.svc.cluster.local:5556/dex
      outputClaimToHeaders:
        - header: kubeflow-userid
          claim: email
        - header: kubeflow-groups
          claim: groups

Istio JWT Refresh

From my experience it's also worth providing this patch to the istiod Deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: istiod
  namespace: istio-system
spec:
  template:
    spec:
      containers:
      - name: discovery
        env:
        - name: PILOT_JWT_PUB_KEY_REFRESH_INTERVAL
          value: "1m"

This is because when we instruct Istio to trust dex issuer with RequestAuthentication, istio will call dex's .well-known/openid-configuration and if dex service is not available (mostly happens on cluster power ups or when new env is created and istio gets up before dex), it will use some random placeholder jwt token and inform about it in the logs:

2023-08-26T18:15:43.935638Z     info    model   The JWKS key is not yet fetched for issuer http://dex.auth.svc.cluster.local:5556/dex (), using a fake JWKS for now

JWKS is refreshed based on config PILOT_JWT_PUB_KEY_REFRESH_INTERVAL and the default interval is set to 20 minutes. Setting this to 1 minute makes the auth usable no longer than a minute after dex is running.

If above situation were to happen, a user would see the following error when trying to access the page after authentication with dex:

Jwks doesn't have key to match kid or alg from Jwt

Clearing the cookies helps.

This patch could be a part of common/istio-1-17/istio-install/overlays/oauth2-proxy.

Delegating auth to oauth2-proxy

To delegate the access control to an external authorization system, an instance of AuthorizationPolicy specifying CUSTOM action and name of the extension name in the provider must be created. For us the extension is envoyExtAuthzHttp named oauth2-proxy. Previously this was managed by EnvoyFilter as well.

To configure AuthorizationPolicy for a whole Istio Ingress Gateway Pod (aka Istio Load Balancer), this AuthorizationPolicy must be created in a namespace where the gateway pod is deployed with selector set for gateway pod. Here this will be:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: oauth2-proxy-istio-ingressgateway
  namespace: istio-system
spec:
  action: CUSTOM
  provider:
    name: oauth2-proxy
  selector:
    matchLabels:
      app: istio-ingressgateway
  rules:
  - {}

This could be added to common/istio-1-17/kubeflow-istio-resources/overlays/oauth2-proxy.

Bonus: CloudFlare

If you're running you Kubeflow instance behind CloudFlare, you will probably want to disable auth for some static web page resources like favicon or assets as CloudFlare will try to cache them. This configuration can be used in such scenario:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: oauth2-proxy-istio-ingressgateway
  namespace: istio-system
spec:
  action: CUSTOM
  provider:
    name: oauth2-proxy
  selector:
    matchLabels:
      app: istio-ingressgateway
  rules:
  - to:
    - operation:
        notPaths:
        - /favicon*
        - /webcomponentsjs*
        - /vendor.bundle.js
        - /app.bundle.js
        - /dashboard_lib.bundle.js
        - /assets*
        - /app.css

Programmatic clients

I understand that support for programmatic clients in this setup is something planned for the next steps but since I have this already worked out, I'd like to describe what would be the next steps needed in order to enable it in a way that complies with the configuration I described above.

In a summary, oauth2-proxy must be configured to skip auth for requests that have verified jwt bearer tokens and then istio must be configured to trust this jwt and extract user information to the header. This jwt can but doesn't have to be issued by the kubernetes cluster. If it is, tokens can be created using kubectl create token command. Otherwise you'd have to use this other IdP to generate the tokens.

oauth2-proxy

This must be added into configuration:

skip_jwt_bearer_tokens = true
extra_jwt_issuers = "issuer1=audience1,issuer2=audience2"  # example: "https://oidc.example=https://kubernetes.default.svc"

Istio

When oauth2-proxy skips the auth, content of "authorization" headers is passed to istio and then we can use RequestAuthentication again to trust this jwt bearer token and pass the client information to the kubeflow-userid header:

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
  name: kubernetes-jwt
  namespace: istio-system
spec:
  jwtRules:
  - forwardOriginalToken: true
    issuer: https://oidc.example
    outputClaimToHeaders:
    - claim: sub
      header: kubeflow-userid
    - claim: sub
      header: x-auth-request-email
    - claim: sub
      header: x-auth-request-user

This could be added to common/istio-1-17/kubeflow-istio-resources/overlays/oauth2-proxy.

Usage example

With the config in setup, assuming k8s is configured with some IdP, following commands can be executed to test the m2m flow:

$ TOKEN="$(kubectl create token default-editor)"
$ curl https://kubeflow.example.com/ -H "Authorization: Bearer $TOKEN"
# web page is content is returned instead of auth redirect

$ curl https://kubeflow.example.com/api/workgroup/env-info -H "Authorization: Bearer $TOKEN"
{"user":"system:serviceaccount:kubeflow-user-example-com:default-editor","platform":{"kubeflowVersion":"unknown","provider":"other://","providerName":"other","logoutUrl":"/logout"},"namespaces":[],"isClusterAdmin":false}

Note

It's important to understand that if the issuer is served behind https, the https certificates must be trusted by both the oauth2-proxy and istio. This means that if the oidc issuer of your k8s cluster is https://kubernetes.default.svc.cluster.local, most probably the certs will be self signed and both istio and oauth2-proxy will have issues with that. I'm not sure if there is some configuration for istio and oauth2-proxy to accept such situation except for adding the certs to the containers. My recommendation for such cases is to deploy kubernetes with some IdP such as keycloak.


Let me know what you think! Also, I have some spare time if you'd like some help on the PR.

@axel7083
Copy link
Contributor Author

axel7083 commented Aug 27, 2023

Hey @axel7083, I see we were working on a very similar contribution in parallel!

Hey @kromanow94 ! Thanks for reaching out, really appreciate all the details and work ! Let's dive into it.

I also wanted to add support for oauth2-proxy but with a little different configuration. I'd like to describe the differences in the alternative, provide some arguments in favor and open a discussion if you don't mind. I think there might be some value in combining our work here.

This PR aims to mimic the behavior of the oidc-authservice. I knew using more in depth the Istio Configuration could potentially be better, but I have not look into it, so thanks you for this amazing work!

oauth2-proxy config

I have two points:

  • I think it's worth keeping oauth2-proxy in it's own namespace because oauth2-proxy is integrated with istio but is not a part of istio. With that in mind, I also understand it was made to keep oauth2-proxy and istio close and ease the kustomize setup and it's rather an insignificant issue

I totally get the point here, and agree.

  • skip_auth_regex is deprecated and should be replaced with skip_auth_route

(DEPRECATED for --skip-auth-route) bypass authentication for requests paths that match (may be given multiple times)

https://oauth2-proxy.github.io/oauth2-proxy/docs/configuration/overview/

Thanks for noticing! This can be fix right away, I will run some tests to ensure everything run properly.

Istio mesh config vs EnvoyFilter

It is described in Istio Issue Better External Authorization support and in the Istio docs External Authorization that there is a new, improved and recommended option to configure auth with mesh config using envoyExtAuthzHttp. The documentation is based on this very same use case which is integrating oauth2-proxy with Istio. That said I believe there is value in making a few changes in this PR to use the recommeneded configuration.

I have experience in this exact setup because I implemented it in multiple Kubeflow environments that I administrate and develop. I'll go through the differences:

envoyExtAuthzHttp extension instead of EnvoyFilter

To configure the external authorizer with envoyExtAuthzHttp, a following configuration should be placed in the istio ConfigMap:

  mesh: |-
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow:
        - authorization
        - path
        - x-auth-request-email
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

This configuration is very similar to the EnvoyFilter but it doesn't add the kubeflow-userid. This is because the headers might be added by istio with RequestAuthentication and then are parsed directly from the JWT.

From the end goal perspective which is providing the auth headers it doesn't make a difference if it's oauth2-proxy or istio that's adding the header but I think it makes better sense to use Istio for that because this way we can keep the oauth2-proxy configuration simple and not too Kubeflow specific. From istio perspective, kubeflow-userid header is specific to the application which is Kubeflow.

To follow the current convention in the repo of configuring istio with istioctl and IstioOperator file, a following section can be added to the profile-overlay.yaml file:

  meshConfig:
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow:
        - authorization
        - path
        - x-auth-request-email
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

I was thinking about what could be the easiest way of providing this configuration to istio and I think we could either create a dedicated istio directory with separate profile-overlay.yaml file but, knowing that at end this configuration will be just added to the istio ConfigMap in istio root namespace, a simpler option would be to create a common/istio-1-17/istio-install/overlays/oauth2-proxy overlay and add this specific patch (this also takes into account the istio-configmap-disable-tracing.yaml patch):

apiVersion: v1
kind: ConfigMap
metadata:
  name: istio
  namespace: istio-system
data:
  # Configuration file for the mesh networks to be used by the Split Horizon EDS.
  mesh: |-
    accessLogFile: /dev/stdout
    defaultConfig:
      discoveryAddress: istiod.istio-system.svc:15012
      proxyMetadata: {}
      tracing: {}
    enablePrometheusMerge: true
    rootNamespace: istio-system
    tcpKeepalive:
      interval: 5s
      probes: 3
      time: 10s
    trustDomain: cluster.local
    extensionProviders:
    - name: oauth2-proxy
      envoyExtAuthzHttp:
        service: oauth2-proxy.oauth2-proxy.svc.cluster.local
        port: 8080
        headersToDownstreamOnDeny:
        - content-type
        - set-cookie
        headersToUpstreamOnAllow: # x-auth-* headers are optional if set with RequestAuthentication.spec.jwtRules.outputClaimToHeaders
        - authorization
        - path
        - x-auth-request-email   
        - x-auth-request-groups
        - x-auth-request-user
        - x-auth-request-user-groups
        includeRequestHeadersInCheck:
        - authorization
        - cookie

RequestAuthentication to trust and use jwt

This is how we tell Istio that if a JWT is present in the request and is issued by dex, it should be trusted and possible to use with AuthorizationPolicy using information from JWT. Additionally, we can use outputClaimToHeaders to instruct istio to take a claim from JWT and put to header. This is where we define kubeflow-userid and kubeflow-groups headers.

apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: dex-jwt
  namespace: istio-system
spec:
  jwtRules:
    - forwardOriginalToken: true
      issuer: http://dex.auth.svc.cluster.local:5556/dex
      outputClaimToHeaders:
        - header: kubeflow-userid
          claim: email
        - header: kubeflow-groups
          claim: groups

Istio JWT Refresh

From my experience it's also worth providing this patch to the istiod Deployment.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: istiod
  namespace: istio-system
spec:
  template:
    spec:
      containers:
      - name: discovery
        env:
        - name: PILOT_JWT_PUB_KEY_REFRESH_INTERVAL
          value: "1m"

This is because when we instruct Istio to trust dex issuer with RequestAuthentication, istio will call dex's .well-known/openid-configuration and if dex service is not available (mostly happens on cluster power ups or when new env is created and istio gets up before dex), it will use some random placeholder jwt token and inform about it in the logs:

2023-08-26T18:15:43.935638Z     info    model   The JWKS key is not yet fetched for issuer http://dex.auth.svc.cluster.local:5556/dex (), using a fake JWKS for now

JWKS is refreshed based on config PILOT_JWT_PUB_KEY_REFRESH_INTERVAL and the default interval is set to 20 minutes. Setting this to 1 minute makes the auth usable no longer than a minute after dex is running.

If above situation were to happen, a user would see the following error when trying to access the page after authentication with dex:

Jwks doesn't have key to match kid or alg from Jwt

Clearing the cookies helps.

This patch could be a part of common/istio-1-17/istio-install/overlays/oauth2-proxy.

Delegating auth to oauth2-proxy

To delegate the access control to an external authorization system, an instance of AuthorizationPolicy specifying CUSTOM action and name of the extension name in the provider must be created. For us the extension is envoyExtAuthzHttp named oauth2-proxy. Previously this was managed by EnvoyFilter as well.

To configure AuthorizationPolicy for a whole Istio Ingress Gateway Pod (aka Istio Load Balancer), this AuthorizationPolicy must be created in a namespace where the gateway pod is deployed with selector set for gateway pod. Here this will be:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: oauth2-proxy-istio-ingressgateway
  namespace: istio-system
spec:
  action: CUSTOM
  provider:
    name: oauth2-proxy
  selector:
    matchLabels:
      app: istio-ingressgateway
  rules:
  - {}

This could be added to common/istio-1-17/kubeflow-istio-resources/overlays/oauth2-proxy.

Bonus: CloudFlare

If you're running you Kubeflow instance behind CloudFlare, you will probably want to disable auth for some static web page resources like favicon or assets as CloudFlare will try to cache them. This configuration can be used in such scenario:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: oauth2-proxy-istio-ingressgateway
  namespace: istio-system
spec:
  action: CUSTOM
  provider:
    name: oauth2-proxy
  selector:
    matchLabels:
      app: istio-ingressgateway
  rules:
  - to:
    - operation:
        notPaths:
        - /favicon*
        - /webcomponentsjs*
        - /vendor.bundle.js
        - /app.bundle.js
        - /dashboard_lib.bundle.js
        - /assets*
        - /app.css

Programmatic clients

I understand that support for programmatic clients in this setup is something planned for the next steps but since I have this already worked out, I'd like to describe what would be the next steps needed in order to enable it in a way that complies with the configuration I described above.

In a summary, oauth2-proxy must be configured to skip auth for requests that have verified jwt bearer tokens and then istio must be configured to trust this jwt and extract user information to the header. This jwt can but doesn't have to be issued by the kubernetes cluster. If it is, tokens can be created using kubectl create token command. Otherwise you'd have to use this other IdP to generate the tokens.

oauth2-proxy

This must be added into configuration:

skip_jwt_bearer_tokens = true
extra_jwt_issuers = "issuer1=audience1,issuer2=audience2"  # example: "https://oidc.example=https://kubernetes.default.svc"

Istio

When oauth2-proxy skips the auth, content of "authorization" headers is passed to istio and then we can use RequestAuthentication again to trust this jwt bearer token and pass the client information to the kubeflow-userid header:

apiVersion: security.istio.io/v1
kind: RequestAuthentication
metadata:
  name: kubernetes-jwt
  namespace: istio-system
spec:
  jwtRules:
  - forwardOriginalToken: true
    issuer: https://oidc.example
    outputClaimToHeaders:
    - claim: sub
      header: kubeflow-userid
    - claim: sub
      header: x-auth-request-email
    - claim: sub
      header: x-auth-request-user

This could be added to common/istio-1-17/kubeflow-istio-resources/overlays/oauth2-proxy.

Usage example

With the config in setup, assuming k8s is configured with some IdP, following commands can be executed to test the m2m flow:

$ TOKEN="$(kubectl create token default-editor)"
$ curl https://kubeflow.example.com/ -H "Authorization: Bearer $TOKEN"
# web page is content is returned instead of auth redirect

$ curl https://kubeflow.example.com/api/workgroup/env-info -H "Authorization: Bearer $TOKEN"
{"user":"system:serviceaccount:kubeflow-user-example-com:default-editor","platform":{"kubeflowVersion":"unknown","provider":"other://","providerName":"other","logoutUrl":"/logout"},"namespaces":[],"isClusterAdmin":false}

Note

It's important to understand that if the issuer is served behind https, the https certificates must be trusted by both the oauth2-proxy and istio. This means that if the oidc issuer of your k8s cluster is https://kubernetes.default.svc.cluster.local, most probably the certs will be self signed and both istio and oauth2-proxy will have issues with that. I'm not sure if there is some configuration for istio and oauth2-proxy to accept such situation except for adding the certs to the containers. My recommendation for such cases is to deploy kubernetes with some IdP such as keycloak.

Let me know what you think! Also, I have some spare time if you'd like some help on the PR.

I really like the approach you are taking, relying more on Istio than the oauth2-proxy, (Istio really have a better support anyway).

As I stated in the begin of this comments, this PR aims to mimic the existing behavior, offering an alternative. @kimwnasptd what do you think ? This is probably worth a try, but maybe in another PR ? He did a great job, no sure if it should be integrated in this PR or later ?

Edit

I will try it in a cluster next week to see it in more details :)

@kromanow94
Copy link
Contributor

Hey @axel7083 , thanks for the positive feedback, I’m really glad you like it! :)

Sure, let’s see what the community thinks. If the decision would be to go with separate PR I can author it or help in the process, whatever is preferred.

@juliusvonkohout
Copy link
Member

juliusvonkohout commented Aug 28, 2023

@kromanow94 I really like your approach and either @axel7083 adds you as collaborator to his repository or the other way around. This way you can work together, as I often do. You can see this way of collaboration in #2455 for example.

As you can see in #2455 we will have multiple istios for some time, so please use a kustomize component instead of an overlay. This component can probably live in the oauth2 related directory as well.

We must be sure that the jwt issued by kubernetes is checked for the correct signature, such that you cannot fake it. I assume that is handled automatically, but please confirm.

Furthermore we must somehow automatically import the cluster certificates, whether self signed or not. As far as I know these certificates are mounted by default into each pod on a normal kubernetes. Probably you just have to tell Istio to use them as well.
It must work in every installation automatically.

I can definitely guide regarding the repository structure and kubernetes.

@kromanow94
Copy link
Contributor

@juliusvonkohout thanks :).

I assume from your comment that the changes I mentioned above should be a part of this PR.

About the cluster certificates: you're right, it's in k8s docs that the ServiceAccount admission controller adds a projected volume with the cert: https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume. With this we can rely on that the certs will be always available in istiod and oauth2-proxy pods and we just need to add them. The path is always /run/secrets/kubernetes.io/serviceaccount/ca.crt.

Also, we can take the issuer url from inside the token value in /run/secrets/kubernetes.io/serviceaccount/token (this is always available unless ServiceAccount.automountServiceAccountToken is false). Then this could be used in some k8s job to create an instance of RequestAuthentication with the issuer url taken from token and outputClaimToHeaders. @juliusvonkohout does that seem like a viable option? Is there another way of automating that you have in mind?

Using PyJWT library, we can get the issuer automatically with:

import jwt

with open("/run/secrets/kubernetes.io/serviceaccount/token", "rb") as token_file:
    token = token_file.read().decode()

issuer = jwt.decode(token, options={"verify_signature": False})["iss"]  # don't verify because of self-signed issuer cert

We must be sure that the jwt issued by kubernetes is checked for the correct signature, such that you cannot fake it. I assume that is handled automatically, but please confirm.

Both oauth2-proxy and istiod had complaints when the certs were not trusted and it always rendered the setup unusable so I think it's safe to assume that the signature is verified. If you need better argument I can verify additionally.

please use a kustomize component instead of an overlay. This component can probably live in the oauth2 related directory as well.

It would be best if you could provide a directory path that would suit this.

@axel7083 how do you feel about adding me as collaborator to your repository?

@juliusvonkohout
Copy link
Member

@kromanov94

"Also, we can take the issuer url from inside the token value in /run/secrets/kubernetes.io/serviceaccount/token (this is always available unless ServiceAccount.automountServiceAccountToken is false). Then this could be used in some k8s job to create an instance of RequestAuthentication with the issuer url taken from token and outputClaimToHeaders. @juliusvonkohout does that seem like a viable option? Is there another way of automating that you have in mind?"

We can use something similar to this to enforce mounting

      volumes:
      - name: volume-kf-pipelines-token
        projected:
          defaultMode: 420
          sources:
          - serviceAccountToken:
              audience: pipelines.kubeflow.org
              expirationSeconds: 7200
              path: token

it is used a lot in KFP.

@kromanow94
Copy link
Contributor

@juliusvonkohout sure, why not. It was more of a question of if you see having a K8s Job in istio-system namespace that will create an instance of RequestAuthentication a viable option.

@kimwnasptd
Copy link
Member

Thanks for the detailed explanations @kromanow94, this is really good technical context and insights!

As next steps I want to suggest the following:

  • I'd like to merge this PR from @axel7083, since
    • he's worked on it for so long
    • I'd like us to have a first step merged and then we can keep iterating
  • Have a follow-up issues:
    • oauth2-proxy and Istio mesh support
    • oauth2-proxy and programmatic access clients

Again the reasoning for the above is to also slightly organize the knowledge and discussion better. Although I'd expect the context to be a little bit intertwined, which we can keep discussing in the calls as well.

@axel7083 @kromanow94 WDYT?

@axel7083
Copy link
Contributor Author

Thanks for the detailed explanations @kromanow94, this is really good technical context and insights!

As next steps I want to suggest the following:

  • I'd like to merge this PR from @axel7083, since

    • he's worked on it for so long
    • I'd like us to have a first step merged and then we can keep iterating
  • Have a follow-up issues:

    • oauth2-proxy and Istio mesh support
    • oauth2-proxy and programmatic access clients

Again the reasoning for the above is to also slightly organize the knowledge and discussion better. Although I'd expect the context to be a little bit intertwined, which we can keep discussing in the calls as well.

@axel7083 @kromanow94 WDYT?

Thanks for the reply, and yes I agree on your opinion for the steps to follow, this PR is kinda getting old and have a lot of elements, discussion and comments, merging it, and working on a new one is probably better !

@kromanow94
Copy link
Contributor

Sounds good. FYI, discussion on istio slack about using Kubernetes OIDC: https://istio.slack.com/archives/C3TEGNZ7W/p1693320167853579

@kimwnasptd
Copy link
Member

@axel7083 thanks again for your time on this, and really looking forward to next steps!

@axel7083 @kromanow94 @juliusvonkohout Let's discuss the future fun items in the new issues :)

/lgtm
/approve

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: axel7083, kimwnasptd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants