Skip to content
This repository has been archived by the owner on Jul 11, 2023. It is now read-only.

Programmatic access and authentication/authorization #130

Open
AmadiL opened this issue Jun 14, 2021 · 9 comments
Open

Programmatic access and authentication/authorization #130

AmadiL opened this issue Jun 14, 2021 · 9 comments

Comments

@AmadiL
Copy link

AmadiL commented Jun 14, 2021

Hi,
looking into the main Kubeflow project and it seems that there is an issue when trying to programatically authenticate and authorize.
With Dex it seems to involve some workaround: kubeflow/kubeflow#5345
while with cognito it's impossible (involves pasting cookies by hand): kubeflow/pipelines#4182

Did you happen to address this issue in your project? I can see you're using oauth2-proxy - does it help here in any way?
We are trying to integrate our auth gate with external IdP like Ping Identity or LDAP. Also we consider moving to argoflow-aws.

@davidspek
Copy link
Member

@AmadiL Am I assuming correctly that the main reason you need the authentication through the Gateway is for Kubeflow Pipelines? Or are you also needing this for KFServing? I’d like to hear more about your use case so that we can come up with a better method to handle these types of policies.

The cookie method works with Oauth2-Proxy (though the name of the cookie is different). For the KFP SDK a better integration for authentication is something we’d like to work on in the future.

If you need LDAP integration, OAuth2-Procy can’t handle this natively, but we provide overlays for Dex and Keycloak which can do this. For Ping I am not sure, if it is OIDC you can integrate it with OAuth2-Proxy directly.

@AmadiL
Copy link
Author

AmadiL commented Jun 15, 2021

  1. Kubeflow Pipelines (REST and with kfp) and models (KFServing or Seldon).
    Example usecases:

    • deploy pipelines with CI/CD,
    • trigger pipelines,
    • run inference on the models.
  2. The cookie method doesn't work when we want to give the access token/credentials to the pipeline or some other application running outside the cluster.

  3. OAuth2-Proxy, Dex and Keycloak are all OAuth2/OIDC providers. Ping Identity also supports OAuth2/OIDC or SAML. Dex and Keycloak should be able to integrate with LDAP. We can put the authn/z on the cluster or on the Gateway outside (like an external ALB or API Gateway). The problem with Cognito is that by default it uses OAuth2 Authorization Grant (which doesn't support machine2machine authn/z) and if you switch to Client Credentials Grant it doesn't support Federation (Ping Identity or any other IdP via SAML/OIDC). So I am looking for any suitable combination of tools (Oauth2-Proxy/Dex/Keycloak/Ory/Istio-mTLS/APIGateway-Auth + Ping Identity/LDAP*) that will allow me to auth programmatically (m2m) using identities from our central IdP.

*Our Ping Identity is synced with LDAP

If we could find a general solution it would be great for everyone :)

@davidspek
Copy link
Member

@AmadiL Could you maybe expand a bit how machine2machine authentication works in this workflow. I'm not familiar with it, but reading up on it now. It does look similar to something I've seen before, which gives me hope that we can use this to provide a better authentication experience for the entire Kubeflow setup.

@AmadiL
Copy link
Author

AmadiL commented Jun 15, 2021

M2M with OAuth2 is a little bit tricky, I'm not an expert and just researching it to solve this particular issue.
You can ask the provider for access token with Client Credentials Grant using client_id and client_secret:
https://www.oauth.com/oauth2-servers/access-tokens/client-credentials/
https://www.appsdeveloperblog.com/keycloak-client-credentials-grant-example/
There seems to also be a Password Grant:
https://www.appsdeveloperblog.com/keycloak-requesting-token-with-password-grant/
https://developer.okta.com/blog/2018/06/29/what-is-the-oauth2-password-grant

I don't have the whole image of how this workflow would look like with Kubeflow. For me it doesn't really matter as long as I can either allow my app/client to generate a token, that will be used to access KF, or generate a long-lived token myself that I will store somewhere, and my app/client will use it to access KF.

@davidspek
Copy link
Member

Obviously, using the client ID and secret is something that needs to be done very carefully. On GCP, the pipelines SDK has an authentication flow that I think is somewhat similar, and could hopefully be more generalized for a generic OIDC/OAuth setup.

https://www.kubeflow.org/docs/distributions/gke/pipelines/authentication-sdk/#connecting-to-kubeflow-pipelines-in-a-full-kubeflow-deployment

Does this setup look familiar to you?

@AmadiL
Copy link
Author

AmadiL commented Jun 15, 2021

Yes, I believe this is exactly OAuth2 Client Credentials Grant with GCP's IAP:
https://cloud.google.com/iap/docs/authentication-howto#authenticating_from_a_desktop_app (found this link in KF doc you just provided)

@amybachir
Copy link

@AmadiL Here is how I got around programmatically authenticating kubeflow with cognito. It's an ugly workaround but it's the only one I could come up with.

  1. I created a service account in cognito. Just a regular "User Pool" user but designated that for programatic access. You can keep its credentials in AWS secret manager and get them from there in your CI/CD pipeline
  2. I wrote a python package which uses selenium and chrome driver and passed cognito service account username/password to it get the auth cookies. Cognito returns 2 cookies.
  3. authenticate using the cognito cookies:
    • Kubeflow pipelines using the sdk: you can pass the cognito cookies like this:
      •   import kfp
          alb_session_cookie0='AWSELBAuthSessionCookie-0=<cookie0>'
          alb_session_cookie1='AWSELBAuthSessionCookie-1=<cookie1>'
          client = kfp.Client(host='https://<aws_alb_host>/pipeline', cookies=f"{alb_session_cookie0};{alb_session_cookie1}")
          client.list_experiments(namespace="<your_namespace>") 
        
    • KFServing: authenticate your http session. here is a python example:
      •  cookies = {
                  "AWSELBAuthSessionCookie-1": cookie_1_value,
                  "AWSELBAuthSessionCookie-0": cookie_0_value,
          }
          session = requests.Session()
          cookies = {cookie_1_name: cookie_1_value, cookie_0_name: cookie_0_value}
          requests.utils.add_dict_to_cookiejar(session.cookies, cookies)
          session.get(host)
        

@soleares
Copy link
Collaborator

soleares commented Jul 3, 2021

It looks like oauth2-proxy only supports user flows and not the client credentials flow: oauth2-proxy/oauth2-proxy#698.

This Kubeflow distribution with oauth2-proxy does work with OIDC and external IDPs like Okta.

I've used the client credentials flow in the past for api calls that are not user scoped. I'm curious how that would work here since I assume with Kubeflow you would want different users to login, not just a single machine user. But I might not be understanding your requirements.

@davidspek
Copy link
Member

I believe I've figured out a way to allow for generating API keys, and thus also enable programmatic access. The downside is that the changes necessary are not very trivial. This is something I'm working on as a part of my new job, that I hope I will be able to port to ArgoFlow as well. Until that time, the cookie method might be the only way to go if your IdP doesn't support generating API tokens.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants