Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Authorization with Identity Aware Proxy #11305

Open
9 tasks
mik-laj opened this issue Oct 6, 2020 · 23 comments
Open
9 tasks

Authorization with Identity Aware Proxy #11305

mik-laj opened this issue Oct 6, 2020 · 23 comments
Assignees
Labels
area:API Airflow's REST/HTTP API area:webserver Webserver related Issues good first issue kind:feature Feature Requests security Security issues that must be fixed

Comments

@mik-laj
Copy link
Member

mik-laj commented Oct 6, 2020

Why?

Users expect integration with various Identity Aware Proxies (IAP) that provide authorization. The use of such proxies brings many benefits.

  • Management: provide trustworthy integration with identity providers e..g. Google, LDAP, Kerberos, Keycloak, OAuth
  • Audibility write additional metadata about the operations performed by users
  • Security No need to keep secrets in a container that can execute user code. All the necessary user info is provided in a push model.
  • Security standards: One proxy can be used to protect multiple applications, so it is code that meets higher security standards. The in-app part is very simple and easier to audit.

Besides, it can make using LDAP with Airflow much easier. Deficiencies in the implementation of LDAP for Airflow will no longer be a problem for our users e.g. dpgaspar/Flask-AppBuilder#956)

I think we should prepare implementations for some of the most popular products:

This will be an example for other uses for other products as well.

How?

In order to accomplish this task for each supported proxy, we need to prepare two authorization checks - one for Web UI, one for API.

API

Creating your own API auth backend is described in our documentation: https://airflow.readthedocs.io/en/latest/security/api.html#roll-your-own-api-authentication

FAB

Creating an integration with Flask App Builder is a bit worse described, but in our case, we can extend REMOTE_USER to support product-specific headers.

To do this, create a new view based on the flask_appbuilder.security.views.AuthView class, and then set it as an authremoteuserview attribute in the airflow.www.security.AirflowSecurityManager class. You can use the flask_appbuilder.security.views.AuthRemoteUserView class as a template.

Below is a minimal example of the webserver_config.py file (you should save it to ~/airflow/config/) that provide authorizations using the X-Auth-Username header. The goal is to support more vendor-specific headers

from flask import get_flashed_messages, request, redirect, flash
from flask_appbuilder import expose
from flask_appbuilder._compat import as_unicode
from flask_appbuilder.security.views import AuthView
from flask_login import login_user, logout_user

from airflow.www.security import AirflowSecurityManager


class CustomAuthRemoteUserView(AuthView):
    login_template = ""

    @expose("/login/")
    def login(self):
        if g.user is not None and g.user.is_authenticated:
            return redirect(self.appbuilder.get_url_for_index)

        username = request.environ.get("X-Auth-Username")
        if username:
            user = self.appbuilder.sm.auth_user_remote_user(username)
            if user is None:
                flash(as_unicode(self.invalid_login_message), "warning")
            else:
                login_user(user)
        else:
            flash(as_unicode(self.invalid_login_message), "warning")

        # Flush "Access is Denied" flash messaage
        get_flashed_messages()
        return redirect(self.appbuilder.get_url_for_index)

    @expose("/logout/")
    def logout(self):
        logout_user()
        return redirect("/oauth/logout")


class CustomAirflowSecurityManager(AirflowSecurityManager):
    authremoteuserview = CustomAuthRemoteUserView


SECURITY_MANAGER_CLASS = CustomAirflowSecurityManager  # pylint:

Invoking function get_flashed_messages clears the "Access denied" flash message that appears when the user is redirected from / to /login. This is not included with the FAB, but is needed in Airflow.

Vendor headers

In the case of Louketo/Keycloak, we should support the following headers:

  • X-Auth-Email
  • X-Auth-Family-Name
  • X-Auth-Given-Name
  • X-Auth-Groups
  • X-Auth-Roles
  • X-Auth-Username

In the case of Google IAP, we should use the JWT signed header: https://cloud.google.com/iap/docs/signed-headers-howto
In the case of Promerium, we should use the JWT signed header - X-Pomerium-Jwt-Assertion: : https://www.pomerium.io/docs/topics/getting-users-identity.html#prerequisites

Status

  • API supports Louketo Proxy
  • API supports Google IAP (@alex-kattathra-johnson )
  • API supports Promerium (@ameyk-2409 )
  • Web UI supports Louketo Proxy
  • Web UI supports Google IAP
  • Web UI supports Promerium
  • Docs for Louketo Proxy
  • Docs for Google IAP
  • Docs for Promerium

Disclaimer

If someone is interested in this task, I will be happy to provide all the necessary information and support.

@mik-laj mik-laj added security Security issues that must be fixed area:webserver Webserver related Issues kind:feature Feature Requests area:API Airflow's REST/HTTP API good first issue Hacktoberfest labels Oct 6, 2020
@ap-kulkarni
Copy link

@mik-laj I am interested to work on this.

@mik-laj
Copy link
Member Author

mik-laj commented Oct 8, 2020

@ameyk-2409 Which task do you want to focus on? We have a few tasks to do here I think this can be broken down into several small contributions..

@ap-kulkarni
Copy link

@mik-laj By task do you mean the ones listed under Status section above? If yes, I am interested in implementing authorization for API part. Can be any of the implementations mentioned the description.

@mik-laj
Copy link
Member Author

mik-laj commented Oct 8, 2020

@ameyk-2409 Fantastic. I assigned you to "API supports Promerium". 🐈

@rafaelvargas
Copy link

@mik-laj I'd like to work on the support for the Google's IAP.

@mik-laj
Copy link
Member Author

mik-laj commented Oct 19, 2020

@rafaelvargas I assigned you to "API supports Google IAP". I am trying to gain permission to publish the integration with Webserver but this may not happen so I do not assign myself to the task. However, I am happy to help with the review for IAP integration for AIP.

@mik-laj
Copy link
Member Author

mik-laj commented Nov 2, 2020

@zjffdu Jarek suggested that we also provide support for Apache Knox. Can you share details about how this product works?

@mik-laj mik-laj changed the title Authorization using Identity Aware Proxy Authorization with Identity Aware Proxy Nov 2, 2020
@zjffdu
Copy link

zjffdu commented Nov 2, 2020

@mik-laj Thanks for at me, apache knox is a reverse proxy, I mean to use knox as reverse proxy of airflow, so that we can leverage knox's sso. https://knox.apache.org/

@mik-laj
Copy link
Member Author

mik-laj commented Nov 2, 2020

@zjffdu How is the identity from Apache Knox passed to other applications? Have you ever tried integrating other applications with Apache Knox?

@loozhengyuan
Copy link
Contributor

@mik-laj FYI, not sure if this has been raised yet but Keycloak has recently sunsetted the Louketo project and is due to EOL on 21 Nov. Here's the relevant GitHub issue. As such, we may consider omitting Louketo from the scope of the issue.

@zjffdu
Copy link

zjffdu commented Nov 12, 2020

@zjffdu How is the identity from Apache Knox passed to other applications? Have you ever tried integrating other applications with Apache Knox?

It looks like there's already a PR in knox project. apache/knox#182

@w4tsn
Copy link

w4tsn commented Dec 17, 2020

@mik-laj @loozhengyuan Keycloak states that oauth2-proxy is the viable alternative, so maybe this project should replace Louketo in this issue.

@mik-laj
Copy link
Member Author

mik-laj commented Dec 17, 2020

@w4tsn I have a working and tested implentation that uses Loukietto proxy. if time permits I will try to update it to use a different proxy and contribute it to community, but for now I have big time deficit.

@rg2609
Copy link

rg2609 commented Feb 2, 2021

@mik-laj can you share the branch or PR what changes you did for keycloak

@mik-laj
Copy link
Member Author

mik-laj commented Feb 2, 2021

@rg2609 Unfortunately, this is part of a client project, and I haven't found the time to reimplement it in the community.

@rg2609
Copy link

rg2609 commented Feb 2, 2021

@mik-laj so can guide me where to make changes

@ghost
Copy link

ghost commented May 28, 2021

@rafaelvargas, was wondering if the Google IAP support is actively being worked on. If not, I'd be interested in giving it a shot!

@mik-laj
Copy link
Member Author

mik-laj commented May 28, 2021

@alex-kattathra-johnson Assigned. I also worked on IAP support, but never finished. I managed to write some system test code to check if the integration is working fine. Feel free to use it in your PR.
https://github.com/mik-laj/airflow/pull/35/files

@brandondtb
Copy link

@rafaelvargas @ap-kulkarni I wanted to check in on the progress of this one and see if either of you are actively working on it. I could really use this feature, and would be happy to help however I can.

@ap-kulkarni
Copy link

ap-kulkarni commented May 15, 2022

Apologies for a long hiatus on this one. Could not work on this due to personal issues. I have started analyzing the requirement to integrate with Pomerium and have few questions.

  1. When request is received within airflow will the user be already authenticated with Pomerium? i.e. When validating request in the auth backend, should the code directly look for the header X-Pomerium-Jwt-Assertion or the request would contain credentials which the code should authenticate with pomerium first?
  2. To validate jwt header we will need a jwt library and I feel jwcrypto will be good since it supports all facets of the jwt as per the JWT.IO page detailing the libraries. When I tried installing the library in the python environment created for airflow, I found that the library is already installed as part of dependency of some other requirement. However, I feel we should add explicit requirement for this. Let me know if this is okay and what criteria is used to pin a requirement to a particular version. Also, if anyone has some other suggestion for jwt library, would like to hear that as well.

At this point I am initially concentrating on API authentication only. Once I am clear enough with the details, I will check out FAB implementation. Again apologies for not able to working on this one for long.

@mik-laj
Copy link
Member Author

mik-laj commented May 16, 2022

When request is received within airflow will the user be already authenticated with Pomerium? i.e. When validating request in the auth backend, should the code directly look for the header X-Pomerium-Jwt-Assertion or the request would contain credentials which the code should authenticate with pomerium first?

I have no experience with this platform, but a non-privileged user should not be able to log in and this is the main requirement.

Let me know if this is okay and what criteria is used to pin a requirement to a particular version.

We should create a new provider and define all requirements explicitly. Here is our doc about dependencies and upper-bound version of Airflow dependencies: https://github.com/apache/airflow#approach-to-dependencies-of-airflow
lower bound should point to the version you are currently testing.

@ap-kulkarni
Copy link

Thank you @mik-laj. I will try setting minimal environment required for this. Will post questions here if stuck anywhere.

@softestplease
Copy link

softestplease commented May 27, 2022

API supports Google IAP

Hello @mik-laj , has this code/feature been tested (https://github.com/mik-laj/airflow/pull/35/files)? I do need this feature in my environment as we are running Airflow in GKE and would like to trigger Dags with Stable Rest API from Cloud Function. I believe that HTTP only support one authentication header. hence IAP is used from CF to Airflow@GKE, so we are unable to add include username/password for basic_auth backend type. cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:API Airflow's REST/HTTP API area:webserver Webserver related Issues good first issue kind:feature Feature Requests security Security issues that must be fixed
Projects
None yet
Development

No branches or pull requests