Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support OAuth for Integrations in Airbyte UI #768

Closed
cgardens opened this issue Oct 31, 2020 · 14 comments
Closed

Support OAuth for Integrations in Airbyte UI #768

cgardens opened this issue Oct 31, 2020 · 14 comments
Labels
type/enhancement New feature or request

Comments

@cgardens
Copy link
Contributor

cgardens commented Oct 31, 2020

related to #766.

Checklist

Described in #5769

Tell us about the problem you're trying to solve

  • Several of our integrations requires authenticating using oauth. The common way of doing this in singer is to cheese the system a little bit. Essentially you find some way to get a refresh token by extracting it out of the network call in the browser's developer tools and then passing it as an argument to the integration. This is not how oauth is intended to work, but we've followed singer's cue here and the done same.
  • The down side of this approach is that it's really unfriendly to the user:
    • Accessing the refresh token is usually something intended to be done by developers, not your average user of a service.
    • Even if you are a developer, it's supposed to be done inside their own application, not as a series of scripts and hacks, which is what the current procedure relies on.
    • Anecdotally it takes me on average an hour to go from creating an account to successfully extracting a refresh token for a given service. this is a pretty big friction!!!

If you're not familiar with oauth (or forget how it works every time you encounter it, like me)...

The flow looks something like this.

  • X is my application that wants to access User Y's data in Application Z.
  • A developer from X goes to Z and gets some credentials to identify their application (usually a client id and a client secret)
  • While User Y is using X, X says it needs access to User Y's data in Z.
  • User Y is redirected to Z's oauth portal (a.k.a that page where it says "Z wants to be able to see your data, is that okay?" (under the hood, X has passed its own client id and secret to Z to identify it's application)
  • Assuming the User agreed to give access, Z redirects back to Y providing it with a refresh token. This refresh token can then be used to create access tokens. The access tokens are how X is able to access Y's data in Z. Access tokens expire after a few hours. Refresh tokens (in ad tech) often don't expire unless they are revoked. If the refresh token does expire, it's usually after many days / months. Pretty much as long as X has a non-revoked / non-expired refresh token, it will be able to access Y's data.

Describe the solution you’d like

  • Airbyte should provide facility for integrations to do oauth in airbyte's UI.
  • The flow:
    • User selects the integration they want to use. They input the credentials (e.g. client id, client secret) (or we input our own, not sure which ones makes most sense yet).
    • Airbyte uses that to construct the correct request to the integration's oauth portal. The user will be prompted to allow Airbyte access to their data.
    • Once they hit accept they will be redirected back to Airbyte. Airbyte will behind the scenes store the refresh token (this is how oauth is normally supposed to work).
  • This is better because now the user doesn't need to worry about refresh tokens at all.

How

  • This isn't that easy to do ...
  • Right now all integration related code runs inside the workers (docker containers). This is great for creating hermetic environments to run integration code.
  • OAuth relies on the browser to work. So while we may be able to offload some of the worker to the worker (e.g. constructing the correct request and handling extracting the refresh token from the response, ultimately the requests need to be made in the browser, not in docker containers, so that users can properly approve the transaction.

┆Issue is synchronized with this Asana task by Unito

@cgardens cgardens added the type/enhancement New feature or request label Oct 31, 2020
@cgardens cgardens changed the title UI Oauth Support OAuth for Integrations in Airbyte UI Oct 31, 2020
@cgardens cgardens self-assigned this Jan 25, 2021
@cgardens cgardens removed their assignment Feb 1, 2021
@cgardens cgardens modified the milestones: 2021/02/05, 20201/02/12 Feb 8, 2021
@cgardens cgardens assigned sherifnada and cgardens and unassigned sherifnada Feb 8, 2021
@cgardens
Copy link
Contributor Author

cgardens commented Feb 13, 2021

First pass of the tech spec is done and reviewed. Next steps are to describe the technical details of the approach in more depth.

@cgardens cgardens modified the milestones: 20201/02/12, 2021/02/19 Feb 15, 2021
@cgardens
Copy link
Contributor Author

We have verbal agreement on the approach. Need to update the spec to reflect it. On @michel-tricot 's suggestion we will start out by not putting the OAuth code in connector containers but add it as a switch statement in core based on insight that we think we will be able to write write an abstraction that doesn't require per integration work.

@cgardens cgardens modified the milestones: 2021-02-19, 2021-02-26 Feb 23, 2021
@cgardens cgardens removed their assignment Feb 23, 2021
@cgardens cgardens removed this from the 2021-02-26 milestone Feb 23, 2021
@cgardens cgardens modified the milestone: Core - 2021-03-12 Mar 8, 2021
@cgardens
Copy link
Contributor Author

@jim-barlow
Copy link

jim-barlow commented Mar 18, 2021

@davinchia asked me to add my thoughts to this issue as we have the same need, but for the Facebook Graph API, which is pretty complex as per their docs. When you generate an access token for this API via their Graph Explorer you need to specify the Facebook App, whether it's a User or Page Token and then add the specific permissions (i.e. scopes) needed to get data from the API. However it would be great if the following OAuth flow was possible via the UI:

  1. Connect to Facebook Authentication via button in UI
  2. Login popup with exact permissions pre-defined (not sure whether it makes sense to have an Airbyte Facebook app or rely in users to create their own)
  3. Initial access token passed to Airbyte and debugged for expiry
  4. Initial access token (1 hr access) exchanged for long-lived token (3 months access)

An additional step which it would be great to automate would be the periodic refresh of the token every e.g. 1 month so that the data would not stop flowing if the user forgot to manually refresh/exchange the token. Currently we do this via a Python script in a Colab Notebook, which is obviously not ideal but we are only doing this on one account every 2-3 months.

@cgardens
Copy link
Contributor Author

Thanks @jimbeepbeep . I think the common pattern is that every time we replicate data we will use the refresh token to get a new access token if needed. So hopefully the second half of your comment is pretty much already part of our common pattern.

Login popup with exact permissions pre-defined

For this part, what values need to be predefined? and should they be predefined in an airbyte ui or the fb ui?

(not sure whether it makes sense to have an Airbyte Facebook app or rely in users to create their own)

So far we planning to have the user create their own. Airbyte is focused on letting a user have 100% control of their data. If we use an Airbyte FB app, then the user is giving Airbyte access to their data. The downside is it requires a little extra set up for the user.

Let me know if this all made sense or if you have any other thoughts!

@jim-barlow
Copy link

I think the common pattern is that every time we replicate data we will use the refresh token to get a new access token if needed. So hopefully the second half of your comment is pretty much already part of our common pattern.

That is great, makes total sense.

Login popup with exact permissions pre-defined
For this part, what values need to be predefined? and should they be predefined in an airbyte ui or the fb ui?

I think these could be preset in the connector config, for our use-case it's a specific and unchanging set

(not sure whether it makes sense to have an Airbyte Facebook app or rely in users to create their own)
So far we planning to have the user create their own. Airbyte is focused on letting a user have 100% control of their data. If we use an Airbyte FB app, then the user is giving Airbyte access to their data. The downside is it requires a little extra set up for the user.

Agreed, that makes more sense. There are app-linked quotas too which should be user responsibility.

@manish-GP
Copy link

manish-GP commented Jun 18, 2021

When is it getting released?

@cgardens
Copy link
Contributor Author

@manish-GP thanks for your interest. It is likely something we'll be tackling in the second half of the summer.

@tweinreich
Copy link

tweinreich commented Aug 31, 2021

I am currently thinking about building connectors to Personio and Weclapp which also use mechanisms similar to what @cgardens describes.

For Personio a token has to be sent to an API endpoint to obtain another token which is valid for one request to another API endpoint (documentation). Weclapp uses a token that can be generated once (documentation).

Can anyone explain to me if the "best practice" currently is to implement this in the connector on my own or does the CDK already contain methods to achieve this?

P.S. I hope this is the right place to ask, if not I can of course open a new issue for this.

@sherifnada
Copy link
Contributor

@tweinreich the best way to implement oauth right now is to have the connector accept a refresh/access token and perform the oauth flow by hand outside of airbyte.

When this ticket closes it will be possible to perform the oauth flow directly in the UI.

@tweinreich
Copy link

@sherifnada thank you for the clarification.

@sherifnada
Copy link
Contributor

Oauth is available on Airybte Cloud, currently no ETA for OSS support. Closing this issue, we should re-open a separate one for OSS as needed

@thomas-vl
Copy link
Contributor

@sherifnada is there an update on Oauth for the OSS version? I'm confused on why its not possible for OSS and is possible on the cloud version. What are the blockers?

@sherifnada
Copy link
Contributor

@thomas-vl There's currently no timeline -- in order to accurately gauge interest in this, I've opened an issue to track Oauth in OSS specifically here: #13021 please leave a 👍🏼 to help us prioritize

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants