Skip to content
This repository has been archived by the owner on Sep 26, 2022. It is now read-only.

oauth: add oauth_client_id and oauth_client_secret to connectors #32

Merged
merged 19 commits into from
Aug 4, 2022

Conversation

mdibaiee
Copy link
Member

@mdibaiee mdibaiee commented Jul 1, 2022

  • Adds oauth_client_id and oauth_client_secret to connectors table
  • Revokes select access on the table as a whole, and grants select on all columns except oauth_client_secret to authenticated role
  • Adds functions that allow the UI to generate an Authorization URL for a connector as well as generating the access token once we have code and state
  • Adds a function that can be used to encrypt the spec of the connector, with the intention of adding client_id and client_secret to the spec before encryption.
  • Agent now updates connectors table with oauth2_spec of the connector

This change is Reviewable

@mdibaiee mdibaiee mentioned this pull request Jul 1, 2022
@mdibaiee mdibaiee force-pushed the mahdi/oauth-columns branch 3 times, most recently from b9a95fc to 53b5516 Compare July 7, 2022 16:05
Copy link
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core function of this looks great. A couple of comments on secondary bits.

"version": "lts"
},
// Enable docker-from-docker.
"docker-from-docker": {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spent a bunch of time trying to get docker-from-docker working in estuary/flow and ran into a weird issue w/ connectors i was unable to resolve.

That said, i've also been playing with .devcontainer changes in an animated-carnival branch and the docker in docker feature appears to work fine? 🤷

This looks like it was copied from the flow changes I made and then backed out. I'd recommend stripping .devcontainer changes from this PR and let's land that separately. Thanks

@@ -0,0 +1,28 @@
-- OAuth Client ID and Client Secret are associated uniquely with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out this chunk of the readme -- in essence, rather than adding a new migration here, you should be updating the existing connectors "create table" statement.

Then, add an actual migration under supabase/pending which will migrate the production DB from the current schema to your updated one. This migration is checked in and PR'd but is ephemeral and will be removed once applied to the DB, leaving only the pristine schema.

Check out this PR for an example of this workflow.

@mdibaiee mdibaiee force-pushed the mahdi/oauth-columns branch 3 times, most recently from f039802 to d92870d Compare July 8, 2022 09:37
-- don't expose details of open_graph raw responses & patching and oauth2 secret
revoke select on table connectors from authenticated;
-- authenticated may select other columns for all connectors connectors.
grant select(id, detail, updated_at, created_at, image_name, external_url, open_graph, oauth2_client_id) on table connectors to authenticated;
Copy link
Member Author

@mdibaiee mdibaiee Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this locally, apparently the way we had the migration initially did not actually restrict access to open_graph_raw and open_graph_patch, you can try this by going to dashboard.estuary.dev/connectors, and re-sending the XHR request with an additional open_graph_raw, and you will get that column back.

curl 'https://eyrcnmuzzyriypdajwdk.supabase.co/rest/v1/connectors?select=open_graph_raw&offset=0&limit=10' -H 'apikey: $KEY' -H 'Authorization: Bearer $TOKEN

In contrast, if you grant only specific columns and not the ones you want accessible by users, it seems to work. So trying to fetch the restricted columns I now get a 403.

cc @jgraettinger @psFried @saterus

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!. TIL.

nit: I think you can drop the revoke though, since select of the table is never granted.

Copy link
Member

@jgraettinger jgraettinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % comments below

-- don't expose details of open_graph raw responses & patching and oauth2 secret
revoke select on table connectors from authenticated;
-- authenticated may select other columns for all connectors connectors.
grant select(id, detail, updated_at, created_at, image_name, external_url, open_graph, oauth2_client_id) on table connectors to authenticated;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed!. TIL.

nit: I think you can drop the revoke though, since select of the table is never granted.

@@ -145,12 +145,14 @@ impl TagHandler {
#[serde(rename = "type")]
protocol: String,
resource_spec_schema: Box<RawValue>,
oauth2_spec: Option<Box<RawValue>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're using a catch-all oauth2_spec document in the DB schema without any constraints on its structure. We may want to have the DB more formally structure separate columns for each of access_token_url_template, access_token_body_json, etc. I'm going to assume you considered that and opted for a single-document approach -- for example, to allow for making future changes without DB migrations, though we also shouldn't be afraid of DB migrations -- and if this isn't the case, it's worth considering.

In that case, I think we should minimally have a struct with a serde-enforced schema that's parsed into, if only to then serialize it back out again. Otherwise, we open up the possibility of silent schema mismatches that occur only when someone tries to actually authorize.

What I'll specifically suggest is that we have a protobuf message type for oauth2_spec (which I believe you're working on already?), and that we then use it here through its generated Rust struct + serde implementations.

If you want to come back and do that later once that type is merged, that's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgraettinger Good point. We already do have an OAuth2Spec in our flow protocol now (PR here), I will update this to deserialize and serialize again just to verify 👍🏽

Copy link
Member Author

@mdibaiee mdibaiee Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgraettinger I just tried this, and there seems to be a discrepancy between how proto types are serialized/deserialized to/from JSON between Go and Rust: in Go they are snake_case, while in Rust they are camelCase.

In Go we are serializing flow.OAuth2Spec, and it comes out with snake_case keys: https://github.com/estuary/flow/pull/570/files#diff-6f46f9561151d2babc68df1479eaef0ae774c36e2eb09a4902325adc0b08dc00R28

"oauth2Spec":{"provider":"google","auth_url_template":"https://accounts.google.com/o/oauth2/auth?access_type=offline\u0026client_id={{ client_id }}\u0026redirect_uri={{ redirect_uri }}\u0026response_type=code\u0026scope=email\u0026state={{ state }}","access_token_url_template":"https://oauth2.googleapis.com/token","access_token_body_json":{"grant_type":"authorization_code","client_id":"{{ client_id }}","client_secret":"{{ client_secret }}","redirect_uri":"{{ redirect_uri }}","code":"{{ code }}"}

However, in Rust, trying to deserialize this, I get:

Caused by:
    unknown field `auth_url_template`, expected one of `provider`, `authUrlTemplate`, `accessTokenUrlTemplate`, `accessTokenBody`, `accessTokenHeaders`, `refreshTokenUrlTemplate`, `refreshTokenBody`, `refreshTokenHeaders` at line 1 column 2414)

btw, I did try removing the rename_all = "camelCase" from Spec in connector_tags.rs but I still had the same issue.

I can look into making these two compatible (i.e. both doing camelCase or both doing snake_case), but I'm afraid it might have a large blast radius. Any thoughts on this? I can also work around this by specifically parsing as snake_case in this instance in Rust and then serializing to camelCase before persisting it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've explained here what is blocking me from verifying this: #34

I think for now we can proceed with the raw JSON, but we should resolve that issue.

@mdibaiee mdibaiee force-pushed the mahdi/oauth-columns branch 2 times, most recently from 7568fe5 to 08c7e2f Compare July 11, 2022 11:04
@mdibaiee
Copy link
Member Author

Added a new operation encrypt-config which allows the frontend to send encryption requests, with the difference that this function will add credentials/client_id and credentials/client_secret to the spec before sending it to encryption service. (cc @jgraettinger @travjenkins @psFried)

It can be called like so:

curl -i --location --request POST 'http://localhost:5431/functions/v1/' \
   --header 'Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzdXBhYmFzZS1kZW1vIiwicm9sZSI6ImFub24ifQ.625_WdcF3KHqz5amU0x2X5WWHP-OEs_4qj0ssLNHzTs' \
   --header 'Content-Type: application/json' \
   --data @request.json

If the client_id and client_secret are in any way exposed in the response of the encryption service, this function will avoid sending them back to client and will error instead. Try it by removing secret: true from either client_id or client_secret.

request.json:

{
  "operation": "encrypt-config",
  "connector_id": "06:ab:80:41:d8:00:5c:00",
  "config": {"credentials": {}, "greetings": 10},
  "schema": {
    "$schema": "http://json-schema.org/draft-07/schema#",
    "title": "Test Source Spec",
    "type": "object",
    "required": [
      "greetings"
    ],
    "properties": {
      "credentials": {
        "title": "Authentication",
        "description": "Google API Credentials",
        "type": "object",
        "required": [
          "client_id",
          "client_secret"
        ],
        "properties": {
          "client_id": {
            "title": "Client ID",
            "type": "string",
            "secret": true
          },
          "client_secret": {
            "title": "Client Secret",
            "type": "string",
            "secret": true
          },
          "refresh_token": {
            "title": "Refresh Token",
            "type": "string",
            "secret": true
          }
        },
        "x-oauth2-provider": "google"
      },
      "greetings": {
        "title": "Number of Greetings",
        "description": "Number of greeting documents to produce when running in non-tailing mode",
        "default": 1000,
        "type": "integer"
      }
    }
  }
}

@mdibaiee mdibaiee merged commit 1f8c360 into main Aug 4, 2022
@mdibaiee mdibaiee deleted the mahdi/oauth-columns branch August 4, 2022 14:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants