Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ airbyte-cdk - Adds JwtAuthenticator to low-code #37005

Merged
merged 47 commits into from
Apr 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
ce29fb5
Initial jwt authenticator build
pnilan Apr 9, 2024
bb0789d
Update to include config eval
pnilan Apr 10, 2024
e12254b
Update typo
pnilan Apr 10, 2024
dca7b06
Updates JWT auth, schema, and factory
pnilan Apr 11, 2024
58df22d
Update property retrieval in component factory
pnilan Apr 11, 2024
c8160c8
add token duration check
pnilan Apr 11, 2024
16f5165
Fix error with exp header setting
pnilan Apr 11, 2024
1b7b006
Update _get_jwt_payload
pnilan Apr 11, 2024
d41e009
Updates jwt.auth to interpolate inputs correctly
pnilan Apr 11, 2024
a1b2965
Clean up redundancy by removing "alg" from jwt headers
pnilan Apr 11, 2024
b25df50
Updates expiration time setting in jwt payload method
pnilan Apr 11, 2024
86809bd
Update jwt schema to remove HS256
pnilan Apr 11, 2024
63db2a6
Merge branch 'master' into pnilan/airbyte-cdk-jwt-auth
pnilan Apr 11, 2024
40717f5
chore: format code
pnilan Apr 11, 2024
0e374b3
Adds base64 secret key encoding, updates jwt.py returns, adds default…
pnilan Apr 15, 2024
70c2deb
Ran `poetry run poe build`
pnilan Apr 15, 2024
7388a46
Imports InterpolateBoolean for base64 encoding
pnilan Apr 15, 2024
b67ec03
Updates jwt.py to convert values to strings for serialization
pnilan Apr 15, 2024
0aa43a6
Updates jwt.py to convert secret key and algorithm to string
pnilan Apr 15, 2024
342cfad
Update how boolean is handled for base64 encoding
pnilan Apr 15, 2024
a927323
Adds base64_encode_secret_key to component factory
pnilan Apr 16, 2024
46ff236
Add dev defined header prefix option
pnilan Apr 16, 2024
111180a
Updates header prefix
pnilan Apr 16, 2024
5bc18dd
Updates get header prefix for brevity
pnilan Apr 16, 2024
e351f3e
Cleanup
pnilan Apr 16, 2024
00db62e
Wraps jwt.encode in try-catch to handle pyjwt errors
pnilan Apr 16, 2024
8db26a5
Adds unit testing for jwt.py and for model_to_component_factory.py
pnilan Apr 16, 2024
0c14620
Updates jwt schema, model, factory, and class to make jwt_headers and…
pnilan Apr 16, 2024
790e7ea
Merge branch 'master' into pnilan/airbyte-cdk-jwt-auth
pnilan Apr 16, 2024
ae0cef5
chore: format code
pnilan Apr 16, 2024
dd6377b
Updates Algorithm to be enumeration based on PyJWT supported algorithms
pnilan Apr 17, 2024
252f863
Updates unit tests for updated algorithm enumeration
pnilan Apr 17, 2024
0284d30
chore: format code
pnilan Apr 17, 2024
be03266
Merge branch 'master' into pnilan/airbyte-cdk-jwt-auth
pnilan Apr 17, 2024
7515a54
Remove unused import
pnilan Apr 17, 2024
2a419e7
Updates jwt.py, model_to_componet_factory, and relevant tests to reso…
pnilan Apr 17, 2024
47c4604
Update jwt.py for `Any` type when throwing exception
pnilan Apr 17, 2024
4b8512d
Adds explanations of JWT authenticator methods
pnilan Apr 17, 2024
a7559cb
chore: format code
pnilan Apr 17, 2024
2772d66
Updated for linting
pnilan Apr 17, 2024
0e4b8e2
reverts changelog, pyproject.toml, and poetry.lock to previous airbyt…
pnilan Apr 17, 2024
aeca944
Updates JwtAlgorithm to include all supported algos
pnilan Apr 18, 2024
9a32e55
Merge branch 'master' into pnilan/airbyte-cdk-jwt-auth
pnilan Apr 18, 2024
c7bd90b
Reverst CHANGELOG, updates `model_to_component_factory` method condit…
pnilan Apr 18, 2024
0723fee
Update low code documentation to include JwtAuthenticator
pnilan Apr 18, 2024
bb839c9
Fix conditional check error for `create_jwt_authenticator`
pnilan Apr 18, 2024
6075a42
Update `create_jwt_authenticator` jwt_headers and jwt_payload setting
pnilan Apr 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,9 @@
#

from airbyte_cdk.sources.declarative.auth.oauth import DeclarativeOauth2Authenticator
from airbyte_cdk.sources.declarative.auth.jwt import JwtAuthenticator

__all__ = [
"DeclarativeOauth2Authenticator",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why SelectiveAuthenticator isn't here.

Copy link
Contributor Author

@pnilan pnilan Apr 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a good answer for you. Oauth is re-exported via __init__ but all other authenticators are accessed directly from their files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not intentional. It shouldn't matter as these classes should only be used through the YAML interface

"JwtAuthenticator"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, should this also be DeclarativeJWTAuthenticator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the oauth2 auth would be the outlier: ApiKeyAuthenticator, BearerAuthenticator, BasicHttpAuthenticator, etc don't explicitly note declarative.

]
170 changes: 170 additions & 0 deletions airbyte-cdk/python/airbyte_cdk/sources/declarative/auth/jwt.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
#
# Copyright (c) 2023 Airbyte, Inc., all rights reserved.
#

import base64
from dataclasses import InitVar, dataclass
from datetime import datetime
from typing import Any, Mapping, Optional, Union

import jwt
from airbyte_cdk.sources.declarative.auth.declarative_authenticator import DeclarativeAuthenticator
from airbyte_cdk.sources.declarative.interpolation.interpolated_boolean import InterpolatedBoolean
from airbyte_cdk.sources.declarative.interpolation.interpolated_mapping import InterpolatedMapping
from airbyte_cdk.sources.declarative.interpolation.interpolated_string import InterpolatedString


class JwtAlgorithm(str):
"""
Enum for supported JWT algorithms
"""

HS256 = "HS256"
HS384 = "HS384"
HS512 = "HS512"
ES256 = "ES256"
ES256K = "ES256K"
ES384 = "ES384"
ES512 = "ES512"
RS256 = "RS256"
RS384 = "RS384"
RS512 = "RS512"
PS256 = "PS256"
PS384 = "PS384"
PS512 = "PS512"
EdDSA = "EdDSA"


@dataclass
class JwtAuthenticator(DeclarativeAuthenticator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what our naming convention is — do we want acronyms to be all caps? I.e. JWTAuthenticator or JwtAuthenticator?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not attached to it but I find JwtAuthenticator a bit easier to read

Copy link
Contributor Author

@pnilan pnilan Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed ApiKeyAuthenticator's lead.

"""
Generates a JSON Web Token (JWT) based on a declarative connector configuration file. The generated token is attached to each request via the Authorization header.

Attributes:
config (Mapping[str, Any]): The user-provided configuration as specified by the source's spec
secret_key (Union[InterpolatedString, str]): The secret key used to sign the JWT
algorithm (Union[str, JwtAlgorithm]): The algorithm used to sign the JWT
token_duration (Optional[int]): The duration in seconds for which the token is valid
base64_encode_secret_key (Optional[Union[InterpolatedBoolean, str, bool]]): Whether to base64 encode the secret key
header_prefix (Optional[Union[InterpolatedString, str]]): The prefix to add to the Authorization header
kid (Optional[Union[InterpolatedString, str]]): The key identifier to be included in the JWT header
typ (Optional[Union[InterpolatedString, str]]): The type of the JWT.
cty (Optional[Union[InterpolatedString, str]]): The content type of the JWT.
iss (Optional[Union[InterpolatedString, str]]): The issuer of the JWT.
sub (Optional[Union[InterpolatedString, str]]): The subject of the JWT.
aud (Optional[Union[InterpolatedString, str]]): The audience of the JWT.
additional_jwt_headers (Optional[Mapping[str, Any]]): Additional headers to include in the JWT.
additional_jwt_payload (Optional[Mapping[str, Any]]): Additional payload to include in the JWT.
"""

config: Mapping[str, Any]
parameters: InitVar[Mapping[str, Any]]
secret_key: Union[InterpolatedString, str]
algorithm: Union[str, JwtAlgorithm]
token_duration: Optional[int]
base64_encode_secret_key: Optional[Union[InterpolatedBoolean, str, bool]] = False
header_prefix: Optional[Union[InterpolatedString, str]] = None
kid: Optional[Union[InterpolatedString, str]] = None
typ: Optional[Union[InterpolatedString, str]] = None
cty: Optional[Union[InterpolatedString, str]] = None
iss: Optional[Union[InterpolatedString, str]] = None
sub: Optional[Union[InterpolatedString, str]] = None
aud: Optional[Union[InterpolatedString, str]] = None
additional_jwt_headers: Optional[Mapping[str, Any]] = None
additional_jwt_payload: Optional[Mapping[str, Any]] = None

def __post_init__(self, parameters: Mapping[str, Any]) -> None:
self._secret_key = InterpolatedString.create(self.secret_key, parameters=parameters)
self._algorithm = JwtAlgorithm(self.algorithm) if isinstance(self.algorithm, str) else self.algorithm
self._base64_encode_secret_key = (
InterpolatedBoolean(self.base64_encode_secret_key, parameters=parameters)
if isinstance(self.base64_encode_secret_key, str)
else self.base64_encode_secret_key
)
self._token_duration = self.token_duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably also need to call InterpolatedString.create since self.token_duration can also be a string

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this to make token duration only an integer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this means the value can't be interpolated. Is that ok?

self._header_prefix = InterpolatedString.create(self.header_prefix, parameters=parameters) if self.header_prefix else None
self._kid = InterpolatedString.create(self.kid, parameters=parameters) if self.kid else None
self._typ = InterpolatedString.create(self.typ, parameters=parameters) if self.typ else None
self._cty = InterpolatedString.create(self.cty, parameters=parameters) if self.cty else None
self._iss = InterpolatedString.create(self.iss, parameters=parameters) if self.iss else None
self._sub = InterpolatedString.create(self.sub, parameters=parameters) if self.sub else None
self._aud = InterpolatedString.create(self.aud, parameters=parameters) if self.aud else None
self._additional_jwt_headers = InterpolatedMapping(self.additional_jwt_headers or {}, parameters=parameters)
self._additional_jwt_payload = InterpolatedMapping(self.additional_jwt_payload or {}, parameters=parameters)

def _get_jwt_headers(self) -> dict[str, Any]:
""" "
Builds and returns the headers used when signing the JWT.
"""
headers = self._additional_jwt_headers.eval(self.config)
if any(prop in headers for prop in ["kid", "alg", "typ", "cty"]):
raise ValueError("'kid', 'alg', 'typ', 'cty' are reserved headers and should not be set as part of 'additional_jwt_headers'")

if self._kid:
headers["kid"] = self._kid.eval(self.config)
if self._typ:
headers["typ"] = self._typ.eval(self.config)
if self._cty:
headers["cty"] = self._cty.eval(self.config)
headers["alg"] = self._algorithm
return headers

def _get_jwt_payload(self) -> dict[str, Any]:
"""
Builds and returns the payload used when signing the JWT.
"""
now = int(datetime.now().timestamp())
exp = now + self._token_duration if isinstance(self._token_duration, int) else now
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._token_duration can only be a int as far as I can tell

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this was a result of being flagged by mypy --> the issue is I set token_duration as optional in the schema, and therefore it has type Optional[int] even though I have a default value to use. So it'll never be None but because of the auto-generated Optional[int] in the model, I can't avoid it from being flagged by mypy. Any workarounds here you're aware of?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also re "this means the [token_duration] value can't be interpolated. Is that ok?"

I can't think of a scenario where a connector dev would want to use a string to define the token duration. It's in the spec that the iat (issued at), exp (expires at), and nbf (not before) claims be seconds since epoch. Instead of directly defining the exp claim, I have the dev set the token duration in seconds (typically a max value allowed by the API) to dynamically set the exp. This seems to me to be a better approach then expecting a connector dev to define iat and exp direction via jinja expressions (i.e. exp: {{ int(datetime.now().timestamp() + 1200 }}.

nbf = now

payload = self._additional_jwt_payload.eval(self.config)
if any(prop in payload for prop in ["iss", "sub", "aud", "iat", "exp", "nbf"]):
raise ValueError(
"'iss', 'sub', 'aud', 'iat', 'exp', 'nbf' are reserved properties and should not be set as part of 'additional_jwt_payload'"
)

if self._iss:
payload["iss"] = self._iss.eval(self.config)
if self._sub:
payload["sub"] = self._sub.eval(self.config)
if self._aud:
payload["aud"] = self._aud.eval(self.config)
payload["iat"] = now
payload["exp"] = exp
payload["nbf"] = nbf
return payload

def _get_secret_key(self) -> str:
"""
Returns the secret key used to sign the JWT.
"""
secret_key: str = self._secret_key.eval(self.config)
return base64.b64encode(secret_key.encode()).decode() if self._base64_encode_secret_key else secret_key

def _get_signed_token(self) -> Union[str, Any]:
"""
Signed the JWT using the provided secret key and algorithm and the generated headers and payload. For additional information on PyJWT see: https://pyjwt.readthedocs.io/en/stable/
"""
try:
return jwt.encode(
payload=self._get_jwt_payload(),
key=self._get_secret_key(),
algorithm=self._algorithm,
headers=self._get_jwt_headers(),
)
except Exception as e:
raise ValueError(f"Failed to sign token: {e}")

def _get_header_prefix(self) -> Union[str, None]:
"""
Returns the header prefix to be used when attaching the token to the request.
"""
return self._header_prefix.eval(self.config) if self._header_prefix else None

@property
def auth_header(self) -> str:
return "Authorization"

@property
def token(self) -> str:
return f"{self._get_header_prefix()} {self._get_signed_token()}" if self._get_header_prefix() else self._get_signed_token()
Original file line number Diff line number Diff line change
Expand Up @@ -257,13 +257,15 @@ definitions:
- "$ref": "#/definitions/BearerAuthenticator"
- "$ref": "#/definitions/CustomAuthenticator"
- "$ref": "#/definitions/OAuthAuthenticator"
- "$ref": "#/definitions/JwtAuthenticator"
- "$ref": "#/definitions/NoAuth"
- "$ref": "#/definitions/SessionTokenAuthenticator"
- "$ref": "#/definitions/LegacySessionTokenAuthenticator"
examples:
- authenticators:
token: "#/definitions/ApiKeyAuthenticator"
oauth: "#/definitions/OAuthAuthenticator"
jwt: "#/definitions/JwtAuthenticator"
$parameters:
type: object
additionalProperties: true
Expand Down Expand Up @@ -833,6 +835,127 @@ definitions:
$parameters:
type: object
additionalProperties: true
JwtAuthenticator:
title: JWT Authenticator
description: Authenticator for requests using JWT authentication flow.
type: object
required:
- type
- secret_key
- algorithm
properties:
type:
type: string
enum: [JwtAuthenticator]
secret_key:
type: string
description: Secret used to sign the JSON web token.
examples:
- "{{ config['secret_key'] }}"
base64_encode_secret_key:
type: boolean
description: When set to true, the secret key will be base64 encoded prior to being encoded as part of the JWT. Only set to "true" when required by the API.
default: False
algorithm:
type: string
description: Algorithm used to sign the JSON web token.
enum:
[
"HS256",
"HS384",
"HS512",
"ES256",
"ES256K",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this enum value isn't in jwt.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Updated.

"ES384",
"ES512",
"RS256",
"RS384",
"RS512",
"PS256",
"PS384",
"PS512",
"EdDSA",
]
examples:
- ES256
- HS256
- RS256
- "{{ config['algorithm'] }}"
token_duration:
type: integer
title: Token Duration
description: The amount of time in seconds a JWT token can be valid after being issued.
default: 1200
examples:
- 1200
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a common default value we can use here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't seen any consistency, although 1200s (20 min) is the smallest duration I've seen -- maybe a good default.

That being said, does this mean if a sync takes longer than 20 minutes it would fail? Should I include some sort of refresh mechanism? Or, is the authenticator re-instantiated per read?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question! can we refresh the token at runtime? this is how we do it for oauth authenticators

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@girarda Actually I believe we're good -- each time the the request is prepared it will invoke _get_jwt_headers which will "refresh" the expiration time (and therefore refresh the token, as the token is the headers, payload, secret_key all encoded into a single string).

- 3600
header_prefix:
type: string
title: Header Prefix
description: The prefix to be used within the Authentication header.
examples:
- "Bearer"
- "Basic"
jwt_headers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are any of those properties required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, but they are the most common properties.

type: object
title: JWT Headers
description: JWT headers used when signing JSON web token.
additionalProperties: false
properties:
kid:
type: string
title: Key Identifier
description: Private key ID for user account.
examples:
- "{{ config['kid'] }}"
typ:
type: string
title: Type
description: The media type of the complete JWT.
default: JWT
examples:
- JWT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's set JWT as a default value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Incorporated.

cty:
type: string
title: Content Type
description: Content type of JWT header.
examples:
- JWT
additional_jwt_headers:
type: object
title: Additional JWT Headers
description: Additional headers to be included with the JWT headers object.
additionalProperties: true
jwt_payload:
type: object
title: JWT Payload
description: JWT Payload used when signing JSON web token.
additionalProperties: false
properties:
iss:
type: string
title: Issuer
description: The user/principal that issued the JWT. Commonly a value unique to the user.
examples:
- "{{ config['iss'] }}"
sub:
type: string
title: Subject
description: The subject of the JWT. Commonly defined by the API.
aud:
type: string
title: Audience
description: The recipient that the JWT is intended for. Commonly defined by the API.
examples:
- "appstoreconnect-v1"
additional_jwt_payload:
type: object
title: Additional JWT Payload Properties
description: Additional properties to be added to the JWT payload.
additionalProperties: true
$parameters:
type: object
additionalProperties: true
OAuthAuthenticator:
title: OAuth2
description: Authenticator for requests using OAuth 2.0 authorization flow.
Expand Down Expand Up @@ -1311,6 +1434,7 @@ definitions:
- "$ref": "#/definitions/BearerAuthenticator"
- "$ref": "#/definitions/CustomAuthenticator"
- "$ref": "#/definitions/OAuthAuthenticator"
- "$ref": "#/definitions/JwtAuthenticator"
- "$ref": "#/definitions/NoAuth"
- "$ref": "#/definitions/SessionTokenAuthenticator"
- "$ref": "#/definitions/LegacySessionTokenAuthenticator"
Expand Down
Loading
Loading