Skip to content

Conversation

lazebnyi
Copy link
Contributor

@lazebnyi lazebnyi commented Jan 30, 2025

What

Some sources use JWT authentication and store all parameters in the 'config' field in JSON format (e.g., Google Sheets). This approach makes it problematic to natively use this JSON-format config without a custom implementation.

How

Pass json.loads to all interpolated fields in the JWT authenticator.

Summary by CodeRabbit

  • New Features

    • Enhanced JWT authentication with improved JSON data processing
    • Added test coverage for secret key retrieval from configuration
  • Tests

    • Introduced new test method to validate secret key configuration handling

@github-actions github-actions bot added the enhancement New feature or request label Jan 30, 2025
@lazebnyi
Copy link
Contributor Author

lazebnyi commented Jan 30, 2025

/autofix

Auto-Fix Job Info

This job attempts to auto-fix any linting or formating issues. If any fixes are made,
those changes will be automatically committed and pushed back to the PR.

Note: This job can only be run by maintainers. On PRs from forks, this command requires
that the PR author has enabled the Allow edits from maintainers option.

PR auto-fix job started... Check job output.

✅ Changes applied successfully.

@lazebnyi lazebnyi requested a review from maxi297 January 30, 2025 18:02
Copy link
Contributor

coderabbitai bot commented Jan 30, 2025

📝 Walkthrough

Walkthrough

The pull request introduces modifications to the JWT authentication mechanism in the Airbyte CDK. The changes primarily focus on enhancing JSON processing within the JwtAuthenticator class by adding json.loads to various evaluation methods. A corresponding unit test has been added to verify the secret key retrieval functionality from configuration, expanding the test coverage for the JWT authentication component.

Changes

File Change Summary
airbyte_cdk/sources/declarative/auth/jwt.py - Added json import
- Updated methods to include json_loads=json.loads in eval() calls
- Corrected docstring formatting
unit_tests/sources/declarative/auth/test_jwt.py - Added new test method test_get_secret_key_from_config() to validate secret key retrieval

Sequence Diagram

sequenceDiagram
    participant JwtAuthenticator
    participant Config
    participant JSON
    
    JwtAuthenticator->>Config: Retrieve secret key
    Config-->>JSON: Parse JSON string
    JSON-->>JwtAuthenticator: Return parsed secret key
    JwtAuthenticator->>JwtAuthenticator: Validate and use secret key
Loading
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@pnilan
Copy link
Contributor

pnilan commented Jan 30, 2025

Should we use orjson instead?

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
unit_tests/sources/declarative/auth/test_jwt.py (1)

129-142: The test looks good, but could we add more edge cases? wdyt?

The test effectively covers the happy path for JSON parsing. Consider adding test cases for:

  • Malformed JSON in config
  • Missing keys in the JSON object
  • Nested JSON structures

Example additional test case:

def test_get_secret_key_from_config_malformed_json(self):
    authenticator = JwtAuthenticator(
        config={"secrets": 'invalid json'},
        parameters={},
        secret_key="{{ json_loads(config['secrets'])['secret_key'] }}",
        algorithm="test_algo",
        token_duration=1200,
        base64_encode_secret_key=False,
    )
    with pytest.raises(json.JSONDecodeError):
        authenticator._get_secret_key()
airbyte_cdk/sources/declarative/auth/jwt.py (3)

111-111: Should we add some error handling for JSON parsing? wdyt?

While adding json_loads is great, we might want to catch and handle JSON parsing errors gracefully. Consider wrapping the eval calls in try-except blocks to provide more meaningful error messages.

try:
    headers = self._additional_jwt_headers.eval(self.config, json_loads=json.loads)
except json.JSONDecodeError as e:
    raise ValueError(f"Failed to parse JWT header JSON: {str(e)}")

Also applies to: 118-118, 120-120, 122-122


134-134: Should we apply the same error handling pattern here too?

For consistency with the headers method, consider adding similar JSON parsing error handling.

Also applies to: 141-141, 143-143, 145-145


6-6: Consider centralizing JSON parsing error handling, wdyt?

Since we're adding JSON parsing across multiple methods, we could create a helper method to handle JSON parsing errors consistently. This would reduce code duplication and ensure uniform error handling.

def _safe_json_eval(self, interpolated_value, config: dict, context: str = "") -> Any:
    try:
        return interpolated_value.eval(config, json_loads=json.loads)
    except json.JSONDecodeError as e:
        raise ValueError(f"Failed to parse JSON in {context}: {str(e)}")

Also applies to: 111-111, 134-134, 156-156, 181-185

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a6d55be and 7d70970.

📒 Files selected for processing (2)
  • airbyte_cdk/sources/declarative/auth/jwt.py (5 hunks)
  • unit_tests/sources/declarative/auth/test_jwt.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Publish SDM to DockerHub
  • GitHub Check: Check: 'source-pokeapi' (skip=false)
  • GitHub Check: Check: 'source-the-guardian-api' (skip=false)
  • GitHub Check: Check: 'source-shopify' (skip=false)
  • GitHub Check: Check: 'source-hardcoded-records' (skip=false)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Pytest (Fast)
  • GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/auth/jwt.py (3)

6-6: LGTM! Clean import addition.

The json import is appropriately placed with other standard library imports.


181-185: LGTM! Clean and consistent implementation.

The changes align well with the pattern used in other methods.


156-156: Should we validate the secret key after parsing? wdyt?

Consider adding validation to ensure the secret key is non-empty and of the correct type after JSON parsing.

secret_key: str = self._secret_key.eval(self.config, json_loads=json.loads)
if not secret_key:
    raise ValueError("Secret key cannot be empty")

@lazebnyi
Copy link
Contributor Author

@pnilan

Should we use orjson instead?

In that case I don't think that orjson will add a lot in performance, but it will add additional dependencies. So, I think json is preferable here for me.

Copy link
Contributor

@maxi297 maxi297 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this. We can make it global to all interpolation later if needed as it would not be a breaking change from what I can see. Thanks!

@maxi297
Copy link
Contributor

maxi297 commented Jan 30, 2025

I'm not too attached on orjson on this one because it won't be called very often so the impact on performance does not seem very high

@lazebnyi lazebnyi merged commit dea2cc9 into main Jan 30, 2025
26 of 28 checks passed
@lazebnyi lazebnyi deleted the lazebnyi/add-json-loads-to-jwt-authentication-params branch January 30, 2025 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants