Skip to content

Conversation

pedroslopez
Copy link
Contributor

@pedroslopez pedroslopez commented Sep 4, 2025

close https://github.com/airbytehq/airbyte-internal-issues/issues/14219

Adds an optional context attribute on manifest requests to include workspace ID and builder project ID, which will be attached as span attributes for datadog. These are also logged so we know when we're processing a request for a given workspace when reading through the logs.

This will help debug potential issues more easily.

I also moved the auth dependency into a helpers module

Summary by CodeRabbit

  • New Features
    • Added optional request context with workspace and project identifiers across manifest endpoints to enhance traceability and observability. These fields are optional and backward compatible.
    • Automatically applies trace tags when context is provided, enabling better correlation in tracing systems without impacting request handling.
  • Refactor
    • Internal reorganization of authentication and tracing helpers for cleaner structure (no user-facing changes).
  • Tests
    • Updated tests to reflect internal module path changes.

@github-actions github-actions bot added the enhancement New feature or request label Sep 4, 2025
Copy link

github-actions bot commented Sep 4, 2025

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

Testing This CDK Version

You can test this version of the CDK using the following:

# Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pedro/dep-trace-tags#egg=airbyte-python-cdk[dev]' --help

# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pedro/dep-trace-tags

Helpful Resources

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /autofix - Fixes most formatting and linting issues
  • /poetry-lock - Updates poetry.lock file
  • /test - Runs connector tests with the updated CDK
  • /poe build - Regenerate git-committed build artifacts, such as the pydantic models which are generated from the manifest JSON schema in YAML.
  • /poe <command> - Runs any poe command in the CDK environment

📝 Edit this welcome message.

Copy link

github-actions bot commented Sep 4, 2025

PyTest Results (Fast)

3 752 tests  ±0   3 740 ✅ ±0   6m 28s ⏱️ -1s
    1 suites ±0      12 💤 ±0 
    1 files   ±0       0 ❌ ±0 

Results for commit 73dffd0. ± Comparison against base commit 5378cef.

This pull request removes 9 and adds 9 tests. Note that renamed tests count towards both.
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_empty_secret_allows_all_requests
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_expired_token_raises_401
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_invalid_token_raises_401
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_malformed_token_raises_401
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_missing_credentials_with_secret_raises_401
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_no_secret_allows_all_requests
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_token_without_required_claims_passes
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_valid_token_passes
unit_tests.manifest_server.test_auth.TestVerifyJwtToken ‑ test_wrong_secret_raises_401
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_empty_secret_allows_all_requests
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_expired_token_raises_401
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_invalid_token_raises_401
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_malformed_token_raises_401
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_missing_credentials_with_secret_raises_401
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_no_secret_allows_all_requests
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_token_without_required_claims_passes
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_valid_token_passes
unit_tests.manifest_server.helpers.test_auth.TestVerifyJwtToken ‑ test_wrong_secret_raises_401

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Sep 4, 2025

PyTest Results (Full)

3 755 tests  +3 755   3 743 ✅ +3 743   11m 5s ⏱️ + 11m 5s
    1 suites +    1      12 💤 +   12 
    1 files   +    1       0 ❌ ±    0 

Results for commit 73dffd0. ± Comparison against base commit 5378cef.

♻️ This comment has been updated with latest results.

@pedroslopez pedroslopez changed the title feat: add trace tags feat: add trace tags for manifest-server Sep 4, 2025
@pedroslopez pedroslopez marked this pull request as ready for review September 4, 2025 14:54
@Copilot Copilot AI review requested due to automatic review settings September 4, 2025 14:54
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds optional tracing functionality to the manifest-server by introducing context attributes that can include workspace ID and builder project ID for DataDog observability and logging.

  • Adds optional context parameter to all manifest API request models
  • Creates tracing helper functions to apply span attributes and log request context
  • Refactors auth module location to a helpers directory

Reviewed Changes

Copilot reviewed 6 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
airbyte_cdk/manifest_server/api_models/manifest.py Adds RequestContext model and optional context field to all request models
airbyte_cdk/manifest_server/helpers/tracing.py Creates tracing utility function for applying DataDog span tags
airbyte_cdk/manifest_server/routers/manifest.py Integrates tracing calls in all endpoint handlers and updates auth import
airbyte_cdk/manifest_server/dependencies/tracing.py Contains duplicate tracing functionality
airbyte_cdk/manifest_server/api_models/init.py Exports new RequestContext model
unit_tests/manifest_server/helpers/test_auth.py Updates import path for moved auth module

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Contributor

coderabbitai bot commented Sep 4, 2025

📝 Walkthrough

Walkthrough

Introduces a RequestContext model and exposes it via api_models. Adds optional context fields to several request models. Implements a tracing helper to apply workspace_id/project_id tags to the current ddtrace span. Updates manifest router endpoints to apply trace tags when context is provided. Adjusts an auth import path in tests.

Changes

Cohort / File(s) Summary
API models export
airbyte_cdk/manifest_server/api_models/__init__.py
Re-exports RequestContext from manifest and includes it in __all__.
Manifest request models
airbyte_cdk/manifest_server/api_models/manifest.py
Adds RequestContext model; adds optional `context: RequestContext
Tracing utilities
airbyte_cdk/manifest_server/helpers/tracing.py, airbyte_cdk/manifest_server/dependencies/tracing.py
Adds apply_trace_tags_from_context(workspace_id=None, project_id=None) to set ddtrace span tags if present; logs when tags are applied; safely no-ops on errors or missing span.
Router updates
airbyte_cdk/manifest_server/routers/manifest.py
Switches auth import to ..helpers.auth; imports apply_trace_tags_from_context; invokes it in test_read, check, discover, resolve, full_resolve when request.context is provided.
Unit tests (auth import path)
unit_tests/manifest_server/helpers/test_auth.py
Updates verify_jwt_token import to airbyte_cdk.manifest_server.helpers.auth.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant Router as Manifest Router
  participant Tracing as apply_trace_tags_from_context
  participant DD as ddtrace.current_span()
  participant Core as Manifest Operations

  Client->>Router: HTTP request (e.g., test_read/check/discover/resolve)
  alt request.context present
    Router->>Tracing: apply_trace_tags_from_context(workspace_id, project_id)
    Tracing->>DD: get current span
    alt span exists
      Tracing->>DD: setTag("workspace_id", ...), setTag("project_id", ...)
    else no span
      Note over Tracing: No-op
    end
  else no context
    Note over Router: Skip tracing
  end
  Router->>Core: Continue with source/config/catalog handling
  Core-->>Router: Response payload
  Router-->>Client: HTTP response
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • aaronsteers
✨ Finishing Touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pedro/dep-trace-tags

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

‼️ IMPORTANT
Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.

  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
airbyte_cdk/manifest_server/api_models/manifest.py (1)

23-35: Add missing context to Request models in stream.py
All request models in airbyte_cdk/manifest_server/api_models/stream.py currently lack a context: Optional[RequestContext] = None field—please add it to each *Request class to mirror the manifest models and ensure consistent request handling. wdyt?

🧹 Nitpick comments (8)
airbyte_cdk/manifest_server/helpers/tracing.py (1)

24-26: Downgrade trace logging to debug with structured metadata?

The logger.info(f"Processing request with trace tags: …") appears in both:

  • airbyte_cdk/manifest_server/helpers/tracing.py:24
  • airbyte_cdk/manifest_server/dependencies/tracing.py:32

Would you prefer

logger.debug(
    "Processing request with trace tags",
    extra={"workspace_id": workspace_id, "project_id": project_id},
)

to keep logs structured and less chatty in prod? wdyt?

airbyte_cdk/manifest_server/api_models/manifest.py (2)

16-21: Validate IDs as UUIDs to catch bad inputs early.

Workspace and project IDs are UUIDs in Airbyte; typing them as UUID improves validation and OpenAPI docs. Should we migrate to UUID here? wdyt?

@@
-from typing import Any, List, Optional
+from typing import Any, List, Optional
+from uuid import UUID
@@
-class RequestContext(BaseModel):
+class RequestContext(BaseModel):
@@
-    workspace_id: Optional[str] = None
-    project_id: Optional[str] = None
+    workspace_id: Optional[UUID] = None
+    project_id: Optional[UUID] = None

If backward-compat risk exists, we can accept strings now and add UUID parsing later with a deprecation notice—does that work?


23-35: Reduce repetition by using a ContextMixin base model.

Five request models add the same optional context. Shall we introduce a ContextMixin to DRY this and avoid omissions in future models? wdyt?

@@
 from .dicts import ConnectorConfig, Manifest
@@
+class ContextMixin(BaseModel):
+    context: Optional[RequestContext] = None
+
@@
-class StreamTestReadRequest(BaseModel):
+class StreamTestReadRequest(ContextMixin, BaseModel):
@@
-    slice_limit: int = Field(default=5, ge=1, le=20)
-    context: Optional[RequestContext] = None
+    slice_limit: int = Field(default=5, ge=1, le=20)
@@
-class CheckRequest(BaseModel):
+class CheckRequest(ContextMixin, BaseModel):
@@
-    config: ConnectorConfig
-    context: Optional[RequestContext] = None
+    config: ConnectorConfig
@@
-class DiscoverRequest(BaseModel):
+class DiscoverRequest(ContextMixin, BaseModel):
@@
-    config: ConnectorConfig
-    context: Optional[RequestContext] = None
+    config: ConnectorConfig
@@
-class ResolveRequest(BaseModel):
+class ResolveRequest(ContextMixin, BaseModel):
@@
-    manifest: Manifest
-    context: Optional[RequestContext] = None
+    manifest: Manifest
@@
-class FullResolveRequest(BaseModel):
+class FullResolveRequest(ContextMixin, BaseModel):
@@
-    stream_limit: int = Field(default=100, ge=1, le=100)
-    context: Optional[RequestContext] = None
+    stream_limit: int = Field(default=100, ge=1, le=100)

Also applies to: 40-43, 52-58, 66-71, 79-86

airbyte_cdk/manifest_server/routers/manifest.py (5)

72-78: Nice early span tagging; want to DRY this and guard against untrusted context values?

The pattern is duplicated across endpoints. Would you consider a tiny helper to both (a) avoid repetition and (b) safely handle a missing/empty context, wdyt?

Apply this localized change here:

-    # Apply trace tags from context if provided
-    if request.context:
-        apply_trace_tags_from_context(
-            workspace_id=request.context.workspace_id,
-            project_id=request.context.project_id,
-        )
+    _apply_tags_from_request_context(request)

And add this helper once in the module (outside this hunk):

def _apply_tags_from_request_context(req: Any) -> None:
    ctx = getattr(req, "context", None)
    if ctx is None:
        return
    apply_trace_tags_from_context(
        workspace_id=getattr(ctx, "workspace_id", None),
        project_id=getattr(ctx, "project_id", None),
    )

Also, since workspace_id/project_id come from the client payload, are we comfortable with potential tag spoofing in logs/traces? If not, would you want to optionally cross-check against JWT claims (when present) or add a server-side provenance tag (e.g., trace_context_source=client), wdyt?


115-121: Repeat: centralize trace-tagging in a helper to reduce duplication

Same refactor as above to call _apply_tags_from_request_context(request) here, wdyt?

-    # Apply trace tags from context if provided
-    if request.context:
-        apply_trace_tags_from_context(
-            workspace_id=request.context.workspace_id,
-            project_id=request.context.project_id,
-        )
+    _apply_tags_from_request_context(request)

131-137: Repeat: use helper for tagging

Swap the block with a helper call for consistency, wdyt?

-    # Apply trace tags from context if provided
-    if request.context:
-        apply_trace_tags_from_context(
-            workspace_id=request.context.workspace_id,
-            project_id=request.context.project_id,
-        )
+    _apply_tags_from_request_context(request)

149-155: Repeat: use helper for tagging

Same DRY suggestion here, wdyt?

-    # Apply trace tags from context if provided
-    if request.context:
-        apply_trace_tags_from_context(
-            workspace_id=request.context.workspace_id,
-            project_id=request.context.project_id,
-        )
+    _apply_tags_from_request_context(request)

167-173: Repeat: use helper for tagging

Replace with the helper to keep all endpoints consistent, wdyt?

-    # Apply trace tags from context if provided
-    if request.context:
-        apply_trace_tags_from_context(
-            workspace_id=request.context.workspace_id,
-            project_id=request.context.project_id,
-        )
+    _apply_tags_from_request_context(request)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled
  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 5378cef and 7bc84f1.

📒 Files selected for processing (6)
  • airbyte_cdk/manifest_server/api_models/__init__.py (2 hunks)
  • airbyte_cdk/manifest_server/api_models/manifest.py (5 hunks)
  • airbyte_cdk/manifest_server/dependencies/tracing.py (1 hunks)
  • airbyte_cdk/manifest_server/helpers/tracing.py (1 hunks)
  • airbyte_cdk/manifest_server/routers/manifest.py (6 hunks)
  • unit_tests/manifest_server/helpers/test_auth.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
airbyte_cdk/manifest_server/routers/manifest.py (2)
airbyte_cdk/manifest_server/helpers/auth.py (1)
  • verify_jwt_token (11-43)
airbyte_cdk/manifest_server/helpers/tracing.py (1)
  • apply_trace_tags_from_context (9-36)
airbyte_cdk/manifest_server/api_models/manifest.py (1)
airbyte_cdk/manifest_server/api_models/dicts.py (2)
  • Manifest (8-11)
  • ConnectorConfig (14-17)
airbyte_cdk/manifest_server/api_models/__init__.py (1)
airbyte_cdk/manifest_server/api_models/manifest.py (1)
  • RequestContext (16-20)
unit_tests/manifest_server/helpers/test_auth.py (1)
airbyte_cdk/manifest_server/helpers/auth.py (1)
  • verify_jwt_token (11-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Pytest (All, Python 3.11, Ubuntu)
  • GitHub Check: Check: source-shopify
🔇 Additional comments (5)
unit_tests/manifest_server/helpers/test_auth.py (1)

10-10: Import path update looks correct and non-breaking.

The move to helpers.auth aligns with the new module layout; tests should continue to pass unchanged in behavior. wdyt?

airbyte_cdk/manifest_server/helpers/tracing.py (1)

1-36: No missing ddtrace imports or dependency declarations found; pyproject.toml and poetry.lock correctly list ddtrace as an optional dependency and include it in the manifest-server extra.

airbyte_cdk/manifest_server/api_models/__init__.py (2)

15-15: Re-exporting RequestContext looks good.

Package consumers can now import it from a stable path. wdyt?


34-34: all updated correctly.

Nice touch to keep the public surface explicit. wdyt?

airbyte_cdk/manifest_server/routers/manifest.py (1)

32-33: No stale auth imports remain. Verified there are no from airbyte_cdk.manifest_server.auth import verify_jwt_token or from ..auth import verify_jwt_token usages—only the new helpers.auth path is used.

Copy link
Contributor

@aaronsteers aaronsteers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!! 👍👍

@pedroslopez pedroslopez merged commit 4e591f2 into main Sep 4, 2025
28 checks passed
@pedroslopez pedroslopez deleted the pedro/dep-trace-tags branch September 4, 2025 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants