-
Notifications
You must be signed in to change notification settings - Fork 30
feat: add trace tags for manifest-server #744
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This CDK VersionYou can test this version of the CDK using the following: # Run the CLI from this branch:
uvx 'git+https://github.com/airbytehq/airbyte-python-cdk.git@pedro/dep-trace-tags#egg=airbyte-python-cdk[dev]' --help
# Update a connector to use the CDK from this branch ref:
cd airbyte-integrations/connectors/source-example
poe use-cdk-branch pedro/dep-trace-tags Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
|
PyTest Results (Fast)3 752 tests ±0 3 740 ✅ ±0 6m 28s ⏱️ -1s Results for commit 73dffd0. ± Comparison against base commit 5378cef. This pull request removes 9 and adds 9 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds optional tracing functionality to the manifest-server by introducing context attributes that can include workspace ID and builder project ID for DataDog observability and logging.
- Adds optional
context
parameter to all manifest API request models - Creates tracing helper functions to apply span attributes and log request context
- Refactors auth module location to a
helpers
directory
Reviewed Changes
Copilot reviewed 6 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
airbyte_cdk/manifest_server/api_models/manifest.py | Adds RequestContext model and optional context field to all request models |
airbyte_cdk/manifest_server/helpers/tracing.py | Creates tracing utility function for applying DataDog span tags |
airbyte_cdk/manifest_server/routers/manifest.py | Integrates tracing calls in all endpoint handlers and updates auth import |
airbyte_cdk/manifest_server/dependencies/tracing.py | Contains duplicate tracing functionality |
airbyte_cdk/manifest_server/api_models/init.py | Exports new RequestContext model |
unit_tests/manifest_server/helpers/test_auth.py | Updates import path for moved auth module |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
📝 WalkthroughWalkthroughIntroduces a RequestContext model and exposes it via api_models. Adds optional context fields to several request models. Implements a tracing helper to apply workspace_id/project_id tags to the current ddtrace span. Updates manifest router endpoints to apply trace tags when context is provided. Adjusts an auth import path in tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant Router as Manifest Router
participant Tracing as apply_trace_tags_from_context
participant DD as ddtrace.current_span()
participant Core as Manifest Operations
Client->>Router: HTTP request (e.g., test_read/check/discover/resolve)
alt request.context present
Router->>Tracing: apply_trace_tags_from_context(workspace_id, project_id)
Tracing->>DD: get current span
alt span exists
Tracing->>DD: setTag("workspace_id", ...), setTag("project_id", ...)
else no span
Note over Tracing: No-op
end
else no context
Note over Router: Skip tracing
end
Router->>Core: Continue with source/config/catalog handling
Core-->>Router: Response payload
Router-->>Client: HTTP response
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
airbyte_cdk/manifest_server/api_models/manifest.py (1)
23-35
: Add missingcontext
to Request models in stream.py
All request models inairbyte_cdk/manifest_server/api_models/stream.py
currently lack acontext: Optional[RequestContext] = None
field—please add it to each*Request
class to mirror the manifest models and ensure consistent request handling. wdyt?
🧹 Nitpick comments (8)
airbyte_cdk/manifest_server/helpers/tracing.py (1)
24-26
: Downgrade trace logging to debug with structured metadata?The
logger.info(f"Processing request with trace tags: …")
appears in both:
airbyte_cdk/manifest_server/helpers/tracing.py:24
airbyte_cdk/manifest_server/dependencies/tracing.py:32
Would you prefer
logger.debug( "Processing request with trace tags", extra={"workspace_id": workspace_id, "project_id": project_id}, )to keep logs structured and less chatty in prod? wdyt?
airbyte_cdk/manifest_server/api_models/manifest.py (2)
16-21
: Validate IDs as UUIDs to catch bad inputs early.Workspace and project IDs are UUIDs in Airbyte; typing them as UUID improves validation and OpenAPI docs. Should we migrate to UUID here? wdyt?
@@ -from typing import Any, List, Optional +from typing import Any, List, Optional +from uuid import UUID @@ -class RequestContext(BaseModel): +class RequestContext(BaseModel): @@ - workspace_id: Optional[str] = None - project_id: Optional[str] = None + workspace_id: Optional[UUID] = None + project_id: Optional[UUID] = NoneIf backward-compat risk exists, we can accept strings now and add UUID parsing later with a deprecation notice—does that work?
23-35
: Reduce repetition by using a ContextMixin base model.Five request models add the same optional context. Shall we introduce a
ContextMixin
to DRY this and avoid omissions in future models? wdyt?@@ from .dicts import ConnectorConfig, Manifest @@ +class ContextMixin(BaseModel): + context: Optional[RequestContext] = None + @@ -class StreamTestReadRequest(BaseModel): +class StreamTestReadRequest(ContextMixin, BaseModel): @@ - slice_limit: int = Field(default=5, ge=1, le=20) - context: Optional[RequestContext] = None + slice_limit: int = Field(default=5, ge=1, le=20) @@ -class CheckRequest(BaseModel): +class CheckRequest(ContextMixin, BaseModel): @@ - config: ConnectorConfig - context: Optional[RequestContext] = None + config: ConnectorConfig @@ -class DiscoverRequest(BaseModel): +class DiscoverRequest(ContextMixin, BaseModel): @@ - config: ConnectorConfig - context: Optional[RequestContext] = None + config: ConnectorConfig @@ -class ResolveRequest(BaseModel): +class ResolveRequest(ContextMixin, BaseModel): @@ - manifest: Manifest - context: Optional[RequestContext] = None + manifest: Manifest @@ -class FullResolveRequest(BaseModel): +class FullResolveRequest(ContextMixin, BaseModel): @@ - stream_limit: int = Field(default=100, ge=1, le=100) - context: Optional[RequestContext] = None + stream_limit: int = Field(default=100, ge=1, le=100)Also applies to: 40-43, 52-58, 66-71, 79-86
airbyte_cdk/manifest_server/routers/manifest.py (5)
72-78
: Nice early span tagging; want to DRY this and guard against untrusted context values?The pattern is duplicated across endpoints. Would you consider a tiny helper to both (a) avoid repetition and (b) safely handle a missing/empty context, wdyt?
Apply this localized change here:
- # Apply trace tags from context if provided - if request.context: - apply_trace_tags_from_context( - workspace_id=request.context.workspace_id, - project_id=request.context.project_id, - ) + _apply_tags_from_request_context(request)And add this helper once in the module (outside this hunk):
def _apply_tags_from_request_context(req: Any) -> None: ctx = getattr(req, "context", None) if ctx is None: return apply_trace_tags_from_context( workspace_id=getattr(ctx, "workspace_id", None), project_id=getattr(ctx, "project_id", None), )Also, since
workspace_id
/project_id
come from the client payload, are we comfortable with potential tag spoofing in logs/traces? If not, would you want to optionally cross-check against JWT claims (when present) or add a server-side provenance tag (e.g.,trace_context_source=client
), wdyt?
115-121
: Repeat: centralize trace-tagging in a helper to reduce duplicationSame refactor as above to call
_apply_tags_from_request_context(request)
here, wdyt?- # Apply trace tags from context if provided - if request.context: - apply_trace_tags_from_context( - workspace_id=request.context.workspace_id, - project_id=request.context.project_id, - ) + _apply_tags_from_request_context(request)
131-137
: Repeat: use helper for taggingSwap the block with a helper call for consistency, wdyt?
- # Apply trace tags from context if provided - if request.context: - apply_trace_tags_from_context( - workspace_id=request.context.workspace_id, - project_id=request.context.project_id, - ) + _apply_tags_from_request_context(request)
149-155
: Repeat: use helper for taggingSame DRY suggestion here, wdyt?
- # Apply trace tags from context if provided - if request.context: - apply_trace_tags_from_context( - workspace_id=request.context.workspace_id, - project_id=request.context.project_id, - ) + _apply_tags_from_request_context(request)
167-173
: Repeat: use helper for taggingReplace with the helper to keep all endpoints consistent, wdyt?
- # Apply trace tags from context if provided - if request.context: - apply_trace_tags_from_context( - workspace_id=request.context.workspace_id, - project_id=request.context.project_id, - ) + _apply_tags_from_request_context(request)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled
- Linear integration is disabled
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
airbyte_cdk/manifest_server/api_models/__init__.py
(2 hunks)airbyte_cdk/manifest_server/api_models/manifest.py
(5 hunks)airbyte_cdk/manifest_server/dependencies/tracing.py
(1 hunks)airbyte_cdk/manifest_server/helpers/tracing.py
(1 hunks)airbyte_cdk/manifest_server/routers/manifest.py
(6 hunks)unit_tests/manifest_server/helpers/test_auth.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
airbyte_cdk/manifest_server/routers/manifest.py (2)
airbyte_cdk/manifest_server/helpers/auth.py (1)
verify_jwt_token
(11-43)airbyte_cdk/manifest_server/helpers/tracing.py (1)
apply_trace_tags_from_context
(9-36)
airbyte_cdk/manifest_server/api_models/manifest.py (1)
airbyte_cdk/manifest_server/api_models/dicts.py (2)
Manifest
(8-11)ConnectorConfig
(14-17)
airbyte_cdk/manifest_server/api_models/__init__.py (1)
airbyte_cdk/manifest_server/api_models/manifest.py (1)
RequestContext
(16-20)
unit_tests/manifest_server/helpers/test_auth.py (1)
airbyte_cdk/manifest_server/helpers/auth.py (1)
verify_jwt_token
(11-43)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Check: source-shopify
🔇 Additional comments (5)
unit_tests/manifest_server/helpers/test_auth.py (1)
10-10
: Import path update looks correct and non-breaking.The move to helpers.auth aligns with the new module layout; tests should continue to pass unchanged in behavior. wdyt?
airbyte_cdk/manifest_server/helpers/tracing.py (1)
1-36
: No missing ddtrace imports or dependency declarations found; pyproject.toml and poetry.lock correctly list ddtrace as an optional dependency and include it in themanifest-server
extra.airbyte_cdk/manifest_server/api_models/__init__.py (2)
15-15
: Re-exporting RequestContext looks good.Package consumers can now import it from a stable path. wdyt?
34-34
: all updated correctly.Nice touch to keep the public surface explicit. wdyt?
airbyte_cdk/manifest_server/routers/manifest.py (1)
32-33
: No stale auth imports remain. Verified there are nofrom airbyte_cdk.manifest_server.auth import verify_jwt_token
orfrom ..auth import verify_jwt_token
usages—only the newhelpers.auth
path is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!! 👍👍
close https://github.com/airbytehq/airbyte-internal-issues/issues/14219
Adds an optional
context
attribute on manifest requests to include workspace ID and builder project ID, which will be attached as span attributes for datadog. These are also logged so we know when we're processing a request for a given workspace when reading through the logs.This will help debug potential issues more easily.
I also moved the auth dependency into a
helpers
moduleSummary by CodeRabbit