Skip to content

Conversation

@dinmukhamedm
Copy link
Member

@dinmukhamedm dinmukhamedm commented Nov 16, 2025

Note

Adds Claude Agent SDK auto-instrumentation (with local proxy and span-context handoff), updates deps, and adds comprehensive tests; bumps version to 0.7.23a1.

  • Instrumentation:
    • Introduces ClaudeAgentInstrumentor in opentelemetry/instrumentation/claude_agent/__init__.py with async, streaming, and module-level wrappers (ClaudeSDKClient.connect/query/receive_*, claude_agent_sdk.query, create_sdk_mcp_server).
    • Wires initializer in tracing/_instrument_initializers.py and registers enum in tracing/instruments.py (Instruments.CLAUDE_AGENT).
  • Proxy integration:
    • Adds opentelemetry/instrumentation/claude_agent/proxy.py to manage a local Claude proxy (start/stop, port selection, env var swap, span-context publication).
  • SDK:
    • Fixes Laminar.use_span behavior when uninitialized (uses contextmanager correctly).
  • Dependencies:
    • Adds extras claude-agent-sdk (via lmnr-claude-code-proxy) and dev dep claude-agent-sdk>=0.1.6 in pyproject.toml.
  • Versioning:
    • Bumps version to 0.7.23a1 in pyproject.toml and src/lmnr/version.py.
  • Tests:
    • Adds mock transport tests/.../mock_transport.py and proxy cleanup fixture.
    • New tests for SDK client, module query, alias import, and tools (test_claude_sdk_client.py, test_query.py, test_query_with_alias.py, test_tool.py).
    • Adjusts tests/test_initialize.py fixture to preserve/restore Laminar state.

Written by Cursor Bugbot for commit b7d751d. This will update automatically on new commits. Configure here.


Important

Add Claude Agent SDK with auto-instrumentation, proxy management, and tests, updating dependencies and version.

  • Instrumentation:
    • Add ClaudeAgentInstrumentor in __init__.py for claude_agent_sdk with span input/output and streaming support.
    • Implement proxy management in proxy.py (start/stop server, env var routing, trace propagation).
    • Register instrumentor via ClaudeAgentInstrumentorInitializer in _instrument_initializers.py and instruments.py.
  • Tests:
    • Add tests for client flow, module query, and tools in test_claude_sdk_client.py, test_query.py, test_query_with_alias.py, and test_tool.py.
    • Provide MockClaudeTransport for simulated I/O in mock_transport.py.
  • Dependencies & Version:
    • Add extras claude-agent-sdk and lmnr-claude-code-proxy; bump version to 0.7.23a1 in pyproject.toml and version.py.

This description was created by Ellipsis for b7d751d. You can customize this summary. It will automatically update as commits are pushed.

Rainhunter13 and others added 9 commits November 13, 2025 22:38
…on (#209)

* wip: initial version to support ClaudeSDKClient.query() instrumentation

* add async generator wrapping for capturing output

* chore test

* chore is_streaming field

* wip: fixing query() test

* fix wrapping standalone query() function

* fix async gen function span staying active during the next functon call
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed everything up to d35952a in 2 minutes and 2 seconds. Click for details.
  • Reviewed 1110 lines of code in 11 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. pyproject.toml:70
  • Draft comment:
    Ensure version constraints for claude-agent dependencies are consistent between optional and dev sections (e.g. 'lmnr-claude-code-proxy>=0.1.0a2' vs 'claude-agent-sdk>=0.1.6').
  • Reason this comment was not posted:
    Comment looked like it was already resolved.
2. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:374
  • Draft comment:
    Avoid 'await release_proxy()' since release_proxy() is a synchronous function. Either remove the await or convert release_proxy() to an async function.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment claims release_proxy() is synchronous and shouldn't be awaited. However, I cannot see the implementation of release_proxy in this diff - it's in a separate file (.proxy). The evidence shows inconsistent usage: in _wrap_async it's called without await (line 308), but in _wrap_async_gen it's awaited (line 375). Similarly, start_proxy() is called without await in _wrap_async (line 289) but with await in _wrap_async_gen (line 326). This could mean: (1) the functions have both sync and async versions, (2) there's a bug in one of the usages, or (3) the comment is incorrect. Without seeing the actual implementation of release_proxy, I cannot definitively say the comment is correct. The inconsistency within the same file is suspicious though. I'm making an assumption about whether release_proxy() is sync or async without seeing its implementation. The inconsistent usage patterns in the code could indicate either a bug or that there are multiple versions of these functions. I should not assume the comment is correct without strong evidence. Given the rule that I need STRONG EVIDENCE to keep a comment, and I cannot see the implementation of release_proxy() in this diff, I don't have strong evidence. The inconsistent usage could be intentional (different contexts require different approaches) or could be a bug in the existing code rather than the line being commented on. Without access to the proxy module implementation, I cannot verify the claim. I should delete this comment because I cannot verify from the diff alone whether release_proxy() is sync or async. The inconsistent usage patterns in the code are not sufficient evidence to confirm the comment is correct, and understanding this requires seeing the implementation in another file.
3. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:194
  • Draft comment:
    Consider logging exceptions in _record_input instead of silently passing to aid in debugging.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a new file being added, so technically everything in it is "new code" or a "change". However, the comment is suggesting a code quality improvement (logging exceptions instead of silently passing). The rules state that "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This comment is actionable and clear. However, I need to consider: 1) Is there strong evidence this is the right approach? The silent exception handling might be intentional for instrumentation code that shouldn't disrupt the main application flow. 2) Looking at similar patterns in the codebase, _record_output also silently passes exceptions, suggesting this is a deliberate pattern. 3) In instrumentation/observability code, it's often a best practice to fail silently to avoid breaking the instrumented application. The comment doesn't acknowledge this trade-off. The silent exception handling might be intentional design for instrumentation code - observability tooling typically should not break the application it's monitoring. The comment doesn't consider this context and may be suggesting a change that could make the instrumentation more intrusive. Without seeing other similar instrumentation files in this codebase, I can't be certain this is the wrong pattern. While the silent exception handling might be intentional, the comment suggests logging (not raising) exceptions, which would maintain the non-intrusive behavior while improving debuggability. However, there's a logger available in the file, and if the developers wanted logging here, they likely would have added it. The pattern is consistent across both _record_input and _record_output, suggesting it's deliberate. This appears to be a subjective code quality suggestion without strong evidence it's necessary. This comment suggests a code quality improvement but lacks strong evidence that it's necessary. The silent exception handling appears to be an intentional pattern for instrumentation code (used consistently in both _record_input and _record_output), and the comment doesn't acknowledge the trade-offs involved. Since I need strong evidence to keep a comment and this is more of a subjective suggestion, I should delete it.
4. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:210
  • Draft comment:
    Consider logging exceptions in _record_output rather than silently passing, to help diagnose issues with output serialization.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a code quality suggestion about error handling. The comment is about code that was added in this diff (it's a new file). However, looking at the pattern in the codebase, _record_input uses the exact same silent exception handling pattern. The comment is suggesting a refactor/improvement rather than pointing out a bug. According to the rules, "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This comment is actionable and clear. However, I need to consider: 1) Is this obviously important enough to warrant a comment? 2) Is there strong evidence this is correct? The fact that _record_input uses the same pattern suggests this might be an intentional design choice. The comment doesn't acknowledge this consistency. Also, the rules say "Do NOT comment unless there is clearly a code change required" - this is more of a "nice to have" suggestion rather than a clear requirement. The comment doesn't acknowledge that _record_input uses the exact same pattern, which suggests this might be an intentional design choice across the instrumentation code. If logging were truly important here, it would likely be needed in both places. The comment might be suggesting an inconsistency that doesn't actually exist, or suggesting a change that goes against the established pattern in this code. While the pattern is consistent between _record_input and _record_output, that doesn't necessarily mean it's the right pattern. However, the rules state that comments should point out clear code changes that are required, not suggest improvements. This is more of a "consider doing X" suggestion rather than identifying a clear problem. Without strong evidence that the silent exception handling is causing issues, this falls into the category of speculative improvements. This comment suggests a code quality improvement but doesn't identify a clear bug or required change. The silent exception handling is consistent with the pattern used in _record_input, suggesting it's intentional. Without strong evidence that this is causing problems, this is more of a speculative improvement suggestion rather than a required fix.

Workflow ID: wflow_wwEXhmESWW7Ugdxk

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

…e_agent/__init__.py

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
@dinmukhamedm dinmukhamedm merged commit 0ff547b into main Nov 17, 2025
8 checks passed
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Changes requested ❌

Reviewed b7d751d in 1 minute and 53 seconds. Click for details.
  • Reviewed 378 lines of code in 10 files
  • Skipped 0 files when reviewing.
  • Skipped posting 8 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. pyproject.toml:70
  • Draft comment:
    Dependency version bumped to 'lmnr-claude-code-proxy>=0.1.0a3'. Please ensure that any breaking changes or new features are documented (e.g. in CHANGELOG).
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to ensure that breaking changes or new features are documented, which falls under the rule of not asking the author to ensure things. It also relates to dependency changes, which should not be commented on unless it's about confirming intentionality or specific issues.
2. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/proxy.py:70
  • Draft comment:
    Using a background thread with a fixed GRACE_TO_FLUSH_SECONDS delay for stopping the proxy is acceptable, but ensure that this delay does not lead to race conditions in concurrent environments or tests.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is asking the PR author to ensure that the delay does not lead to race conditions, which is against the rules. It does not provide a specific suggestion or ask for a specific test to be written. Therefore, it should be removed.
3. tests/test_initialize.py:12
  • Draft comment:
    The fixture nicely saves and restores Laminar's internal state. Consider adding a short comment explaining why manual state management is critical in tests.
  • Reason this comment was not posted:
    Confidence changes required: 30% <= threshold 50% None
4. tests/test_instrumentations/test_claude_agent/conftest.py:4
  • Draft comment:
    The 'cleanup_claude_proxy' fixture efficiently cleans up the proxy server before and after tests. Ensure that suppressing errors in _cleanup_proxy does not hide underlying issues during debugging.
  • Reason this comment was not posted:
    Confidence changes required: 30% <= threshold 50% None
5. tests/test_instrumentations/test_claude_agent/test_claude_sdk_client.py:9
  • Draft comment:
    The test for ClaudeSDKClient verifies span creation and attributes correctly. Confirm that using 'asyncio.run' aligns with your event loop policy and that the order of spans is deterministic.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None
6. tests/test_instrumentations/test_claude_agent/test_query.py:12
  • Draft comment:
    The query test validates that the span attributes (name, path, output) are correctly set. Remove any commented-out prompt placeholders if they are no longer needed.
  • Reason this comment was not posted:
    Confidence changes required: 40% <= threshold 50% None
7. tests/test_instrumentations/test_claude_agent/test_query_with_alias.py:9
  • Draft comment:
    The alias test for 'query' confirms that both the original and aliased functions produce the same span. Ensure consistent naming conventions are maintained.
  • Reason this comment was not posted:
    Confidence changes required: 40% <= threshold 50% None
8. tests/test_instrumentations/test_claude_agent/test_tool.py:14
  • Draft comment:
    Tool tests correctly define decorated tools and validate the generated spans. Be cautious with hard-coded span counts (e.g. expecting 9 spans) which might change if instrumentation behavior is updated.
  • Reason this comment was not posted:
    Confidence changes required: 50% <= threshold 50% None

Workflow ID: wflow_vv4fQqg60Q38x9gq

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

try:
result = await wrapped(*args, **kwargs)
except Exception as e: # pylint: disable=broad-except
if original_base_url is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The async wrapper (_wrap_async) duplicates environment restoration logic for ANTHROPIC_BASE_URL. Consider extracting this into a helper function to avoid duplication and potential inconsistencies.

@dinmukhamedm dinmukhamedm deleted the feat/claude-agent-sdk branch November 17, 2025 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants