-
Notifications
You must be signed in to change notification settings - Fork 9
Feat/claude agent sdk #213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…on (#209) * wip: initial version to support ClaudeSDKClient.query() instrumentation * add async generator wrapping for capturing output * chore test * chore is_streaming field * wip: fixing query() test * fix wrapping standalone query() function * fix async gen function span staying active during the next functon call
src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed everything up to d35952a in 2 minutes and 2 seconds. Click for details.
- Reviewed
1110lines of code in11files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. pyproject.toml:70
- Draft comment:
Ensure version constraints for claude-agent dependencies are consistent between optional and dev sections (e.g. 'lmnr-claude-code-proxy>=0.1.0a2' vs 'claude-agent-sdk>=0.1.6'). - Reason this comment was not posted:
Comment looked like it was already resolved.
2. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:374
- Draft comment:
Avoid 'await release_proxy()' since release_proxy() is a synchronous function. Either remove the await or convert release_proxy() to an async function. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment claimsrelease_proxy()is synchronous and shouldn't be awaited. However, I cannot see the implementation ofrelease_proxyin this diff - it's in a separate file (.proxy). The evidence shows inconsistent usage: in_wrap_asyncit's called without await (line 308), but in_wrap_async_genit's awaited (line 375). Similarly,start_proxy()is called without await in_wrap_async(line 289) but with await in_wrap_async_gen(line 326). This could mean: (1) the functions have both sync and async versions, (2) there's a bug in one of the usages, or (3) the comment is incorrect. Without seeing the actual implementation ofrelease_proxy, I cannot definitively say the comment is correct. The inconsistency within the same file is suspicious though. I'm making an assumption about whetherrelease_proxy()is sync or async without seeing its implementation. The inconsistent usage patterns in the code could indicate either a bug or that there are multiple versions of these functions. I should not assume the comment is correct without strong evidence. Given the rule that I need STRONG EVIDENCE to keep a comment, and I cannot see the implementation ofrelease_proxy()in this diff, I don't have strong evidence. The inconsistent usage could be intentional (different contexts require different approaches) or could be a bug in the existing code rather than the line being commented on. Without access to the proxy module implementation, I cannot verify the claim. I should delete this comment because I cannot verify from the diff alone whetherrelease_proxy()is sync or async. The inconsistent usage patterns in the code are not sufficient evidence to confirm the comment is correct, and understanding this requires seeing the implementation in another file.
3. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:194
- Draft comment:
Consider logging exceptions in _record_input instead of silently passing to aid in debugging. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a new file being added, so technically everything in it is "new code" or a "change". However, the comment is suggesting a code quality improvement (logging exceptions instead of silently passing). The rules state that "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This comment is actionable and clear. However, I need to consider: 1) Is there strong evidence this is the right approach? The silent exception handling might be intentional for instrumentation code that shouldn't disrupt the main application flow. 2) Looking at similar patterns in the codebase,_record_outputalso silently passes exceptions, suggesting this is a deliberate pattern. 3) In instrumentation/observability code, it's often a best practice to fail silently to avoid breaking the instrumented application. The comment doesn't acknowledge this trade-off. The silent exception handling might be intentional design for instrumentation code - observability tooling typically should not break the application it's monitoring. The comment doesn't consider this context and may be suggesting a change that could make the instrumentation more intrusive. Without seeing other similar instrumentation files in this codebase, I can't be certain this is the wrong pattern. While the silent exception handling might be intentional, the comment suggests logging (not raising) exceptions, which would maintain the non-intrusive behavior while improving debuggability. However, there's a logger available in the file, and if the developers wanted logging here, they likely would have added it. The pattern is consistent across both_record_inputand_record_output, suggesting it's deliberate. This appears to be a subjective code quality suggestion without strong evidence it's necessary. This comment suggests a code quality improvement but lacks strong evidence that it's necessary. The silent exception handling appears to be an intentional pattern for instrumentation code (used consistently in both_record_inputand_record_output), and the comment doesn't acknowledge the trade-offs involved. Since I need strong evidence to keep a comment and this is more of a subjective suggestion, I should delete it.
4. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py:210
- Draft comment:
Consider logging exceptions in _record_output rather than silently passing, to help diagnose issues with output serialization. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This is a code quality suggestion about error handling. The comment is about code that was added in this diff (it's a new file). However, looking at the pattern in the codebase,_record_inputuses the exact same silent exception handling pattern. The comment is suggesting a refactor/improvement rather than pointing out a bug. According to the rules, "Comments that suggest code quality refactors are good! But only if they are actionable and clear." This comment is actionable and clear. However, I need to consider: 1) Is this obviously important enough to warrant a comment? 2) Is there strong evidence this is correct? The fact that_record_inputuses the same pattern suggests this might be an intentional design choice. The comment doesn't acknowledge this consistency. Also, the rules say "Do NOT comment unless there is clearly a code change required" - this is more of a "nice to have" suggestion rather than a clear requirement. The comment doesn't acknowledge that_record_inputuses the exact same pattern, which suggests this might be an intentional design choice across the instrumentation code. If logging were truly important here, it would likely be needed in both places. The comment might be suggesting an inconsistency that doesn't actually exist, or suggesting a change that goes against the established pattern in this code. While the pattern is consistent between_record_inputand_record_output, that doesn't necessarily mean it's the right pattern. However, the rules state that comments should point out clear code changes that are required, not suggest improvements. This is more of a "consider doing X" suggestion rather than identifying a clear problem. Without strong evidence that the silent exception handling is causing issues, this falls into the category of speculative improvements. This comment suggests a code quality improvement but doesn't identify a clear bug or required change. The silent exception handling is consistent with the pattern used in_record_input, suggesting it's intentional. Without strong evidence that this is causing problems, this is more of a speculative improvement suggestion rather than a required fix.
Workflow ID: wflow_wwEXhmESWW7Ugdxk
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py
Outdated
Show resolved
Hide resolved
…e_agent/__init__.py Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/__init__.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed b7d751d in 1 minute and 53 seconds. Click for details.
- Reviewed
378lines of code in10files - Skipped
0files when reviewing. - Skipped posting
8draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. pyproject.toml:70
- Draft comment:
Dependency version bumped to 'lmnr-claude-code-proxy>=0.1.0a3'. Please ensure that any breaking changes or new features are documented (e.g. in CHANGELOG). - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the PR author to ensure that breaking changes or new features are documented, which falls under the rule of not asking the author to ensure things. It also relates to dependency changes, which should not be commented on unless it's about confirming intentionality or specific issues.
2. src/lmnr/opentelemetry_lib/opentelemetry/instrumentation/claude_agent/proxy.py:70
- Draft comment:
Using a background thread with a fixed GRACE_TO_FLUSH_SECONDS delay for stopping the proxy is acceptable, but ensure that this delay does not lead to race conditions in concurrent environments or tests. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =0%<= threshold50%The comment is asking the PR author to ensure that the delay does not lead to race conditions, which is against the rules. It does not provide a specific suggestion or ask for a specific test to be written. Therefore, it should be removed.
3. tests/test_initialize.py:12
- Draft comment:
The fixture nicely saves and restores Laminar's internal state. Consider adding a short comment explaining why manual state management is critical in tests. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
4. tests/test_instrumentations/test_claude_agent/conftest.py:4
- Draft comment:
The 'cleanup_claude_proxy' fixture efficiently cleans up the proxy server before and after tests. Ensure that suppressing errors in _cleanup_proxy does not hide underlying issues during debugging. - Reason this comment was not posted:
Confidence changes required:30%<= threshold50%None
5. tests/test_instrumentations/test_claude_agent/test_claude_sdk_client.py:9
- Draft comment:
The test for ClaudeSDKClient verifies span creation and attributes correctly. Confirm that using 'asyncio.run' aligns with your event loop policy and that the order of spans is deterministic. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
6. tests/test_instrumentations/test_claude_agent/test_query.py:12
- Draft comment:
The query test validates that the span attributes (name, path, output) are correctly set. Remove any commented-out prompt placeholders if they are no longer needed. - Reason this comment was not posted:
Confidence changes required:40%<= threshold50%None
7. tests/test_instrumentations/test_claude_agent/test_query_with_alias.py:9
- Draft comment:
The alias test for 'query' confirms that both the original and aliased functions produce the same span. Ensure consistent naming conventions are maintained. - Reason this comment was not posted:
Confidence changes required:40%<= threshold50%None
8. tests/test_instrumentations/test_claude_agent/test_tool.py:14
- Draft comment:
Tool tests correctly define decorated tools and validate the generated spans. Be cautious with hard-coded span counts (e.g. expecting 9 spans) which might change if instrumentation behavior is updated. - Reason this comment was not posted:
Confidence changes required:50%<= threshold50%None
Workflow ID: wflow_vv4fQqg60Q38x9gq
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| try: | ||
| result = await wrapped(*args, **kwargs) | ||
| except Exception as e: # pylint: disable=broad-except | ||
| if original_base_url is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The async wrapper (_wrap_async) duplicates environment restoration logic for ANTHROPIC_BASE_URL. Consider extracting this into a helper function to avoid duplication and potential inconsistencies.
Note
Adds Claude Agent SDK auto-instrumentation (with local proxy and span-context handoff), updates deps, and adds comprehensive tests; bumps version to 0.7.23a1.
ClaudeAgentInstrumentorinopentelemetry/instrumentation/claude_agent/__init__.pywith async, streaming, and module-level wrappers (ClaudeSDKClient.connect/query/receive_*,claude_agent_sdk.query,create_sdk_mcp_server).tracing/_instrument_initializers.pyand registers enum intracing/instruments.py(Instruments.CLAUDE_AGENT).opentelemetry/instrumentation/claude_agent/proxy.pyto manage a local Claude proxy (start/stop, port selection, env var swap, span-context publication).Laminar.use_spanbehavior when uninitialized (uses contextmanager correctly).claude-agent-sdk(vialmnr-claude-code-proxy) and dev depclaude-agent-sdk>=0.1.6inpyproject.toml.0.7.23a1inpyproject.tomlandsrc/lmnr/version.py.tests/.../mock_transport.pyand proxy cleanup fixture.query, alias import, and tools (test_claude_sdk_client.py,test_query.py,test_query_with_alias.py,test_tool.py).tests/test_initialize.pyfixture to preserve/restoreLaminarstate.Written by Cursor Bugbot for commit b7d751d. This will update automatically on new commits. Configure here.
Important
Add Claude Agent SDK with auto-instrumentation, proxy management, and tests, updating dependencies and version.
ClaudeAgentInstrumentorin__init__.pyforclaude_agent_sdkwith span input/output and streaming support.proxy.py(start/stop server, env var routing, trace propagation).ClaudeAgentInstrumentorInitializerin_instrument_initializers.pyandinstruments.py.query, and tools intest_claude_sdk_client.py,test_query.py,test_query_with_alias.py, andtest_tool.py.MockClaudeTransportfor simulated I/O inmock_transport.py.claude-agent-sdkandlmnr-claude-code-proxy; bump version to0.7.23a1inpyproject.tomlandversion.py.This description was created by
for b7d751d. You can customize this summary. It will automatically update as commits are pushed.