feat(openai): OpenAI responses minimal instrumentation #3052

dinmukhamedm · 2025-06-28T16:04:30Z

This is a basic instrumentation of OpenAI responses, supporting the following:

text inputs and outputs
(function) tool calls
(function) tool inputs
(function) tool definitions
computer use tool calls
async versions of all of the above

Things NOT supported, but that would be great to add:

input image parsing
other types of tool calls such as MCP etc
emitting log events alongside spans
streaming responses

I have added tests that cover my changes.
If adding a new instrumentation or changing an existing one, I've added screenshots from some observability platform showing the change.
PR name follows conventional commits format: feat(instrumentation): ... or fix(instrumentation): ....
(If applicable) I have updated the documentation accordingly.

Important

Add instrumentation for OpenAI responses, including text inputs/outputs, tool calls, and async operations, with tests for verification.

Instrumentation:
- Add support for OpenAI responses in responses_wrappers.py, handling text inputs/outputs, tool calls, and async operations.
- Update __init__.py to integrate new response wrappers for synchronous and asynchronous operations.
Testing:
- Add test_responses.py to test response creation, input history, and tool calls.
- Include VCR cassettes for test scenarios in test_responses.yaml, test_responses_tool_calls.yaml, and test_responses_with_input_history.yaml.

^{This description was created by}^{for 5714d08. You can customize this summary. It will automatically update as commits are pushed.}

ellipsis-dev

Caution

Changes requested ❌

Reviewed everything up to 5714d08 in 1 minute and 34 seconds. Click for details.

Reviewed 1289 lines of code in 6 files
Skipped 0 files when reviewing.
Skipped posting 2 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/__init__.py:303

Draft comment:
The _try_wrap calls for 'Responses.create' and 'Responses.retrieve' both use the same wrapper (responses_get_or_create_wrapper). Ensure this is intentional since these methods could potentially require different instrumentation logic. Consider adding a comment to document the rationale.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50% This comment violates several rules. It's asking for confirmation of intention ("Ensure this is intentional") and requesting documentation. It's speculative ("could potentially require different instrumentation logic"). Without seeing the implementation of responses_get_or_create_wrapper, we can't know if this is actually an issue. The author likely had a good reason for using the same wrapper. Maybe there really is a bug here and these operations should be handled differently? The shared wrapper name is unusual. Even if there was a bug, asking for confirmation isn't helpful - the comment should point out the specific issue. The shared wrapper could be intentional if these operations are similar enough to share instrumentation logic. This comment should be deleted as it's primarily asking for confirmation and documentation rather than pointing out a clear issue that needs to be fixed.

2. packages/opentelemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py:446

Draft comment:
The code uses the dictionary merge operator (|) to combine output_blocks from the parsed response and existing data. This requires Python 3.9 or later; ensure that this version requirement is documented and acceptable for the project.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 0% <= threshold 50% The comment is informative and suggests ensuring documentation of a version requirement, which is not allowed by the rules. It doesn't provide a specific code suggestion or ask for a test to be written.

Workflow ID: wflow_TgkmsYc5QmFqJGlM

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

...lemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py

ellipsis-dev · 2025-06-28T23:25:02Z

...lemetry-instrumentation-openai/opentelemetry/instrumentation/openai/v1/responses_wrappers.py

+    response_model: Optional[str] = pydantic.Field(default=None)
+
+
+responses: dict[str, TracedData] = {}


The module uses a global 'responses' dictionary to store traced data. In a multi-threaded or async environment, this could potentially lead to race conditions. Consider using a thread-safe or async-safe data structure, or document why this is acceptable in your use case.

dinmukhamedm force-pushed the openai-responses branch from 548b547 to 8ffd459 Compare June 28, 2025 16:14

feat(openai): very basic instrumentation of openai.responses

12165b7

dinmukhamedm force-pushed the openai-responses branch from 8ffd459 to 12165b7 Compare June 28, 2025 22:44

conditional imports for older sdk versions

5714d08

dinmukhamedm marked this pull request as ready for review June 28, 2025 23:23

ellipsis-dev bot reviewed Jun 28, 2025

View reviewed changes

remove debug re-raise

9680414

dinmukhamedm requested a review from nirga June 29, 2025 17:28

dinmukhamedm changed the title ~~OpenAI responses~~ feat(openai) OpenAI responses minimal instrumentation Jun 30, 2025

dinmukhamedm changed the title ~~feat(openai) OpenAI responses minimal instrumentation~~ feat(openai): OpenAI responses minimal instrumentation Jun 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(openai): OpenAI responses minimal instrumentation #3052

feat(openai): OpenAI responses minimal instrumentation #3052

Uh oh!

dinmukhamedm commented Jun 28, 2025 •

edited

Loading

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Uh oh!

ellipsis-dev bot Jun 28, 2025

Uh oh!

Uh oh!

		response_model: Optional[str] = pydantic.Field(default=None)


		responses: dict[str, TracedData] = {}

feat(openai): OpenAI responses minimal instrumentation #3052

Are you sure you want to change the base?

feat(openai): OpenAI responses minimal instrumentation #3052

Uh oh!

Conversation

dinmukhamedm commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ellipsis-dev bot Jun 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

dinmukhamedm commented Jun 28, 2025 •

edited

Loading