Skip to content

Commit

Permalink
fix(openai): correctly tag image inputs for chat completions (#7759)
Browse files Browse the repository at this point in the history
Resolves #7737.

This PR adds a step to stringify input messages before tagging in the
OpenAI chat completions endpoint. Previously, we had assumed that
`messages.content` would always be a string (which was true until OpenAI
recently added the image input feature to the chat completions
endpoint), but it can now be an array of str-str dictionaries.

## Testing Strategy

Regression tests have been added, and manual testing has also confirmed
that the error reported on #7737 does not appear.

## Checklist

- [x] Change(s) are motivated and described in the PR description.
- [x] Testing strategy is described if automated tests are not included
in the PR.
- [x] Risk is outlined (performance impact, potential for breakage,
maintainability, etc).
- [x] Change is maintainable (easy to change, telemetry, documentation).
- [x] [Library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
are followed. If no release note is required, add label
`changelog/no-changelog`.
- [x] Documentation is included (in-code, generated user docs, [public
corp docs](https://github.com/DataDog/documentation/)).
- [x] Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist

- [ ] Title is accurate.
- [ ] No unnecessary changes are introduced.
- [ ] Description motivates each change.
- [ ] Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes unless absolutely necessary.
- [ ] Testing strategy adequately addresses listed risk(s).
- [ ] Change is maintainable (easy to change, telemetry, documentation).
- [ ] Release note makes sense to a user of the library.
- [ ] Reviewer has explicitly acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment.
- [ ] Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
- [ ] If this PR touches code that signs or publishes builds or
packages, or handles credentials of any kind, I've requested a review
from `@DataDog/security-design-and-guidance`.
- [ ] This PR doesn't touch any of that.
  • Loading branch information
Yun-Kim committed Dec 1, 2023
1 parent c976a5f commit aeb1e47
Show file tree
Hide file tree
Showing 7 changed files with 298 additions and 2 deletions.
7 changes: 5 additions & 2 deletions ddtrace/contrib/openai/_endpoint_hooks.py
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,9 @@ def _record_request(self, pin, integration, span, args, kwargs):
super()._record_request(pin, integration, span, args, kwargs)
for idx, m in enumerate(kwargs.get("messages", [])):
if integration.is_pc_sampled_span(span):
span.set_tag_str("openai.request.messages.%d.content" % idx, integration.trunc(m.get("content", "")))
span.set_tag_str(
"openai.request.messages.%d.content" % idx, integration.trunc(str(m.get("content", "")))
)
span.set_tag_str("openai.request.messages.%d.role" % idx, m.get("role", ""))
span.set_tag_str("openai.request.messages.%d.name" % idx, m.get("name", ""))

Expand All @@ -270,8 +272,9 @@ def _record_response(self, pin, integration, span, args, kwargs, resp, error):
return self._handle_streamed_response(integration, span, args, kwargs, resp)
for choice in resp.choices:
idx = choice.index
finish_reason = getattr(choice, "finish_reason", None)
message = choice.message
span.set_tag_str("openai.response.choices.%d.finish_reason" % idx, str(choice.finish_reason))
span.set_tag_str("openai.response.choices.%d.finish_reason" % idx, str(finish_reason))
span.set_tag_str("openai.response.choices.%d.message.role" % idx, choice.message.role)
if integration.is_pc_sampled_span(span):
span.set_tag_str(
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
---
fixes:
- |
openai: This fix resolves an issue where tagging image inputs in the chat completions endpoint resulted in attribute errors.
87 changes: 87 additions & 0 deletions tests/contrib/openai/cassettes/v0/chat_completion_image_input.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
interactions:
- request:
body: '{"model": "gpt-4-vision-preview", "messages": [{"role": "user", "content":
[{"type": "text", "text": "What\u2019s in this image?"}, {"type": "image_url",
"image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"}]}]}'
headers:
Accept:
- '*/*'
Accept-Encoding:
- gzip, deflate
Connection:
- keep-alive
Content-Length:
- '332'
Content-Type:
- application/json
User-Agent:
- OpenAI/v1 PythonBindings/0.27.2
X-OpenAI-Client-User-Agent:
- '{"bindings_version": "0.27.2", "httplib": "requests", "lang": "python", "lang_version":
"3.11.1", "platform": "macOS-14.1.1-arm64-arm-64bit", "publisher": "openai",
"uname": "Darwin 23.1.0 Darwin Kernel Version 23.1.0: Mon Oct 9 21:27:24
PDT 2023; root:xnu-10002.41.9~6/RELEASE_ARM64_T6000 arm64 arm"}'
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
body:
string: !!binary |
H4sIAAAAAAAAA0yPQWvDMAyF/4rweRkJLW3pfbuMwWCDjY1R1FiJvdiWsZWmpfS/DwdaehPSe5/e
Oyur1RZUa1BaH121efteP73arqZaf+HmdHoe34euHvzn9BLVAyje/1ErV89jyz46EsuhHNtEKFSI
zbpumnqxWa8eQHnW5Iqlj1Itq6apV9XBZsuhiokOlqZiHjP2pLZwVjGxj7ITHijkAmuaTaHfft2d
Cl9Y0N3LF8tL0Ru2LZXNz1l5yjd+YlcGhTnbLBhkzs5BKMzNPgyB9dgTZMNTBoRMiQJBQBkTOnAY
dG4xErQcCsOGHrgDhIlZU4A9Y9ITugHEoAAdhYLOICbx2BtV0nU22Gx2mgSty3MuOcU5l8fjtU1R
2qDpqLZQX34v/wAAAP//AwBzg8RBsQEAAA==
headers:
CF-Cache-Status:
- DYNAMIC
CF-RAY:
- 82cbdfefb813436e-EWR
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Mon, 27 Nov 2023 16:51:18 GMT
Server:
- cloudflare
Set-Cookie:
- __cf_bm=d6ov6YwrqemojSGt8iO9_0Qf3UJhwlgQMcoKdnoRbwg-1701103878-0-ASEB1DDqa/JLbWcuxqNOaYCngY4tGk9Q8m3aVpU4jAjGua7OpatkiSrzIPI9rRMHQUqGKdiJYa52zH3JS1iq5Wk=;
path=/; expires=Mon, 27-Nov-23 17:21:18 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=ALaEbe03egqFuF8IP_iPjk4QMdgVjcdkH1QEiSb2ric-1701103878609-0-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Transfer-Encoding:
- chunked
alt-svc:
- h3=":443"; ma=86400
openai-model:
- gpt-4-1106-vision-preview
openai-organization:
- datadog-4
openai-processing-ms:
- '4003'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=15724800; includeSubDomains
x-ratelimit-limit-requests:
- '100'
x-ratelimit-limit-tokens:
- '150000'
x-ratelimit-remaining-requests:
- '98'
x-ratelimit-remaining-tokens:
- '149977'
x-ratelimit-reset-requests:
- 25m47.91s
x-ratelimit-reset-tokens:
- 9ms
x-request-id:
- fed91798d778620c33cc96e1c49f0300
status:
code: 200
message: OK
version: 1
92 changes: 92 additions & 0 deletions tests/contrib/openai/cassettes/v1/chat_completion_image_input.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
interactions:
- request:
body: '{"messages": [{"role": "user", "content": [{"type": "text", "text": "What\u2019s
in this image?"}, {"type": "image_url", "image_url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"}]}],
"model": "gpt-4-vision-preview"}'
headers:
accept:
- application/json
accept-encoding:
- gzip, deflate
connection:
- keep-alive
content-length:
- '332'
content-type:
- application/json
host:
- api.openai.com
user-agent:
- OpenAI/Python 1.1.1
x-stainless-arch:
- arm64
x-stainless-lang:
- python
x-stainless-os:
- MacOS
x-stainless-package-version:
- 1.1.1
x-stainless-runtime:
- CPython
x-stainless-runtime-version:
- 3.11.1
method: POST
uri: https://api.openai.com/v1/chat/completions
response:
content: '{"id": "chatcmpl-8PZ7EMif0e0dXa8yyFuSkf0kmWwKp", "object": "chat.completion",
"created": 1701103876, "model": "gpt-4-1106-vision-preview", "usage": {"prompt_tokens":
1118, "completion_tokens": 16, "total_tokens": 1134}, "choices": [{"message":
{"role": "assistant", "content": "The image shows a serene natural landscape consisting of a wooden boardwalk that extends through"}, "finish_details": {"type": "max_tokens"},
"index": 0}]}'
headers:
CF-Cache-Status:
- DYNAMIC
CF-RAY:
- 82cbdb89482dc420-EWR
Connection:
- keep-alive
Content-Encoding:
- gzip
Content-Type:
- application/json
Date:
- Mon, 27 Nov 2023 16:48:19 GMT
Server:
- cloudflare
Set-Cookie:
- __cf_bm=GG9trWAPd86aqXS0Gt6guzjR2FLum2Fa5zkfYbUCNUw-1701103699-0-Aetjz9Ne380ZTQMFwc/pySteL/hStTE57jbQX8ddzByy+YCO3xU2xryQQu/rV/IAJrmhMOsMpdonkyX6+/lGwdY=;
path=/; expires=Mon, 27-Nov-23 17:18:19 GMT; domain=.api.openai.com; HttpOnly;
Secure; SameSite=None
- _cfuvid=1aDtGGCysQtwRa_0YB6rXUoELGu20x4HAE_1R_jGQU8-1701103699770-0-604800000;
path=/; domain=.api.openai.com; HttpOnly; Secure; SameSite=None
Transfer-Encoding:
- chunked
alt-svc:
- h3=":443"; ma=86400
openai-model:
- gpt-4-1106-vision-preview
openai-organization:
- datadog-4
openai-processing-ms:
- '5254'
openai-version:
- '2020-10-01'
strict-transport-security:
- max-age=15724800; includeSubDomains
x-ratelimit-limit-requests:
- '100'
x-ratelimit-limit-tokens:
- '150000'
x-ratelimit-remaining-requests:
- '98'
x-ratelimit-remaining-tokens:
- '149977'
x-ratelimit-reset-requests:
- 14m24s
x-ratelimit-reset-tokens:
- 9ms
x-request-id:
- a2dab4f27488b3acab1cfe854feb6895
http_version: HTTP/1.1
status_code: 200
version: 1
30 changes: 30 additions & 0 deletions tests/contrib/openai/test_openai_v0.py
Original file line number Diff line number Diff line change
Expand Up @@ -436,6 +436,36 @@ def test_chat_completion_tool_calling(openai, openai_vcr, snapshot_tracer):
)


@pytest.mark.snapshot(
token="tests.contrib.openai.test_openai.test_chat_completion_image_input",
ignores=[
"meta.http.useragent",
"meta.openai.base_url",
],
)
def test_chat_completion_image_input(openai, openai_vcr, snapshot_tracer):
image_url = (
"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk"
".jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
)
with openai_vcr.use_cassette("chat_completion_image_input.yaml"):
openai.ChatCompletion.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": image_url,
},
],
}
],
)


@pytest.mark.parametrize("ddtrace_config_openai", [dict(metrics_enabled=b) for b in [True, False]])
def test_enable_metrics(openai, openai_vcr, ddtrace_config_openai, mock_metrics, mock_tracer):
"""Ensure the metrics_enabled configuration works."""
Expand Down
32 changes: 32 additions & 0 deletions tests/contrib/openai/test_openai_v1.py
Original file line number Diff line number Diff line change
Expand Up @@ -484,6 +484,38 @@ def test_chat_completion_tool_calling(openai, openai_vcr, snapshot_tracer):
)


@pytest.mark.snapshot(
token="tests.contrib.openai.test_openai.test_chat_completion_image_input",
ignores=[
"meta.http.useragent",
"meta.openai.api_type",
"meta.openai.api_base",
],
)
def test_chat_completion_image_input(openai, openai_vcr, snapshot_tracer):
image_url = (
"https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk"
".jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
)
with openai_vcr.use_cassette("chat_completion_image_input.yaml"):
client = openai.OpenAI()
client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "What’s in this image?"},
{
"type": "image_url",
"image_url": image_url,
},
],
}
],
)


def test_chat_completion_raw_response(openai, openai_vcr, snapshot_tracer):
with snapshot_context(
token="tests.contrib.openai.test_openai.test_chat_completion",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
[[
{
"name": "openai.request",
"service": null,
"resource": "createChatCompletion",
"trace_id": 0,
"span_id": 1,
"parent_id": 0,
"meta": {
"_dd.p.dm": "-0",
"_dd.p.tid": "6564c89d00000000",
"component": "openai",
"language": "python",
"openai.api_base": "https://api.openai.com/v1",
"openai.api_type": "open_ai",
"openai.base_url": "https://api.openai.com/v1/",
"openai.organization.name": "datadog-4",
"openai.request.endpoint": "/v1/chat/completions",
"openai.request.messages.0.content": "[{'type': 'text', 'text': 'What\u2019s in this image?'}, {'type': 'image_url', 'image_url': 'https://upload.wikimedia.org/wikipedia/c...",
"openai.request.messages.0.name": "",
"openai.request.messages.0.role": "user",
"openai.request.method": "POST",
"openai.request.model": "gpt-4-vision-preview",
"openai.response.choices.0.finish_reason": "None",
"openai.response.choices.0.message.content": "The image shows a serene natural landscape consisting of a wooden boardwalk that extends through",
"openai.response.choices.0.message.role": "assistant",
"openai.response.created": "1701103876",
"openai.response.id": "chatcmpl-8PZ7EMif0e0dXa8yyFuSkf0kmWwKp",
"openai.response.model": "gpt-4-1106-vision-preview",
"openai.user.api_key": "sk-...key>",
"runtime-id": "08f460eea4204da69b46486a3dc357f9"
},
"metrics": {
"_dd.measured": 1,
"_dd.top_level": 1,
"_dd.tracer_kr": 1.0,
"_sample_rate": 1.0,
"_sampling_priority_v1": 1,
"openai.organization.ratelimit.requests.remaining": 98,
"openai.organization.ratelimit.tokens.remaining": 149977,
"openai.response.usage.completion_tokens": 16,
"openai.response.usage.prompt_tokens": 1118,
"openai.response.usage.total_tokens": 1134,
"process_id": 45616
},
"duration": 26636000,
"start": 1701103773252008000
}]]

0 comments on commit aeb1e47

Please sign in to comment.