Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Support for OpenAI v4 #4232

Merged
merged 24 commits into from
Apr 23, 2024
Merged

Add Support for OpenAI v4 #4232

merged 24 commits into from
Apr 23, 2024

Conversation

sabrenner
Copy link
Collaborator

@sabrenner sabrenner commented Apr 9, 2024

What does this PR do?

Updates the openai integration to support the latest major release line of the Node.js OpenAI SDK.

Notes for Reviewers

Most of the LOC on this PR are for test changes, which are mostly changing relevant method calls to v4 syntax (ie createChatCompletion vs chat.completions.create), and relevant payload differences. The tests themselves still have the same content.

The important files changed are both the instrumentation and plugin code.

Instrumentation Updates

The instrumentation ensures the same functions are wrapped from the existing support for v3 of the OpenAI SDK. However, these methods are not available on the top-level export anymore, and instead are found in various files throughout, and is reflected in the wrapping configuration (by specifying all the different locations and target classes to wrap in an object)

There are two major differences in how these methods are implemented on the SDK side that affect the instrumentation.

  1. The promises returned from the SDK are a custom promise, APIPromise, which has a method withResponse that can be called after the promise resolves.
  2. This promise resolves into only the data returned from the call. Previously, it would also return the HTTP Response object associated with the underlying call to the OpenAI API, which we used to extract headers for tagging and metrics. This is now not available off of the resolved promise.

To work with the restriction of needing to return a custom promise type, there is no promise chaining on the returned object to publish the span end event. Rather, the APIPromise object itself is wrapped. This allows for access to an internal variable which represents the raw response from the external call to the OpenAI API, this.responsePromise.

However, the then method of the promise cannot be wrapped either, as it is also possible to resolve the APIPromise by calling withResponse(), which allows the user to get the response as well as the data associated with the call. Both then and withResponse utilize a private parse method on the APIPromise, which takes in the raw response, and parses it to the refined data object which the user typically sees. Wrapping parse allows for the raw response and final output to be gathered together, and sent to the plugin to populate data fields on the span.

Plugin Updates

There are minimal changes made to the plugin side. The overall idea is, the updates made in this PR should not impact the tags/metrics/logs emitted from the overall integration. The only things that needed to be changed were method name switch cases, as well as a few miscellaneous changes due to payload structure difference in the API.

Testing Updates

The content of the tests themselves (checking for specific tags/metrics, as well as calls to the mock logger and DogStatsD) remain largely unchanged. Instead, lots of version gates were added to use different method calls (ie openai.createCompletion vs openai.completions.create) and check for the resulting differences in tag values (mostly resource names following a similarly highlighted pattern).

Motivation

Support the latest major release line for the Node.js OpenAI SDK.

Outstanding Issues

  • There are some issues with ESM - I need to verify if this is something that can be changed in this PR, or if there is an issue with import-in-the-middle that needs to be addressed, TBD
  • Adding support for streaming - need to make sure everything is tagged correctly, WIP (I have a commit locally, but this might be best left for another PR)
  • Support tooling/function calls in chat completions
  • Fix tests to account for change in fine-tuning API

Copy link

github-actions bot commented Apr 9, 2024

Overall package size

Self size: 6.33 MB
Deduped: 60.83 MB
No deduping: 61.11 MB

Dependency sizes

name version self size total size
@datadog/native-iast-taint-tracking 1.7.0 16.71 MB 16.72 MB
@datadog/native-appsec 7.1.1 14.39 MB 14.4 MB
@datadog/pprof 5.2.0 8.84 MB 9.21 MB
protobufjs 7.2.5 2.77 MB 6.56 MB
@datadog/native-iast-rewriter 2.3.0 2.15 MB 2.24 MB
@opentelemetry/core 1.14.0 872.87 kB 1.47 MB
@datadog/native-metrics 2.0.0 898.77 kB 1.3 MB
@opentelemetry/api 1.4.1 780.32 kB 780.32 kB
import-in-the-middle 1.7.3 67.62 kB 731.01 kB
msgpack-lite 0.1.26 201.16 kB 281.59 kB
opentracing 0.14.7 194.81 kB 194.81 kB
semver 7.5.4 93.4 kB 123.8 kB
pprof-format 2.1.0 111.69 kB 111.69 kB
@datadog/sketches-js 2.1.0 109.9 kB 109.9 kB
lodash.sortby 4.7.0 75.76 kB 75.76 kB
lru-cache 7.14.0 74.95 kB 74.95 kB
ipaddr.js 2.1.0 60.23 kB 60.23 kB
ignore 5.2.4 51.22 kB 51.22 kB
int64-buffer 0.1.10 49.18 kB 49.18 kB
shell-quote 1.8.1 44.96 kB 44.96 kB
istanbul-lib-coverage 3.2.0 29.34 kB 29.34 kB
tlhunter-sorted-set 0.1.0 24.94 kB 24.94 kB
limiter 1.1.5 23.17 kB 23.17 kB
dc-polyfill 0.1.4 23.1 kB 23.1 kB
retry 0.13.1 18.85 kB 18.85 kB
node-abort-controller 3.1.1 16.89 kB 16.89 kB
jest-docblock 29.7.0 8.99 kB 12.76 kB
crypto-randomuuid 1.0.0 11.18 kB 11.18 kB
path-to-regexp 0.1.7 6.78 kB 6.78 kB
koalas 1.0.2 6.47 kB 6.47 kB
methods 1.1.2 5.29 kB 5.29 kB
module-details-from-path 1.0.3 4.47 kB 4.47 kB

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@pr-commenter
Copy link

pr-commenter bot commented Apr 9, 2024

Benchmarks

Benchmark execution time: 2024-04-18 21:37:39

Comparing candidate commit c178449 in PR branch sabrenner/openai4.x with baseline commit 0392082 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 259 metrics, 7 unstable metrics.

Copy link

codecov bot commented Apr 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.19%. Comparing base (c4c01e4) to head (dd25159).
Report is 3 commits behind head on master.

❗ Current head dd25159 differs from pull request most recent head c178449. Consider uploading reports for the commit c178449 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4232      +/-   ##
==========================================
- Coverage   73.16%   69.19%   -3.98%     
==========================================
  Files         245        1     -244     
  Lines       10442      198   -10244     
  Branches       33       33              
==========================================
- Hits         7640      137    -7503     
+ Misses       2802       61    -2741     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ladislasdellinger
Copy link

Hello, any ETA on when this will be merged ? 🙏🏼

@sabrenner
Copy link
Collaborator Author

ESM test failures related to DataDog/import-in-the-middle#60

@sabrenner sabrenner marked this pull request as ready for review April 17, 2024 17:51
@sabrenner sabrenner requested review from a team as code owners April 17, 2024 17:51
@sabrenner sabrenner requested a review from wconti27 April 17, 2024 17:51
Copy link
Collaborator

@Qard Qard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments. Also, that test file is enormous. Can we break that up somehow? Maybe with some fixture files or something?

@@ -113,42 +113,42 @@ class OpenApiPlugin extends TracingPlugin {
}

switch (methodName) {
case 'createFineTune':
case 'createFineTune': case 'fine_tuning.jobs.create': case 'fine-tune.create':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably not be putting multiple cases on the same line. I'm surprised the linter doesn't complain about this. 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah fair, will make these each their own line. yeah, surprised this wasn't linted (I think I had done it this way while I was working to let myself know it was the same function but different name).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could use blank lines to group renames together.

Comment on lines 160 to 162
try {
headers = Object.fromEntries(headers)
} catch { /* headers are already an object */ }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather we check it's an object rather than try/catching our way out of this.

Comment on lines 172 to 174
try {
path = new URL(path).pathname
} catch { /* path is already a URL pathname */ }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As with the object case above, can we just do a check rather than a try/catch? Exceptions are expensive.

packages/datadog-plugin-openai/src/index.js Outdated Show resolved Hide resolved
@sabrenner
Copy link
Collaborator Author

sabrenner commented Apr 17, 2024

Also, that test file is enormous. Can we break that up somehow? Maybe with some fixture files or something?

@Qard Fair point. My original intent was to not disturb the content or structure too much for this PR, as I really just wanted to isolate the changes to the instrumentation and test those accordingly. But, I agree it's completely an unmanageable size and format. Will look tomorrow at breaking it up for this PR

@Qard
Copy link
Collaborator

Qard commented Apr 17, 2024

Feel free to break that up as a follow-up. That comment was more an observation than a request for changes. I won't consider that a blocker here. 😅

@sabrenner
Copy link
Collaborator Author

@Qard Sounds good; in the interest of seeing if this PR can land a bit quicker, I'll do a refactor of this integration as a whole (+ its tests, maybe in two PRs, as its a fairly large and involved integration) separately.

Copy link
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get some screenshots of manual/demo traces generated by the new OpenAI support before we merge? Otherwise, lgtm!

@sabrenner
Copy link
Collaborator Author

@Yun-Kim hope the following looks ok! let me know if anything should be different (which, if there is, and it's not too concerning, I can address in a follow-up PR).

Using the latest version of OpenAI:

"openai": "^4.38.3"

With a simple chat completion like:

await openai.chat.completions.create({
    model: "gpt-3.5-turbo",
    messages: [{"role": "user", "content": "Hello!"}],
  })

We get the following trace
Screenshot 2024-04-23 at 11 00 39 AM

With the relevant tags on the trace
Screenshot 2024-04-23 at 11 03 49 AM

@sabrenner sabrenner merged commit c11fcfd into master Apr 23, 2024
108 of 109 checks passed
@sabrenner sabrenner deleted the sabrenner/openai4.x branch April 23, 2024 16:13
tlhunter pushed a commit that referenced this pull request Apr 25, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
tlhunter pushed a commit that referenced this pull request Apr 25, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
tlhunter pushed a commit that referenced this pull request Apr 25, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
This was referenced Apr 29, 2024
tlhunter pushed a commit that referenced this pull request Apr 29, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
tlhunter pushed a commit that referenced this pull request Apr 29, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
tlhunter pushed a commit that referenced this pull request Apr 29, 2024
* add instrumenation shims for openai v4.x

* instrumentation populates fields correctly

* fix method name

* wrap apipromise.parse to always return apipromise type

* support everything but streaming

* fix completion test and give tests generic names

* update tests + plugin for finetune api change

* tag tools

* fix logger test for tool calls

* fix tool call logger test

* streaming

* Revert "streaming"

This reverts commit ff854c2.

* request message tagging + tools log

* discard .only

* tool test

* tool test + finetune updates

* lint

* linting + fixes, clamping version for ESM tests for CI

---------

Co-authored-by: Mark Hayes <mcleodm@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants