Skip to content

Conversation

@nicohrubec
Copy link
Member

@nicohrubec nicohrubec commented Nov 17, 2025

This adds instrumentation for the OpenAI Embeddings API. Specifically, we instrument Create embeddings, which is also the only endpoint in the embeddings API atm. Implementation generally follows the same flow we also have for the completions and responses APIs. To detect embedding requests we check whether the model name contains embeddings.

The embedding results are currently not tracked, as we do not truncate outputs right now as far as I know and these can get large quite easily. For instance, text-embedding-3 uses dimension 1536 (small) or 3072 (large) by default, resulting in single embeddings sizes of 6KB and 12KB, respectively.

Test updates:

  • Added a new scenario-embeddings.mjs file, that covers the embeddings API tests (tried to put this in the main scenario.mjs, but the linter starts complaining about the file being too long).
  • Added a new scenario file to check that truncation works properly for the embeddings API. Also moved all truncation scenarios to a folder.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 17, 2025

node-overhead report 🧳

Note: This is a synthetic benchmark with a minimal express app and does not necessarily reflect the real-world performance impact in an application.

Scenario Requests/s % of Baseline Prev. Requests/s Change %
GET Baseline 8,761 - 8,666 +1%
GET With Sentry 1,357 15% 1,348 +1%
GET With Sentry (error only) 6,088 69% 5,936 +3%
POST Baseline 1,203 - 1,184 +2%
POST With Sentry 515 43% 501 +3%
POST With Sentry (error only) 1,048 87% 1,040 +1%
MYSQL Baseline 3,329 - 3,255 +2%
MYSQL With Sentry 455 14% 409 +11%
MYSQL With Sentry (error only) 2,696 81% 2,673 +1%

View base workflow run

@github-actions
Copy link
Contributor

github-actions bot commented Nov 17, 2025

size-limit report 📦

Path Size % Change Change
@sentry/browser 24.62 kB - -
@sentry/browser - with treeshaking flags 23.13 kB - -
@sentry/browser (incl. Tracing) 41.37 kB - -
@sentry/browser (incl. Tracing, Profiling) 45.69 kB - -
@sentry/browser (incl. Tracing, Replay) 79.82 kB - -
@sentry/browser (incl. Tracing, Replay) - with treeshaking flags 69.52 kB - -
@sentry/browser (incl. Tracing, Replay with Canvas) 84.5 kB - -
@sentry/browser (incl. Tracing, Replay, Feedback) 96.73 kB - -
@sentry/browser (incl. Feedback) 41.29 kB - -
@sentry/browser (incl. sendFeedback) 29.29 kB - -
@sentry/browser (incl. FeedbackAsync) 34.21 kB - -
@sentry/react 26.32 kB - -
@sentry/react (incl. Tracing) 43.32 kB - -
@sentry/vue 29.11 kB - -
@sentry/vue (incl. Tracing) 43.17 kB - -
@sentry/svelte 24.64 kB - -
CDN Bundle 26.94 kB - -
CDN Bundle (incl. Tracing) 41.93 kB - -
CDN Bundle (incl. Tracing, Replay) 78.49 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) 83.95 kB - -
CDN Bundle - uncompressed 78.92 kB - -
CDN Bundle (incl. Tracing) - uncompressed 124.3 kB - -
CDN Bundle (incl. Tracing, Replay) - uncompressed 240.33 kB - -
CDN Bundle (incl. Tracing, Replay, Feedback) - uncompressed 253.09 kB - -
@sentry/nextjs (client) 45.73 kB - -
@sentry/sveltekit (client) 41.76 kB - -
@sentry/node-core 50.91 kB +0.01% +1 B 🔺
@sentry/node 159.21 kB +0.08% +121 B 🔺
@sentry/node - without tracing 92.78 kB - -
@sentry/aws-serverless 106.53 kB - -

View base workflow run

@nicohrubec nicohrubec marked this pull request as ready for review November 17, 2025 16:17
Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Just a quick comment and then we're good to go

/**
* Add attributes for Embeddings API responses
*/
function addEmbeddingsAttributes(span: Span, response: OpenAICreateEmbeddingsObject): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to utils? if max number of lines complain, we can keep this here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, just put it there because similar methods for completions and responses were in the index file. moved them all to utils

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing! i'm trying to clean this up more moving forward

span.setAttributes({ [GEN_AI_RESPONSE_TEXT_ATTRIBUTE]: response.output_text });
}
} else if (isEmbeddingsResponse(response)) {
addEmbeddingsAttributes(span, response);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Array Truncation Causes Embeddings Data Loss

When embeddings input is an array of strings requiring truncation, getTruncatedJsonString incorrectly processes it through truncateGenAiMessages, which expects message objects with content or parts properties. For plain strings, truncation fails and returns an empty array, causing data loss. The embeddings API accepts both single strings and arrays of strings, but only single string truncation works correctly.

Fix in Cursor Fix in Web

Copy link
Member

@RulaKhaled RulaKhaled left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@RulaKhaled RulaKhaled merged commit 935ef55 into develop Nov 18, 2025
195 checks passed
@RulaKhaled RulaKhaled deleted the nh/openai-embeddings-api branch November 18, 2025 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants