perf(embedding): default embedding creation to base64 #1312

manekinekko · 2025-02-08T17:09:04Z

Requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding.

Closes #1310

I understand that this repository is auto-generated and my pull request may not be merged

Changes being requested

We always request embedding creating encoded as base64, and then decoded them to float32 based on the user's provided encoding_format parameter.

Additional context & links

After running a few benchmarks, requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding.

This performance improvement could translate to:

✅ Faster HTTP responses
✅ Less bandwidth used when generating multiple embeddings

This is the result of a request that creates embedding from a 10kb chunk, run 10 times (the number are the size of response body in kb):

Benchmark	Min (ms)	Max (ms)	Mean (ms)	Min (+)	Max (+)	Mean (+)
float32 vs base64	41.742	19616.000	9848.819	40.094 (3.9%)	8351.000 (57.4%)	4206.126 (57.3%)

Read more #1310

RobertCraigie

Thanks!

src/resources/embeddings.ts

IDisposable · 2025-03-11T03:42:08Z

This is a great idea! Who doesn't want 1/4 the network bandwidth?

RobertCraigie

Sorry for the delayed review, this looks good! Some minor comments and you left a test.only change in.

Will merge once comments have been addressed.

tests/api-resources/embeddings.test.ts

src/resources/embeddings.ts

manekinekko · 2025-03-18T13:57:02Z

Sorry for the delayed review, this looks good! Some minor comments and you left a test.only change in.

Will merge once comments have been addressed.

@RobertCraigie no worries about the delay. Thank you for reviews. I addressed your suggestions.

Requesting base64 encoded embeddings returns smaller body sizes, on average ~60% smaller than float32 encoded. In other words, the size of the response body containing embeddings in float32 is ~2.3x bigger than base64 encoded embedding. We always request embedding creating encoded as base64, and then decoded them to float32 based on the user's provided encoding_format parameter. Closes openai#1310

Closes openai#1310

Co-authored-by: Robert Craigie <robert@craigie.dev>

manekinekko · 2025-03-27T21:42:05Z

@RobertCraigie PR is now ready for review. Thank you.

RobertCraigie

Thanks! Sorry again for the delay.

I pushed a commit removing some debug logs as our debug logging system is not particularly great right now so they'd be too verbose IMO. (logging will be fixed in the next major version)

RobertCraigie · 2025-03-28T20:39:27Z

src/core.ts

+export const toFloat32Array = (base64Str: string): Array<number> => {
+  if (typeof Buffer !== 'undefined') {
+    // for Node.js environment
+    return Array.from(new Float32Array(Buffer.from(base64Str, 'base64').buffer));


curious if you've benchmarked how much of a difference just returning the Float32Array directly would have?

if it's a big difference we should probably have an opt-in flag to just do that. (doesn't block this PR)

manekinekko requested a review from a team as a code owner February 8, 2025 17:09

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch from 7702d54 to 270861b Compare February 8, 2025 17:09

manekinekko mentioned this pull request Feb 8, 2025

Perf: Improve vector embeddings creation by 60% #1310

Open

1 task

RobertCraigie requested changes Feb 11, 2025

View reviewed changes

src/resources/embeddings.ts Outdated Show resolved Hide resolved

src/resources/embeddings.ts Outdated Show resolved Hide resolved

src/resources/embeddings.ts Outdated Show resolved Hide resolved

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch from 83fbd84 to 185dbe5 Compare February 24, 2025 21:10

manekinekko requested a review from RobertCraigie February 28, 2025 20:19

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch 2 times, most recently from e34c241 to fd14cdf Compare March 6, 2025 15:26

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch from fd14cdf to 362f02f Compare March 12, 2025 09:31

RobertCraigie requested changes Mar 18, 2025

View reviewed changes

manekinekko requested a review from RobertCraigie March 18, 2025 13:55

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch 2 times, most recently from c50fa5f to 84180db Compare March 25, 2025 09:02

manekinekko and others added 10 commits March 27, 2025 22:04

chore: move toFloat32Array to core.ts

992d731

chore: default to base64 if user didn't provide encoding_format

d8dde73

Closes openai#1310

chore: update tests

bb0f8da

Update tests/api-resources/embeddings.test.ts

f17bc73

Co-authored-by: Robert Craigie <robert@craigie.dev>

Update tests/api-resources/embeddings.test.ts

c3bda2e

Co-authored-by: Robert Craigie <robert@craigie.dev>

Update tests/api-resources/embeddings.test.ts

de662fa

Co-authored-by: Robert Craigie <robert@craigie.dev>

Update src/resources/embeddings.ts

ec27e05

Co-authored-by: Robert Craigie <robert@craigie.dev>

Update tests/api-resources/embeddings.test.ts

ca9a5cb

Co-authored-by: Robert Craigie <robert@craigie.dev>

fix: refactor encoding_format handling for clarity

d2bc20b

manekinekko force-pushed the perf/wassim-chegham-issue-1310 branch from 84180db to d2bc20b Compare March 27, 2025 21:04

remove some debug logs

d20005b

RobertCraigie approved these changes Mar 28, 2025

View reviewed changes

RobertCraigie changed the base branch from master to next March 28, 2025 20:45

RobertCraigie changed the title ~~perf(embedding): always request embedding creation as base64~~ perf(embedding): default embedding creation as base64 Mar 28, 2025

RobertCraigie changed the title ~~perf(embedding): default embedding creation as base64~~ perf(embedding): default embedding creation to base64 Mar 28, 2025

RobertCraigie merged commit ce2157b into openai:next Mar 28, 2025
4 of 5 checks passed

stainless-app bot mentioned this pull request Mar 28, 2025

release: 4.91.0 #1429

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(embedding): default embedding creation to base64 #1312

perf(embedding): default embedding creation to base64 #1312

manekinekko commented Feb 8, 2025

RobertCraigie left a comment

IDisposable commented Mar 11, 2025

RobertCraigie left a comment

manekinekko commented Mar 18, 2025

manekinekko commented Mar 27, 2025

RobertCraigie left a comment •

edited

Loading

RobertCraigie Mar 28, 2025

perf(embedding): default embedding creation to base64 #1312

perf(embedding): default embedding creation to base64 #1312

Conversation

manekinekko commented Feb 8, 2025

Changes being requested

Additional context & links

RobertCraigie left a comment

Choose a reason for hiding this comment

IDisposable commented Mar 11, 2025

RobertCraigie left a comment

Choose a reason for hiding this comment

manekinekko commented Mar 18, 2025

manekinekko commented Mar 27, 2025

RobertCraigie left a comment • edited Loading

Choose a reason for hiding this comment

RobertCraigie Mar 28, 2025

Choose a reason for hiding this comment

RobertCraigie left a comment •

edited

Loading