[agents] compute-embeddings: Implement batch and async execution #337

eolivelli · 2023-09-04T14:08:56Z

Sending batches of requests to the API allows you to reduce the number of calls and mitigate the risk to fall into errors due to rate limits.

Summary:

now the compute-embeddings agent sends API requests in batches and it is fully async
unfortunately the OpenAI Java client is not async, so actually only Vertex AI and Hugging Face REST are really async
introduced a new parameter batch-size with default value 10 in the compute-embeddings agent
introduced a new parameter flush-interval with default 0 (ms) in the compute-embeddings agent
the API call is executed on reaching batch-size pending records or flush-interval ms of wait.
the feature is disabled by default as it could add extra latency to the chatbot pipelines

Interesting things to document:

you should enable batching (batch-size > 0 and flush-interval > 0) on the pipelines that are not sensitive to latency (like background text processing)
on the chatbot pipelines it is better to keep the defaults (flush-interval to 0) in order to reduce the latency

pipeline:
  - name: "compute-embeddings"
    id: "step1"
    type: "compute-ai-embeddings"
    output: "chunks-topic"
    configuration:
      model: "text-embedding-ada-002" # This needs to match the name of the model deployment, not the base model
      embeddings-field: "value.embeddings_vector"
      text: "{{% value.text }}"
      batch-size: 10
      flush-interval: 1000

Backward compatiblity
This PR is 100% backward compatible because we set flush-interval to 0 by default, so the effect is that we always compute only 1 embeddings at a time.

…gStream#337)

eolivelli added 3 commits September 4, 2023 16:04

Implement batch and async execution

472a132

Add batch-size to the examples

d585879

Disable the test again

7cee2a4

eolivelli closed this Sep 4, 2023

Aggregate on the Agent

37b4a08

eolivelli reopened this Sep 4, 2023

eolivelli added 2 commits September 4, 2023 18:08

Fix up

a2920be

Clean up and set flush-interval to 0 by default

b1a40f8

eolivelli added the needs-doc label Sep 4, 2023

Fix tests

6420256

eolivelli merged commit d2bb1ea into main Sep 5, 2023
8 checks passed

eolivelli deleted the impl/batch-embeddings branch September 5, 2023 06:40

benfrank241 pushed a commit to vectorize-io/langstream that referenced this pull request May 2, 2024

[agents] compute-embeddings: Implement batch and async execution (Lan…

c6b562a

…gStream#337)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[agents] compute-embeddings: Implement batch and async execution #337

[agents] compute-embeddings: Implement batch and async execution #337

eolivelli commented Sep 4, 2023 •

edited

Loading

[agents] compute-embeddings: Implement batch and async execution #337

[agents] compute-embeddings: Implement batch and async execution #337

Conversation

eolivelli commented Sep 4, 2023 • edited Loading

eolivelli commented Sep 4, 2023 •

edited

Loading