Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Support batch ingestion in TextEmbeddingProcessor & SparseEncodingProcessor #743

Open
chishui opened this issue May 9, 2024 · 5 comments
Assignees

Comments

@chishui
Copy link

chishui commented May 9, 2024

Is your feature request related to a problem?

RFC: opensearch-project/OpenSearch#12457

We have implemented batch ingestion logic in OpenSearch core in version 2.14, now we want to enable the batch ingestion capability in neural-search processors: TextEmbeddingProcessor & SparseEncodingProcessor so that we can better utilize the remote ML server's GPU capacity and accelerate the ingestion process, based our benchmark, batch can reduce total ingestion time by 77% without seeing throttling error (P90, SageMaker), please refer to here to see the benchmark results.

What solution would you like?

  1. In InferenceProcessor, override Processor's batchExecute API, add a default implementation to combine List<String> inferenceText from multiple docs, then reuse the mlCommonsClientAccessor.inferenceSentences and mlCommonsClientAccessor.inferenceSentencesWithMapResult. After getting inference results, map them to each doc and update the docs.
  2. We'll sort the docs by length before sending for inference to achieve better performance. And inference results will be restored to original order before being processed.
    (This was original proposed in ml-commons. But as @ylwu-amzn suggested that we can reuse input_docs_processed_step_size as max batch size, then it makes more sense to sort the docs in neural-search as we can ensure that we won't sort docs from TextImageEmbeddingProcessor)

What alternatives have you considered?

N/A

Do you have any additional context?

N/A

@zhichao-aws
Copy link
Member

Hi @chishui , could you please provide an example API request body to create a batched ingest processor?

@martin-gaievski
Copy link
Member

We are not adding this feature for TextImageEmbeddingProcessor, is there a plan to do it later?

@chishui
Copy link
Author

chishui commented May 20, 2024

@zhichao-aws there is no changes to how TextEmbeddingProcessor and SparseEncodingProcessor are created. It's only when user uses _bulk API with batch_size parameter, then the processors will see documents in batches.

@chishui
Copy link
Author

chishui commented May 21, 2024

@martin-gaievski we don't have a plan to support TextImageEmbeddingProcessor as it requires text and image to be grouped in one request.

@chishui
Copy link
Author

chishui commented May 21, 2024

Updated description in "What solution would you like?" section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants