Proposal: Support multithreading in vectorstore #21283

chosh0615 · 2024-05-03T23:11:12Z

chosh0615
May 3, 2024

Checked

I searched existing ideas and did not find a similar one
I added a very descriptive title
I've clearly described the feature request and motivation for it

Feature request

I would like to propose multithreading when initializing a VectorStore or adding texts/documents to it.

Currently, sync and async methods of add_texts, add_documents, from_texts, and from_documents all processes texts sequentially. This does not fully utilize Embeddings API throughput and becomes bottleneck.

The following was my workaround to this problem. It splits documents into N groups and runs aadd_documents in parallel. This improves the entire embedding processing.

db: VectorStore = Chroma(embedding_function=UpstageEmbeddings())

async def embed_group(docs: list[Document]): 
  await db.aadd_documents(docs)

n = int(len(docs) / 10)
doc_groups = [docs[i:i + n] for i in range(0, len(docs), n)]

tasks = [embed_group(group) for group in doc_groups]
await asyncio.gather(*tasks)

I thought it would be a great feature if VectorStore supports something like this internally and users can use this with one liner.
One option is to support concurrency parameter in VectorStore which defaults to 1.

Chroma.afrom_documents(docs, concurrency=10)

I also noticed ContextThreadPoolExecutor already exists, so we can probably leverage that in VectorStore.

Let me know if there is already a better way to achieve this!

Motivation

Adding large number of chunks into VectorStore takes very long time currently and easily becomes a bottleneck. There is a workaround to this problem, but it is cumbersome to code out the concurrent processing.

Proposal (If applicable)

No response

yonekura-dc · 2024-06-07T20:28:45Z

yonekura-dc
Jun 7, 2024

I'm also having this issue and did a similar workaround to yours, but I'm not sure how well this will work for some Embeddings.

Currently I'm using BedrockEmbeddings and while it seems to work, I'm not 100% sure if it's behaving as expected since Boto3 (AWS SDK) is not thread/process-safe for some cases nor properly supports async.

Langchain async methods usually just calls run_in_executor to achieve async functionality, I don't know how this works in depth, but I assume this basically makes the code multithreaded.
By calling multiple aadd_documents and waiting with gather, we would have lots of API requests running at the same time and in Bedrock case lots of calls to client.invoke_model happening in parallel (which might be a problem or not since the client is the only "thread safe" part of Boto3 according to this issue. )

Would be good to get some input from the langchain team about this.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Support multithreading in vectorstore #21283

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Proposal: Support multithreading in vectorstore #21283

chosh0615 May 3, 2024

Checked

Feature request

Motivation

Proposal (If applicable)

Replies: 1 comment

yonekura-dc Jun 7, 2024

chosh0615
May 3, 2024

yonekura-dc
Jun 7, 2024