-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple vector stores and concurrency #76
Comments
Hi, thanks for the question!
For in-memory: The chromem-go DB contains multiple "collections", and the querying takes place per collection. So when the concurrent queries are for separate collections, they don't affect each other. There could be a write process to one collection and you can still query the other. If the concurrent queries are for the same collection, it should also be fine as long as no write happens in between, because a read lock is used to ensure no data race condition. Multiple concurrent reads can all access the in-memory data structure concurrently. But if a write operation asks for a lock, the scheduler waits for ongoing reads to finish and then give the write operation the exclusive lock. If you do a query at this point it will have to wait until the write is finished, so there can be a delay. The querying currently uses a number of goroutines matching the number of CPU threads. As goroutines are "cheap" / "green threads", running concurrent queries (and thus more goroutines than you have CPU threads) doesn't lead to much overhead, but still the CPU is shared so you might see a performance decrease. I haven't measured this yet, but could do it if it's a blocking question for you to move on with using chromem-go. So far I did one round of performance improvements in v0.5.0 for what I think is the most common use case (non-concurrent querying without metadata or document filter). Based on user's needs I can do more optimizations, or for other use cases. For persistence: Currently the persistence is a fairly naive implementation, with one file per document, while the data is also kept in memory. This means for querying there is no performance penalty, because the data isn't read from disk. Only writes go to disk. Persistent data is only read on DB initialization ( I have plans for several enhancements around persistence, where one is to use a write-ahead or append-only log, with one file for the DB or collection instead of one per document. This will make concurrent writes more performant. And another is to offload document contents (not embeddings) to files and not keep them in memory. This will lead to a huge reduction in memory usage for the entire DB, and querying should stay almost as fast as before as long as no document filtering is used, because only for the final n documents a read from disk will be required. These should be optional features so users can opt-in to them. For document filtering: When you include filters on document content it slows down the query a lot, because I haven't done any optimization for that use case yet. There are probably some low hanging fruits to improve the performance for this. And the next level of performance (for document content filtering) could be achieved when using something like roaring bitmaps. But I'd only look into the latter when users start voicing a need for it or needing it myself.
So far this is a side-project and I maintain it while working a regular job. Thanks to the library being dependency-free there's a lower need for regular updates (library version bumps for security fixes or general improvements), so I'm confident that I'm able to maintain it for a while, but without any sponsorship I won't make any promises. There's always the option to make someone from the community a maintainer, or move the project to a GitHub organization with members from the community.
Thanks for the kind words! 🙂 I hope I was able to answer most of your questions. Otherwise feel free to ask in more detail, or follow-up questions. |
Ah P.S.: If we can land #48 it should lead to another performance improvement for queries |
P.P.S.: Another feature on the roadmap for persistence is to not be file-based at all, but allow the user to pass an implementation of any kind of key-value store, for example with https://github.com/philippgille/gokv, and store documents there. But just like with the write-ahead or append-only log this would only affect writes, not reads. |
All great info. Thank you so much. Am looking to make a knowledge base
creation and curation app, with ability to query the data directly as well
as have a chat with it…. All in golang. Am planning to go very deep with
it. Happy to meetup and discuss if you’d like.
…On Fri, May 17, 2024 at 5:44 PM Philipp Gillé ***@***.***> wrote:
P.P.S.: Another feature on the roadmap for persistence is to not be
file-based at all, but allow the user to pass an implementation of any kind
of key-value store, for example with https://github.com/philippgille/gokv,
and store documents there. But just like with the write-ahead or
append-only log this would only affect writes, not reads.
—
Reply to this email directly, view it on GitHub
<#76 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AF632FLQDMXDMYY2OWKZBPDZCZ25TAVCNFSM6AAAAABHXKIQYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYGQYTOOJTGM>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
That’s particularly great info about the filters and wheres. That means I
probably need to keep related source docs of different source mediums in
separate collections with similar names to be able to query all the
web-crawled material from a particular knowledge base as opposed to the
GitHub code. Trying to get granular and orthogonal with the options. The
data sources would be an abstraction on top of the vector stores and
collections.
On Fri, May 17, 2024 at 11:39 PM Joseph Gill ***@***.***>
wrote:
… All great info. Thank you so much. Am looking to make a knowledge base
creation and curation app, with ability to query the data directly as well
as have a chat with it…. All in golang. Am planning to go very deep with
it. Happy to meetup and discuss if you’d like.
On Fri, May 17, 2024 at 5:44 PM Philipp Gillé ***@***.***>
wrote:
> P.P.S.: Another feature on the roadmap for persistence is to not be
> file-based at all, but allow the user to pass an implementation of any kind
> of key-value store, for example with https://github.com/philippgille/gokv,
> and store documents there. But just like with the write-ahead or
> append-only log this would only affect writes, not reads.
>
> —
> Reply to this email directly, view it on GitHub
> <#76 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AF632FLQDMXDMYY2OWKZBPDZCZ25TAVCNFSM6AAAAABHXKIQYOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMJYGQYTOOJTGM>
> .
> You are receiving this because you authored the thread.Message ID:
> ***@***.***>
>
|
I've thought about creating something similar, without curation, but with pluggable data sources. Some apps exist in the space, like Danswer (in Python). Frameworks like LlamaIndex and Haystack (also both Python) show that this is a popular use case for vector stores.
Do you mean the code example in the GitHub repo that uses "knowledge-base" as collection name? Then yes, that's just a simplified example, and I would suggest to use one collection per data source. |
We're building something in the "knowledge base" realm as well over here: gptscript-ai/knowledge. |
May I consider the questions as answered? I'll close the issue, but feel free to ask for reopening, or I can enable the "discussion" feature in this repo to have a separate section that are not related to issues. |
Curious about the expected performance under concurrent loads and/or with multiple persistent (long term) and in-memory (short term) vector stores being simultaneously queried. Not that I have something to through at it that would test its limits. Am curious. Trying to make an informed decision for maybe a 10+ year runway on upgrades. Other folks might be curious too. This project is vastly underrated in my opinion.
The text was updated successfully, but these errors were encountered: