-
Notifications
You must be signed in to change notification settings - Fork 2
VectorDB and Judge Parallelism #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
AlexCuadron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite nice PR! The biggest issue I found is the widespread use of RLock, even when sometimes its not needed and following the google style docstring.
vcache/vcache_core/cache/embedding_store/vector_db/strategies/faiss.py
Outdated
Show resolved
Hide resolved
vcache/vcache_core/cache/embedding_store/vector_db/strategies/faiss.py
Outdated
Show resolved
Hide resolved
vcache/vcache_core/cache/embedding_store/embedding_metadata_storage/strategies/in_memory.py
Show resolved
Hide resolved
|
Thank you for the comments @AlexCuadron. I fixed them. |
AlexCuadron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct?
AlexCuadron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quite nice PR, some threading potential errors, better to fix them now than later
| self._init_vector_store(len(embedding)) | ||
| if self.collection.count() == 0: | ||
| return [] | ||
| k_ = min(k, self.collection.count()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.index.ntotal could change between the check and usage if another thread adds/removes embeddings.
vcache/vcache_core/cache/embedding_store/embedding_metadata_storage/strategies/in_memory.py
Show resolved
Hide resolved
| self._init_vector_store(len(embedding)) | ||
|
|
||
| # Atomic ID generation and assignment | ||
| embedding_id = self.__next_embedding_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If add_with_ids() fails, __next_embedding_id is not incremented, but the ID was already assigned and returned, causing ID reuse.
| embedding_id (int): The ID of the embedding to update. | ||
| observation (Tuple[float, int]): The observation tuple (similarity, label). | ||
| """ | ||
| entry_lock = self._get_entry_lock(embedding_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock ordering violation. If another thread holds _store_lock and tries to get entry_lock, deadlock occurs. with entry lock should already fix this I think (?)
|
This async logic is too complicated. We have a better logic proposal in #65. I close this PR. |
The LLMSimilarityEvaluator requires async execution to ensure performant e2e latency.
To achieve that, I multi-threaded the exploration logic. To ensure thread safety, the metadata objects and Vector DB ID generation maintain RW locks.