Conversation
aebc34a to
e622b4f
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
This PR removes the persisted in-memory “index” from PersistentKB state and shifts chunk lookup/deletion to be driven by engine metadata (source) plus filesystem scanning of assetDir. It also introduces a lightweight in-memory MockEngine and a Ginkgo/Gomega test suite to exercise persistence behavior without embeddings.
Changes:
- Drop
CollectionState.Index/PersistentKB.indexand derive document listing from the on-disk UUID layout. - Extend the
Engineinterface withGetBySourceand implement it for Postgres/Chromem/Mock (LocalAI stubbed). - Add mock-based persistency tests covering store/list/search/reset/remove and external sources.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
rag/persistency.go |
Removes state/index map and switches document/chunk operations to filesystem keys + Engine.GetBySource; updates reset/repopulate/migration logic. |
rag/engine.go |
Extends Engine interface with GetBySource. |
rag/engine/postgres.go |
Adds GetBySource query implementation for Postgres-backed engine. |
rag/engine/chromem.go |
Adds GetBySource implementation using a metadata-filtered query. |
rag/engine/localai.go |
Adds GetBySource stub returning “not implemented”. |
rag/engine/mock.go |
Adds new in-memory MockEngine for tests (no embeddings/external deps). |
rag/persistency_mock_test.go |
Adds a comprehensive mock-engine test suite for PersistentKB. |
Comments suppressed due to low confidence (1)
rag/persistency.go:200
- PersistentKB.Reset ignores errors from os.RemoveAll/os.MkdirAll/db.save and os.RemoveAll(db.path), so Reset can return nil even when the on-disk state wasn’t actually cleared. Please handle and propagate these errors (and consider ordering so Engine.Reset failures don’t leave disk/state partially reset).
func (db *PersistentKB) Reset() error {
db.Lock()
os.RemoveAll(db.assetDir)
os.MkdirAll(db.assetDir, 0755)
db.sources = []*ExternalSource{}
db.save()
db.Unlock()
if err := db.Engine.Reset(); err != nil {
return err
}
os.RemoveAll(db.path)
return nil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| os.Remove(oldPath) | ||
| xlog.Info("Migrated entry", "old_key", fileName, "new_key", filepath.Join(fileUUID, fileName)) | ||
| } | ||
|
|
Comment on lines
+172
to
+182
| count := c.collection.Count() | ||
| if count == 0 { | ||
| return nil, nil | ||
| } | ||
|
|
||
| // Use Query with a where filter to find documents by source metadata. | ||
| // We use a dummy query and request all documents, relying on the where | ||
| // filter to narrow results. | ||
| res, err := c.collection.Query(ctx, ".", count, map[string]string{"source": source}, nil) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("error querying by source: %v", err) |
| Expect(filepath.Base(docs[0])).To(Equal("replace.txt")) | ||
|
|
||
| // Count should be roughly the same (old chunks removed, new added) | ||
| Expect(kb.Count()).To(BeNumerically("~", countAfterFirst, countAfterFirst)) |
| ) | ||
|
|
||
| // newMockKB creates a PersistentKB backed by a MockEngine. | ||
| // It writes a minimal state file so the constructor skips the embedding check. |
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
e622b4f to
36b3173
Compare
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
c27f9de to
2c4903b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR drops the memory index to use the filesystem instead