Skip to content

Conversation

@enitrat
Copy link
Collaborator

@enitrat enitrat commented Oct 5, 2025

Ingestion: Incremental Vector Store Updates (TypeScript)

  • postgresVectorStore.ts
    • Added updateDocumentsMetadata() to update metadata (and source column) in-place by uniqueId without re-embedding or replacing content.
  • vectorStoreUtils.ts
    • findChunksToUpdateAndRemove() now returns three buckets: contentChanged, metadataOnlyChanged, and chunksToRemove.
    • updateVectorStore() now:
      • Removes documents that no longer exist.
      • Upserts content-changed documents via addDocuments() (re-embeds as before).
      • Updates metadata-only documents via updateDocumentsMetadata() (no re-embed).
  • Tests
    • Updated ingesters/__tests__/vectorStoreUtils.test.ts to cover content-changed vs metadata-only cases; all tests pass.

  - Added `updateDocumentsMetadata()` to update metadata (and `source` column) in-place by `uniqueId` without re-embedding or replacing content.
- vectorStoreUtils.ts
  - `findChunksToUpdateAndRemove()` now returns three buckets: `contentChanged`, `metadataOnlyChanged`, and `chunksToRemove`.
  - `updateVectorStore()` now:
    - Removes documents that no longer exist.
    - Upserts content-changed documents via `addDocuments()` (re-embeds as before).
    - Updates metadata-only documents via `updateDocumentsMetadata()` (no re-embed).
- Tests
  - Updated `ingesters/__tests__/vectorStoreUtils.test.ts` to cover content-changed vs metadata-only cases; all tests pass.
@enitrat enitrat force-pushed the feat-ingester-partial-update branch from 8cef519 to 781742f Compare October 5, 2025 08:56
@enitrat enitrat merged commit 445dc12 into main Oct 5, 2025
1 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants