Add staged background ingest promotion#34
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce589e1e65
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| except Exception: | ||
| self._backend.delete_index_batch(batch_id) | ||
| raise |
There was a problem hiding this comment.
Make batch-failure cleanup safe after partial promotion
This handler always calls delete_index_batch when promote_index_batch raises, but promotion itself updates visibility across multiple tables in sequence (activate new rows, then deactivate old rows). If an error happens mid-promotion (for example after old embeddings are deactivated but before all tables finish), this cleanup can delete the newly activated batch and leave part of the old memory already hidden, producing an inconsistent or unreadable memory state. The failure path should either be transactional/rollback-safe or avoid deleting staged rows once promotion has started mutating active rows.
Useful? React with 👍 / 👎.
| self._embeddings_table.search() | ||
| .where(active_row_filter()) | ||
| .select(["hash_seq"]) | ||
| .limit(10_000_000) | ||
| .to_list() |
There was a problem hiding this comment.
Avoid capped in-memory scans for count APIs
This count implementation pulls row IDs into Python and uses len(...) with a hard .limit(10_000_000), which underreports any store larger than 10M rows and can cause large memory/latency spikes before falling into the broad except path. That turns counting into an O(N) materialization step and can return incorrect 0 on heavy datasets; counting should stay backend-side (filtered count/aggregate) without a hard truncation cap.
Useful? React with 👍 / 👎.
Summary
Validation