Please sign in to comment.
Fix OOM: Only keep deltas in memory for a pending txn (#3349)
Currently, we bring and keep the entire posting list in memory for each pending txn, which remains there until the txn is committed or aborted. Mutations can easily touch a lot of data (including indices) which gets very expensive in terms of memory usage, causing OOMs during data loads. This PR fixes that issue by only keeping the deltas that need to be applied to the lists and discards the lists as soon as mutation application is done. On a commit, these deltas are then directly written to disk. On a read from the same txn, we apply the delta onto a newly read posting list, so a pending txn can read back its own write. This PR dramatically reduces the memory usage when mutations are going on, avoiding OOMs. Changes: * Instead of keeping the entire posting list in memory, only keep the deltas. This significantly reduces the memory usage, in fact, make it negligible. * Keep track of max version per posting list and use that to avoid repeat commits. * Revert changes to increment tool. They cause two counters to get created. * Add txn.Update in the right places, so any PLs in cache get converted to diffs. * Remove CommitToMemory
- Loading branch information...
Showing with 96 additions and 98 deletions.