Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport stable/8.4] [Backport main] Fix Migration to 8.3 OOM Failure #16142

Merged
merged 7 commits into from
Jan 30, 2024

Conversation

backport-action
Copy link
Collaborator

Description

Backport of #16113 to stable/8.4.

relates to #16090 #14975
original author: @backport-action

berkaycanbc and others added 7 commits January 30, 2024 10:40
New MemoryChecker class is added. It will later be used to control memory
usage of each RocksIterator loop in both DbDecisionMigrationState and
DbProcessMigrationState classes.

(cherry picked from commit 9dd6cea)
(cherry picked from commit d580506)
…ation state classes

In the previous implementation, one instance of RocksIterator was iterating through
the whole ColumnFamily. Since each instance of RocksIterator opens its transaction,
that led the previous implementation to occupy the memory equal to size of key-value
pairs.

RocksDB manages memory using malloc/free algorithm. In that implementation if there is
enough memory left in the total allocated memory, it keeps freed memory unused, because
it expects it to be used again by the same object. For that reason, when RocksIterator
occupies too much memory, there is a possibility that after it is done, the free memory
might not be allocated for other threads.

In the updated implementation, the memory usage per iterator is limited to a fixed size.
This will provide OS to re-use (or at least keep less sized non-used memory) occupied memory
when it is no longer used by another iterator.

The fixed limit is chosen as default 50MB. It is because, 50MB appeared to be a sweet spot
while testing the fix. When I use 100MB, it was almost doubling the memory usage and when
I use 25MB, it only increased the migration time while not improving on the memory usage.

(cherry picked from commit 4a0de14)
(cherry picked from commit 3232747)
For the same reason as mentioned in the previous commit, we should keep the memory
usage of each transaction low. Otherwise, the data retrieved will be kept in memory
and after its usage, the freed memory might not be reused.

If we do not remove this transaction, we will not benefit from memory limiting each
`columnFamily.foreach` loop because it will keep changes uncommitted. Therefore, it
will cause occupying the whole migration data in memory until it is committed.

(cherry picked from commit d94a32b)
(cherry picked from commit e6053f6)
(cherry picked from commit cabdd2a)
(cherry picked from commit e228cf9)
(cherry picked from commit ba60e6d)
(cherry picked from commit 30d0b42)
(cherry picked from commit c4d6466)
(cherry picked from commit 229996a)
@camundait camundait added this pull request to the merge queue Jan 30, 2024
@lenaschoenburg
Copy link
Member

Backport-ception 🥳

Merged via the queue into stable/8.4 with commit 208ee85 Jan 30, 2024
28 of 29 checks passed
@camundait camundait deleted the backport-16113-to-stable/8.4 branch January 30, 2024 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants