You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I reproduced it in our unit tests, but haven't got a chance to make a rocksdb test.
What I did to reproduce the issue
On the flush path
open database
create a new CF
generate range deletions, e.g.
for (int i = 0; i < numRangeDeletions; ++i) {
writeBatch->Put(cf, prefix + std::to_string(i + 1), value);
writeBatch->DeleteRange(cf, prefix, prefix + std::to_string(i));
db->Write(options, writeBatch);
}
Issue flush to the CF db->Flush(options, cf);
Depends on the memory limit of the process, you may get a OOM with less range deletes
On the recovery path
step 1 - 3 are the same as above
4. close database
5. reopen database
6. OOM during recovery
I believe the flush and the recovery actually use the same code path. They all OOMed when creating FragmentedRangeTombstoneList .
I tried issuing flush to the CF after a certain amount of range deletes (e.g. 1000). It seems to resolve the OOM. However, there's no metrics to track the number of range deletes in a memtable. I have to count the deletes sent to a CF, which is inaccurate because some deletes may have been flushed already.
Another thing I tried is to reopen the database using ldb_tools, it also OOMed during recovery. The issued was originally discovered in 7.2.6. After that, we tried many newer version including 8.1.1, still see the same issue.
The text was updated successfully, but these errors were encountered:
Hi, thanks for reporting the issue. Currently, FragmentTombstones() can take more memory if the input range tombstones are overlapping, especially in your case where all range tombstones are overlapping (the memory overhead is roughly O(N^2) in this case). I doubt the threshold is 1000 range deletions tho, I tried the following test and it consumes about 40MB memory. It's probably closer to 10k depending on the amount of available memory.
...
const int kNumRangeDel = 1000;
for (int i = 0; i < kNumRangeDel; ++i) {
ASSERT_OK(Put(Key(i + 1), std::to_string(i + 1)));
// All range deletions are overlapping
ASSERT_OK(db_->DeleteRange(WriteOptions(), db_->DefaultColumnFamily(), Key(0), Key(i)));
}
// Flush will trigger fragmentation which could OOM?
ASSERT_OK(Flush());
...
There are some potential ways to optimize this that I plan to work on. In the mean time, as a work around, are you able to issue non-overlapping range deletions?
However, there's no metrics to track the number of range deletes in a memtable. I have to count the deletes sent to a CF, which is inaccurate because some deletes may have been flushed already.
You are right that manually tracking the count can be inaccurate sometimes but manual flush should still help. There is an open PR (#11358) that is adding this functionality as an option.
I doubt the threshold is 1000 range deletions tho, I tried the following test and it consumes about 40MB memory. It's probably closer to 10k depending on the amount of available memory.
Right. In the unit test, it OOMed around 10k deletes. I misread the number when going through the WAL files, which is also close to 10k.
There are some potential ways to optimize this that I plan to work on. In the mean time, as a work around, are you able to issue non-overlapping range deletions?
It's hard to issue non-overlapping range deletions. We plan to convert some DeleteRange to read and delete.
I think #11358 could help with our situation. I also left some comments there.
Will #11358 be included in the next rocksdb release?
Hi, I'm working on FoundationDB and trying to use rocksdb as the underlying storage engine.
OOM was observed in two scenarios
Expected behavior
Flush memtable should complete without error.
Actual behavior
Out of memory while flushing a memtable with ~1000 deletes.
Sample heap profile (using massif) during recovery
Steps to reproduce the behavior
I reproduced it in our unit tests, but haven't got a chance to make a rocksdb test.
What I did to reproduce the issue
On the flush path
db->Flush(options, cf);
On the recovery path
step 1 - 3 are the same as above
4. close database
5. reopen database
6. OOM during recovery
I believe the flush and the recovery actually use the same code path. They all OOMed when creating
FragmentedRangeTombstoneList
.I tried issuing flush to the CF after a certain amount of range deletes (e.g. 1000). It seems to resolve the OOM. However, there's no metrics to track the number of range deletes in a memtable. I have to count the deletes sent to a CF, which is inaccurate because some deletes may have been flushed already.
Another thing I tried is to reopen the database using ldb_tools, it also OOMed during recovery. The issued was originally discovered in 7.2.6. After that, we tried many newer version including 8.1.1, still see the same issue.
The text was updated successfully, but these errors were encountered: