HDDS-8195. RDBStore.getUpdatesSince() throws RocksDBException: Requested array size exceeds VM limit#4459
Conversation
…ted array size exceeds VM limit.
|
@jojochuang @sumitagrawl @dombizita - pls review. |
|
Please create PR only after reasonably clean CI run in fork. Checkstyle is failing: |
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
Ran both checkstyle and findbugs locally and all issues fixed. Kindly enable workflows. |
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
| public class RDBStore implements DBStore { | ||
| private static final Logger LOG = | ||
| LoggerFactory.getLogger(RDBStore.class); | ||
| public static final int MAX_DB_UPDATES_SIZE_THRESHOLD = 1024 * 1024; |
There was a problem hiding this comment.
Shouldn't this be 1024 * 1024 * 1024?
There was a problem hiding this comment.
@jojochuang - Thanks for the review. Yes you are right, I have changed that to 1024 * 1024 * 1024.
…ted array size exceeds VM limit.
|
Thanks, I ran existing test cases of delta OM DB snapshots and covering the feature. They ran fine. If you are looking for any specific coverage of any usecase, please let me know. |
sumitagrawl
left a comment
There was a problem hiding this comment.
@devmadhuu thanks for working over this, LGTM. UT case related to the scenario can be added, like crossing limit of sequence log file size
jojochuang
left a comment
There was a problem hiding this comment.
The code looks good to me. But it involves a few moving parts and a test is really needed to ensure different components respond properly, for example, does recon respond to OM properly when the DB update is not successful.
…ted array size exceeds VM limit.
|
@sumitagrawl Thanks for the review. Additional UT and assertions have been updated. Pls re-review. |
@jojochuang - Thanks for the review. Additional UT and assertions have been updated. Pls re-review. |
sumitagrawl
left a comment
There was a problem hiding this comment.
@devmadhuu Please handle one of minor comment
| this(dbFile, options, new ManagedWriteOptions(), families, | ||
| new CodecRegistry(), false, 1000, null, false, | ||
| TimeUnit.DAYS.toMillis(1), TimeUnit.HOURS.toMillis(1)); | ||
| this.maxDbUpdatesSizeThreshold = MAX_DB_UPDATES_SIZE_THRESHOLD; |
There was a problem hiding this comment.
plz use common default value, do not use another default value 80 which will create confusion.
There was a problem hiding this comment.
@sumitagrawl - I have handled this comment. Pls re-review.
| @SuppressWarnings("checkstyle:ParameterNumber") | ||
| @VisibleForTesting | ||
| public RDBStore(File dbFile, ManagedDBOptions rocksDBOption, | ||
| ManagedWriteOptions writeOptions, | ||
| Set<TableConfig> tableConfigs, CodecRegistry registry, | ||
| boolean openReadOnly, int maxFSSnapshots, | ||
| String dbJmxBeanNameName, boolean enableCompactionLog, | ||
| long maxTimeAllowedForSnapshotInDag, | ||
| long pruneCompactionDagDaemonRunInterval, | ||
| long maxDbUpdatesSizeThreshold) throws IOException { | ||
| this(dbFile, rocksDBOption, writeOptions, tableConfigs, registry, | ||
| openReadOnly, maxFSSnapshots, dbJmxBeanNameName, | ||
| enableCompactionLog, maxTimeAllowedForSnapshotInDag, | ||
| pruneCompactionDagDaemonRunInterval); | ||
| this.maxDbUpdatesSizeThreshold = maxDbUpdatesSizeThreshold; | ||
| } |
There was a problem hiding this comment.
Please add the new param to the existing constructor instead of creating a new one.
There was a problem hiding this comment.
@adoroszlai - I have handled this review comment. Pls re-review.
| LOG.info("Number of updates received from OM : {}, " + | ||
| "SequenceNumber diff: {}, SequenceNumber Lag from OM {}.", | ||
| numUpdates, getCurrentOMDBSequenceNumber() - fromSequenceNumber, lag); | ||
| return null != dbUpdates ? dbUpdates.isDBUpdateSuccess() : false; |
There was a problem hiding this comment.
nit:
| return null != dbUpdates ? dbUpdates.isDBUpdateSuccess() : false; | |
| return null != dbUpdates && dbUpdates.isDBUpdateSuccess(); |
There was a problem hiding this comment.
@adoroszlai - I have handled this suggestion. Pls re-review.
…ted array size exceeds VM limit.
…ted array size exceeds VM limit.
|
Thanks @devmadhuu for the patch, @jojochuang, @sumitagrawl for the review. |
Observations about below code in
org.apache.hadoop.hdds.utils.db.RDBStore#getUpdatesSince(long, long)Above code is in while loop for all log files getting iterated and each log file batch allocates a byte[] array which results in accumulating the data in dbUpdatesWrapper in the form of dataList. This will increase JVM heap due to dataList growing with each allocated byte[] array getting added in dataList for dbUpdatesWrapper object and may further fail to allocate any byte[] array in the log iterator loop on calling "result.writeBatch().data()" code. If Recon has limited heap memory, this may fail frequently and may even fall into worse situation where a first call to "result.writeBatch().data()" may fail to allocate byte[] array and throw "org.rocksdb.RocksDBException: Requested array size exceeds VM limit".
So to reduce the chance of this byte[] array allocation failure, we need to ensure following 3 points:
https://issues.apache.org/jira/browse/HDDS-8195
Patch was tested locally on docker setup by creating very frequent writes and generated keys using freon tool and waited for recon to get OM DB sync updates.