Fix LSM net additions count and add WAL threshold #1855
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Count Memtable Net Additions
Prior to these changes, the length in bytes of each key+value was used to calculate the approximate size of the memtable. This led to the bug described in this PR's related issue, where the memtable flushes too early, as the parent bucket overestimated the memtable's size.
This PR introduces the counting of the net additions, where if an existing key is updated, the size of the memtable is incremented only by the difference in length of the original value's bytes with the length of the new value's.
WAL Threshold
Added WAL threshold to be monitored in addition to the existing memtable threshold. Reason is, with the counting of net additions to the memtable vs gross additions, the WAL can grow without bound in certain situation.
An example is if the majority of writes to the memtable are updates to existing keys, with values of similar size, the bucket's active memtable will grow extremely slowly, if at all. Therefore threshold is never reached and the WAL increases in size unbounded.
To mitigate this problem, the WAL threshold will
stat
the current WAL and check to ensure that its size is below the threshold. Otherwise, like with the memtable threshold, the log is flushed to a segment and switched.Closes #1830