-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify single version compaction #1303
Conversation
@manishrjain @jarifibrahim Any chance you could review this? This was another fix that we needed along with #1302. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing this @ou05020 ! The fix works for the most part but I think we might have to make some changes. The following test fails on this branch (and master) . Let me know if you need help fixing them. I can help in fixing the failures.
diff --git a/levels_test.go b/levels_test.go
index 8c7df15..1703141 100644
--- a/levels_test.go
+++ b/levels_test.go
@@ -17,6 +17,7 @@
package badger
import (
+ "fmt"
"math"
"testing"
"time"
@@ -139,11 +140,11 @@ func getAllAndCheck(t *testing.T, db *DB, expected []keyValVersion) {
defer it.Close()
i := 0
for it.Rewind(); it.Valid(); it.Next() {
- require.Less(t, i, len(expected), "DB has more number of key than expected")
+ // require.Less(t, i, len(expected), "DB has more number of key than expected")
item := it.Item()
v, err := item.ValueCopy(nil)
require.NoError(t, err)
- // fmt.Printf("k: %s v: %d val: %s\n", item.key, item.Version(), v)
+ fmt.Printf("k: %s v: %d val: %s\n", item.key, item.Version(), v)
expect := expected[i]
require.Equal(t, expect.key, string(item.Key()), "expected key: %s actual key: %s",
expect.key, item.Key())
@@ -155,7 +156,7 @@ func getAllAndCheck(t *testing.T, db *DB, expected []keyValVersion) {
"key: %s expected meta: %d meta %d", item.key, expect.meta, item.meta)
i++
}
- require.Equal(t, len(expected), i, "keys examined should be equal to keys expected")
+ require.Equal(t, len(expected), i, "keys examined %d should be equal to keys expected %d", i, len(expected))
return nil
})
@@ -254,6 +255,58 @@ func TestCompaction(t *testing.T) {
getAllAndCheck(t, db, []keyValVersion{{"foo", "bar", 3, 0}, {"fooz", "baz", 1, 0}})
})
})
+ t.Run("level 1 to level 2 with delete", func(t *testing.T) {
+ t.Run("with overlap", func(t *testing.T) {
+ runBadgerTest(t, &opt, func(t *testing.T, db *DB) {
+ l1 := []keyValVersion{{"foo", "bar", 3, bitDelete}, {"fooz", "baz", 1, bitDelete}}
+ l2 := []keyValVersion{{"foo", "bar", 2, 0}}
+ l3 := []keyValVersion{{"foo", "bar", 1, 0}}
+ createAndOpen(db, l1, 1)
+ createAndOpen(db, l2, 2)
+ createAndOpen(db, l3, 3)
+
+ // Set a high discard timestamp so that all the keys are below the discard timestamp.
+ db.SetDiscardTs(10)
+
+ getAllAndCheck(t, db, []keyValVersion{
+ {"foo", "bar", 3, 1}, {"foo", "bar", 2, 0}, {"foo", "bar", 1, 0}, {"fooz", "baz", 1, 1},
+ })
+ cdef := compactDef{
+ thisLevel: db.lc.levels[1],
+ nextLevel: db.lc.levels[2],
+ top: db.lc.levels[1].tables,
+ bot: db.lc.levels[2].tables,
+ }
+ require.NoError(t, db.lc.runCompactDef(1, cdef))
+ // foo version 2 should be dropped after compaction.
+ getAllAndCheck(t, db, []keyValVersion{{"foo", "bar", 3, bitDelete}, {"foo", "bar", 1, 0}})
+ })
+ })
+ t.Run("without overlap", func(t *testing.T) {
+ runBadgerTest(t, &opt, func(t *testing.T, db *DB) {
+ l1 := []keyValVersion{{"foo", "bar", 3, bitDelete}, {"fooz", "baz", 1, bitDelete}}
+ l2 := []keyValVersion{{"fooo", "barr", 2, 0}}
+ createAndOpen(db, l1, 1)
+ createAndOpen(db, l2, 2)
+
+ // Set a high discard timestamp so that all the keys are below the discard timestamp.
+ db.SetDiscardTs(10)
+
+ getAllAndCheck(t, db, []keyValVersion{
+ {"foo", "bar", 3, 1}, {"fooo", "barr", 2, 0}, {"fooz", "baz", 1, 1},
+ })
+ cdef := compactDef{
+ thisLevel: db.lc.levels[1],
+ nextLevel: db.lc.levels[2],
+ top: db.lc.levels[1].tables,
+ bot: db.lc.levels[2].tables,
+ }
+ require.NoError(t, db.lc.runCompactDef(1, cdef))
+ // foo version 2 should be dropped after compaction.
+ getAllAndCheck(t, db, []keyValVersion{{"fooo", "barr", 2, 0}})
+ })
+ })
+ })
}
func TestHeadKeyCleanup(t *testing.T) {
3462d8b
to
c52e68a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jarifibrahim I have incorporated your test with modifications.
The "level 1 to level 2 with delete/with overlap" test was missing "fooz/baz" in the expected results.
That entry still exists after compaction because hasOverlap
in the compactBuildTables
method is true which is expected because that is the purpose of the test and hasOverlap
is not checked per key. hasOverlap
is set true if any keys in the compaction tables overlap with lower layers which happened to be the case for the foo
key in this test.
During the first compaction, all L1 items were moved to L2. Running a subsequent compaction with L2 and L3 results in removal of all keys.
Reviewable status: 0 of 2 files reviewed, all discussions resolved (waiting on @ashish-goswami and @manishrjain)
@ou05020 Thanks! I didn't realize there was an overlap. |
// When `NumVersionsToKeep == 1` the skipKey should always be set | ||
// and there is no need for `lastValidVersion`. The only required check | ||
// is to determine if a deleted key should be written due to `hasOverlap`. | ||
if s.kv.opt.NumVersionsToKeep == 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ou05020 I think all the changes in levels.go
file can be replaced with the following
diff --git a/levels.go b/levels.go
index 764fcd9..0121650 100644
--- a/levels.go
+++ b/levels.go
@@ -578,6 +578,16 @@ func (s *levelsController) compactBuildTables(
// only valid version for a running transaction.
numVersions++
+ if isDeletedOrExpired(vs.Meta, vs.ExpiresAt) && !hasOverlap &&
+ s.kv.opt.NumVersionsToKeep == 1 {
+ skipKey = y.SafeCopy(skipKey, it.Key())
+
+ numSkips++
+ updateStats(vs)
+ continue
+ }
+
// Keep the current version and discard all the next versions if
// - The `discardEarlierVersions` bit is set OR
// - We've already processed `NumVersionsToKeep` number of versions
I tried this and everything seems to be working fine. I'll push some more tests on this PR to ensure we've covered all the cases. We have very few tests for compaction which were added recently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 2 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
levels.go, line 579 at r2 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
@ou05020 I think all the changes in
levels.go
file can be replaced with the followingdiff --git a/levels.go b/levels.go index 764fcd9..0121650 100644 --- a/levels.go +++ b/levels.go @@ -578,6 +578,16 @@ func (s *levelsController) compactBuildTables( // only valid version for a running transaction. numVersions++ + if isDeletedOrExpired(vs.Meta, vs.ExpiresAt) && !hasOverlap && + s.kv.opt.NumVersionsToKeep == 1 { + skipKey = y.SafeCopy(skipKey, it.Key()) + + numSkips++ + updateStats(vs) + continue + } + // Keep the current version and discard all the next versions if // - The `discardEarlierVersions` bit is set OR // - We've already processed `NumVersionsToKeep` number of versionsI tried this and everything seems to be working fine. I'll push some more tests on this PR to ensure we've covered all the cases. We have very few tests for compaction which were added recently.
@jarifibrahim That will work.
But I think the fix should also be expanded to address deleted/expired keys when NumVersionsToKeep > 1.
Consider the following compaction scenarios with num versions = 3 assuming no overlap:
Before Compaction -> After Compaction
v3, v2 delete, v1 -> v3
v3, v2, v1 delete -> v3, v2, v1 delete
v3 delete, v2, v1 ->
When the deleted/expired key is the last valid version it is kept even when there is no overlap. The second scenario above should have resulted in v3, v2
keys remaining.
The changes proposed in issue 1228 or PR #1354 will handle NumVersionsToKeep >= 1.
@ou05020 I'm going to close this PR in favor of PR #1354 since PR #1354 has simpler changes. Thank you so much for fixing this issue, we appreciate your contribution. We have cherry-picked your commit in PR #1354 and so you would be the co-author of that PR. Thank you so much for your help 🎉 |
When
NumVersionsToKeep
is1
, deleted and expired entries should always be removed if there is no overlap with lower levels. Otherwise, deleted entries will continue to grow for heavy write/delete use cases with many transient keys.This PR adds simplified logic when
NumVersionsToKeep
is1
that always sets theskipKey
and only writes deleted or expired entries whenhasOverlap
istrue
.This change is