-
Notifications
You must be signed in to change notification settings - Fork 1.2k
[breaking/format] Add key-offset index to the end of SST table #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds the key-offset index used for searching the blocks in a table to the end the SST file. The length of the index is stored in the last 4 bytes of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 14 of 14 files at r1.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)
db2_test.go, line 105 at r1 (raw file):
var manual = flag.Bool("manual", false, "Set when manually running some tests.") // Badger dir to be used for performing db.Open benchmark
minor: period at end of comments.
table/builder.go, line 160 at r1 (raw file):
b.addHelper([]byte{}, y.ValueStruct{}) y.AssertTrue(b.buf.Len() < math.MaxUint32) // Add key to the block index
minor: period at the end.
table/builder.go, line 233 at r1 (raw file):
} // Write bloom filter.
why is the bloom filter being written before the block index now?
table/builder.go, line 242 at r1 (raw file):
y.AssertTrue(n < math.MaxUint32) // Write index size
nit: period at the end.
table/builder.go, line 253 at r1 (raw file):
func (b *Builder) writeChecksum(data []byte) { // Build checksum for the index
nit: period at the end
table/builder.go, line 267 at r1 (raw file):
} // Write checksum to the file
minor: period at the end.
table/builder.go, line 274 at r1 (raw file):
y.AssertTrue(n < math.MaxUint32) // Write checksum size
minor: period at the end.
table/builder_test.go, line 25 at r1 (raw file):
"github.com/dgraph-io/badger/options" "github.com/dgraph-io/badger/y"
@campoy is making a change to move to v2: #882
If that is submitted before your change, merge and make sure you update the references. So your imports here would become:
"github.com/dgraph-io/badger/v2/options"
"github.com/dgraph-io/badger/v2/y"
Same for other badger imports you added.
table/builder_test.go, line 36 at r1 (raw file):
require.Len(t, table.blockIndex, 1) }) t.Run("multiple keys", func(t *testing.T) {
If these tests are independent, just have them in separate methods.
table/table.go, line 223 at r1 (raw file):
readPos := t.tableSize // Read checksum len from the last 4 bytes
minor: period at the end
table/table.go, line 228 at r1 (raw file):
checksumLen := binary.BigEndian.Uint32(buf) // Read checksum
minor: period at the end
table/table.go, line 236 at r1 (raw file):
} // Read index size from the footer
minor: period at the end
table/table.go, line 240 at r1 (raw file):
buf = t.readNoFail(readPos, 4) indexLen := int(binary.BigEndian.Uint32(buf)) // Read index
minor: period at the end
table/table.go, line 265 at r1 (raw file):
ko := t.blockIndex[idx] blk := block{ offset: int(ko.Offset),
I see a lot of conversion between int types in this file. Is there a way to avoid that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 3 of 14 files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @campoy, @manishrjain, and @martinmr)
table/builder.go, line 233 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
why is the bloom filter being written before the block index now?
We're writing bloom filter along with the index. They're part of the same protobuf.
I'll reword the comment so that it makes more sense.
table/builder_test.go, line 25 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
@campoy is making a change to move to v2: #882
If that is submitted before your change, merge and make sure you update the references. So your imports here would become:
"github.com/dgraph-io/badger/v2/options" "github.com/dgraph-io/badger/v2/y"Same for other badger imports you added.
Yes. I've changed it. Thanks!
table/builder_test.go, line 36 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
If these tests are independent, just have them in separate methods.
Both the tests are related to validation of the table index and this looks like a logical grouping to me.
Keeping it this way makes it more readable as well.
table/table.go, line 265 at r1 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
I see a lot of conversion between int types in this file. Is there a way to avoid that?
Not really. t.read(..) needs ints and protobuf has int16, int32, int64. No matter which type we use, we'll have to convert it to int.
One thing we can do is change the param type of t.read(..) from int to int64. The function internally converts int to int64 and then we can use int64 everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 5 of 11 files at r2.
Reviewable status: 8 of 14 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @campoy, @jarifibrahim, and @martinmr)
table/builder.go, line 191 at r1 (raw file):
// ReachedCapacity returns true if we... roughly (?) reached capacity? func (b *Builder) ReachedCapacity(cap int64) bool { estimateSz := b.buf.Len() + 8 /* empty header */ + 4 /* Index length */ +
Move each + to a new line and use // instead of /*.
table/table.go, line 265 at r1 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
Not really.
t.read(..)needsintsand protobuf hasint16, int32, int64. No matter which type we use, we'll have to convert it toint.One thing we can do is change the param type of
t.read(..)from int to int64. The function internally converts int to int64 and then we can use int64 everywhere.
That sounds like a good idea.
…o/badger into ibrahim/footer-protobuf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 7 of 14 files reviewed, 5 unresolved discussions (waiting on @ashish-goswami, @campoy, @manishrjain, and @martinmr)
table/builder.go, line 191 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
Move each + to a new line and use // instead of /*.
Done.
table/table.go, line 265 at r1 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
That sounds like a good idea.
I'll defer the variable type change to a separate PR. The int => int64 change requires changes at multiple places in the code.
If we do not stop the existing one, the go routines will keep running in the background.
When the opened DB is not closed, all the running go routines (watermark, compaction, etc) are not stopped and they keep running in the background
|
I had to push ec7de22 and 127ff52 to reduce error logs in the build output. Without those commits, the PR build was failing and the errors were not visible because Travis has a limit on the number of characters in the logs. See https://travis-ci.org/dgraph-io/badger/jobs/549661427 for an example of failed build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 11 files at r2, 3 of 3 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ashish-goswami and @manishrjain)
The original PR #816 got closed because I deleted the
release/v2.1.0 branch(we didn't need it) and Github won't let me reopen that PR or change the target branch.This is a breaking change. Older versions of badger data will not work after this change
This commit adds the key-offset index used for searching the blocks in a table to the end of the SST file. The length of the index is stored in the last 4 bytes of the file.
Earlier we used to build the index at runtime by reading the keys stored at the
offsetspecified in the footer of the file. With this change, we won't build the index, we'll read it from the file.Benchmarks
The following test was run on a badger DB with 300 million sorted key value pairs (DB size 57GB)
The following code was used for benchmarking (also added to db2_test.go file)
This change is