Skip to content

[BREAKING] feat(index): Use flatbuffers instead of protobuf #1546

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Oct 3, 2020

Conversation

jarifibrahim
Copy link
Contributor

@jarifibrahim jarifibrahim commented Sep 29, 2020

This PR

  • Uses flatbuffers instead of protobufs for table index and directly stores byte slices in the cache.
  • Uses MaxVersion to pick the oldest tables for compaction first.
  • Uses leveldb/bloom so that we can test it without unmarshal
  • Adds uncompressed size and key count in table index.
  • Updates write bench tool to use managed mode.

This change is Reviewable

Use flatbuffers instead of protobuf for storing table index.
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 13 files at r1.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @ashish-goswami and @jarifibrahim)


flatbuffer.fbs, line 1 at r1 (raw file):

namespace fb;

add license.


table/builder.go, line 78 at r1 (raw file):

// blockOffset is used to create the index in builder.Finish
type blockOffset struct {

Don't need this struct. Create blockoffsets via Flatbuffers, and lay them out on z.Calloc or something.


table/builder.go, line 96 at r1 (raw file):

	entryOffsets  []uint32 // Offsets of entries present in current block.
	indexBuidler  *flatbuffers.Builder
	offsets       []*blockOffset

Don't use Go memory for this.


table/builder.go, line 404 at r1 (raw file):

			off := b.offsets[i]
			// Length of the block is end minus the start.
			off.len = bl.end - bl.start

don't call it len. Len was ok, because of capital L. Maybe call it sz now.


table/table.go, line 107 at r1 (raw file):

	var buf []byte
	for j := 0; j < ti.BloomFilterLength(); j++ {
		buf = append(buf, byte(ti.BloomFilter(j)))

this is weird code.


table/table.go, line 125 at r1 (raw file):

// Key retuns the key for the blockoffset.
func (bo *fBlockOffset) key() []byte {

Why are you allocating memory in these functions? The point of FBs are to access the bytes directly off the marshaled structures.


table/table.go, line 542 at r1 (raw file):

	if blk.data, err = t.read(blk.offset, int(ko.Len())); err != nil {
		return nil, errors.Wrapf(err,
			"failed to read from file: %s at offset: %d, len: %d", t.fd.Name(), blk.offset, ko.Len())

100 chars.

@CLAassistant
Copy link

CLAassistant commented Oct 3, 2020

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Good to merge.

Reviewed 1 of 10 files at r2, 1 of 5 files at r5, 2 of 7 files at r6, 4 of 9 files at r7, 1 of 2 files at r9, 1 of 1 files at r10, 6 of 8 files at r11, 1 of 7 files at r13, 11 of 11 files at r14.
Reviewable status: 27 of 28 files reviewed, 2 unresolved discussions (waiting on @ashish-goswami, @jarifibrahim, and @manishrjain)


fb/gen.sh, line 5 at r4 (raw file):

set -e

# You will need the flatbuffer compiler (flatc) https://google.github.io/flatbuffers/flatbuffers_guide_building.html

100 chars.

@jarifibrahim jarifibrahim changed the title feat(index): Use flatbuffers instead of protobuf {BREAKING] feat(index): Use flatbuffers instead of protobuf Oct 3, 2020
@jarifibrahim jarifibrahim changed the title {BREAKING] feat(index): Use flatbuffers instead of protobuf [BREAKING] feat(index): Use flatbuffers instead of protobuf Oct 3, 2020
@jarifibrahim jarifibrahim merged commit 599363b into master Oct 3, 2020
@jarifibrahim jarifibrahim deleted the ibrahim/flatbuffers branch October 3, 2020 16:35
manishrjain added a commit that referenced this pull request Dec 3, 2020
NewKeyIterator uses pickTables which was optimized in the past. But, a
recent PR: #1546 removed this
optimization, which is now making NewKeyIterator quite expensive.

This PR brings that optimization back.
manishrjain added a commit to outcaste-io/outserv that referenced this pull request Jul 6, 2022
NewKeyIterator uses pickTables which was optimized in the past. But, a
recent PR: hypermodeinc/badger#1546 removed this
optimization, which is now making NewKeyIterator quite expensive.

This PR brings that optimization back.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

5 participants