Skip to content

Decompression block pool is inefficient #1378

@damz

Description

@damz

What version of Go are you using (go version)?

$ go version
go version go1.14.4 linux/amd64

What operating system are you using?

Linux

What version of Badger are you using?

master

Steps to Reproduce the issue

Profile a badger using database, you will likely see a number of allocations both in table.decompress and in zstd.Decompress, something like this:

Screenshot from 2020-06-20 14-46-41

What's going on?

There are actually three issues:

The block size is hardcoded

The block pool is expecting blocks to be 4kB, regardless of the Options.BlockSize setting of the database (it actually allocates 5kB instead of 4kB as an effort to avoid spurious allocations during decompression);

The zstd library expects a slice of the correct length, not just capacity

The table.decompress passes zstd a block that has a capacity of 5kB, but a length that can be anything. On the other hand zstd expects the block length to be enough (see that it passes the length to the C function here:

https://github.com/DataDog/zstd/blob/89f69fb7df32e0513d85ac76df1c2a13df58e5e7/zstd.go#L106-L110

The block pool keeps a reference to the block structure alive

In addition (but it is a minor issue), the way the block pool is implemented makes the block pool keep a reference to the block struct alive:

https://github.com/dgraph-io/badger/blob/d37ce36911ae349fbedba342f693184aa434cc30/table/table.go#L241-L243

The &b.data is a reference to the block, so the whole struct is kept alive, including everything that it itself references.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions