Skip to content

VLOGs total size ~1gb with no stored data on default config #1297

@jsign

Description

@jsign

What version of Go are you using (go version)?

$ go version
go version go1.14.2 darwin/amd64

What version of Badger are you using?

github.com/dgraph-io/badger/v2 v2.0.1-rc1.0.20200409094109-809725940698

Does this issue reproduce with the latest master?

Yes.

What are the hardware specifications of the machine (RAM, OS, Disk)?

RAM 8gb, latest Catalina, 512gb

What did you do?

I created a GH repo to reproduce the problem:

  1. git clone https://github.com/jsign/go-badger2-size.git
  2. go run main.go
  3. Wait for output.

The program tries different configurations to see how they affect the final SST and VLOG total sizes.
Some points about the program:

  • Creates a Badger DB from scratch on a temp clean folder.
  • Puts 1 million values. Each key 16 bytes, each value 1024 bytes. Both random.
  • After the insertion, all keys are removed in the same order.
  • It runs as GC with 0.01 rate until it returns ErrNoRewrite.
  • Closes the DB.
  • Opens the DB and closes it again. (Just to simulate if other processes might cleanup things).
  • Finally, counts numbers of SST and VLOG files, with their sizes.
  • Prints that info.

The above flow runs for different scenarios described as opts used to open a DB.
There're four scenarios that run in parallel. Running them concurrently is OK since I'm not testing for performance, only wanting to inspect files. Might take <5min to run.

What did you expect to see?

The process creates a million keys and then deletes all of them. I'd expect some config might achieve the total size to be ~negligible.

What did you see instead?

The output of the program (stripping badger logs):

NumVersionToKeep0:
        main.Metrics{NumSST:1, SizeSSTs:0, NumVLOG:4, SizeVLOGs:1132702}
CompactL0OnClose:
        main.Metrics{NumSST:1, SizeSSTs:36301, NumVLOG:4, SizeVLOGs:1132702}
Default config:
        main.Metrics{NumSST:1, SizeSSTs:36301, NumVLOG:4, SizeVLOGs:1132702}
Aggressive:
        main.Metrics{NumSST:1, SizeSSTs:0, NumVLOG:3, SizeVLOGs:599607}

Above sizes are in KB.
In the default config, the total size is greater than 1gb.
On an agressive setup is ~600mb.
To be clear, the DB has no stored data but is allocating 600mb in the best scenario.

I'm looking for a configuration that does this without stopping the DB. Is there any possible configuration that can make the DB have ~negligible size if it isn't storing any keys?

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/questionSomething requiring a response

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions