-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What version of Go are you using (go version)?
$ go version go version go1.14.2 darwin/amd64
What version of Badger are you using?
github.com/dgraph-io/badger/v2 v2.0.1-rc1.0.20200409094109-809725940698
Does this issue reproduce with the latest master?
Yes.
What are the hardware specifications of the machine (RAM, OS, Disk)?
RAM 8gb, latest Catalina, 512gb
What did you do?
I created a GH repo to reproduce the problem:
git clone https://github.com/jsign/go-badger2-size.gitgo run main.go- Wait for output.
The program tries different configurations to see how they affect the final SST and VLOG total sizes.
Some points about the program:
- Creates a Badger DB from scratch on a temp clean folder.
- Puts 1 million values. Each key 16 bytes, each value 1024 bytes. Both random.
- After the insertion, all keys are removed in the same order.
- It runs as GC with
0.01rate until it returnsErrNoRewrite. - Closes the DB.
- Opens the DB and closes it again. (Just to simulate if other processes might cleanup things).
- Finally, counts numbers of SST and VLOG files, with their sizes.
- Prints that info.
The above flow runs for different scenarios described as opts used to open a DB.
There're four scenarios that run in parallel. Running them concurrently is OK since I'm not testing for performance, only wanting to inspect files. Might take <5min to run.
What did you expect to see?
The process creates a million keys and then deletes all of them. I'd expect some config might achieve the total size to be ~negligible.
What did you see instead?
The output of the program (stripping badger logs):
NumVersionToKeep0:
main.Metrics{NumSST:1, SizeSSTs:0, NumVLOG:4, SizeVLOGs:1132702}
CompactL0OnClose:
main.Metrics{NumSST:1, SizeSSTs:36301, NumVLOG:4, SizeVLOGs:1132702}
Default config:
main.Metrics{NumSST:1, SizeSSTs:36301, NumVLOG:4, SizeVLOGs:1132702}
Aggressive:
main.Metrics{NumSST:1, SizeSSTs:0, NumVLOG:3, SizeVLOGs:599607}Above sizes are in KB.
In the default config, the total size is greater than 1gb.
On an agressive setup is ~600mb.
To be clear, the DB has no stored data but is allocating 600mb in the best scenario.
I'm looking for a configuration that does this without stopping the DB. Is there any possible configuration that can make the DB have ~negligible size if it isn't storing any keys?