New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
very large database #1320
Comments
Yes, it is. I observe a similar thing. I don't think the actual data stored is so large, I wonder if triggering a compaction would fix things. |
I wonder how this is triggered in the first place. I mean, we just write the pinset to the cluster, right? In this case, we shouldn't really discard anything from the database, so a compaction should not have to clean up anything - I suppose? |
Depends. Pins that are removed from the cluster are also removed from the pins state (which is separate from the consensus state). The crdt part is mostly write only, except the heads. When a head is replaced by a new one, the old one is deleted. And then, I don't know what sort of accounting badger is carrying internally and how it is handling tables etc. In general I'm growing very weary of badger. As said, triggering a compaction on open would give a first clue on whether there is indeed a lot of trash. |
Maybe not on startup, this already takes quite long. Compaction should be possible all the time, no? So why not just do it randomly every 2 hours or so? :) |
Ah yeah, I was not suggesting as permanent solution, I was mentioning just for testing. But anyways, cannot turn compaction regularly as essentially they make everything crawl to a stop, and that is not something that should happen randomly. Afaik, Filecoin compacts on start. |
I have checked one of our cluster nodes with 160k pins and 70GB of badger datastore. The badger tool gave the following:
That is 74GB of value logs. After backing up and restoring using the badger tool:
I have no idea how badger manages to put so much trash on disk. There are effectively no Deletes happening on this cluster. |
@RubenKelevra are you able to test by setting |
Yeah sure |
Settings previously:
Size of |
Hey @jarifibrahim mind having a look at this? |
This is a known issue with Badger v1 and v2. The value log GC wasn't very effective at cleaning up the vlog files and for this exact reason, we have badger v3. Badger v1 and v2 use vlog (value log) files as the Write Ahead Log (WAL) and they also store big values in those files. The vlog files keep accumulating and the vlog GC isn't able to clean them up fast. Badger v3 uses vlog files to only store values that are greater than 1 MB and the memtables are the WAL and they're removed as soon as the data reaches the disk. I would recommend migrating over to badger v3. v2 and v3 are incompatible but you can do a backup/restore to migrate. |
Doing this will reduce the number of vlog files that are generated. This would also help if you cannot migrate to badger v3 right now. Set the valuethreshold to 1MB . |
@jarifibrahim do you know why value logs keep growing with otherwise gc-able data when there are not many deletes? I think that most data is add-only and never deleted, so unless I'm missing something, I'm not sure what makes those 69GB that can be just freed. Also, can you confirm that by increasing the value_threshold things will improve as less things go onto the value log? I know we can migrate to v3, but we have no go-datastore wrapper for it yet (whereas we do for leveldb). |
The vlog file is also the write-ahead log. You might have a lot of old vlog files which backup+restore cleaned up.
Step 2 might not find useless data easily if you have only added data or your adds and deletions are interleaved (which means the sampling will find all valid data or all stale data).
It should improve things but you would still have some vlog files lying around. We use 1 MB as the default in badger master https://github.com/dgraph-io/badger/blob/74ade987faa5561e4704ce568f1b265d168b3e95/options.go#L182 .
Is this something @RubenKelevra would be able to help with? |
@jarifibrahim when I understand you right we could reduce the threshold to rewrite those files more often. Can we increase the sample size from 10% to 100%? Performance isn't a key factor in this application, since the daemon is async to latency critical operations. :) |
@RubenKelevra This would make GC very very slow. If we're going to sample 100% of the data, we're better off GC-ing the file directly instead of sampling it. However, we could expose an option to set the sample size if that would help you. Note - v3 gc doesn't do sampling. It wouldn't have this problem. |
Well, at least in my case I could basically "shut down" the database for a while, say after midnight when I know there isn't any new data going to be added and let the GC run once. This would be better than shutting down the daemon, since all the network connections and the current status of the other peers would have to be exchanged again in this case. If sampling is slow, a forced GC of the whole database is maybe the better option in this case. :) V3 sounds nice btw, we have to look into it. |
Badger can take 1000x the amount of needed space if not GC'ed or compacted (#1320), even for non heavy usage. Cluster has no provisions to run datastore GC operations and while they could be added, they are not ensured to help. Improvements on Badger v3 might help but would still need to GC explicitally. Cluster was however designed to support any go-datastore as backend. This commit adds LevelDB support. LevelDB go-datastore wrapper is mature, does not need GC and should work well for most cluster usecases, which are not overly demanding. A new `--datastore` flag has been added on init. The store backend is selected based on the value in the configuration, similar to how raft/crdt is. The default is set to leveldb. From now on it should be easier to add additional backends, i.e. badgerv3.
Hey, @RubenKelevra I no longer have access to dgraph-io/badger repository. You'll have to tag someone else from dgraph to help fixing this.
The db has a streamDB API that allows you to stream DB and create a fresh one. This would be better than running GC since streaming the data would remove all stale data (and that too very quickly). |
Oh sorry to hear that. Hope you're allright. Thanks for all the fish!
|
Thanks @RubenKelevra . I am happy to help debug any issues you encounter with Badger DB. |
Fix #1320: Add automatic GC to Badger datastore
@hsanjuan great! It reduced my space consumption from 19.3 GB to 0.7 GB :) |
Additional information:
Describe the bug:
I think the garbage collection in the database is broken/turned off. Otherwise, this database size is quite excessive for ~50k changes on a cluster.
The text was updated successfully, but these errors were encountered: