-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
What version of Go are you using (go version)?
$ go version go version go1.12.6 linux/amd64
What version of Badger are you using?
1.6.0, with local patches to fix value log GC (TxnTooBig) and bloom filter memory use. (Without the local patches, vlog GC doesn't work and memory use OOMs the machine way to easily)
Does this issue reproduce with the latest master?
Dunno, master binary format is different so the broken DB I created is incompatible.
What are the hardware specifications of the machine (RAM, OS, Disk)?
SLES Linux in a VM, 8GB RAM
What did you do?
Ran a test program which does concurrent writes and value log GC, then performs reads to validate data is accessible. This program generates the busted databas: main.go.txt.
Within 24 hours, that test program spewed out some warnings about "This entry should have been caught". I copied the DB to another machine (Ubuntu, same arch) for analysis and tried to open it with a dumb test program.
What did you expect to see?
I expect to be able to Open() the database.
What did you see instead?
Attempt to open the database failed with error:
badger 2019/09/03 13:42:31 INFO: All 8 tables opened in 1.084s
badger 2019/09/03 13:42:31 INFO: Replaying file id: 48 at offset: 744099210
badger 2019/09/03 13:42:31 INFO: Replay took: 230.700769ms
open failed Unable to find log file. Please retry
github.com/dgraph-io/badger.init.ializers
/home/marka/src/badger-dataloss/thirdparty/github.com/dgraph-io/badger/errors.go:66
runtime.main
/usr/local/go/src/runtime/proc.go:188
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1337
failed to read value pointer from vlog file: {Fid:33 Len:334 Offset:347265756}
github.com/dgraph-io/badger.(*valueLog).populateDiscardStats
/home/marka/src/badger-dataloss/thirdparty/github.com/dgraph-io/badger/value.go:1449
github.com/dgraph-io/badger.(*valueLog).open
/home/marka/src/badger-dataloss/thirdparty/github.com/dgraph-io/badger/value.go:859
github.com/dgraph-io/badger.Open
/home/marka/src/badger-dataloss/thirdparty/github.com/dgraph-io/badger/db.go:318
main.main
/home/marka/src/badger-dataloss/cmd/lookup/main.go:14
runtime.main
/usr/local/go/src/runtime/proc.go:200
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1337
panic: failed to read value pointer from vlog file: {Fid:33 Len:334 Offset:347265756}: Unable to find log file. Please retry
Indeed, value log file 33 is no longer present.
Repeated attempts to open the database always failed with the same error. On inspection of the populateDiscardStats function, it looks like it doesn't properly handle the case where the !badger!discard value has been moved to a later value log file.
When I hacked in some code into populateDiscardStats to handle ErrRetry then it seemed to fix the problem.