completeFilename causes invalid memory address or nil pointer dereference #49

rabbitstack · 2018-09-17T15:21:18Z

Hi

We're running a component in Kubernetes that uses diskv under the hood. The problem is that the process occasionally crashes when it attempts to remove the key from the store. Here is the relevant stack trace:

github.com/org/vendor/github.com/peterbourgon/diskv.(*Diskv).Erase(0xc42041c2d0, 0x0, 0x1b, 0x0, 0x0) /home/rabbit/org/vendor/github.com/peterbourgon/diskv/diskv.go:409 +0xe7
github.com/org/vendor/github.com/peterbourgon/diskv.(*Diskv).completeFilename(0xc42041c2d0, 0x0, 0x1b, 0x1b, 0x27fdf00) /home/rabbit/org/vendor/github.com/peterbourgon/diskv/diskv.go:525 +0x98
path/filepath.Join(0xc42110b970, 0x2, 0x2, 0xc4208e18c0, 0x11) /usr/lib/go/src/path/filepath/path.go:210
path/filepath.join(0xc42110b970, 0x2, 0x2, 0x0, 0x0) /usr/lib/go/src/path/filepath/path_unix.go:45 +0x96
strings.Join(0xc42110b970, 0x2, 0x2, 0x18d6236, 0x1, 0xc42110b918, 0x2) /usr/lib/go/src/strings/strings.go:424

Data directory is mounted as a regular host path (/opt/spm/agent) and file names are ksuid-compatible identifiers.

diskv is initialized with following configuration:

d := diskv.New(diskv.Options{
		BasePath:     c.Dir,
		Transform:    func(s string) []string { return []string{} },
		CacheSizeMax: 1024 * 1024,
})

Do you have any pointers or ideas why this would happen?

The text was updated successfully, but these errors were encountered:

rabbitstack · 2018-09-25T13:21:14Z

@peterbourgon Do you have any smart ideas or suggestions where I should start looking in the code to find the root cause?

peterbourgon · 2018-09-25T16:00:57Z

That stack trace identifies the callstack, but it doesn't give the specific error that gets bubbled up. Do you have that?

rabbitstack · 2018-09-25T16:16:49Z

All I have is the panic error message:

panic: runtime error: invalid memory address or nil pointer dereference

peterbourgon · 2018-09-25T16:28:43Z

And which version of diskv?

rabbitstack · 2018-09-25T16:56:20Z

I'm using the latest version: 2.0.1

peterbourgon · 2018-09-25T18:24:37Z

Hmm. There is not much to go on. According to your stack trace, the nil pointer dereference is in the stdlib filepath.Join, and I'm not sure how that makes sense. Does this happen when you pass an empty key, for example? Can you paste the complete verbatim crash log?

rabbitstack · 2018-09-26T08:49:33Z

yeah, the error is quite bizarre. We had 5 crashes of k8s pod during last 2 days. Here is the complete stack trace (there are slightly naming modifications to avoid exposing private data):

panic: runtime error: invalid memory address or nil pointer dereference

goroutine 16 [running]:
github.com/org/agent/outputs/lb.(*Output).Publish(0xc4204391e0, 0x1a8da00, 0xc420e72600, 0x0, 0x0) /home/usr/agent/src/github.com/org/agent/outputs/lb/lb.go:157 +0xa5d
github.com/org/agent/cmd/agent/app/pipeline.(*eventConsumer).ackJournal(0xc420690360, 0xc420e72600) /home/usr/agent/src/github.com/org/agent/cmd/agent/app/pipeline/consumer.go:211 +0xdd
github.com/org/agent/cmd/agent/app/journal/kv.(*dkv).ACK(0xc4202fa5a0, 0x0, 0x1b, 0x1a61d80, 0xc4200d9930) /home/usr/agent/src/github.com/org/agent/cmd/agent/app/journal/kv/kv.go:204 +0x42
github.com/org/agent/vendor/github.com/peterbourgon/diskv.(*Diskv).Erase(0xc4201d5050, 0x0, 0x1b, 0x0, 0x0) /home/usr/agent/src/github.com/org/agent/vendor/github.com/peterbourgon/diskv/diskv.go:409 +0xe7
github.com/org/agent/vendor/github.com/peterbourgon/diskv.(*Diskv).completeFilename(0xc4201d5050, 0x0, 0x1b, 0x1b, 0x27fcf00) /home/usr/agent/src/github.com/org/agent/vendor/github.com/peterbourgon/diskv/diskv.go:525 +0x98
path/filepath.Join(0xc42110b970, 0x2, 0x2, 0xc4208e18c0, 0x11) /usr/lib/go/src/path/filepath/path.go:210
path/filepath.join(0xc42110b970, 0x2, 0x2, 0x0, 0x0) /usr/lib/go/src/path/filepath/path_unix.go:45 +0x96
strings.Join(0xc42110b970, 0x2, 0x2, 0x18d6236, 0x1, 0xc42110b918, 0x2) /usr/lib/go/src/strings/strings.go:424

I already tried to reproduce by passing an empty key, but the crash doesn't occur. For what's worth, I also identified the exact place in Go stdlib (path_unix.go) where it panics:

func join(elem []string) string {
	// If there's a bug here, fix the logic in ./path_plan9.go too.
	for i, e := range elem {
		if e != "" {
			return Clean(strings.Join(elem[i:], string(Separator))) <- panic
		}
	}
	return ""
}

The last that occur to me is that this could be something k8s-specific, since data dir is mapped to host volume inside daemonset.

peterbourgon · 2018-09-29T22:14:23Z

I feel safe asserting that crashing in strings.Join is a red herring, the problem is elsewhere. If it's happening in a Kubernetes volume mount, my guess is something at the filesystem layer. My suggestion would be to come up with some patch for Erase that logged a bunch of debug information, and have you run that build for a day or two, and see what it says when it goes kaput. Would this work?

rabbitstack · 2018-10-01T09:46:21Z

Sounds good. Let me reach out to you again after I gather some relevant debug info.
Thanks!

rabbitstack · 2018-10-29T11:41:10Z

Hi @peterbourgon

The conclusion we have is that a nasty race condition is occurring in the code and thus making the key invalid (runtime attempts to dereference a nil string pointer) which leads to the crash. I'll close this issue as it seems it's not related to diskv.

Thanks for being supportive.

peterbourgon · 2018-10-29T19:47:21Z

🙇

rabbitstack changed the title ~~completeFilename causes invalid memory address or nil pointer dereference~~ completeFilename causes invalid memory address or nil pointer dereference Sep 17, 2018

peterbourgon added the bug label Sep 17, 2018

rabbitstack closed this as completed Oct 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

completeFilename causes invalid memory address or nil pointer dereference #49

completeFilename causes invalid memory address or nil pointer dereference #49

rabbitstack commented Sep 17, 2018 •

edited

Loading

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018 •

edited

Loading

rabbitstack commented Sep 26, 2018 •

edited

Loading

peterbourgon commented Sep 29, 2018

rabbitstack commented Oct 1, 2018

rabbitstack commented Oct 29, 2018 •

edited

Loading

peterbourgon commented Oct 29, 2018

completeFilename causes invalid memory address or nil pointer dereference #49

completeFilename causes invalid memory address or nil pointer dereference #49

Comments

rabbitstack commented Sep 17, 2018 • edited Loading

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018

rabbitstack commented Sep 25, 2018

peterbourgon commented Sep 25, 2018 • edited Loading

rabbitstack commented Sep 26, 2018 • edited Loading

peterbourgon commented Sep 29, 2018

rabbitstack commented Oct 1, 2018

rabbitstack commented Oct 29, 2018 • edited Loading

peterbourgon commented Oct 29, 2018

rabbitstack commented Sep 17, 2018 •

edited

Loading

peterbourgon commented Sep 25, 2018 •

edited

Loading

rabbitstack commented Sep 26, 2018 •

edited

Loading

rabbitstack commented Oct 29, 2018 •

edited

Loading