Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

completeFilename causes invalid memory address or nil pointer dereference #49

Closed
rabbitstack opened this issue Sep 17, 2018 · 11 comments
Closed
Labels

Comments

@rabbitstack
Copy link

rabbitstack commented Sep 17, 2018

Hi

We're running a component in Kubernetes that uses diskv under the hood. The problem is that the process occasionally crashes when it attempts to remove the key from the store. Here is the relevant stack trace:

github.com/org/vendor/github.com/peterbourgon/diskv.(*Diskv).Erase(0xc42041c2d0, 0x0, 0x1b, 0x0, 0x0) /home/rabbit/org/vendor/github.com/peterbourgon/diskv/diskv.go:409 +0xe7
github.com/org/vendor/github.com/peterbourgon/diskv.(*Diskv).completeFilename(0xc42041c2d0, 0x0, 0x1b, 0x1b, 0x27fdf00) /home/rabbit/org/vendor/github.com/peterbourgon/diskv/diskv.go:525 +0x98
path/filepath.Join(0xc42110b970, 0x2, 0x2, 0xc4208e18c0, 0x11) /usr/lib/go/src/path/filepath/path.go:210
path/filepath.join(0xc42110b970, 0x2, 0x2, 0x0, 0x0) /usr/lib/go/src/path/filepath/path_unix.go:45 +0x96
strings.Join(0xc42110b970, 0x2, 0x2, 0x18d6236, 0x1, 0xc42110b918, 0x2) /usr/lib/go/src/strings/strings.go:424

Data directory is mounted as a regular host path (/opt/spm/agent) and file names are ksuid-compatible identifiers.

diskv is initialized with following configuration:

d := diskv.New(diskv.Options{
		BasePath:     c.Dir,
		Transform:    func(s string) []string { return []string{} },
		CacheSizeMax: 1024 * 1024,
})

Do you have any pointers or ideas why this would happen?

@rabbitstack rabbitstack changed the title completeFilename causes invalid memory address or nil pointer dereference completeFilename causes invalid memory address or nil pointer dereference Sep 17, 2018
@rabbitstack
Copy link
Author

@peterbourgon Do you have any smart ideas or suggestions where I should start looking in the code to find the root cause?

@peterbourgon
Copy link
Owner

That stack trace identifies the callstack, but it doesn't give the specific error that gets bubbled up. Do you have that?

@rabbitstack
Copy link
Author

All I have is the panic error message:

panic: runtime error: invalid memory address or nil pointer dereference

@peterbourgon
Copy link
Owner

And which version of diskv?

@rabbitstack
Copy link
Author

I'm using the latest version: 2.0.1

@peterbourgon
Copy link
Owner

peterbourgon commented Sep 25, 2018

Hmm. There is not much to go on. According to your stack trace, the nil pointer dereference is in the stdlib filepath.Join, and I'm not sure how that makes sense. Does this happen when you pass an empty key, for example? Can you paste the complete verbatim crash log?

@rabbitstack
Copy link
Author

rabbitstack commented Sep 26, 2018

yeah, the error is quite bizarre. We had 5 crashes of k8s pod during last 2 days. Here is the complete stack trace (there are slightly naming modifications to avoid exposing private data):

panic: runtime error: invalid memory address or nil pointer dereference

goroutine 16 [running]:
github.com/org/agent/outputs/lb.(*Output).Publish(0xc4204391e0, 0x1a8da00, 0xc420e72600, 0x0, 0x0) /home/usr/agent/src/github.com/org/agent/outputs/lb/lb.go:157 +0xa5d
github.com/org/agent/cmd/agent/app/pipeline.(*eventConsumer).ackJournal(0xc420690360, 0xc420e72600) /home/usr/agent/src/github.com/org/agent/cmd/agent/app/pipeline/consumer.go:211 +0xdd
github.com/org/agent/cmd/agent/app/journal/kv.(*dkv).ACK(0xc4202fa5a0, 0x0, 0x1b, 0x1a61d80, 0xc4200d9930) /home/usr/agent/src/github.com/org/agent/cmd/agent/app/journal/kv/kv.go:204 +0x42
github.com/org/agent/vendor/github.com/peterbourgon/diskv.(*Diskv).Erase(0xc4201d5050, 0x0, 0x1b, 0x0, 0x0) /home/usr/agent/src/github.com/org/agent/vendor/github.com/peterbourgon/diskv/diskv.go:409 +0xe7
github.com/org/agent/vendor/github.com/peterbourgon/diskv.(*Diskv).completeFilename(0xc4201d5050, 0x0, 0x1b, 0x1b, 0x27fcf00) /home/usr/agent/src/github.com/org/agent/vendor/github.com/peterbourgon/diskv/diskv.go:525 +0x98
path/filepath.Join(0xc42110b970, 0x2, 0x2, 0xc4208e18c0, 0x11) /usr/lib/go/src/path/filepath/path.go:210
path/filepath.join(0xc42110b970, 0x2, 0x2, 0x0, 0x0) /usr/lib/go/src/path/filepath/path_unix.go:45 +0x96
strings.Join(0xc42110b970, 0x2, 0x2, 0x18d6236, 0x1, 0xc42110b918, 0x2) /usr/lib/go/src/strings/strings.go:424

I already tried to reproduce by passing an empty key, but the crash doesn't occur. For what's worth, I also identified the exact place in Go stdlib (path_unix.go) where it panics:

func join(elem []string) string {
	// If there's a bug here, fix the logic in ./path_plan9.go too.
	for i, e := range elem {
		if e != "" {
			return Clean(strings.Join(elem[i:], string(Separator))) <- panic
		}
	}
	return ""
}

The last that occur to me is that this could be something k8s-specific, since data dir is mapped to host volume inside daemonset.

@peterbourgon
Copy link
Owner

I feel safe asserting that crashing in strings.Join is a red herring, the problem is elsewhere. If it's happening in a Kubernetes volume mount, my guess is something at the filesystem layer. My suggestion would be to come up with some patch for Erase that logged a bunch of debug information, and have you run that build for a day or two, and see what it says when it goes kaput. Would this work?

@rabbitstack
Copy link
Author

Sounds good. Let me reach out to you again after I gather some relevant debug info.
Thanks!

@rabbitstack
Copy link
Author

rabbitstack commented Oct 29, 2018

Hi @peterbourgon

The conclusion we have is that a nasty race condition is occurring in the code and thus making the key invalid (runtime attempts to dereference a nil string pointer) which leads to the crash. I'll close this issue as it seems it's not related to diskv.

Thanks for being supportive.

@peterbourgon
Copy link
Owner

🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants