Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Occasional runtime/cgo: misuse of an invalid Handle panic when using vsock #131

Closed
cfergeau opened this issue Jul 3, 2023 · 6 comments · Fixed by #135
Closed

Occasional runtime/cgo: misuse of an invalid Handle panic when using vsock #131

cfergeau opened this issue Jul 3, 2023 · 6 comments · Fixed by #135
Labels
bug Something isn't working

Comments

@cfergeau
Copy link
Contributor

cfergeau commented Jul 3, 2023

Describe the bug

When using vsock to connect from the VM to the host, I sometimes get a panic:

panic: runtime/cgo: misuse of an invalid Handle

goroutine 17 [running, locked to thread]:
runtime/cgo.Handle.Value(...)
	/usr/local/go/src/runtime/cgo/handle.go:124
github.com/Code-Hex/vz/v3.shouldAcceptNewConnectionHandler(0x14000182000?, 0x0?, 0x14000000001?)
	/Users/teuf/go/pkg/mod/github.com/!code-!hex/vz/v3@v3.0.6/socket.go:231 +0x11c

To Reproduce

I'm afraid I don't have a clear reproducer.

Expected behavior
No panic
Screenshots
If applicable, add screenshots to help explain your problem.

Environment that you use to compile (please complete the following information):

  • Xcode version: [xcodebuild -version | pbcopy]
  • macOS Version: [sw_vers | pbcopy]
ProductName:		macOS
ProductVersion:		13.4.1
BuildVersion:		22F82
  • mac architecture: [arm64]
  • Go Version: [go version | pbcopy]
% go version
go version go1.20.5 darwin/arm64

Additional context
The fix might be similar to some of the recent fixes in the v3 branch (?)

@cfergeau cfergeau added the bug Something isn't working label Jul 3, 2023
cfergeau added a commit to cfergeau/vz that referenced this issue Jul 7, 2023
Using `make test/run TARGET=TestRunIssue131`, I can reproduce Code-Hex#131 reliably
on my M1:

Run socat as vsock ssh proxyserver (port=2222)

Login with root and no password.

localhost login: [    3.053740] random: crng init done
finalizing handler 0x140000a2260 value: 0x100a3b020
panic: runtime/cgo: misuse of an invalid Handle

goroutine 17 [running, locked to thread]:
runtime/cgo.Handle.Value(...)
        /usr/local/go/src/runtime/cgo/handle.go:124
github.com/Code-Hex/vz/v3.shouldAcceptNewConnectionHandler(0x14000102000?, 0x0?, 0x14000000001?)
        /Users/teuf/dev/vz/socket.go:235 +0x11c
2023/07/07 13:49:05 exit status 2
exit status 1
FAIL    github.com/Code-Hex/vz/v3       5.220s
testing: warning: no tests to run
PASS
ok      github.com/Code-Hex/vz/v3/internal/progress     0.351s [no tests to run]
FAIL
make: *** [test/run] Error 1
@cfergeau
Copy link
Contributor Author

cfergeau commented Jul 7, 2023

I've managed to reproduce it in this branch: https://github.com/cfergeau/vz/tree/issue_131:

$ make test/run TARGET=TestRunIssue131
[...]
Run socat as vsock ssh proxyserver (port=2222)

Login with root and no password.

localhost login: [    3.053740] random: crng init done
finalizing handler 0x140000a2260 value: 0x100a3b020
panic: runtime/cgo: misuse of an invalid Handle

goroutine 17 [running, locked to thread]:
runtime/cgo.Handle.Value(...)
        /usr/local/go/src/runtime/cgo/handle.go:124
github.com/Code-Hex/vz/v3.shouldAcceptNewConnectionHandler(0x14000102000?, 0x0?, 0x14000000001?)
        /Users/teuf/dev/vz/socket.go:235 +0x11c
2023/07/07 13:49:05 exit status 2
exit status 1
FAIL    github.com/Code-Hex/vz/v3       5.220s
testing: warning: no tests to run
PASS
ok      github.com/Code-Hex/vz/v3/internal/progress     0.351s [no tests to run]
FAIL
make: *** [test/run] Error 1

cfergeau added a commit to cfergeau/vz that referenced this issue Jul 7, 2023
Using `make test/run TARGET=TestRunIssue131`, I can reproduce Code-Hex#131 reliably
on my M1:

Run socat as vsock ssh proxyserver (port=2222)

Login with root and no password.

localhost login: [    3.053740] random: crng init done
finalizing handler 0x140000a2260 value: 0x100a3b020
panic: runtime/cgo: misuse of an invalid Handle

goroutine 17 [running, locked to thread]:
runtime/cgo.Handle.Value(...)
        /usr/local/go/src/runtime/cgo/handle.go:124
github.com/Code-Hex/vz/v3.shouldAcceptNewConnectionHandler(0x14000102000?, 0x0?, 0x14000000001?)
        /Users/teuf/dev/vz/socket.go:235 +0x11c
2023/07/07 13:49:05 exit status 2
exit status 1
FAIL    github.com/Code-Hex/vz/v3       5.220s
testing: warning: no tests to run
PASS
ok      github.com/Code-Hex/vz/v3/internal/progress     0.351s [no tests to run]
FAIL
make: *** [test/run] Error 1
@mpoindexter
Copy link
Contributor

I also see this, it seems like it's timing dependent.

I think the problem here is how cgo.Handle values are passed/used between the Go side and the native side. Everywhere this project passes a cgo.Handle to the native side it's done by the pattern unsafe.Pointer(&cgoHandle). This pattern is recommended in the docs as the way to pass a cgo.Handle to native code that expects a void *, but when the native side then stores this value for use in callback it violates one of the cgo pointer use rules at https://pkg.go.dev/cmd/cgo#hdr-Passing_pointers - namely C code may not keep a copy of a Go pointer after the call returns. Violation of this rule seems to create a pointer to a memory location that can be stomped on in a timing dependent way.

I think a better way to handle this is to pass the cgo.Handle value to native code as a uintptr_t since all the methods called by the Go side are defined by this project. In a few cases this means that the uintptr_t will have to be cast to a void * on the native side, but this is OK - it's just the cast in Go to an unsafe.Pointer that is not safe, see https://groups.google.com/g/golang-nuts/c/nL2FGeIWpfQ

I've made the changes above locally, and it makes the test created by @cfergeau pass (well, actually just time out after 5 minutes of running, but no longer crash with the invalid handle error)

I can create a PR with my changes if this seems like a good change, please let me know @Code-Hex

@mpoindexter
Copy link
Contributor

@Code-Hex
Copy link
Owner

@cfergeau Thanks to create this issue! and I'm sorry for the late.

Good catch @mpoindexter !!
Could you create PR? Thank you so much!

@cfergeau
Copy link
Contributor Author

Great to see this fixed! Do you want a PR to add the test case from https://github.com/cfergeau/vz/tree/issue_131 to this repo to prevent future regressions?

@mpoindexter
Copy link
Contributor

mpoindexter commented Aug 16, 2023 via email

cfergeau added a commit to cfergeau/vfkit that referenced this issue Oct 3, 2023
The fix for Code-Hex/vz#131 is not in a tagged
Code-Hex/vz release, but this can trigger crashes in vfkit.
This commit updates Code-Hex/vz to latest git main to get this fix.
praveenkumar pushed a commit to crc-org/vfkit that referenced this issue Oct 3, 2023
The fix for Code-Hex/vz#131 is not in a tagged
Code-Hex/vz release, but this can trigger crashes in vfkit.
This commit updates Code-Hex/vz to latest git main to get this fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants