Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsock: Use net.FileConn to create VirtioSocketConnection #63

Merged
merged 4 commits into from
Oct 12, 2022

Conversation

cfergeau
Copy link
Contributor

Currently, newVirtioSocketConnection manually sets its objc file descriptor
to non-blocking, calls dup() on it, ... so that it matches what go expects
from a file descriptor.

go provides a helper doing all of this for us: net.FileConn() This commit
makes use of that helper. This will allow for more code simplifications in
the next commits.

However this removes vz.VirtioSocketConnection.FileDescriptor() since the
file descriptor was created internally by net.FileConn(). It would be
possible to keep it, but I'm not sure it's really useful. It would be more
consistent with other go APIs to add a File() method if we want to expose
this.

One side-effect of this change is to fix connections returned by
ConnectToPort. At the moment, they are missing a call to Dup() for the file
descriptor they use. Without this, their file descriptor is closed too
early, and the connection cannot be used.

Which issue(s) this PR fixes:

This fixes #54

Currently, newVirtioSocketConnection manually sets its objc file
descriptor to non-blocking, calls dup() on it, ... so that it
matches what go expects from a file descriptor.

go provides a helper doing all of this for us: net.FileConn()
This commit makes use of that helper. This will allow for more code
simplifications in the next commits.

However this removes vz.VirtioSocketConnection.FileDescriptor() since
the file descriptor was created internally by net.FileConn().
It would be possible to keep it, but I'm not sure it's really useful.
It would be more consistent with other go APIs to add a File() method if
we want to expose this.

One side-effect of this change is to fix connections returned by
ConnectToPort. At the moment, they are missing a call to Dup() for the
file descriptor they use. Without this, their file descriptor is closed
too early, and the connection cannot be used.

This fixes Code-Hex#54
The source and destination vsock ports for VirtioSocketConnection are
set both as `sourcePort` and `destinationPort`, but also as part of
`laddr` and `raddr`. This commit only keeps the latter as it's redundant
to have both.
socket.go Outdated Show resolved Hide resolved
vzVirtioSocketConnection := C.convertVZVirtioSocketConnection2Flat(ptr)
err := unix.SetNonblock(int(vzVirtioSocketConnection.fileDescriptor), true)
file := os.NewFile((uintptr)(vzVirtioSocketConnection.fileDescriptor), "")
defer file.Close()
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is unnecessary because of closed the file descriptor in the virtualization framework.

Suggested change
defer file.Close()

Copy link
Contributor Author

@cfergeau cfergeau Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't explicitly close the file, go garbage collector may do it for us: https://pkg.go.dev/os#File.Fd

If f is garbage collected, a finalizer may close the file descriptor, making it invalid;

This is problematic if the following sequence happens:

  • file := os.NewFile((uintptr)(vzVirtioSocketConnection.fileDescriptor), "")
  • virtualization framework closes vzVirtioSocketConnection.fileDescriptor
  • another parts of the code needs a file descriptor, the OS/system library decides to reuse the same fd as vzVirtioSocketConnection.fileDescriptor
  • the go garbage collector closes the file descriptor used by file, which is no longer the expected one and will cause unexpected issues

Copy link
Owner

@Code-Hex Code-Hex Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why needs to call close method from your comment.

I think garbage collection is unrelated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://gist.githubusercontent.com/cfergeau/46a4a4fda70a83dbe2a9bf519cbabd28/raw/530c7d9a97e7a823fb324552bda470ad11fbbab9/fd.go
This shows the race which can happen when you call os.NewFile() on a file descriptor, and then other code closes the file descriptor.

By calling Close() in newVirtioSocketConnection, we know we are closing the right file descriptor. The garbage collector won't call Close() on it later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If vzVirtioSocketConnection.fileDescriptor is owned by the virtualization framework, the code below would even be more correct. Since net.FileConn() will also Dup() the file descriptor, this feels a bit overkill :-/

diff --git a/socket.go b/socket.go
index 173cae6..5770618 100644
--- a/socket.go
+++ b/socket.go
@@ -12,6 +12,7 @@ import (
        "os"
        "runtime"
        "runtime/cgo"
+       "syscall"
        "unsafe"
 
        "golang.org/x/sys/unix"
@@ -192,7 +193,11 @@ type VirtioSocketConnection struct {
 
 func newVirtioSocketConnection(ptr unsafe.Pointer) (*VirtioSocketConnection, error) {
        vzVirtioSocketConnection := C.convertVZVirtioSocketConnection2Flat(ptr)
-       file := os.NewFile((uintptr)(vzVirtioSocketConnection.fileDescriptor), "")
+       fd, err := syscall.Dup(int(vzVirtioSocketConnection.fileDescriptor))
+       if err != nil {
+               return nil, err
+       }
+       file := os.NewFile(uintptr(fd), "")
        defer file.Close()
        rawConn, err := net.FileConn(file)
        if err != nil {

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

net.FileConn will dup file descriptor. https://pkg.go.dev/net#FileConn

FileConn returns a copy of the network connection corresponding to the open file f.

Copy link
Contributor Author

@cfergeau cfergeau Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I believe the original file descriptor should be closed in the framework (in Objective-C) instead of Go side.

The problem is that when os.NewFile(fd) is used, go will always call Close(fd). If our code don't call Close(), then the go runtime/go garbage collector will call Close() automatically. If the go garbage collector closes the file, we don't control when it will do so.

But I agree to call Close method for duplicated file descriptor in Go

Are you referring to net.FileConn(file) ? Or to another file descriptor?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our code don't call Close(), then the go runtime/go garbage collector will call Close() automatically.

This is done here: https://github.com/golang/go/blob/4bcf94b0232db65ed5df47e0127cdbc8866aec64/src/os/file_unix.go#L196

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfergeau I know the Go call Close method in finalizer. Bu the timing is different with using defer.

Yes. about net.FileConn which dup from original file descriptor.

Are you referring to net.FileConn(file) ? Or to another file descriptor?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind either now. Think about it after we have a problem.

Since VirtioSocketConnection is now backed by a net.Conn instance,
we can let go reuse the net.Conn methods from the backing instance
instead of reimplementing them ourselves.
The dup/dupCh types are now badly named as the dup() function is gone.
They seem redundant though.
shouldAcceptNewConnectionHandler will queue (conn, err) in dupCh.
NewVirtioSocketListener() started a go routine which iterates on
dupCh, and runs `handler(conn, err)` in its own go routine.

It seems shouldAcceptNewConnectionHandler could directly run
`handler(conn, err)` in a new go routine instead of going through an
intermediate channel before doing this. This makes the code simpler.
@Code-Hex
Copy link
Owner

@cfergeau Could you check also this issue has been fixed or not by your PR?

#13

@balajiv113
Copy link
Contributor

@Code-Hex
I just gave a try with this PR. Facing same issue, getting segmentation fault after ~1min.

OS - macOS 12
VM - Ubuntu focal

@cfergeau
Copy link
Contributor Author

I just gave a try with this PR. Facing same issue, getting segmentation fault after ~1min.

If you try the workaround from #13 (comment) (remove the SetFinalizer call in socket.go line 81), does this also segfault?

@Code-Hex
Copy link
Owner

Code-Hex commented Oct 12, 2022

Ah... I see. this is required a keepalive.
@balajiv113 Could you try this one after applying this patch by pbpaste | git apply?

diff --git a/socket.go b/socket.go
index 9c9dbf9..d38b07e 100644
--- a/socket.go
+++ b/socket.go
@@ -91,6 +91,7 @@ func newVirtioSocketDevice(ptr, dispatchQueue unsafe.Pointer) *VirtioSocketDevic
 // see: https://developer.apple.com/documentation/virtualization/vzvirtiosocketdevice/3656679-setsocketlistener?language=objc
 func (v *VirtioSocketDevice) SetSocketListenerForPort(listener *VirtioSocketListener, port uint32) {
 	C.VZVirtioSocketDevice_setSocketListenerForPort(v.Ptr(), v.dispatchQueue, listener.Ptr(), C.uint32_t(port))
+	runtime.KeepAlive(v)
 }
 
 // RemoveSocketListenerForPort removes the listener object from the specfied port.

@balajiv113
Copy link
Contributor

@Code-Hex Sure, will give a try

@balajiv113
Copy link
Contributor

I tried both the options,

  1. Workaround here zsh: segmentation fault ./virtualization #13 (comment) - This is working, no seg fault for more than 30mins
  2. runtime.KeepAlive(v) - This ended with seg fault around ~2mins

@cfergeau
Copy link
Contributor Author

cfergeau commented Oct 12, 2022

I tried both the options,

1. Workaround here

Apple engineers have the same fix in their fork of Code-Hex/vz kata-container@1b1a7e9
I don't fully understand why this makes a difference :-/
kata-container@5587aa9 explains why this is done:

It is owned and maintained by the VMM. As a caller, we should not release it. If we do, we'll make it a dangling pointer.

The VMM in this context must be Apple's virtualization framework.

@Code-Hex
Copy link
Owner

@cfergeau I see. I didn't know that. I introduce those fixes.
Thanks

Copy link
Owner

@Code-Hex Code-Hex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@Code-Hex Code-Hex merged commit 3d5cf50 into Code-Hex:master Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ConnectToPort() needs to Dup() its VirtioSocketConnection
3 participants