Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle bind failures when running NIO in a tight sandbox #2616

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
3 changes: 3 additions & 0 deletions Sources/NIOCore/Channel.swift
Original file line number Diff line number Diff line change
Expand Up @@ -371,6 +371,9 @@ public enum ChannelError: Error {

/// An attempt was made to remove a ChannelHandler that is not removable.
case unremovableHandler

/// An attempt to bind a `Channel` failed.
case bindFailed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sadly this is a breaking change but I don't think we need this case at all.

}

extension ChannelError: Equatable { }
Expand Down
18 changes: 11 additions & 7 deletions Sources/NIOPosix/SocketChannel.swift
Original file line number Diff line number Diff line change
Expand Up @@ -266,14 +266,18 @@ final class ServerSocketChannel: BaseSocketChannel<ServerSocket> {
promise?.fail(error)
}
executeAndComplete(p) {
switch target {
case .socketAddress(let address):
try socket.bind(to: address)
case .vsockAddress(let address):
try socket.bind(to: address)
do {
switch target {
case .socketAddress(let address):
try socket.bind(to: address)
case .vsockAddress(let address):
try socket.bind(to: address)
}
self.updateCachedAddressesFromSocket(updateRemote: false)
try self.socket.listen(backlog: backlog)
} catch {
promise?.fail(ChannelError.bindFailed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should definitely still fail the promise with the error produced by the socket.bind class. Otherwise it is hard to tell what went wrong. Could you explain why this approach works around the assert? I would like to understand how we actually ended up in the assert in the first place since something must have caused the call to readable and we just have to avoid that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after re thinking my approach the way I set up my sandbox to test this was not correct, thus nor is this solution. In the actual testChannelInactiveCancelScheduled I set up a class MockServerSocket: ServerSocket {} that essentially always simulate bind failure. However I did this by overriding bind but by doing that I am not hitting all of NIOs binding process.

Now I am overriding the standard bind sys call thats called by NIO by creating a bind function that always returns EPERM

int bind(int sockfd, const struct sockaddr *addr, socklen_t addrlen) {
    int (*original_bind)(int, const struct sockaddr *, socklen_t);
    original_bind = dlsym(RTLD_NEXT, "bind");
    errno = EPERM;
    return -1;
}

compile it down to a shared object and preload it and run the test. Going to use this approach to create a new solution here.

In the actual testChannelInactiveCancelScheduled we call readable since we want to test whether scheduled reads are closed when the channel is closed. The issue here is readable checks if the lifecycleManager is active, however at this stage it's only transitioned from fresh to preRegistered thus triggering the assert.

}
self.updateCachedAddressesFromSocket(updateRemote: false)
try self.socket.listen(backlog: backlog)
}
}

Expand Down