Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cqueues.socket.fdopen always uses dup(); which still shares things with the original socket #13

Closed
daurnimator opened this issue Oct 22, 2014 · 8 comments

Comments

@daurnimator
Copy link
Collaborator

I encountered an issue today, where the descriptor that got dup()d was setting !O_NONBLOCK on close


From here

I'll accept a patch that adds an option to tell .fdopen to take ownership of the existing descriptor number.

First and foremost, this issue is to track this potential patch

However, contrary to what you wrote:

because stale descriptors are perhaps the biggest source of bugs in asynchronous networking software, I prefer that the default behavior is to duplicate the descriptor.

I think it is a more sensible default to have cqueue's fdopen just take over the existing file descriptor; as we can't know what is going to happen to the old one.


My current 'take over from luasocket' code:
local sock = cqueues_socket.fdopen(conn:getfd())
-- cqueues uses dup(), so we close original socket
conn:close()
-- However, luasocket's close() sets it back into 'blocking' mode, as flags are shared between dup()'d fds.
do -- But if we re-dup cqueues' socket, it can re-establish it's flags.
    local newsock = cqueues_socket.fdopen(sock:pollfd())
    sock:close()
    sock = newsock
end
What it could be instead:
local sock = cqueues_socket.fdopen(conn:getfd())
conn:setfd(-1) -- Means invalid; luasocket will not clean up.
@wahern
Copy link
Owner

wahern commented Oct 23, 2014

That's an interesting case. Although a socket having O_NONBLOCK unset is arguably less problematic than a descriptor becoming invalidated, at least from the perspective of the library. OTOH, with luasocket it's an inevitability rather than a possibility.

I'm already committed to adding the option to take ownership; still not sure about changing the default behavior.

I would also file a bug report with luasocket. O_NONBLOCK has nothing to do with whether close blocks, nor whether it returns without closing the descriptor. Per POSIX, "the requirement for close() on a socket to block for up to the current linger interval is not conditional on the O_NONBLOCK setting."

Also, neither POSIX, Linux, OS X, FreeBSD, NetBSD, OpenBSD, Solaris, nor AIX document an EAGAIN/EWOULDBLOCK error for the close system call. The next issue of POSIX excludes EAGAIN/EWOULDBLOCK as a possibility, entirely. (See http://austingroupbugs.net/view.php?id=529) That really drives home the point that O_NONBLOCK is irrelevant in the context of close.

@daurnimator
Copy link
Collaborator Author

I would also file a bug report with luasocket.

@diegonehab, are you able to comment?

neither POSIX, Linux, OS X, FreeBSD, NetBSD, OpenBSD, Solaris, nor AIX document an EAGAIN/EWOULDBLOCK error for the close system call. The next issue of POSIX excludes EAGAIN/EWOULDBLOCK as a possibility, entirely. (See http://austingroupbugs.net/view.php?id=529) That really drives home the point that O_NONBLOCK is irrelevant in the context of close.

Looking at the original commit, that might not be the case on windows?
https://github.com/diegonehab/luasocket/blob/0c9f420a3549df3fb331bb24157b65a3301641d4/src/wsocket.c#L57

Which seems to have been copied across to https://github.com/diegonehab/luasocket/blob/0c9f420a3549df3fb331bb24157b65a3301641d4/src/usocket.c#L51

In any case, that code has been there since 2004; and will likely live on in packages for a long time.

@diegonehab
Copy link

This was added long enough ago that I do not remember why it was added. You seem to be using LuaSocket in a way it was not designed for. Can you explain the usage scenario where this is a problem?

@daurnimator
Copy link
Collaborator Author

This was added long enough ago that I do not remember why it was added.

It looks like it was in the initial commit of new code, so quite possibly it was just what came first to mind at the time?
As mentioned in this thread, close() cannot block on any unix system we know of, so it should be safe to remove.

Can you explain the usage scenario where this is a problem?

I'm writing additional implementations of a couple of apis using cqueues.
Some of these in-use apis have methods that take luasocket objects.
I'm trying to convert these luasockets to cqueues sockets.

@diegonehab
Copy link

Can it block on Windows? I would accept a pull request for this change.

@daurnimator
Copy link
Collaborator Author

On windows it appers that it can nonblock; so I'll leave the call in there for now.
(see http://msdn.microsoft.com/en-us/library/windows/desktop/ms737582(v=vs.85).aspx)

daurnimator added a commit to daurnimator/luasocket that referenced this issue Oct 27, 2014
This results in unexpected behaviour if the socket has been `dup()`d, as O_NONBLOCK is shared.
Close is always 'blocking' anyway

See wahern/cqueues#13 for an example use case
@wahern
Copy link
Owner

wahern commented Jan 12, 2015

Renamed undocumented socket.fdopen to socket.dup, and implemented a socket.fdopen which takes ownership of the specified file descriptor number.

See 5073dfe and 3295ea6.

@daurnimator
Copy link
Collaborator Author

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants