Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-blocking connects with multiple DNS entries return EALREADY #99

Closed
daurnimator opened this issue Jun 27, 2014 · 41 comments
Closed

non-blocking connects with multiple DNS entries return EALREADY #99

daurnimator opened this issue Jun 27, 2014 · 41 comments

Comments

@daurnimator
Copy link
Contributor

Non-blocking connects when theres multiple DNS entries keep iterating through entries after getting EINPROGRESS and end up returning EALREADY instead.

s=require"socket"
print(s._VERSION)
m=s.tcp()
m:settimeout(0)
print(m:connect("google.com",80))
LuaSocket 3.0-rc1
nil     Operation already in progress

Expected nil, timeout instead.

@deryni
Copy link

deryni commented Jun 27, 2014

I did a little digging into this when daurnimator first mentioned it in the prosody muc and it looked to me like the address loop was just failing to account for timeout as a valid response for an address and continuing to loop incorrectly instead.

@daurnimator
Copy link
Contributor Author

Yep, the bug is here: https://github.com/diegonehab/luasocket/blob/master/src/inet.c#L418

Though a proper fix might require a bit of a refactor. Calling socket_strerror instead of checking the result of socket_connect directly is a bit weird IMO

@diegonehab
Copy link
Contributor

How do you propose us do a non-blocking connect when there are multiple addresses? I mean, what would the semantics be?

@daurnimator
Copy link
Contributor Author

How do you propose us do a non-blocking connect when there are multiple addresses? I mean, what would the semantics be?

If connect() returns ETIMEDOUT then return nil, "timeout".

It's the only thing that make sense in the current api.

@diegonehab
Copy link
Contributor

Without testing the other ones? What if one of them would return connected immediately?

@daurnimator
Copy link
Contributor Author

is calling connect() multiple times supported?
Most usages of luasocket I know of call connect(), select() until it's writable, then proceed.

If that's the case, the proper procedure should probably be that they getaddrinfo() themselves, and try the multiple possible results. They may even want to do this in tandem.

@diegonehab
Copy link
Contributor

I think that users should write their own loops when connecting in non-blocking mode. On the other hand, the current behavior is not very informative, so perhaps we should change the implementation to wait after the call to connect? Whatever gets done, we should make sure that it also works on Windows.

@okroth
Copy link

okroth commented Dec 21, 2014

I found that the POSIX documentation defines that the return value of connect() when used on a currently connecting non-blocking) socket shall be EALREADY.
socket_connect in usocket.c, however, checks for EAGAIN, which is, according to the Linux man pages, the return value for "Insufficient entries in the routing cache."
I guess, if this is corrected, things move much smoother.

@okroth
Copy link

okroth commented Dec 21, 2014

Update for WIndows implementation:
The wsocket.c socket_connect() function lacks the test for WSAEINVAL as indication that the socket is connecting.

Microsoft specifies (http://msdn.microsoft.com/en-US/library/windows/desktop/ms737625%28v=vs.85%29.aspx):

"Until the connection attempt completes on a nonblocking socket, all subsequent calls to connect on the same socket will fail with the error code WSAEALREADY, and WSAEISCONN when the connection completes successfully. Due to ambiguities in version 1.1 of the Windows Sockets specification, error codes returned from connect while a connection is already pending may vary among implementations. As a result, it is not recommended that applications use multiple calls to connect to detect connection completion. If they do, they must be prepared to handle WSAEINVAL and WSAEWOULDBLOCK error values the same way that they handle WSAEALREADY, to assure robust operation."

@okroth
Copy link

okroth commented Jan 7, 2015

POSIX says this about connect() with non-blocking sockets:
If the connection cannot be established immediately and O_NONBLOCK is set for the file descriptor for the socket, connect() shall fail and set errno to [EINPROGRESS], but the connection request shall not be aborted, and the connection shall be established asynchronously. Subsequent calls to connect() for the same socket, before the connection is established, shall fail and set errno to [EALREADY]."
So it may be best to change line 169 in usocket.c from:
if (err != EINPROGRESS && err != EAGAIN) return err;
to
if (err != EINPROGRESS && err != EALREADY) return err;

@Tieske
Copy link
Member

Tieske commented Mar 3, 2015

The Windows case might be a bit more complicated. When calling connect() several times (on a non-connected socket). It will eventually return WSAEISCONN to indicate the async connect was succesful. So in this case the error should be ignored, and success should be returned.

But, if you call connect() again, on this socket (which is now connected), then it should not ignore the WSAEISCONN, and it should return the error.

I don't know how unix handles this scenario.

@diegonehab
Copy link
Contributor

It would be great if somebody could figure out the portable way for doing non-blocking connects in LuaSocket so that users can forget if they are on Windows or Unix. Is this possible?

@daurnimator
Copy link
Contributor Author

It would be great if somebody could figure out the portable way for doing non-blocking connects

I think the pattern (for a single socket) should be call connect => poll until writable => call connect again.
If that fails, and you have a 2nd socket you want to try (e.g. IPv4 after IPv6) you'll need to call connect again, poll until writable (again) and then call connect (again)

This flow will be quite uncomfortable with luasocket; as the polling primitive is left to the user (to roll their own multi-tasking).

@Tieske
Copy link
Member

Tieske commented Mar 3, 2015

I have been working on Copas, and fixed the connect like so;

function copas.connect(skt, host, port)
  skt:settimeout(0)
  local ret, err, tried_more_than_once
  repeat
    ret, err = skt:connect (host, port)
    -- non-blocking connect on Windows results in error "Operation already
    -- in progress" to indicate that it is completing the request async. So essentially
    -- it is the same as "timeout"
    if ret or (err ~= "timeout" and err ~= "Operation already in progress") then
      -- Once the async connect completes, Windows returns the error "already connected"
      -- to indicate it is done, so that error should be ignored. Except when it is the 
      -- first call to connect, then it was already connected to something else and the 
      -- error should be returned
      if (not ret) and (err == "already connected" and tried_more_than_once) then
        ret = 1
        err = nil
      end
      return ret, err
    end
    tried_more_than_once = tried_more_than_once or true
    coroutine.yield(skt, _writing)
  until false
end

The coroutine.yield(skt, _writing) at the end, basically puts the socket in a select, waiting for becoming writeable.
I did some quick tests on Windows and a Raspberry Pi, and it seems to work fine. But I must say that I do not fully understand the initial issue as posted by @daurnimator and hence don't know whether this also handles that case.

@okroth
Copy link

okroth commented Mar 4, 2015

I found a note on a Microsoft page that said that thee are several
different return values on a second call to connect() on a not yet
connected asynchronously connecting socket.
Luckily all of them can be handled the same. But tehy need to be checked
in the wsocket.c file, which they are currrently not.

With Linux, the situation is 100% POSIX, so that any further connect()
call to a connecting socket returns EALREADY on the second and further
calls. (It returns EINPROGRESS on the first connect() call ), and
EISCONN if the socket is connected in mean time.

Actually, no repeated connect() is necessary to complete the connection.
When the socket becomes writeable, the connection is established, or
failed completely, which may be checked with a call to get_sockopt()
with SO_ERROR on level SOL_SOCKET.
Unfortunately, this isn't implemented as method in the socket library,
so it is not possible to follow this route (until one implements this
function to the socket library).

Side Note:
On the OS, I have to work on, further calls to connect() return an error
code that indicates an illegal socket handle value, so there ARE (a few)
OS that are worse than Windows :-)

Oliver

Am 03.03.2015 um 23:52 schrieb Thijs Schreijer:

I have been working on Copas, and fixed the connect like so;

function copas.connect(skt, host, port)
skt:settimeout(0)
local ret, err, tried_more_than_once
repeat
ret, err= skt:connect (host, port)
-- non-blocking connect on Windows results in error "Operation already
-- in progress" to indicate that it is completing the request async. So essentially
-- it is the same as "timeout"
if retor (err~= "timeout" and err~= "Operation already in progress")then
-- Once the async connect completes, Windows returns the error "already connected"
-- to indicate it is done, so that error should be ignored. Except when it is the
-- first call to connect, then it was already connected to something else and the
-- error should be returned
if (not ret)and (err== "already connected" and tried_more_than_once)then
ret= 1
err= nil
end
return ret, err
end
tried_more_than_once= tried_more_than_onceor true
coroutine.yield(skt, _writing)
until false
end

The |coroutine.yield(skt, _writing)| at the end, basically puts the
socket in a |select|, waiting for becoming writeable.
I did some quick tests on Windows and a Raspberry Pi, and it seems to
work fine. But I must say that I do not fully understand the initial
issue as posted by @daurnimator https://github.com/daurnimator and
hence don't know whether this also handles that case.


Reply to this email directly or view it on GitHub
#99 (comment).

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

From that same page (http://msdn.microsoft.com/en-US/library/windows/desktop/ms737625%28v=vs.85%29.aspx):

It reads (before the excerpt you quoted above);

With a nonblocking socket, the connection attempt cannot be completed immediately. In this case, connect will return SOCKET_ERROR, and WSAGetLastError will return WSAEWOULDBLOCK. In this case, there are three possible scenarios:

  • Use the select function to determine the completion of the connection request by checking to see if the socket is writeable.
  • If the application is using WSAAsyncSelect to indicate interest in connection events, then the application will receive an FD_CONNECT notification indicating that the connect operation is complete (successfully or not).
  • If the application is using WSAEventSelect to indicate interest in connection events, then the associated event object will be signaled indicating that the connect operation is complete (successfully or not).

Note the first point. The way I read that is that the problem you mention, regarding the different return values, occurs only if one just repeatedly calls connect. If you take the approach from the first point above, you only call connect again after it became writeable, and hence it will return WSAEISCONN indicating the connection is complete, or something else on an error. If you don't retry for WSAEISCONN, then you won't know about a possible failure.
So no need to use get_sockopt().

If you want this transparently handled within LuaSocket, then you must retain connection status in the socket object. Tracking for whether the call to connect is the first or a repeat, checking whether the connection target on the subsequent calls hasn't changed, etc.
It can be done, but I don't know whether it is worth the effort. One might be better of documenting the right way to deal with it.

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

Side Note:
On the OS, I have to work on, further calls to connect() return an error
code that indicates an illegal socket handle value, so there ARE (a few)
OS that are worse than Windows :-)

What happens on that OS if you call connect again only after it became writeable?

@okroth
Copy link

okroth commented Mar 4, 2015

Thijs,

it does the same: returns "invalid handle" error. It's a weird OS.

Oliver

Am 04.03.2015 um 10:12 schrieb Thijs Schreijer:

Side Note:
On the OS, I have to work on, further calls to connect() return an
error
code that indicates an illegal socket handle value, so there ARE
(a few)
OS that are worse than Windows :-)

What happens on that OS if you call |connect| again only after it
became writeable?


Reply to this email directly or view it on GitHub
#99 (comment).

@okroth
Copy link

okroth commented Mar 4, 2015

Am 04.03.2015 um 10:07 schrieb Thijs Schreijer:

"If you want this transparently handled within LuaSocket, then you must
retain connection status in the socket object. Tracking for whether the
call to connect is the first or a repeat, checking whether the
connection target on the subsequent calls hasn't changed, etc.
"

It may be worth the effort to handle this within the socket library
(namely wsocket.c and usocket.c) to diminish the differences between the
different OSes.

Oliver

@okroth
Copy link

okroth commented Mar 4, 2015

Hi Thijs,

this looks very promising, especially the fact that TCP sockets have a
connect() and that the SSL is regarded!

I will give it a try and let you know the outcome.

Oliver

Am 03.03.2015 um 23:52 schrieb Thijs Schreijer:

I have been working on Copas, and fixed the connect like so;

function copas.connect(skt, host, port)
skt:settimeout(0)
local ret, err, tried_more_than_once
repeat
ret, err= skt:connect (host, port)
-- non-blocking connect on Windows results in error "Operation already
-- in progress" to indicate that it is completing the request async. So essentially
-- it is the same as "timeout"
if retor (err~= "timeout" and err~= "Operation already in progress")then
-- Once the async connect completes, Windows returns the error "already connected"
-- to indicate it is done, so that error should be ignored. Except when it is the
-- first call to connect, then it was already connected to something else and the
-- error should be returned
if (not ret)and (err== "already connected" and tried_more_than_once)then
ret= 1
err= nil
end
return ret, err
end
tried_more_than_once= tried_more_than_onceor true
coroutine.yield(skt, _writing)
until false
end

The |coroutine.yield(skt, _writing)| at the end, basically puts the
socket in a |select|, waiting for becoming writeable.
I did some quick tests on Windows and a Raspberry Pi, and it seems to
work fine. But I must say that I do not fully understand the initial
issue as posted by @daurnimator https://github.com/daurnimator and
hence don't know whether this also handles that case.


Reply to this email directly or view it on GitHub
#99 (comment).

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

to use it with LuaSocket I think something like the following pseudo code should be used

-- assumes fields
-- skt.host
-- skt.port
-- skt.connectinprogress

connect_non_block(skt, host, port)
  if not skt.connectinprogress then
    -- new connect
    err = skt:connect(host, port)
    if err == isconnected then return nil, isconnected end
    if err == inprogress then
      -- mark connection as started
      skt.connectinprogress = true
      skt.host = host
      skt.port = port
      return nil, timeout 
    end
    if err == succes then return 1 end
    return nil, errormsg
  else
    -- already busy
    if host~=skt.host or port~=skt.port then
      return nil, allready_in_progress  -- don't allow change target half way a connect
    end
    -- if we're not writeable, then its a timeout
    iswriteable = select(nil, {skt}, 0) -- 0 timeout, check if writeable
    if not iswriteable then return nil, timeout end
    -- so try again to connect
    err = skt:connect(host, port)
    if err == inprogress then return nil, timeout end
    skt.host = nil
    skt.port = nil
    skt.connectinprogress = nil
    if err == isconnected then  -- succesfully connected
      return 1
    else
      return nil, errormsg
    end
  end
end

It has the overhead of doing a select on each subsequent call, and it assumes the "connect -> wait-writeable -> connect again" cycle to work on all platforms. And worse; it requires a second call to connect to clear the connectinprogress flag on the socket.

@diegonehab
Copy link
Contributor

Is there a minimum number of changes that would unify Windows and Unix behavior without completely changing the way we do things in LuaSocket? This has always been a source of frustration.

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

Well... I'm not sure there really is a cross-platform problem.

This code (#99 (comment)) works for me so far. Apparently there are some exceptional OS'ses where it doesn't, but Windows, POSIX and generally unix ought to work. But then you rely on the client code to do the right thing.

The one thing I don't understand is the initial message regarding "multiple DNS entries". First hunch is that it is the same as without the "multiple", but it was handled incorrectly. But maybe someone can explain the problem better, or try my code to see whether that works in the "multiple DNS entries" scenario.

@diegonehab
Copy link
Contributor

The issue with non-blocking connects with names that resolve to multiple addresses is: what does that mean? When the connect is blocking, we go over each of the addresses in order until one of them succeeds. When connect is non-blocking, we can't know immediately if the first connect will fail or not, so we have to return immediately. The current code fails because of EINPROGRESS. This should be fixed.
In my opinion, non-blocking connects with names that resolve to multiple addresses should be handled by the client, because it has to be put into the async level and that is outside LuaSocket's control. The user can call the name resolver, and write his own loop over the returns. But LuaSocket has to identify the situation and behave in a consistent way across platforms. For example, we can specify that it will try all addresses until the first EINPROGRESS. Or we can try only the first one.

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

Sorry for my sily questions below, I'm not an expert on this matter, just learning on-the-go...

The issue with non-blocking connects with names that resolve to multiple addresses is: what does that mean? When the connect is blocking, we go over each of the addresses in order until one of them succeeds. When connect is non-blocking, we can't know immediately if the first connect will fail or not, so we have to return immediately. The current code fails because of EINPROGRESS. This should be fixed.

For my understanding, when connecting, the DNS lookup returns multiple addresses. Now who is responsible for iterating over those? the OS or LuaSocket?

How does trying multiple addresses work with only a single socket? You can only try one at a time, no?

How about an extra timeout? If the first address doesn't connect within that timeframe, then go to the second. Passing EINPROGRESS to the async client, telling him to try again.
Then again, that would not work because the client will not retry calling connect if it didn't first signal 'writeable' from a select statement.

In my opinion, non-blocking connects with names that resolve to multiple addresses should be handled by the client, because it has to be put into the async level and that is outside LuaSocket's control. The user can call the name resolver, and write his own loop over the returns. But LuaSocket has to identify the situation and behave in a consistent way across platforms. For example, we can specify that it will try all addresses until the first EINPROGRESS. Or we can try only the first one.

I tend to agree, this being a client issue, as basically the whole DNS lookup stuff is completely blocking anyway. The only way to circumvent that is by doing DNS lookups async yourself, and not through the OS.
But that is going to be quite an effort I guess...

@diegonehab
Copy link
Contributor

The "standard way of doing things" states that these addresses should be tried in sequence. The OS doesn't do this because each call to the OS connect receives a single address, and not domain names. So either LuaSocket or the client has to do it. The typical LuaSocket use is blocking. So LuaSocket does it for the user. For non-blocking, things fall outside of LuaSocket's responsibility, in my opinion. It is possible to replicate the internal C behavior of connect using Lua code by simply resolving the addresses and calling LuaSocket's connect directly with IP addresses instead of a domain name. And this would work already in the current version of the code. What we need to do is to fail in a consistent way when there is a non-blocking connect attempt to a domain name that resolves to multiple addresses. That's all.

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

So, if non-blocking connect receives multiple addresses, then let it fail with a newly defined error message eg. "multi-address connect would block", and to make it easier to handle;

return nil, "multi-address connect would block", resolved_address_list

to prevent having to resolve the address again.
wouldn't that be easiest?

@okroth
Copy link

okroth commented Mar 4, 2015

Thijs,

possibly not.
You may already suffer from delay between calling the connect()and
getting the result EINPROGRESS on some OS when the DNS server needs some
time to respond.
I.e., although the TCP SNY-ACK handshake is asynchronous, the DNS lookup
may be blocking.
The DNS lookup is done in the inet_tryconnect() function in inet.c(and
is blocking by design)

The result is a list of IP addresses which are passed to the
socket_connect() and from there to the OS' connect() function.
How this handles more than one IP address on the list, isn't generally
defined.

To handle this in a common manner, I guess, it's best to manually call
gethostbyname() first and call connect() for each adress in the list
util success or final failure.

Oliver

Am 04.03.2015 um 15:29 schrieb Thijs Schreijer:

So, if non-blocking connect receives multiple addresses, then let it
fail with a newly defined error message eg. |"multi-address connect
would block"|, and to make it easier to handle;

|return nil, "multi-address connect would block", resolved_address_list|

to prevent having to resolve the address again.
wouldn't that be easiest?


Reply to this email directly or view it on GitHub
#99 (comment).

@Tieske
Copy link
Member

Tieske commented Mar 4, 2015

Obviously the DNS is blocking, so connect will also block temporarily while it looks up a domain name. But if you feed it merely an IP address, then it wouldn't go through DNS would it? no?

So using connect non-blocking with a single address returned, might still block on the DNS part. But that is a given, as Diego mentioned, async can only be resolved outside LuaSocket.

So if a client needs truly async, then he should provide his own async dns resolver. Sounds like it sucks, but it would already greatly improve current state of affairs. By returning the multiple addresses and let the client decide.

@diegonehab
Copy link
Contributor

To be absolutely sure, we would need to distinguish between numeric and named addresses ourselves, and then pass the appropriate flag to getaddrinfo to make sure DNS is not being engaged.

"If the AI_NUMERICHOST flag is specified, then a non-null nodename string supplied shall be a numeric host address string. Otherwise, an [EAI_NONAME] error is returned. This flag shall prevent any type of name resolution service (for example, the DNS) from being invoked.

If the AI_NUMERICSERV flag is specified, then a non-null servname string supplied shall be a numeric port string. Otherwise, an [EAI_NONAME] error shall be returned. This flag shall prevent any type of name resolution service (for example, NIS+) from being invoked."

Since distinguishing between numeric addresses and named addresses would better be performed by inet_pton() anyway, we might as well avoid the call to getaddrinfo. This is how LuaSocket did it before there was IPv6. I suppose we should move it back to how it was and add a pton test before trying getaddrinfo. What do you think?

@daurnimator
Copy link
Contributor Author

Since distinguishing between numeric addresses and named addresses would better be performed by inet_pton() anyway, we might as well avoid the call to getaddrinfo. This is how LuaSocket did it before there was IPv6. I suppose we should move it back to how it was and add a pton test before trying getaddrinfo. What do you think?

Sounds like a good idea. see the cqueues implementation: https://github.com/wahern/cqueues/blob/master/src/lib/socket.c#L498

@Tieske
Copy link
Member

Tieske commented Mar 5, 2015

Is that really necessary? If I do socket.connect("123.45.6.7", 80), then the underlying OS will be smart enough to resolve the address locally, detecting it as an IP address and return immediately. No? Anything else would be silly.

But if you want to be certain... add it.

@diegonehab
Copy link
Contributor

You'd be surprised. I don't think we need to add the test. What we definitely need to add is that error checking that you suggested. I.e., if the socket is in non-blocking mode and if the resolver returns multiple addresses, we need to report an error.

@okroth
Copy link

okroth commented Mar 6, 2015

Possibly, it may be useful to report an error only if the application
signalled with a parameter that it likes this case explicitly. And
otherwise, just use the first IP address only.

Oliver

Am 05.03.2015 um 21:24 schrieb Diego Nehab:

You'd be surprised. I don't think we need to add the test. What we
definitely need to add is that error checking that you suggested.
I.e., if the socket is in non-blocking mode and if the resolver
returns multiple addresses, we need to report an error.


Reply to this email directly or view it on GitHub
#99 (comment).

@Tieske
Copy link
Member

Tieske commented Mar 6, 2015

Assume a non-blocking socket doing a connect;

  • on an IP address: do not invoke DNS and connect on the provided address, returning the INPROGRESS and alike error messages --> completely non-blocking
  • on a single addressed name: resolve DNS and connect --> DNS is already blocking!
  • on a multi addressed name: resolve DNS and connect to first --> DNS is already blocking! Might as well try all addresses as it is already blocking....

So basically, the socket being non-blocking, is the parameter that @okroth mentiones, so I don't think there is any gain in @okroth suggestion. And we also don't need the extra return parameters listing the possible addresses. Or am I missing something?

It probably warrants a utility function like isAddress( [target] ) that would return a Boolean indicating whether a target is a named or numbered address.

The alternative being on a non-blocking socket;

  • on an IP address: do not invoke DNS and connect on the provided address, returning the INPROGRESS and alike error messages --> completely non-blocking (same as above)
  • on a name (either single or multi): return nil + "name based target would block"

Question: what constitutes a socket being in 'non-blocking' mode in LuaSocket land? couldn't find it in the docs. Is it a timeout of 0?

@okroth
Copy link

okroth commented Mar 6, 2015

Hi Thijs,

the reason why I mentioned the parameter is based on the idea that
usually an application is written in a simple way to defer all the
look-up and connecting handshake to the OS.

The DNS lookup takes only a few milliseconds, if the DNS is set up
properly, and there is no method (I know of) to avoid this. It's the
final TCP connect may take quite a while, based on the target host's
configuration (and existence...).

If there is more than one address, and the OS did not already test them,
the socket_connect() function may return an indication that it failed.
Same if the OS did only try the first; the application has anyway no
code at hand to handle the situation that the second or later entry may
succeed.

We may implement this code into the socket_connect() function, but then
things won't get easier:
The socket_connect() function needs to get told which entry of the list
returned be gethostbyname() it should pick (default first).
For a retry, the preivous call needs to have returned a return value
indicating something like "failed on this IP, try this index"... So,
gethostbyname() gets called more than once, and applications must be
aware of possibly multiple entries. Actually, I do not know whether an
OS does try on all entries, so I do not know a safe way to signal that
the first entry failed, which may spoil the whole method altogether.

On the other hand, there may be applications that really like to use
multiple IP addresses of a DNS lookup. These may also simply do the
lookup themselves and iterate over the IP address list returned from
gethostbyname(). Then the application has then full control over the way
it tries to establish the TCP connection. It could even try all possible
targets at the same time and cancel all but the first established (bad
behaviour :-) ).

I would go the second way.

BTW: The DNS lookup is always (shortly) blocking, so there is no way
around this short blocking anyway.

Oliver

Am 06.03.2015 um 09:15 schrieb Thijs Schreijer:

Assume a non-blocking socket doing a connect;

  • /on an IP address/: do not invoke DNS and connect on the provided
    address, returning the INPROGRESS and alike error messages -->
    completely non-blocking
  • /on a single addressed name/: resolve DNS and connect --> DNS is
    already blocking!
  • /on a multi addressed name/: resolve DNS and connect to first -->
    DNS is already blocking! Might as well try all addresses as it
    is already blocking....

So basically, the socket being non-blocking, is the parameter that
@okroth https://github.com/okroth mentiones, so I don't think there
is any gain in @okroth https://github.com/okroth suggestion. And we
also don't need the extra return parameters listing the possible
addresses. Or am I missing something?

It probably warrants a utility function like |isAddress( [target] )|
that would return a Boolean indicating whether a target is a named or
numbered address.

The alternative being on a non-blocking socket;

  • /on an IP address/: do not invoke DNS and connect on the provided
    address, returning the INPROGRESS and alike error messages -->
    completely non-blocking (same as above)
  • /on a name (either single or multi)/: return |nil + "name based
    target would block"|

Question: what constitutes a socket being in /'non-blocking'/ mode
in LuaSocket land? couldn't find it in the docs. Is it a timeout of 0?


Reply to this email directly or view it on GitHub
#99 (comment).

@Tieske
Copy link
Member

Tieske commented Mar 7, 2015

Though I think my 2nd proposal (error out when resolving a name on a non-blocking socket) is the cleanest from an API perspective. It is also the most 'breaking' option, so probably overall not the 'best' option.

Consolidating; when connecting on a non-blocking socket;

  1. when proving an IP address; connect and do the regular INPROGRESS error messaging so the client can perform the connect async/non-blocking.
  2. when by name with a single target; resolve DNS (short block) and then do INPROGRESS as per nr. 1
  3. when by name with a multi-target; the same as nr. 2, connecting on the first address returned. But add the extra return value after the errormessage; a list of resolved IP addresses

Does that make sense?

PS. anyone an answer to my "what constitutes a non-blocking socket" question?

@diegonehab
Copy link
Contributor

I think in 3 we should simply try the first address. Simpler to code. The only people that would care about this would know what to do anyway. Nobody's code is broken by this change, because at this moment this simply doesn't work. The behavior becomes consistent and reproducible. Whoever wants to write their own non-blocking connect for hosts that resolve to multiple addresses can do it with the API.

A non-blocking socket is a socket that has had settimeout() called on it. If the user does that before a call to connect, we can test by looking at the tm structure.

@okroth
Copy link

okroth commented Mar 9, 2015

To me this looks sensible.
Maybe the function simply returns the result value of gethostbyname(),
so that all information is preserved.

Oliver

Am 07.03.2015 um 20:41 schrieb Thijs Schreijer:

Though I think my 2nd proposal (error out when resolving a name on a
non-blocking socket) is the cleanest from an API perspective. It is
also the most 'breaking' option, so probably overall not the 'best'
option.

Consolidating; when connecting on a non-blocking socket;

  1. when proving an IP address; connect and do the regular INPROGRESS
    error messaging so the client can perform the connect
    async/non-blocking.
  2. when by name with a single target; resolve DNS (short block) and
    then do INPROGRESS as per nr. 1
  3. when by name with a multi-target; the same as nr. 2, connecting on
    the first address returned. But add the extra return value after
    the errormessage; a list of resolved IP addresses

Does that make sense?

PS. anyone an answer to my "what constitutes a non-blocking socket"
question?


Reply to this email directly or view it on GitHub
#99 (comment).

@daurnimator
Copy link
Contributor Author

What was the outcome on this?

@diegonehab
Copy link
Contributor

I implemented the idea trying just the first returned address if the socket is set to non-blocking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants