Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gopher: check remaining time left during write busy loop #5214

Closed
wants to merge 1 commit into from

Conversation

mback2k
Copy link
Member

@mback2k mback2k commented Apr 10, 2020

Comparing SOCKET_WRITABLE in schannel.c and gopher.c, gopher never considered the transfer time out before.

@mback2k mback2k requested a review from bagder April 10, 2020 15:58
@mback2k mback2k self-assigned this Apr 10, 2020
@mback2k mback2k force-pushed the add-gopher-timeout branch 2 times, most recently from b2058a7 to 0611d61 Compare April 10, 2020 16:26
@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

I guess this should be adapted to #5199 as well. I will wait for @jay's PR to be merged first.

@jay
Copy link
Member

jay commented Apr 11, 2020

I'm in the process of redoing schannel_send to be non-blocking, hopefully it will help with #5177. The socket writable check there is likely going be removed. It calls select which will block on a non-blocking socket. AFAICS if the timeout is 0 and that means no timeout and the non-blocking socket is not ready then 0 will be returned. If that is the case then returning CURLE_OPERATION_TIMEDOUT is wrong. In other words, if the user has not given a timeout, and therefore the timeout is 0, then we shouldn't timeout if the socket is not ready.

@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

Okay, I will wait for your enhancements to schannel_send then. I guess gopher.c can then be adjusted in a similar way.

@jay
Copy link
Member

jay commented Apr 11, 2020

Maybe Curl_timeleft should not return 0 when there is no timeout set? Return TIMEDIFF_T_MAX?

curl/lib/connect.c

Lines 214 to 221 in b81e0b0

default:
/* use the default */
if(!duringconnect)
/* if we're not during connect, there's no default timeout so if we're
at zero we better just return zero and not make it a negative number
by the math below */
return 0;
break;

@ajax16384
Copy link
Contributor

ajax16384 commented Apr 11, 2020

Maybe Curl_timeleft should not return 0 when there is no timeout set? Return TIMEDIFF_T_MAX?

Just checked Curl_timeleft("unspecifed")=> TIMEDIFF_T_MAX with schannel problem and it's works fine.
But personally i would prefer solution when sequentially pushed buffer to nonblock socket will not
use select-write check at all:

  • select-write for non-blocked socket was always used to detect "is-connected" state.
    When we trying to push buffer, socket already passed connection-success state - no reason to check again.
  • write-send method for nonblock socket will return E_AGAIN for retry or some fatal error. So there is no reason to make two syscalls insead of one.
    (like it's done here )

@bagder
Copy link
Member

bagder commented Apr 11, 2020

Maybe Curl_timeleft should not return 0 when there is no timeout set? Return TIMEDIFF_T_MAX?

I think that sounds like a pretty clever way to enhance the function and remove the need to special-case when it returns zero...

@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

  • select-write for non-blocked socket was always used to detect "is-connected" state.
    When we trying to push buffer, socket already passed connection-success state - no reason to check again.

I don't think this is correct, because:

If the socket is not processing a connect call, writability means a send, sendto, or WSASendto are guaranteed to succeed.

Source: https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select#remarks

@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

  • write-send method for nonblock socket will return E_AGAIN for retry or some fatal error. So there is no reason to make two syscalls insead of one.
    (like it's done here )

I don't know if this is a good idea, because this is basically busy-looping and putting pressure on the socket write buffers. But I am not sure, maybe it is an alternative solution? cc @bagder @jay?

@bagder
Copy link
Member

bagder commented Apr 11, 2020

I don't think I understand the question. Should we avoid select() ? If we would do it entirely correctly, it should loop at all and then it shouldn't do select()...

@bagder
Copy link
Member

bagder commented Apr 11, 2020

But you're right, we should not busy-loop. That's the worst.

@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

I don't think I understand the question. Should we avoid select() ?

I was referring to @ajax16384 points above and the specific reference to "(like it's done here)" there Curl_write_plain is called in a for-loop without waiting or select'ing, therefore busy-looping.

curl/lib/security.c

Lines 174 to 185 in b81e0b0

while(len > 0) {
result = Curl_write_plain(conn, fd, to_p, len, &written);
if(!result) {
len -= written;
to_p += written;
}
else {
if(result == CURLE_AGAIN)
continue;
return result;
}
}

If we would do it entirely correctly, it should loop at all and then it shouldn't do select()...

Is that even possible inside schannel_send, due to the following?

curl/lib/vtls/schannel.c

Lines 1625 to 1628 in 6d13ef5

Here's the catch with this - if we tell the client that all the
bytes have been sent, will the client call this method again to
send the buffered data? Looking at who calls this function, it
seems the answer is NO.

And I am not sure if that would also apply to gopher_do, which this PR is actually about.

jay added a commit to jay/curl that referenced this pull request Apr 11, 2020
- Add threaded resolver cleanup and GSSAPI for FTP to the TODO list of
  known blocking operations.

- New known bugs entry 'Blocking socket operations in non-blocking API'
  that directs to the TODO's list of known blocking operations.

Ref: curl#5214 (comment)

Reported-by: Marc Hoersken

Closes #xxxx
@jay
Copy link
Member

jay commented Apr 11, 2020

  • write-send method for nonblock socket will return E_AGAIN for retry or some fatal error. So there is no reason to make two syscalls insead of one.
    (like it's done here )

...

I don't know if this is a good idea, because this is basically busy-looping and putting pressure on the socket write buffers. But I am not sure, maybe it is an alternative solution? cc @bagder @jay?

Yes you're right it's a busy loop if the socket isn't available. Though we shouldn't block, if we must then we should block waiting for the socket rather than call the underlying write repeatedly and eat CPU. That code in security.c needs to be improved, even if it can't be made non-blocking. I've added GSSAPI for FTP (afaict this is what security.c does) as a blocking operation to the TODO list. #5216

@ajax16384
Copy link
Contributor

Source: https://docs.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-select#remarks

@mback2k well noticed: "If TIMEVAL is initialized to {0, 0}, select will return immediately; this is used to poll the state of the selected sockets.".
It shows that underlying socket is ready for send-write (without error or E_AGAIN) but almost the same way as calling to send-write (no cpu time release).
And that's why "SOCKET_WRITABLE(sockfd, 0)" useless to avoid buzy looping and i proposed that it might be removed.
But i did not realized (my bad) that You also deal with cpu hogging in THIS PR.
so "Curl_timeleft(unspecifed)=> TIMEDIFF_T_MAX - really good idea: fix buzy loop, fix schannel send timeout.
(Sorry for missleading proposal about removing SOCKET_WRITABLE before send-write)

@jay
Copy link
Member

jay commented Apr 11, 2020

Maybe Curl_timeleft should not return 0 when there is no timeout set? Return TIMEDIFF_T_MAX?

I think that sounds like a pretty clever way to enhance the function and remove the need to special-case when it returns zero...

I'm having second thoughts. I've since I looked through the codebase and it is depended on elsewhere that Curl_timeleft uses 0 to signal there is no timeout, so it may not be as easy as I thought. The crux of the issue is it is not immediately clear to contributors that Curl_timeleft returns 0 to signal there is no timeout set, though there is some comment at the end of the function that implies it. However most of the code seems to have been written with an understanding of that. Example:

curl/lib/asyn-ares.c

Lines 428 to 435 in b81e0b0

timeout = Curl_timeleft(data, &now, TRUE);
if(timeout < 0) {
/* already expired! */
connclose(conn, "Timed out before name resolve started");
return CURLE_OPERATION_TIMEDOUT;
}
if(!timeout)
timeout = CURL_TIMEOUT_RESOLVE * 1000; /* default name resolve timeout */

It would be easier to update the documentation and then address those cases individually where 0 is wrongly treated as time remaining. For example in this PR

    timeleft = Curl_timeleft(conn->data, NULL, FALSE);
    if(timeleft < 0) {
      result = CURLE_OPERATION_TIMEDOUT;
      break;
    }
    if(!timeleft || timeleft > TIME_T_MAX)
      timeleft = TIME_T_MAX;

Otherwise we'd have to trace every Curl_timeleft to find which code depends on 0 to signal no timeout and then walk through that logic, and change Curl_timeleft to return 0 to mean no time left. Possibly renaming the function Curl_calculate_timeout or something would help to make it clear that 0 may have some special value?

mback2k added a commit to mback2k/curl that referenced this pull request Apr 11, 2020
Assisted-by: Jay Satiro
Reviewed-by: Daniel Stenberg

Closes curl#5214
@mback2k
Copy link
Member Author

mback2k commented Apr 11, 2020

@jay thanks, I updated this PR accordingly.

@mback2k mback2k marked this pull request as ready for review April 11, 2020 22:43
@mback2k mback2k requested a review from jay April 11, 2020 22:43
lib/gopher.c Outdated Show resolved Hide resolved
jay added a commit to jay/curl that referenced this pull request Apr 12, 2020
- Document in Curl_timeleft's comment block that returning 0 signals no
  timeout (ie there's infinite time left).

- Fix SOCKS' Curl_blockread_all for the case when no timeout was set.

Prior to this change if the timeout had a value of 0 and that was passed
to SOCKET_READABLE it would return right away instead of blocking. That
was likely because it was not well understood that when Curl_timeleft
returns 0 it is not a timeout of 0 ms but actually means no timeout.

Ref: curl#5214 (comment)

Closes #xxxx
@jay
Copy link
Member

jay commented Apr 12, 2020

I found a similar issue in socks #5220 and I renamed timeleft to timeout_ms, since I think that may be easier to understand.

@mback2k
Copy link
Member Author

mback2k commented Apr 12, 2020

Thanks, looks good! I will also rename the variable here, because that makes a lot more sense.

mback2k added a commit to mback2k/curl that referenced this pull request Apr 12, 2020
Assisted-by: Jay Satiro
Reviewed-by: Daniel Stenberg

Similar to curl#5220 and curl#5221
Closes curl#5214
@mback2k mback2k requested a review from jay April 12, 2020 12:04
Assisted-by: Jay Satiro
Reviewed-by: Daniel Stenberg

Similar to curl#5220 and curl#5221
Closes curl#5214
@jay jay closed this in be28bc2 Apr 12, 2020
jay added a commit that referenced this pull request Apr 12, 2020
- Document in Curl_timeleft's comment block that returning 0 signals no
  timeout (ie there's infinite time left).

- Fix SOCKS' Curl_blockread_all for the case when no timeout was set.

Prior to this change if the timeout had a value of 0 and that was passed
to SOCKET_READABLE it would return right away instead of blocking. That
was likely because it was not well understood that when Curl_timeleft
returns 0 it is not a timeout of 0 ms but actually means no timeout.

Ref: #5214 (comment)

Closes #5220
@jay
Copy link
Member

jay commented Apr 12, 2020

Thanks

jay added a commit that referenced this pull request Apr 14, 2020
- Add threaded resolver cleanup and GSSAPI for FTP to the TODO list of
  known blocking operations.

- New known bugs entry 'Blocking socket operations in non-blocking API'
  that directs to the TODO's list of known blocking operations.

Ref: #5214 (comment)

Reported-by: Marc Hoersken

Closes #5216
mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Related to curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Closes curl#5240
mback2k added a commit to mback2k/curl that referenced this pull request Apr 20, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Related to curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Closes curl#5240
mback2k added a commit to mback2k/curl that referenced this pull request May 30, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Related to curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Closes curl#5240
mback2k added a commit to mback2k/curl that referenced this pull request May 30, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Taken from curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343
Closes curl#5490
mback2k added a commit to mback2k/curl that referenced this pull request May 31, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Based upon curl#5262
Related to curl#5479
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343
Closes curl#5490
mback2k added a commit to mback2k/curl that referenced this pull request May 31, 2020
Remove obsolete type conversions to time_t or int:

Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Based upon curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343 and curl#5479
Closes curl#5490
mback2k added a commit to mback2k/curl that referenced this pull request May 31, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Based upon curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343 and curl#5479
Closes curl#5490
mback2k added a commit to mback2k/curl that referenced this pull request Jun 1, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited time_t or int, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Based upon curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343 and curl#5479
Closes curl#5490
mback2k added a commit to mback2k/curl that referenced this pull request Jun 5, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited int or long, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Avoiding conversions from time_t was already done in 842f73d.

Based upon curl#5262
Supersedes curl#5214, curl#5220 and curl#5221
Follow up to curl#5343 and curl#5479
Closes curl#5490
mback2k added a commit that referenced this pull request Jun 6, 2020
Now that all functions in select.[ch] take timediff_t instead
of the limited int or long, we can remove type conversions
and related preprocessor checks to silence compiler warnings.

Avoiding conversions from time_t was already done in 842f73d.

Based upon #5262
Supersedes #5214, #5220 and #5221
Follow up to #5343 and #5479
Closes #5490
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

4 participants