timeouts are lost (not taken into account) during the disconnection step in pingpong protocols #739

Closed
nick309h opened this Issue Mar 29, 2016 · 8 comments

Projects

None yet

3 participants

@nick309h

libcurl version: 7.48.0, OS: Windows (but I think the problem will appear in all systems).
I've explored the source code after I got this issue so I will describe the sequence of steps leading to it with the explanation of how the issue occurs. It is as follows:

  1. I used easy interface so the start is for example
    CURL *curl = curl_easy_init();
  2. Setup the timeouts:
    curl_easy_setopt(curl, CURLOPT_TIMEOUT, 10);
    curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 10);
    curl_easy_setopt(curl, CURLOPT_SERVER_RESPONSE_TIMEOUT, 10);
    and other parameters such as URL, login, password, etc.
  3. Call curl_easy_perform(curl)
    What happens: this function calls easy_perform which calls Curl_multi_handle (easy.c, line 789) and the latter creates another easy handle (multi.c, line 326), named closure_handle with all the settings default including timeouts.
    Then some protocol-depending work is executed.
  4. Call curl_easy_cleanup(curl)
    What happens: the first part of the call stack is as follows: curl_easy_cleanup->Curl_close->curl_multi_cleanup->close_all_connections . This function takes the closure_handle (multi.c, line 1876) and tries to disconnect it (line 1880). Please take into account that closure_handle->set which contains all the settings is still set to defaults unlike curl->set which was tuned by previous curl_easy_setopt calls. The following part of the call stack is (I used SMTP in this example) : close_all_connections->Curl_disconnect->smtp_disconnect->smtp_block_statemach->Curl_pp_statemach->Curl_pp_state_timeout. This function tries to calculate timeout but can not do it correctly since it is actually working with closure_handle which settings were not tuned. So data->set.server_response_timeout (pingpong.c, line 53) and data->set.timeout (pingpong.c, line 66) are always zero and default value RESP_TIMEOUT is used. It is set to 1800 seconds so the application may hang for a half of an hour despite the timeouts were set to 10 seconds. The real reason of it may be different, for example remote server or proxy hangs or some packets are lost in an unreliable network.

Possible solution: copy some timeout settings to closure_handle from the main one in easy_perform after Curl_multi_handle was called (easy.c, line 789) or even somewhere in disconnection call stack (to avoid affecting not-pingpong protocols). For example, adding the following in easy.c starting from line 793 (after data->multi_easy = multi;) solves the problem:
data->multi_easy->closure_handle->set.timeout = data->set.timeout; data->multi_easy->closure_handle->set.server_response_timeout = data->set.server_response_timeout;

@bagder bagder self-assigned this Mar 29, 2016
@bagder
Member
bagder commented Mar 29, 2016

I agree that this is a problem we need to address.

However, the closure_handle exists and is used because there will/may not be any other easy handles left and we don't have any easy handles associated with the specific connection at the time of the cleanup so it isn't immediately obvious from which easy handle at what point in time the timeout values should be copied...

@bagder
Member
bagder commented Apr 2, 2016

Yeah, ok by copying the timeout values like that we would at least get the values from the most recently added handle and that seems deterministic and fine enough to me, we should even document it once done and I would expect that applications will usually use the same timeout values for all handles anyway.

Would you be so kind and make a pull request with this for us?

@nick309h
nick309h commented Apr 3, 2016

OK, sorry for the slow response, so did we decide to copy timeout and server_response_timeout values just after the creation of multi handle in easy_perform function? I think it will fix the problem for easy interface because it looks like in this case one easy handle can own at most one multi handle (and this multi owns only one closure_handle). But I'm not so sure about working with multi interface (because I've always worked with easy interface only). It looks like curl_multi_add_handle function should copy timeout values to it's closure_handle too, shouldn't it? But maybe I'm not right because I've not explored the multi interface deeply.

@bagder
Member
bagder commented Apr 9, 2016

I'm fine with doing that for the multi interface as well, which then makes it use the timeout values for the last added easy handles but I think that's still more sensible than just using the defaults like now.

@bagder
Member
bagder commented Jun 28, 2016

so @nick309h, I got the impression you worked on a fix/work-around for this? Any chance you can post it for us as a patch or pull-request?

@burakdulger
burakdulger commented Sep 6, 2016 edited

I had the same problem in SMTP application and the application got stuck 30 minutes long. I use multi interface in the application. and applied the nick309h's change to curl_multi_perform() as adding

+ multi->closure_handle->set.timeout = data->set.timeout; 
+ multi->closure_handle->set.server_response_timeout = data->set.server_response_timeout;

before multi_runsingle(multi,now,data);

Up to now, we didn't observe the problem after the modification. Is it a proper way to fix the problem on multi_interface?

@bagder
Member
bagder commented Sep 6, 2016

Everything is "the multi interface" since the easy interface is implemented as a wrapper around the multi interface.

I'd still appreciate a full patch/pull-request for this change so that I know I'm understanding it properly and can consider it for merging and shipping in a future release.

@bagder
Member
bagder commented Oct 18, 2016

How about doing that operation when a handle is added to the multi handle instead? Seems to be a more controlled moment and will work for all APIs etc. See my patch here:

diff --git a/lib/multi.c b/lib/multi.c
index 8e40916..cac1d44 100644
--- a/lib/multi.c
+++ b/lib/multi.c
@@ -460,10 +460,18 @@ CURLMcode curl_multi_add_handle(struct Curl_multi *multi,
      The work-around is thus simply to clear the 'lastcall' variable to force
      update_timer() to always trigger a callback to the app when a new easy
      handle is added */
   memset(&multi->timer_lastcall, 0, sizeof(multi->timer_lastcall));

+  /* the closure handle only has default timeouts set, but to improve the
+     state somewhat we clone the timeouts from each added handle so that the
+     closure handle's timeout always have timeouts of the most recently added
+     handle */
+  multi->closure_handle->set.timeout = data->set.timeout;
+  multi->closure_handle->set.server_response_timeout =
+    data->set.server_response_timeout;
+
   update_timer(multi);
   return CURLM_OK;
 }

 #if 0
@bagder bagder added a commit that closed this issue Oct 19, 2016
@bagder bagder curl_multi_add_handle: set timeouts in closure handles
The closure handle only ever has default timeouts set. To improve the
state somewhat we clone the timeouts from each added handle so that the
closure handle always has the same timeouts as the most recently added
easy handle.

Fixes #739
5c3d8d2
@bagder bagder closed this in 5c3d8d2 Oct 19, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment