multi: fix queueing of pending easy handles #1358

Closed
wants to merge 2 commits into
from

Projects

None yet

3 participants

@bakaid
bakaid commented Mar 27, 2017 edited

Multi handles "try and maintain a FIFO queue so the pipelined requests are in order". That is why newly added easy handles are added to end of the multi->easyp linked list.
When using the multi_socket API with the total number of connections limited by CURLMOPT_MAX_TOTAL_CONNECTIONS, this FIFO behaviour can be lost. This can be demonstrated by the slightly modified "multi-uv" example. We create a multi handle, and limit CURLMOPT_MAX_TOTAL_CONNECTIONS to 1 (this wouldn't make much sense in real life, but it helps to demonstrate the problem clearly). Then we add 16 easy handles (HTTP GET requests to the same URL) and log when these requests are finished (see: multi-pending-fix-test.txt).

It turns out that the requests are finished in this order:

0 ADDED
1 ADDED
2 ADDED
3 ADDED
4 ADDED
5 ADDED
6 ADDED
7 ADDED
8 ADDED
9 ADDED
10 ADDED
11 ADDED
12 ADDED
13 ADDED
14 ADDED
15 ADDED
0 DONE
1 DONE
15 DONE
2 DONE
14 DONE
3 DONE
13 DONE
4 DONE
12 DONE
5 DONE
11 DONE
6 DONE
10 DONE
7 DONE
9 DONE
8 DONE

This is caused by the repeated inversion of the list of easy handles waiting for connections (multi->pending) in a multistep process.
Let's begin from a state where one easy handle is executing and the others are in STATE_CONNECT_PEND.

  • the currently executing easy handle finishes
  • Curl_multi_process_pending_handles gets called
  • multi->pending is iterated from head to tail, adding a timeout to the splay tree by Curl_expire_latest(data, 0) for every pending handle
  • this means that these timeouts for the easy handles are monotonically increasing from the first member of multi->pending to the last (not strictly increasing, because more handles can and will have the same timeout, but this makes no difference, as we will see later)
  • every easy handle's state is set from STATE_CONNECT_PEND to STATE_CONNECT
  • soon curl_multi_socket_action (and by that, multi_socket) will be called due to a timeout
  • it is clear that the first easy handle for which multi_runsingle is called will "win" and take the available connection, while all the others will go back to STATE_CONNECT_PEND and will be added to multi->pending
  • multi_socket will process all expired timers and call multi_runsingle on their associated easy_handle in an order determined by Curl_splaygetbest
  • Curl_splaygetbest will choose the "best-fit" node which means "the node with the given or lower key", the key being, in this instance, the current time
  • this means that the first timeout to be chosen and removed from the splay tree will be the one which is the largest smaller than the current time and, as a result, the expired timeouts will be iterated from the largest (most recently expired) to the smallest (least recently expired)
  • in case when multiple timeouts (easy handles) belong to the same key (timestamp) in the splay tree, a linked list of timeouts are used
  • when a node is inserted to the splay tree with a key already existing in the tree, it will be inserted in place of the current timeout (to the head of the list of same timeouts) and when removed, the head of this list will be removed first, and replaced by the next element of the list
  • this means that the first element removed from the list of same timeouts will the one added last which is consistent with the demonstrated behaviour of Curl_splaygetbest
  • therefore Curl_splaygetbest exhibits a FILO behaviour
  • therefore multi_socket will call multi_runsingle for easy handles in a reverse order of their respective timeouts
  • the last easy_handle added will take the available connection, the rest will be added to multi->pending in a reverse order
    This process repeats until no more handles are left, inverting the list of pending handles at every iteration and giving the available connection to the first easy handle in the list, causing the observed 1-15,2-14,3-13,... pattern.

This particular problem could probably be solved by a quick-and-dirty solution: iterating on the multi->pending list in a reverse order in Curl_multi_process_pending_handles, negating the effect of the described process.

I would argue, however, that this problem is inherent to the workings of Curl_splaygetbest and the definition of "best" in the context of expired timeouts.
I think we should deal with timeouts which expired the longest time ago first, and define "best" in the context of expired timeouts as "smallest not larger than now".
This would fix this issue, and it seems generally the logical approach to me. However, I recognize that this may disable some optimalization I do not yet comprehend.

The patch that I propose contains the following modifications:

  • splay nodes with the same key are stored in a doubly-linked circular list instead of a non-circular one to enable O(1) insertion to the tail of the list
  • this necessitates a new pointer in the Curl_tree struct, because smaller could not be reused; this means a 4/8 byte memory increase per easy handle, which seems acceptable to me
  • Curl_splayinsert inserts nodes with the same key to the tail of the same list
  • Curl_splaygetbest implements the proposed definition of "best", choosing the smallest node of the tree, if it is not greater than the supplied key
  • in case of multiple nodes with the same key, the one on the head of the list gets selected, maintaining the FIFO behaviour of the new Curl_splaygetbest

@bakaid, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @yangtse and @dfandrich to be potential reviewers.

@bagder bagder self-assigned this Mar 29, 2017
Owner
bagder commented Mar 29, 2017

Thank you for your contribution!

You're suggesting changes to a core part of libcurl that basically has been stable and untouched for almost a decade. I think we must add some test cases for this that make sure the added functionality works as intended.

But first maybe we should discuss whether this change is really worth it. Is there any documentation anywhere that would suggest that libcurl will really try the added handles in a FIFO order? I always stress to users that libcurl does not maintain any order among the transfers an in the case where there are N transfers waiting, we have not promised a queue order to the user. I'm a little concerned that adding such a promise to docs and to users will imply that we need to work harder to maintain that, to very little gain I think. I gather you have a particular use case where FIFO order is desired?

bakaid commented Mar 29, 2017

On the issue of testing the new behaviour: you're completely right and I'd be happy to add these test cases.

The particular use case is the streaming download of a large file using multiple HTTP range requests on multiple connections (there must be multiple connections to achieve maximum throughput, but it is not feasible to use as many connections as the number of chunks in the file). It doesn't really matter in what particular order the transfers are finished, but it does matter that the chunks are downloaded in a relative order, meaning that the pending requests should be scheduled in the order of their addition. Otherwise a large buffer needs to be maintained and work on the file stream is blocked while chunks from the end of the file are being downloaded that could not be processed anyway until we've processed the previous chunks.
This could of course be solved by an external scheduler, but then the number of currently downloading chunks must be maintained externally from the curl library, a new request must be scheduled when a connection becomes avaliable, which means a lot of locking, switching threads, using (in the case of libuv) uv_async_send requests. This complicates the otherwise practical and elegant async code which can be achieved using the curl_multi_socket interface and degrades performance.

It is completely understandable, that no order among the transfers can and should be guaranteed, but I do think that it is the intuitive behaviour to schedule pending transfers in the order they were added and it would be a great improvement to curl.

Owner
bagder commented Mar 31, 2017

I think you make a good case for why we want to try harder to maintain the order. I agree.

Owner
bagder commented Mar 31, 2017

@bakaid, it would be great if you could look into extending unit test 1309 somewhat to verify this enhanced behavior!

bakaid commented Apr 1, 2017

Thank you, I'm on it!

Dániel Bakai added some commits Mar 24, 2017
Dániel Bakai multi: fix queueing of pending easy handles
Multi handles repeatedly invert the queue of pending easy handles when
used with CURLMOPT_MAX_TOTAL_CONNECTIONS. This is caused by a multistep
process involving Curl_splaygetbest and violates the FIFO property of
the multi handle.
This patch fixes this issue by redefining the "best" node in the
context of timeouts as the "smallest not larger than now", and
implementing the necessary data structure modifications to do this
effectively, namely:
 - splay nodes with the same key are now stored in a doubly-linked
   circular list instead of a non-circular one to enable O(1)
   insertion to the tail of the list
 - Curl_splayinsert inserts nodes with the same key to the tail of
   the same list
 - in case of multiple nodes with the same key, the one on the head of
   the list gets selected
0285eeb
Dániel Bakai tests: added test for Curl_splaygetbest to unit1309
This checks the new behavior of Curl_splaygetbest, so that the smallest
node not larger than the key is removed, and FIFO behavior is kept even
when there are multiple nodes with the same key.
89a14a3
bakaid commented Apr 3, 2017

@bagder, I've modified the test and rebased. If there is anything else I can do, please tell me.

@bagder bagder added a commit that closed this pull request Apr 4, 2017
@bagder Dániel Bakai + bagder tests: added test for Curl_splaygetbest to unit1309
This checks the new behavior of Curl_splaygetbest, so that the smallest
node not larger than the key is removed, and FIFO behavior is kept even
when there are multiple nodes with the same key.

Closes #1358
6193770
@bagder bagder closed this in 6193770 Apr 4, 2017
Owner
bagder commented Apr 4, 2017

Excellent work, thank you!

@jay jay referenced this pull request Apr 5, 2017
@bagder Dániel Bakai + bagder multi: fix queueing of pending easy handles
Multi handles repeatedly invert the queue of pending easy handles when
used with CURLMOPT_MAX_TOTAL_CONNECTIONS. This is caused by a multistep
process involving Curl_splaygetbest and violates the FIFO property of
the multi handle.
This patch fixes this issue by redefining the "best" node in the
context of timeouts as the "smallest not larger than now", and
implementing the necessary data structure modifications to do this
effectively, namely:
 - splay nodes with the same key are now stored in a doubly-linked
   circular list instead of a non-circular one to enable O(1)
   insertion to the tail of the list
 - Curl_splayinsert inserts nodes with the same key to the tail of
   the same list
 - in case of multiple nodes with the same key, the one on the head of
   the list gets selected
de05bcb
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment