-
-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
multi: fix queueing of pending easy handles #1358
Conversation
@bakaid, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @yangtse and @dfandrich to be potential reviewers. |
Thank you for your contribution! You're suggesting changes to a core part of libcurl that basically has been stable and untouched for almost a decade. I think we must add some test cases for this that make sure the added functionality works as intended. But first maybe we should discuss whether this change is really worth it. Is there any documentation anywhere that would suggest that libcurl will really try the added handles in a FIFO order? I always stress to users that libcurl does not maintain any order among the transfers an in the case where there are N transfers waiting, we have not promised a queue order to the user. I'm a little concerned that adding such a promise to docs and to users will imply that we need to work harder to maintain that, to very little gain I think. I gather you have a particular use case where FIFO order is desired? |
On the issue of testing the new behaviour: you're completely right and I'd be happy to add these test cases. The particular use case is the streaming download of a large file using multiple HTTP range requests on multiple connections (there must be multiple connections to achieve maximum throughput, but it is not feasible to use as many connections as the number of chunks in the file). It doesn't really matter in what particular order the transfers are finished, but it does matter that the chunks are downloaded in a relative order, meaning that the pending requests should be scheduled in the order of their addition. Otherwise a large buffer needs to be maintained and work on the file stream is blocked while chunks from the end of the file are being downloaded that could not be processed anyway until we've processed the previous chunks. It is completely understandable, that no order among the transfers can and should be guaranteed, but I do think that it is the intuitive behaviour to schedule pending transfers in the order they were added and it would be a great improvement to curl. |
I think you make a good case for why we want to try harder to maintain the order. I agree. |
@bakaid, it would be great if you could look into extending unit test 1309 somewhat to verify this enhanced behavior! |
Thank you, I'm on it! |
Multi handles repeatedly invert the queue of pending easy handles when used with CURLMOPT_MAX_TOTAL_CONNECTIONS. This is caused by a multistep process involving Curl_splaygetbest and violates the FIFO property of the multi handle. This patch fixes this issue by redefining the "best" node in the context of timeouts as the "smallest not larger than now", and implementing the necessary data structure modifications to do this effectively, namely: - splay nodes with the same key are now stored in a doubly-linked circular list instead of a non-circular one to enable O(1) insertion to the tail of the list - Curl_splayinsert inserts nodes with the same key to the tail of the same list - in case of multiple nodes with the same key, the one on the head of the list gets selected
This checks the new behavior of Curl_splaygetbest, so that the smallest node not larger than the key is removed, and FIFO behavior is kept even when there are multiple nodes with the same key.
d7fb86b
to
89a14a3
Compare
@bagder, I've modified the test and rebased. If there is anything else I can do, please tell me. |
Excellent work, thank you! |
Multi handles repeatedly invert the queue of pending easy handles when used with CURLMOPT_MAX_TOTAL_CONNECTIONS. This is caused by a multistep process involving Curl_splaygetbest and violates the FIFO property of the multi handle. This patch fixes this issue by redefining the "best" node in the context of timeouts as the "smallest not larger than now", and implementing the necessary data structure modifications to do this effectively, namely: - splay nodes with the same key are now stored in a doubly-linked circular list instead of a non-circular one to enable O(1) insertion to the tail of the list - Curl_splayinsert inserts nodes with the same key to the tail of the same list - in case of multiple nodes with the same key, the one on the head of the list gets selected
Multi handles "try and maintain a FIFO queue so the pipelined requests are in order". That is why newly added easy handles are added to end of the multi->easyp linked list.
When using the multi_socket API with the total number of connections limited by CURLMOPT_MAX_TOTAL_CONNECTIONS, this FIFO behaviour can be lost. This can be demonstrated by the slightly modified "multi-uv" example. We create a multi handle, and limit CURLMOPT_MAX_TOTAL_CONNECTIONS to 1 (this wouldn't make much sense in real life, but it helps to demonstrate the problem clearly). Then we add 16 easy handles (HTTP GET requests to the same URL) and log when these requests are finished (see: multi-pending-fix-test.txt).
It turns out that the requests are finished in this order:
This is caused by the repeated inversion of the list of easy handles waiting for connections (multi->pending) in a multistep process.
Let's begin from a state where one easy handle is executing and the others are in STATE_CONNECT_PEND.
- this means that the first timeout to be chosen and removed from the splay tree will be the one which is the largest smaller than the current time and, as a result, the expired timeouts will be iterated from the largest (most recently expired) to the smallest (least recently expired)
- in case when multiple timeouts (easy handles) belong to the same key (timestamp) in the splay tree, a linked list of timeouts are used
- when a node is inserted to the splay tree with a key already existing in the tree, it will be inserted in place of the current timeout (to the head of the list of same timeouts) and when removed, the head of this list will be removed first, and replaced by the next element of the list
- this means that the first element removed from the list of same timeouts will the one added last which is consistent with the demonstrated behaviour of Curl_splaygetbest
- therefore Curl_splaygetbest exhibits a FILO behaviour
- the last easy_handle added will take the available connection, the rest will be added to multi->pending in a reverse order
This process repeats until no more handles are left, inverting the list of pending handles at every iteration and giving the available connection to the first easy handle in the list, causing the observed 1-15,2-14,3-13,... pattern.
This particular problem could probably be solved by a quick-and-dirty solution: iterating on the multi->pending list in a reverse order in Curl_multi_process_pending_handles, negating the effect of the described process.
I would argue, however, that this problem is inherent to the workings of Curl_splaygetbest and the definition of "best" in the context of expired timeouts.
I think we should deal with timeouts which expired the longest time ago first, and define "best" in the context of expired timeouts as "smallest not larger than now".
This would fix this issue, and it seems generally the logical approach to me. However, I recognize that this may disable some optimalization I do not yet comprehend.
The patch that I propose contains the following modifications: