Skip to content

multi: add curl_multi_wakeup() #4608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

multi: add curl_multi_wakeup() #4608

wants to merge 1 commit into from

Conversation

ngg
Copy link
Contributor

@ngg ngg commented Nov 17, 2019

This commit adds curl_multi_wakeup() which was previously in the TODO
list under the curl_multi_unblock name.

On some platforms and with some configurations this feature might not be
available or can fail, in these cases a new error code
(CURLM_WAKEUP_FAILURE) is returned from curl_multi_wakeup()

Fixes #4418

This PR is a work-in-progress, I'm submitting it to receive feedback about
the design choices and the implementation.
I've tested it using our application on Windows, Linux, macOS and Android
platforms, and it seems to work well.
I'm not sure how to add a test case for this in the curl test suite, do you have
an idea? Or probably some example should be added as well.

@bagder
Copy link
Member

bagder commented Nov 17, 2019

Why do we need CURLMOPT_ENABLE_UNBLOCK for this? Why not just enable this unconditionally ?

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

If it is enabled then every multi uses two additional fds and the initialization time will be a bit slower. If someone never uses unblock, these are really unnecessary.

Secondly, because Curl_socketpair can fail (mostly on Windows), we'd like to know before calling curl_multi_poll() whether this unblock feature will work, so we can adjust the timeout based on that.

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

My second reason in the previous reply is really weak: we can simply call curl_multi_unblock() and see if it succeeda before the first poll to adjuat timeouts.

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

On second thought, this perf penalty doesn't seem that huge considering every DNS lookup will also create a socketpair when using the threaded resolver. I'll update this PR to have it enabled automatically.

Also it'd be easier to not include a new error code, but I couldn't find any that looked good. Should I reuse one of the existing codes and if so then which one?

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

I've removed the setopt option, now it is enabled if USE_SOCKETPAIR is defined and USE_BLOCKING_SOCKETS is not (blocking write socket can cause a deadlock if unblock is called from the same thread that will call poll).

Idea: After this is merged probably the threaded resolver can be refactored to call unblock after it is done instead of creating a new socketpair every time.

@bagder
Copy link
Member

bagder commented Nov 18, 2019

I'm not sure how to add a test case for this in the curl test suite, do you have an idea?

Two ideas:

  1. we can easily add test a case or two that use unblock when nothing is blocking, and make sure the behavior is correct for that case.
  2. we can use pthread (inspiration from test 1541) to fire up a test program that waits in one thread and we unblock it in a separate one. We just need to make some clever ways to provide the feedback when it works that doesn't depend on very exact timing.

Or probably some example should be added as well.

Yes and once we have the docs and tests worked out, that should be really easy!

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

Thanks for the quick feedback! I'll try to address the implementation and docs issues today, and I'll try to add tests after that.

@ngg
Copy link
Contributor Author

ngg commented Nov 18, 2019

I have another question:
On platforms where unblock is available and initialization succeeds, inside Curl_multi_wait there will always be at least that fd to poll for. This means that basically curl_multi_wait() also becomes like curl_multi_poll() by always waiting for something. As that is technically almost the same breaking change for curl_multi_wait() for current API users as it would be to simply replace it with curl_multi_poll(). Probably we should enable polling this socketpair only inside curl_multi_poll() where it doesn't cause different behavior if the feature is not used. What do you think?

@bagder
Copy link
Member

bagder commented Nov 18, 2019

Ah, good thinking - I hadn't considered that subtlety!

Yes, I would be fine with making it only work for curl_multi_poll() since it helps us not break the existing behavior of curl_multi_wait(). The poll version is the one we rather direct new users to anyway.

@ngg ngg force-pushed the unblock branch 2 times, most recently from aeb5ce0 to 2251546 Compare November 20, 2019 17:34
@ngg
Copy link
Contributor Author

ngg commented Nov 20, 2019

I've updated the implementation and docs, and also added the first test which tests unblocks and poll from a single thread without any easy handles attached. I'll add more tests (including a multi-threaded one) and an example later.
I have some questions:

  • Test case numbering is not clear to me, is 1564 a good number for this new test? (other curl_multi_* tests were in 15xx range)
  • This unblock feature is really about timing, which will make the tests timing dependent a bit. If I increase the timeouts then the tests will become slower but less flaky, how much time is acceptable for these new tests to take?

@ngg ngg force-pushed the unblock branch 2 times, most recently from 9b2da0a to 6ff353a Compare November 20, 2019 21:49
/*
* Name: curl_multi_unblock()
*
* Desc: unblocks a "hanging" curl_multi_poll call.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might improve the description, as unblocks a "hanging" curl_multi_poll call. - sounds like a call that hung (eg deadlocked).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "unblock" word itself can sound a bit horrifying I chose this because it was already in the TODO sectiom, personally I'd prefer to call this "wakeup". The description then could be something like " wakes up a sleeping curl_multi_poll call". Ideas? Or if we leave it named unblock, what description should we use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is hard. I don't have any problems with renaming it *_wakeup if you think that's better.

Is there any previous art somewhere for a similar API that we could copy the name from?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly referring to "hanging" in the description. not the _unblock part. Description perhaps should make it clear that the api unblocks/wake up another thread that's blocked on/waiting for curl_multi_poll calls

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it is called curl_multi_wakeup and the short description says wakes up a sleeping curl_multi_poll call, I hope this is better.

lib/multi.c Outdated
@@ -367,6 +368,21 @@ struct Curl_multi *Curl_multi_handle(int hashsize, /* socket hash */

/* -1 means it not set by user, use the default value */
multi->maxconnects = -1;

#ifdef ENABLE_UNBLOCK
if(Curl_socketpair(AF_UNIX, SOCK_STREAM, 0, multi->unblock_pair) < 0) {
Copy link
Contributor

@pps83 pps83 Nov 21, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • wouldn't that be confusing to have AF_UNIX even on windows? (I know that it's ignored inside Curl_socketpair)
  • why sock_stream? udp should to be lighter than tcp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't that be confusing to have AF_UNIX even on windows? (I know that it's ignored inside Curl_socketpair)

Won't it be equally confusing to setup an #ifdef for the value when it isn't even used on that platform?

why sock_stream? udp should to be lighter than tcp

Well, we want the pipe to be a reliable stream. I'm not sure what to expect from this if we use datagrams instead... besides, does it really matter (can the "overhead" even be measured?) for this simple use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose these parameters because this is how it is already used in the threaded resolver. These can be later improved in both places in a separate PR. Based on https://stackoverflow.com/questions/13953912/difference-between-unix-domain-stream-and-datagram-sockets I think SOCK_STREAM is better here because if we always send 1-byte messages then we would only be able to read a single byte at a time with SOCK_DGRAM (Note: I didn't test if this is really the case)

@bagder
Copy link
Member

bagder commented Nov 21, 2019

Test case numbering is not clear to me, is 1564 a good number for this new test? (other curl_multi_* tests were in 15xx range)

The number just have to be unique, the grouping or ranges are only for humans so you can go however you think is fine. 1564 looks perfect to me!

This unblock feature is really about timing, which will make the tests timing dependent a bit. If I increase the timeouts then the tests will become slower but less flaky, how much time is acceptable for these new tests to take?

If we can't provide any other hints to help figure that out in the test, I guess we need to test with something and see what works...

@ngg ngg force-pushed the unblock branch 2 times, most recently from 68a865f to 85f590f Compare November 21, 2019 11:21
@ngg
Copy link
Contributor Author

ngg commented Nov 22, 2019

The latest CI run shows it seems to work everywhere except on Windows when compiled with MSYS.

In that case, the tests can fail at libtest/lib1564.c:120 curl_multi_poll returned too early which tests that even if there are a huge amount of unblock calls, those all wakes up the first poll only (poll should read every byte from the socketpair).
I have no idea why this happens, I think there is some difference between normal Windows winsock and MSYS winsock calls (probably related to (WSA)EINTR handling or something similar).

I have currently no access to a Windows machine with MSYS, so I'll add some debug logging of error codes inside curl and let appveyor build that.

@ngg ngg force-pushed the unblock branch 4 times, most recently from f847531 to e5f088a Compare November 22, 2019 12:19
@ngg
Copy link
Contributor Author

ngg commented Nov 22, 2019

Strange, based on these debug messages it seems that writing to these socket is much more asynchronous than it is on any other platform, meaning that if we write to this socket then later on that same thread it's possible the data is not even ready to be read.
The good thing is that it means it's a problem with the test and not with the implementation.
For a test workaround, as it only seems to happen only on Windows and only when we write a lot to the socket, I think it might be fixed by adding a single sleep (only on Windows) after the large cycle that calls unblock and before the poll call. I'm trying that now.

@ngg ngg force-pushed the unblock branch 2 times, most recently from 54575e5 to 2710e15 Compare November 23, 2019 10:28
@ngg ngg changed the title [WIP] add curl_multi_unblock() [WIP] add curl_multi_wakeup() Nov 23, 2019
@ngg ngg force-pushed the unblock branch 5 times, most recently from f6262b7 to c553574 Compare November 24, 2019 20:24
This commit adds curl_multi_wakeup() which was previously in the TODO
list under the curl_multi_unblock name.

On some platforms and with some configurations this feature might not be
available or can fail, in these cases a new error code
(CURLM_WAKEUP_FAILURE) is returned from curl_multi_wakeup().

Fixes curl#4418
@ngg ngg changed the title [WIP] add curl_multi_wakeup() multi: add curl_multi_wakeup() Nov 25, 2019
@ngg
Copy link
Contributor Author

ngg commented Nov 25, 2019

I think I'm ready with this PR, please review. In the last CI run all failures seemed to be unrelated to this change. I've started another one just to make sure.

Notable changes since initial version:

  • fixed initial issues that came up on first review
  • the socketpair is created during curl_multi_init (and not with a setopt)
  • function has been renamed to curl_multi_wakeup
  • wakeup only affects curl_multi_poll (and not curl_multi_wait)
  • calling wakeup a lot of times during (or before) a single poll call should only wake up that call and not later calls. This however is not guaranteed and there are cases (on Windows where the socketpair is just too asynchronous and we cannot reliably read out all data from it)
  • docs has been updated to reflect these changes
  • one test (1564) has been added that tests call wakeup and poll from a single thread to see how they interact
  • a multi-threaded test (1565) has been added that tests a more realistic scenario

@bagder bagder closed this in f3c35e3 Nov 25, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Feb 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging this pull request may close these issues.

Add portable way to unblock/alert blocking curl_multi_wait from another thread
3 participants