Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow user to await async transports #799

Closed
wants to merge 9 commits into from
Closed

Conversation

stayallive
Copy link
Collaborator

I wanted to open this PR to have a discussion about how to implement this and if this is the way to go but we need to come up with a solution of some sorts.

The problem is that the HttpTransport uses an async client that because of how php works never actually starts the sending of the events until ->wait is called. Previously in 1.x this was fixed by an sendUnsentErrors and this attempts to replace that with Hub::getCurrent()->await() that a user or a framework integration can use to send unsent errors (if there are any).

We can for now go with not changing the Client interface so this is a non-bc and we can release it sooner so we can fix the problems people might encounter in longer running processes like a Laravel queue worker.

My intention is to get a fix/workaround for this issue crafted sooner rather than later if possible, it ofcourse doesn't have to be this exact way but also doesn't need to take use a few week to come up with a decent solution :)

Copy link
Collaborator

@Jean85 Jean85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally agree that we need to provide a way for the users (or the integrations) to call a "flush" on the event spool. But I fear that exposing it in this way is not useful, since it's not applicable to current transports otuside the normal HTTP one.

We already have the spool as a solution for this use case, isn't that the right approach?

src/ClientInterface.php Show resolved Hide resolved
{
$transport = $this->getClient()->getTransport();

if ($transport instanceof AsyncTransportInterface) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like exposing this method at the Hub level here, especially because it fails silently if the transport does not implement that interface...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did this so you won't have to consider what client the user is using. It always works just is a noop if it's not an async client. This prevents a try/catch block or checking the transport yourself and the Hub seemed like a logical place to have this bit since the Hub is well... the Hub.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be enough to leave the exposed call on the client an leave it to the user?

Copy link
Collaborator Author

@stayallive stayallive Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving the transport exposed you mean but remove this method?

We definitely could, but I think we should provide an easy method of waiting for async transports without the user needing to know about how that internally works or guard against transports not using that method.

But if this convenience method is not liked, happy to remove it. Possibly it could be moved elsewhere where it's better suited (on our own client possibly even)?

src/Transport/HttpTransport.php Outdated Show resolved Hide resolved
@ghost
Copy link

ghost commented Apr 8, 2019

What is the actual specification of the await method? Is it supposed to just tell the transport to send all events or does it also specify that these must be serialized and the method should only return when all requests have been sent? Just noting that latter is a very bad idea for asynchronous applications, as you force them to be synchronous, blocking.

@stayallive
Copy link
Collaborator Author

@CharlotteDunois the point is to make the execution block, that is the main purpose of most implementations of the await keyword as far as I know.

This method is meant to make it possible for the implementor to force the requests to be sent.

Since PHP is a single threaded application there is no concept of asynchronous execution (without extensions) and thus "async" is really a way to defer heavy tasks to later.

For example the curl-client from httplug implements async by "queueing" requests and executing them in parallel later to make them "async".

I hope I explained it correctly but yes, the meaning of this await implementation is to make the execution block until al events are sent. This way in long running processes we can periodically force events to be sent so they won't delay until the process exits (or is killed and the events are lost).

Please note that this method is never called in the SDK except for in the shutdown handler (which indicates the end of an execution and we should wait for event to be sent before the process is killed). So the user can call the await to block execution.

This was made for solving: getsentry/sentry-laravel#222

@stayallive stayallive changed the title [WIP] Allow user to await async transports Allow user to await async transports Apr 8, 2019
@stayallive stayallive requested a review from Jean85 April 8, 2019 18:28
@ghost
Copy link

ghost commented Apr 8, 2019

@stayallive Asynchronous does not mean multithreaded, so does not mean multithreaded asynchronous. In PHP and JavaScript, asynchronous means non-blocking I/O, and I/O is the great devil here. The point of asynchronous, non-blocking I/O is not to block the thread simply to wait for I/O to happen. Simply do something else and check if it's ready yet periodically and get back to it when it's really ready.

Defering tasks does not make anything asynchronous. cURL multi relies onto a separate process (or thread) to do blocking I/O, that does not make it asynchronous. It just means you're telling someone else to do it, which means you are free to do something else, which is also to some degree the point of asynchronous frameworks (such as ReactPHP and Amphp). cURL multi relies onto the fact that you poll cURL meanwhile periodically to see if cURL is done, where you have now two paths 1) do a blocking loop until cURL is ready, or 2) just do something else and check again later.

Asynchronous frameworks such as ReactPHP do it similarily, they do the necessary I/O and then just wait until the I/O is ready using an event loop. As long as I/O isn't ready, the event loop can do something else. Computation isn't really a problem in PHP, blocking I/O is. That's why we rely on event loops to minimize the impact of I/O, and the largest impact is waiting for I/O.

I don't really see the point of an await method for asynchronous applications, nor do I think it's wise to promote it for asynchronous transports. Instead you should probably make it return some sort of awaitable (depends naturally onto the transport) that users can await themself (may that be with some sort of coroutines or not, does not really matter here).
But if the point of await is to force an asynchronous transport to be synchronous, this should be throughrougly documented, as from the current PR state that's not clear. This can lead to various misunderstandings and misapplications of the provided methods.

Extensions such as ext-async provide here naturally a great help because they already implement an underlying libuv event loop to do non-blocking I/O and task scheduling (pretty much like coroutines).

@stayallive
Copy link
Collaborator Author

I'm not really trying to discuss the very inner workings of php/async/await etc. and maybe I explained myself badly what I was trying to say, allow me to take another try to explain why this is currently need, if you still feel otherwise please let me know how to proceed :)

I also wan't to mention again I am not married to the name await if that causes confusion because I am using the await word wrong, happy to call it whatever 😄


$promise = $this->httpClient->sendAsyncRequest($request);

We currently receive a promise from an async HttpClient which is supplied by httplug, for example let's take the current preferred one and how it implements their async.

The CurlPromise we get back is implemented using a MultiRunner which uses the curl_multi_* functions under the hood.

To execute a curl_multi_ request you need to call curl_multi_exec. In the current MultiRunner this is done in the wait() method.

If you call wait on a CurlPromise you execute wait on the MultiRunner. A request will not be sent until you call wait on the CurlPromise as far as I can understand it right now.

This presents a problem. We never call wait until the php shutdown handler is called (currently and with this PR). This presents issues for long running processes like the Laravel queue worker since they run indefinitely and are sometimes killed to restart (no shutdown handler) which makes the curl-client queue up curl multi requests until the shutdownn handler is finally called when the worker is stopped, if the worker is killed no events will have ever be sent.

This PR tries to provide an interface to make sure all requests were sent before what's coming next. We would hook in the queue events from Laravel and make sure all events are submitted before the worker is allowed to handle a new event. This way we can be sure all events are correctly submitted to sentry and are not lost.

@ghost
Copy link

ghost commented Apr 8, 2019

@stayallive I'm not saying this is not needed, I'm saying if it's merged, it must at least be documented properly, that means the method description must say what it does and what it tries to achieve, so that it can't be misused. Since it's a public SDK method, it's open to misapplication, and good documentation can mitigate this.

And just in my own opinion, I think that the async implementation of httplug curl is broken, because that's not how you implement asynchronous, a better example of how you do it is implemented in guzzle (however that's still bad for async applications, but it actually sends the request at any time asynchronously, with no need to synchronize it).

If you want to resolve this the actual good way, then it should be properly solved in httplug curl, however I do see where you're coming from and I agree that sentry should provide a standardized way to "await" outstanding requests (maybe even with being able to return an actual "awaitable" from the transport?).

Copy link
Member

@HazAT HazAT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure to call the function flush / close as defined in the Unified SDK spec: https://docs.sentry.io/development/sdk-dev/unified-api/#client

These function should live directly on the client and the hub shouldn't know about this.

Copy link
Collaborator

@Jean85 Jean85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this change, it's clean and simple, but I've found two thing that need intervention.

src/State/HubInterface.php Outdated Show resolved Hide resolved
src/Transport/HttpTransport.php Outdated Show resolved Hide resolved
@ste93cry
Copy link
Collaborator

Closing this as it has been superseded by #813

@ste93cry ste93cry closed this Aug 17, 2019
@stayallive stayallive deleted the async-await branch August 19, 2019 07:53
@Jean85 Jean85 mentioned this pull request Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants