-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
user_done_callback fires too early on cancellation or timeout #105
Comments
Goal of a process pool is to abstract the management of the worker processes from the main application/service. Hence, there is not a proper moment when running a callback as the nature of the problem is asynchronous. The main reason why callbacks were ran before terminating the processes was to execute the post-processing as fast as possible with the idea of handling the process termination in a later phase. How is the main loop supposed to know what resources to clean up? |
In my case my process does its work in a temporary directory that i
have to clean up manually if the process is cancelled. On Windows the
process is just killed, no signal that can be caught at all, so i have
to do this in my main process that launched the worker task. Right now
i just tried a simple if process was cancelled, then delete the
directory, but that fails since files in the directory are still in
use because the process isn't killed yet. Indeed in general pebble
can't know what should be cleaned up, but i as developer receiving the
callback conceivably can know. Its why i want to receive the callback
in the first case. I understand now that in this case it is a
notification that the process will be cancelled, but that is
inconsistent with the other two states (failed (if exception is set
and its not due to timeout)) and done), which are called after the
process has moved on to the next job.
The next line from the code that i linked to stopping the worker calls
stop_process, which just kills the worker. Am i correct that this is
not asynchronous, in that calling stop_worker and only then firing the
callback would never result in the callback sometimes being received
before the process is stopped?
Whether there is a proper moment to run the callback is almost
philosophical (since to my understanding this is not an asynchronous
problem in the case of the code in the `update_tasks` function). I
would argue that it would be good to have it consistent with the other
situations in which the callback would be called, or to at least make
it configurable to be so.
Thank you for a fantastic library, this is the only problem i am running into :)
…On Sun, Oct 2, 2022 at 5:31 PM Matteo Cafasso ***@***.***> wrote:
Goal of a process pool is to abstract the management of the worker processes from the main application/service.
Hence, there is not a proper moment when running a callback as the nature of the problem is asynchronous. The main reason why callbacks were ran before terminating the processes was to execute the post-processing as fast as possible with the idea of handling the process termination in a later phase.
How is the main loop supposed to know what resources to clean up?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
This is not the first time something similar comes out and I now see a valid Use Case for it. I cannot see issues where the current behaviour is expected (running a callback while the timing out/cancelled process still runs) so I think it would be safe changing the behaviour. I ran some tests over the WE and did not identify issues. I will make a new release soon with this enhancement. |
@noxdafox: thank you very much! Glad to hear! |
A legitimate use case for callbacks is resources cleanup. This cannot happen while the processes are still running as they might be holding the resources to cleanup. Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
Issue resolved in |
Thanks super much! However, i find this is not working as i expected (and i think this is possibly an issue with my expectations), because cancellation of a running task is not notified to the user callback through task_done (the calling of which we just reordered in
All of 2 will happen after 1, since it runs in a different thread that will be assigned a processing slice after 1 is done (if i understand Python correctly, or at least this ordering may occur). This is not easy to fix. Fixing it to behave like a finished or failed task (callback invoked after this status is actually reached) would involve adding an extra state to the PebbleFuture (e.g. All rather complicated in any case i guess. Perhaps i would have more success installing my own waiter (implementation detail as that is) which would get invoked at the right time for cancellation and normal or exceptional finishes now in 5.0.1! I'll give that a try. I'm happy to think along and try out the above however, should you wish to pursue it. Thanks in any case! |
Yes, registering a waiter and (ab)using it as a way to get my callback run at the right time does the trick! So my problem is solved, if a bit dirtyly |
I actually forgot that callbacks are executed on The callback should work on Glad you found a workaround for your need. |
@noxdafox, fully agree, you shouldn't change future's behavior. It may however be good to document that Thanks again for the |
…y, now (ab)use a waiter as a way to get notified, implementation detail as it is. Now we get notified at the right time and can finally clean up any mess we make when canceling or when the process fails. See noxdafox/pebble#105
A legitimate use case for callbacks is resources cleanup. This cannot happen while the processes are still running as they might be holding the resources to cleanup. Signed-off-by: Matteo Cafasso <noxdafox@gmail.com>
See the code around line
pebble/pebble/pool/process.py
Line 248 in 706966a
Here the user is notified of the cancellation before the worker process is actually killed. That is an issue for me because i would like to clear up some file system objects the worker process creates when it is cancelled, but i cannot as they are still in use since the worker isn't stopped yet.
I do not see a reason to notify about cancellation before it has actually occurred, but perhaps i am shortsighted :). If there is a reason why cancellation is notified before it has actually occurred, could you consider making notification timing configurable?
The text was updated successfully, but these errors were encountered: