Task cancellation should use same TCP channel as request to be cancelled #92532
Labels
>bug
:Distributed/Task Management
Issues for anything around the Tasks API - both persistent and node level.
Team:Distributed
Meta label for distributed team
When a parent task is cancelled we use the
internal:admin/tasks/ban
action to notify all its descendant tasks to stop running too. It's important that the cancellation request arrives after the request that started the descendant task: if the cancellation arrives first then it will appear to be trivial, so it will complete immediately and be forgotten, and then the arrival of the request will start the supposedly-cancelled task.Today we track child tasks for each
Transport.Connection
, and make sure to send the cancellation request out over the sameTransport.Connection
as the original request. ATransport.Connection
represents a bundle of TCP channels to a remote node, and this does not give the ordering guarantee needed. I think we need to use the same individual TCP channel for the cancellation. It's ok if this channel is closed, because in that case #56620 will already have cancelled the child task.The text was updated successfully, but these errors were encountered: