New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible race on task dispatch #5071
Comments
Wild suggestion here, perhaps not even something we can do currently: UUIDv7 embeds a timestamp and is designed to have a monotonically increasing unique value. Theoretically we could work with the PK instead of raw timestamps (so long as no ancient tasks utilizing UUIDv4 PKs remain) to obtain a complete table ordering. Postgresql has functions for generating UUIDs (but not UUIDv7 currently, you need an extension for it), so if [0] we were able to utilize those for sourcing the PK then we could mostly [1] eliminate clock skew as an issue, as only the database timestamp would matter [0] we currently "expect" the PK to have been generated in Python prior to any attempts to save IIRC |
I'm not entirely convinced this would solve the issue, which is basically that there is a time gap between creating the task and committing the transaction it is in. And while once all tasks are committed (after say 5 s) we do have a definite order, by that time we'd hope some worker would have started on it already. |
This will acquire a transaction based exclusive advisory lock before creating a new task object in the database and check the actual validity of the new tasks "pulp_created" field. fixes pulp#5071
This will acquire a transaction based exclusive advisory lock before creating a new task object in the database and check the actual validity of the new tasks "pulp_created" field. fixes pulp#5071
This will acquire a transaction based exclusive advisory lock before creating a new task object in the database and check the actual validity of the new tasks "pulp_created" field. fixes pulp#5071
This will acquire a transaction based exclusive advisory lock before creating a new task object in the database and check the actual validity of the new tasks "pulp_created" field. fixes #5071
We identified that there may be a possible race when dispatching new tasks.
With two processes adding a new task they become visible to the workers after finishing their transaction, but the created_at fields may actually report a different order. This can lead to two workers picking up conflicting tasks concurrently.
This may be even worse on distributed systems where you can have a clockscrew on top.
We believe the window to already be very small, so any attempt to fix is should be like a vacuum seal.
Current working idea: Wrap adding a task to the table in a transaction-based-exclusive-wait-for advisory-lock. Check that the current latest task is actually from the past and add the new task afterwards.
The text was updated successfully, but these errors were encountered: