You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When syncing collections the collection version with the highest version is specially marked with the is_highest=True field. There is an uniqueness constraint on this field to only allow one collection version from a collection to be marked as the highest at a time. Due to the design of the sync pipeline there is a special scenario where the current update logic tries to save two collection versions as the highest resulting in the constraint being triggered and the sync failing. Here's how:
Sync a collection and its versions. Let's say collection A and versions 1, 2 & 3.
CV A-3 will be marked is_highest=True
Collection A has a new version released upstream, 4.
Re-sync collection A
Sync task creates potential CVs A-1, A-2, A-3, & A-4 and sends them down the sync pipeline. A-4 gets sent before the others.
Pipeline discovers that A-1, A-2, & A-3 already exist and replaces them with their current records, these are now in-memory objects.
A-4 reaches the save stage and gets saved. update_highest_version properly sets A-4 as the new highest and updates the record of A-3 to no longer be highest.
A-3 reaches the save stage, but now its in-memory representation is desynced with its DB representation. The custom save stage then tries to save the instance again (even if nothing has changed) and thus triggers the constraint since the model will try to save with is_highest=True.
To Reproduce
Sync a repository, wait for some new collection-versions to be uploaded, and then sync again. I think the error might occur only sporadically due to the async nature of the sync pipeline.
Note this issue is a regression and has been fixed before in #481.
I am going to apply the same fix, but a refactoring of the CollectionSaver stage might be needed to better account for assumptions made on when to save/update collections.
The text was updated successfully, but these errors were encountered:
Version
Since this commit: 8ad72ba
Describe the bug
When syncing collections the collection version with the highest version is specially marked with the
is_highest=True
field. There is an uniqueness constraint on this field to only allow one collection version from a collection to be marked as the highest at a time. Due to the design of the sync pipeline there is a special scenario where the current update logic tries to save two collection versions as the highest resulting in the constraint being triggered and the sync failing. Here's how:is_highest=True
update_highest_version
properly sets A-4 as the new highest and updates the record of A-3 to no longer be highest.is_highest=True
.To Reproduce
Sync a repository, wait for some new collection-versions to be uploaded, and then sync again. I think the error might occur only sporadically due to the async nature of the sync pipeline.
Note this issue is a regression and has been fixed before in #481.
I am going to apply the same fix, but a refactoring of the
CollectionSaver
stage might be needed to better account for assumptions made on when to save/update collections.The text was updated successfully, but these errors were encountered: