New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed lost CollectionItems on backup promotion in ISet and IList #12061
Fixed lost CollectionItems on backup promotion in ISet and IList #12061
Conversation
@@ -39,6 +39,10 @@ | |||
@SuppressWarnings("checkstyle:methodcount") | |||
public abstract class CollectionContainer implements IdentifiedDataSerializable { | |||
|
|||
public static final int INVALID_ITEM_ID = -1; | |||
|
|||
public static final int ID_PROMOTION_OFFSET = 100000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This offset was introduced by Jaromir for the IQueue
to create a buffer for in-flight operations, so we ensure that there are no duplicate itemIDs being created.
I put these commonly used constants to CollectionContainer
, which is directly extended by ISet
and IList
. Its generic name also fits well for IQueue
.
Collections.sort(itemList); | ||
sort(itemList); | ||
CollectionItem lastItem = itemList.get(itemList.size() - 1); | ||
setId(lastItem.getItemId() + ID_PROMOTION_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We enter this branch, when backups of ILIst
items are promoted. The backups are stored in the itemMap
and are used to populate the itemList
. The issue was to not call setId()
with the maximum itemId
of the existing items.
Since the itemList
is already sorted (by the itemId
its CollectionItems
), we can just pick the lastItem
to retrieve the maximum itemId
.
} | ||
itemSet.add(collectionItem); | ||
} | ||
setId(maxItemId + ID_PROMOTION_OFFSET); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We enter this branch, when backups of ISet
items are promoted. The backups are stored in the itemMap
and are used to populate the itemSet
. The issue was to not call setId()
with the maximum itemId
of the existing items.
Collection<?> items = getCollectionItems(backupInstance, collectionName); | ||
assertEqualsStringFormat("internal collection should contain %d items, but found %d", expectedCount, items.size()); | ||
for (Object item : items) { | ||
long itemId = INVALID_ITEM_ID; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks fragile. the test is making quite strong assumptions about the implementation.
any idea how to improve it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe we could start another member, add new items and see if all backups are created. I'll try this later.
the impl looks ok, but I am not crazy about coupling tests to the internal impl too much. |
I changed the tests to just rely on the
|
* fixed lost CollectionItems in ISet and IList by setting correct itemId on backup promotion * added constant INVALID_ITEM_ID to replace magic number -1 * added tests to check backup count and content when adding new backups after backup promotion
Fixed the commit message (test description was wrong) |
CollectionItem
s inISet
andIList
by setting correctitemId
on backup promotionINVALID_ITEM_ID
to replace magic number-1
The issue just occurs on backup promotions, so not on a graceful shutdown (which will trigger a replication/migration). This is why we see it in the split-brain scenario and when using
terminate()
.This issue was already fixed for
IQueue
via #2844 and #4275 and applies the same fix forISet
andIQueue
. After the review I also plan to backport this (which needs to include my first PR with the collection backup tests).Fixes #11930
Backport #12064