Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed lost CollectionItems on backup promotion in ISet and IList #12061

Merged
merged 1 commit into from Jan 8, 2018
Merged

Fixed lost CollectionItems on backup promotion in ISet and IList #12061

merged 1 commit into from Jan 8, 2018

Conversation

Donnerbart
Copy link
Contributor

@Donnerbart Donnerbart commented Jan 2, 2018

  • fixed lost CollectionItems in ISet and IList by setting correct itemId on backup promotion
  • added constant INVALID_ITEM_ID to replace magic number -1
  • added tests to check backup count and content when adding new backups after backup promotion

The issue just occurs on backup promotions, so not on a graceful shutdown (which will trigger a replication/migration). This is why we see it in the split-brain scenario and when using terminate().

This issue was already fixed for IQueue via #2844 and #4275 and applies the same fix for ISet and IQueue. After the review I also plan to backport this (which needs to include my first PR with the collection backup tests).

Fixes #11930
Backport #12064

@Donnerbart Donnerbart added this to the 3.10 milestone Jan 2, 2018
@Donnerbart Donnerbart self-assigned this Jan 2, 2018
@@ -39,6 +39,10 @@
@SuppressWarnings("checkstyle:methodcount")
public abstract class CollectionContainer implements IdentifiedDataSerializable {

public static final int INVALID_ITEM_ID = -1;

public static final int ID_PROMOTION_OFFSET = 100000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This offset was introduced by Jaromir for the IQueue to create a buffer for in-flight operations, so we ensure that there are no duplicate itemIDs being created.

I put these commonly used constants to CollectionContainer, which is directly extended by ISet and IList. Its generic name also fits well for IQueue.

Collections.sort(itemList);
sort(itemList);
CollectionItem lastItem = itemList.get(itemList.size() - 1);
setId(lastItem.getItemId() + ID_PROMOTION_OFFSET);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We enter this branch, when backups of ILIst items are promoted. The backups are stored in the itemMap and are used to populate the itemList. The issue was to not call setId() with the maximum itemId of the existing items.

Since the itemList is already sorted (by the itemId its CollectionItems), we can just pick the lastItem to retrieve the maximum itemId.

}
itemSet.add(collectionItem);
}
setId(maxItemId + ID_PROMOTION_OFFSET);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We enter this branch, when backups of ISet items are promoted. The backups are stored in the itemMap and are used to populate the itemSet. The issue was to not call setId() with the maximum itemId of the existing items.

Collection<?> items = getCollectionItems(backupInstance, collectionName);
assertEqualsStringFormat("internal collection should contain %d items, but found %d", expectedCount, items.size());
for (Object item : items) {
long itemId = INVALID_ITEM_ID;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fragile. the test is making quite strong assumptions about the implementation.
any idea how to improve it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe we could start another member, add new items and see if all backups are created. I'll try this later.

@jerrinot
Copy link
Contributor

jerrinot commented Jan 4, 2018

the impl looks ok, but I am not crazy about coupling tests to the internal impl too much.

@Donnerbart
Copy link
Contributor Author

Donnerbart commented Jan 4, 2018

I changed the tests to just rely on the getBackupSet|List|Queue() methods (as the migration test before). So we check the count and content of the backups, like in the migration test. If I reverse the fix, the failure looks like:

java.lang.AssertionError: backupCollection should contain 1000 items, but found 500 
Expected :1000
Actual   :500

* fixed lost CollectionItems in ISet and IList by setting correct
  itemId on backup promotion
* added constant INVALID_ITEM_ID to replace magic number -1
* added tests to check backup count and content when adding new
  backups after backup promotion
@Donnerbart
Copy link
Contributor Author

Fixed the commit message (test description was wrong)

@Donnerbart Donnerbart merged commit 9153f72 into hazelcast:master Jan 8, 2018
@Donnerbart Donnerbart deleted the fixedLostCollectionItemsOnBackupPromotion branch January 11, 2018 16:44
@mmedenjak mmedenjak added the Source: Internal PR or issue was opened by an employee label Apr 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Backup mismatch in SetSplitBrainTest and ListSplitBrainTest
3 participants