Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEODE-9617: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size #6976

Closed
wants to merge 3 commits into from

Conversation

mhansonp
Copy link
Contributor

Did some code cleanup up and made a change to Latch calls to check return values.


LATCH.get().await(getTimeout().toMillis(), MILLISECONDS);
assertThat(LATCH.get().await(getTimeout().toMillis(), MILLISECONDS)).isTrue();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important change. A timeout on a latch is possible and will return false.

@@ -952,16 +915,16 @@ public void testClientMetadataForPersistentPrs() throws Exception {
});
}

LATCH.get().await(getTimeout().toMillis(), MILLISECONDS);
assertThat(LATCH.get().await(getTimeout().toMillis(), MILLISECONDS)).isTrue();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important change. A timeout on a latch is possible and will return false.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be misunderstanding something here, but am I right in thinking that this change won't prevent the test from failing, just make it fail at a different point than described in GEODE-9617? If the failure described in GEODE-9617 is due to this await silently timing out, then asserting here that it doesn't time out will just move the failure to here.

Also, GEODE-9617 seems to describe two distinct failures; one in this test, and one in testMetadataServiceCallAccuracy_FromGetOp(). Since this PR doesn't modify testMetadataServiceCallAccuracy_FromGetOp() beyond minor refactoring, is it accurate to say that any flakiness in that test won't be affected by it?

Finally, could the title of this PR be changed to meet the criteria described on the wiki: https://cwiki.apache.org/confluence/display/GEODE/Commit+Message+Format

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to nothing to affect that. That will be a separate PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though that does not appear to reproduce at this time...

@mhansonp mhansonp marked this pull request as draft October 12, 2021 01:17
@mhansonp
Copy link
Contributor Author

Seems like the latch changes made the sticking point a little more obvious...

Copy link
Contributor

@kirklund kirklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's still in DRAFT, but it looks good to me!

@onichols-pivotal
Copy link
Contributor

this PR appears to be abandoned, can it be closed?

Copy link
Contributor

@onichols-pivotal onichols-pivotal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment deleted

@mhansonp mhansonp marked this pull request as ready for review May 11, 2022 21:30
- found a bug in my changes.
@mhansonp mhansonp changed the title Bugfix/GEODE-9617 Bugfix/GEODE-9617 CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size May 11, 2022
@@ -956,16 +935,16 @@ public void testClientMetadataForPersistentPrs() throws Exception {
});
}

LATCH.get().await(getTimeout().toMillis(), MILLISECONDS);

assertThat(LATCH.get().await(getTimeout().toMillis(), MILLISECONDS))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another latch change

@mhansonp mhansonp changed the title Bugfix/GEODE-9617 CI Failure: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size GEODE-9617: PartitionedRegionSingleHopDUnitTest fails with ConditionTimeoutException waiting for server to bucket map size May 13, 2022
Copy link
Contributor

@DonalEvans DonalEvans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the commit message format guidelines found on the wiki, could this PR name (or the final commit message when merging) be changed to describe what the commit is actually changing rather than just being a restatement of the ticket title? Maybe something like "GEODE-9617: Assert on successful latch await", since as far as I can see, none of the changes are intended to address the flaky test, just make it more obvious when it does fail?

public class PartitionedRegionSingleHopDUnitTest implements Serializable {

protected static Logger logger = LogService.getLogger();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logger is not used in the test so should be removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks!


vm0.invoke(new CacheSerializableRunnable("testSimplePutAll1") {
vm0.invoke("testSimplePutAll1", new CacheSerializableRunnable() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and other uses of CacheSerializableRunnable below) can be simplified by using a lambda instead:

    vm0.invoke("testSimplePutAll1", () -> {
      int cntr = 0, cntr1 = 0;
      for (int i = 1; i < 6; i++) {
        region.put(i, "testSimplePutAll" + i);
        cntr++;
      }
...
    });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot do that because CacheSerializableRunnable throws a CacheException

@Override public void run() { try { run2(); } catch (CacheException exception) { String message = "While invoking \"" + this + "\""; throw new CacheSerializableRunnableException(message, exception); } }

@mhansonp mhansonp closed this May 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants