Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GEODE-8579: Stop waiting locator-wait-time if all locators are available #5598

Conversation

upthewaterspout
Copy link
Contributor

If we can contact all other locators, we should stop waiting for
locator-wait-time to elapse.

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

If we can contact all other locators, we should stop waiting for
locator-wait-time to elapse.
@upthewaterspout
Copy link
Contributor Author

BTW, the actual product change is just shifting some parenthesis around in an if statement.

Most of these changes have to do with moving AvailablePort so that I could write a good test. We need to be able to write a test at the membership level where two locators are started up that both know about each other (including their ports) at startup. The best way to do that is to use AvailablePort to get the ports before starting the locators.

Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine as-is. If you have time and appetite please consider the test/testability improvement I'm asking for in the comments.

@@ -57,7 +56,7 @@
@BeforeClass
public static void startLocator() throws IOException {
rootPath = gfshRule.getTemporaryFolder().getRoot().toPath();
locatorPort = getRandomAvailablePort(SOCKET);
locatorPort = AvailablePortHelper.getRandomAvailableTCPPort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better!

@@ -32,7 +32,7 @@
import java.util.Random;

import org.apache.geode.annotations.Immutable;
import org.apache.geode.distributed.internal.DistributionConfig;
import org.apache.geode.distributed.internal.membership.api.MembershipConfig;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better (narrower) dependency!

// Make sure the members see each other in the view
assertThat(membership0.getView().getMembers()).hasSize(2);
assertThat(membership1.getView().getMembers()).hasSize(2);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good test and it's acceptable as-is. I wonder though, if you'd be willing to entertain reducing run-time and increasing determinism by by injecting the (millisecond) time keeper and sleeper into the locator, the way it's done for PrimaryHandler?

Here is the PR from August that added that injection: #5422

As of that PR (and now on develop) there are a couple functional interfaces defined in PrimaryHander that could be hoisted a little higher and used in MembershipLocator for your purposes, i.e. Sleeper, MillisecondProvider.

With those injected, this test could control time similar to how PrimaryHandlerTest does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on your comment I started down this road. But this is a test of the whole membership system, so this will require injecting virtual time into a whole lot of classes, including classes that are doing things like wait(sometimeout) etc. I feel like this needs a little bit more time and careful development to get right, and more than I want to shoehorn into this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @upthewaterspout. After this PR merges I'll write a ticket to make this improvement. Thanks for trying it though!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.printStackTrace();
ExitCode.FATAL.doSystemExit();
}
addr = InetAddress.getByName(addrString);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verified when address cannot be resolved the exception causes exit code 1 (on macos) ✓

…stem

We should only become the coordinator if we have not contacted any joined members.
Copy link
Contributor

@bschuchardt bschuchardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grouping the time-based and retry-based checks together makes sense to me. This lets us invoke becomeCoordinator() faster, similar to what we had before the locatorGiveUpTime was introduced into this conditional.

AvailablePort moved, so is no longer needed as a sanctioned class in this test.
@upthewaterspout upthewaterspout merged commit faef811 into apache:develop Oct 8, 2020
@upthewaterspout upthewaterspout deleted the feature/stop-waiting-if-all-locators-contacted branch October 8, 2020 21:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants