Skip to content

Conversation

@kamilla1201
Copy link
Contributor

@kamilla1201 kamilla1201 commented Nov 17, 2021

I closed the previous PR (#6714) because of concerns that by changing the member id we may break the client code. For example, region name is based on the member id, so changing it might cause issues similar to GEODE-3152.

This fix is more conservative and potentially safer because instead of changing member id we fall back to the old logic for hostname validation if the remote member is older than 1.15.0.

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

kamilla1201 and others added 4 commits November 15, 2021 21:54
…ificates

Co-authored-by: Bill Burcham <bill.burcham@gmail.com>
Co-authored-by: Kamilla Aslami <kaslami@vmware.com>
@kamilla1201 kamilla1201 changed the title Geode 9396 GEODE-9396: Upgrades using SSL fail with mismatch of hostname in certificates Nov 17, 2021
@kamilla1201 kamilla1201 marked this pull request as ready for review November 18, 2021 17:12
Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved!

private final String conduitIdStr;

private InternalDistributedMember remoteAddr;
private InternalDistributedMember remoteMember;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big improvement in naming here!

updateEngineWithParameters = true;
}
} else {
engine.setNeedClientAuth(sslConfig.isRequireAuth());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This refactor improves readability.

}
return hostNameOrIP;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crux of this PR. This looks good to me. Thanks for the clear comment. I feel like whoever (eventually) deprecates the 1.15 line will notice this when they grep for "1.15". Not quite as elegant as a version-specific deserialization method on MemberIdentifier, but this approach avoids problems with our client code (using MemberIdentifiers in region names).

int jmxPort = ports[2];
GfshExecution execute =
GfshScript.of(startLocatorCommand("test", locatorPort, jmxPort, httpPort, 0))
GfshScript.of(startLocatorCommand("test", "localhost", locatorPort, jmxPort, httpPort, 0))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't localhost default value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed to use a different value for RollingUpgradeWithSslDUnitTest.

private void upgradeServer(String name, int serverPort, int locatorPort,
GfshExecution startupExecution) {
oldGfsh.stopServer(startupExecution, name);
GfshScript.of(startServerCommandWithConfig(name, serverPort, locatorPort))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to add verification to see old locator, server was successfully stopped and new locator and server started. I am thinking of case where the old locator and servers were not stopped. And in the verifyMembers we are only looking at any servers or locators (not version specific).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll consider these improvement suggestions in a future PR.


// make sure servers can do rolling upgrade too
upgradeServer("server1", server1Port, locatorPort, startupExecution);
causeP2PTraffic(locatorPort);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not checking if the entry was made into server1 and read from server2; i am thinking thats the expectation here...Please correct me if I am missing anything.


private SSLEngine createSslEngine(final InetSocketAddress remoteAddress) {
final String hostName;
final boolean isSender = remoteMember != null;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not expert in this area of the code...Earlier the caller used to pass true or pass based on sender or receiver connection; is the assumption is right; the earlier argument "clientSocket" passed reflects (or based on) remoteMember setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

createIoFilter is used by both sender and receiver and that creates some confusion: it isn’t obvious what clientSocket really means, who and why sets it to true, and in what cases remoteMember can be null.
Yes, before this change, the sender set clientSocket to true and the receiver set it to false. The sender also sets the remoteMember while the receiver doesn’t (it only knows the remote address from the socket channel). So there is a correlation between remoteMember and clientSocket, and we wanted to “document” it in the code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a possibility: by the time createSslEngine is called the remote member has left the cluster (or forced-disconnected)? or SocketChannel is closed...
Is it a good idea to have a test, where the remote member is made to exit; right before createSslEngine is called.
Sorry if I am suggesting cases thats not practical...Please ignore if thats the case...

Copy link
Contributor

@kirklund kirklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is geode-assembly/src/upgradeTest/resources/org/apache/geode/management/gemfire.properties used by all upgrade tests in geode-assembly?

I recommend renaming it to match the test such as MyTest_gemfire.properties and then copy it from resources to a TemporaryFolder for the test to use. It can be renamed to gemfire.properties as it's copied. The utils class ResourceUtils covers this use-case. Look for uses of ResourceUtils to see various examples.

@Bill Bill merged commit a24fceb into apache:develop Nov 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants