Skip to content

Conversation

@kamilla1201
Copy link
Contributor

@kamilla1201 kamilla1201 commented Jul 22, 2021

The fix for GEODE-9139 has changed the way hostnames are handled when SNI endpoint validation is enabled. Because of that, when upgrading from some older versions of Geode to a version that contains the fix for GEODE-9139, the SSL handshake may fail to complete. This happens in cases where the call of MemberData.getHostName() method on an older member returns an IP address. In the fix for GEODE-9139, Connection.createIoFilter() uses the hostname for endpoint identification, and if the certificate contains a symbolic name, endpoint identification will fail with "java.security.cert.CertificateException: No subject alternative names matching IP address" error.

This change adds fromDataPre_GEODE_1_15_0_0 method to MemberIdentifierImpl that converts the hostname of an older member to a hostname if it contains an IP address. This change also adds toDataPre_GEODE_1_15_0_0 but it doesn't convert the hostname back to an IP address. We think there is no need to do that because the member is about to be upgraded and will have a hostname in hostname field anyway.

Thank you for submitting a contribution to Apache Geode.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?

  • Has your PR been rebased against the latest commit within the target branch (typically develop)?

  • Is your initial contribution a single, squashed commit?

  • Does gradlew build run cleanly?

  • Have you written or updated unit tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

Note:

Please ensure that once the PR is submitted, check Concourse for build issues and
submit an update to your PR as soon as possible. If you need help, please send an
email to dev@geode.apache.org.

@kamilla1201 kamilla1201 marked this pull request as ready for review September 10, 2021 22:26
@kamilla1201 kamilla1201 changed the title [DRAFT] GEODE-9396 GEODE-9396: Upgrades using SSL fail with mismatch of hostname in certificates Sep 10, 2021
Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor request for reordering a couple methods. Asking for clarification in some comments. And wondering whether we should add another condition in one place.

@kamilla1201 kamilla1201 marked this pull request as draft September 16, 2021 00:24
kamilla1201 and others added 2 commits October 13, 2021 11:47
…ificates

Co-authored-by: Bill Burcham <bill.burcham@gmail.com>
Co-authored-by: Kamilla Aslami <kaslami@vmware.com>
@kamilla1201 kamilla1201 marked this pull request as ready for review October 21, 2021 16:27
Copy link
Contributor

@Bill Bill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this complicated ticket @kamilla1201!

Copy link
Contributor

@upthewaterspout upthewaterspout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if someone would have a reason to specify an IP address as the bind address that this might break. Cases I could think about:

  • Maybe there is no DNS name? The call to getHost() might fail
  • Maybe the DNS server isn't updated at the time the member starts. This might be the case in a IaaS where VMs or containers are launched as soon as the gemfire process starts. I guess for this case an old member would be less likely to not be resolvable?

I'm not sure if this should reject this approach or not, if there are no better options. Or maybe we just handle failures from getHost and retain the IP address.

@upthewaterspout
Copy link
Contributor

I'm wondering if someone would have a reason to specify an IP address as the bind address that this might break. Cases I could think about:

* Maybe there is no DNS name? The call to `getHost()` might fail

* Maybe the DNS server isn't updated at the time the member starts. This might be the case in a IaaS where VMs or containers are launched as soon as the gemfire process starts. I guess for this case an old member would be less likely to not be resolvable?

I'm not sure if this should reject this approach or not, if there are no better options. Or maybe we just failures from getHost and retain the IP address.

I wonder if it would be possible to fall back to the old logic for hostname validation if the member is from an old member, rather than changing the bind address when deserializing. Would that be doable or any safer?

@kamilla1201
Copy link
Contributor Author

@upthewaterspout If there’s no DNS, then getHost() should return the textual representation of an IP address. getHost() calls InetAddress.getCanonicalHostName() which is a best effort method, it should return an IP address if there is no DNS name. I’m not sure about the second scenario you described, but it sounds unlikely.

I wonder if it would be possible to fall back to the old logic for hostname validation if the member is from an old member, rather than changing the bind address when deserializing. Would that be doable or any safer?

I think we could do that in Connection.createIoFilter(), by replacing this:

      if (remoteAddr != null) {
        hostName = remoteAddr.getHostName();
      } else {
        hostName = SocketCreator.getHostName(address.getAddress());
      }

with something like:

      if (remoteAddr != null) {
        if (remoteAddr.getVersion().isOlderThan(KnownVersion.GEODE_1_15_0)) {
          hostName = address.getHostString();

          if (owner.getDM().getConfig().getSSLEndPointIdentificationEnabled()
              && InetAddressValidator.getInstance().isValid(hostName)) {
            // attempt to get a hostname instead of the proffered numeric address
            try {
              hostName = InetAddress.getByName(hostName).getCanonicalHostName();
            } catch (UnknownHostException e) {
              // ignore - we'll see what happens with endpoint validation using a numeric address
            }
          }
        } else {
          hostName = remoteAddr.getHostName();
        }
      } else {
        hostName = SocketCreator.getHostName(address.getAddress());
      }

Here we know whether endpoint identification is enabled, which should help us avoid problems with IP numbers.
But I’m not sure whether this code will work correctly if we decide to back-port this change (along with GEODE-9139).

@kamilla1201
Copy link
Contributor Author

Closing this PR because we decided to implement the fix that Dan suggested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants