Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIFI-7407 Refactored SSL context generation throughout framework and extensions. #4263

Closed
wants to merge 12 commits into from

Conversation

alopresto
Copy link
Contributor

@alopresto alopresto commented May 11, 2020

Thank you for submitting a contribution to Apache NiFi.

Please provide a short description of the PR here:

Description of PR

In order to streamline the creation of SSLContext and SSLSocket objects throughout the application, I refactored the various near-duplicate but slightly-different factory objects. This also involved refactoring duplicate enums and removing legacy modules which were no longer necessary. Additional comments, unit tests, and regression tests were introduced.

In order to streamline the review of the contribution we ask you
to ensure the following steps have been taken:

For all changes:

  • Is there a JIRA ticket associated with this PR? Is it referenced
    in the commit message?

  • Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.

  • Has your PR been rebased against the latest commit within the target branch (typically master)?

  • Is your initial contribution a single, squashed commit? Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not squash or use --force when pushing to allow for clean monitoring of changes.

For code changes:

  • Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
  • Have you written or updated unit tests to verify your changes?
  • Have you verified that the full build is successful on both JDK 8 and JDK 11?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
  • If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
  • If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?

For documentation related changes:

  • Have you ensured that format looks appropriate for the output in which it is rendered?

Note:

Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.

alopresto added 2 commits May 11, 2020
…2" (in shared constant).

Changed JettyServer default SSL initialization and updated unit test.
Removed SecurityStoreTypes (unused).
Added StringUtils inverted blank and empty checks.
Added TlsConfiguration container object.
Enhanced KeystoreType enum.
Added clean #createSSLContext() method to serve as base method for special cases/other method signatures.
Added utility methods in KeyStoreUtils.
Added generic TlsException for callers that cannot resolve TLS-specific exceptions.
Added utility methods for component object debugging.
Enforced TLS protocol version on cluster comms socket creation.
Added utility method for SSL server socket creation.
Refactored (Server)SocketConfigurationFactoryBean to store relevant NiFiProperties in TlsConfiguration instead of stateful SSLContextFactory (Cluster comms now enforce modern TLS protocol version).
Removed duplicate SSLContextFactory.
Switched duplicate SslContextFactory to wrap shared SSLContextFactory.
Refactored SslContextFactoryTest for clarity (will move any unique tests to nifi-security-utils class test).
Added further validation & boundary checking in uses of TlsConfiguration.
Provided SSLSocketFactory accessor in SslContextFactory.
Refactored OkHttpReplicationClient tuple method.
Refactored OcspCertificateValidator TLS logic.
Added utility method to apply TLS configs to OkHttpClientBuilder.
Removed references to duplicate SslContextFactory.
Removed unnecessary SslContextFactory.
Moved OkHttpClientUtils to nifi-web-util module.
Updated module dependencies.
Removed now empty nifi-security module.
Enforced TLS protocol selection on LB server socket.
Enforced TLS protocol selection on S2S server socket.
Applied specified TLS protocol versions to S2S socket creation.
Completed removal of legacy SSLContext creation methods from only remaining SslContextFactory.
Replaced references to creation methods throughout codebase.
Replaced references to unnecessary NiFiProperties file reads throughout tests.
Removed duplicate ClientAuth enum from SSLContextService and changed all references to SslContextFactory.ClientAuth.
Suppressed repeated TLS exceptions in cluster, S2S, and load balance socket listeners.
Cleaned up legacy code.
@alopresto
Copy link
Contributor Author

alopresto commented May 11, 2020

High level description of changes:

  • Previously there were SslContextFactory implementations in multiple modules. I enhanced the one in nifi-security-utils which is now used throughout the project, and removed the implementations in nifi-framework and nifi-socket-utils.
    • Part of this refactoring was removing public static methods which created an SSLContext object from various combinations of explicit keystore and truststore properties. These were being used in an inconsistent manner. I introduced a container object called TlsConfiguration which wraps the state of the configuration and provides internal validation checks. This encapsulates the need to check for different combinations of configuration presence/validity in each use case (components, framework, etc.) and relieves the calling developer of re-implementing this logic every time.
    • I also provided static convenience methods like getX509TrustManager() and createSSLSocketFactory() because in most cases that is what the calling code needs, rather than an intermediate SSLContext object they need to further configure. This reduced the need for Tuple<> return values throughout the code.
    • Duplicate code to transform the various return values and configure the OkHttpClient and its Builder were refactored to utility methods.
  • Duplicate enums were refactored.
  • Unnecessary code dealing with client authentication settings when creating a client connection/socket were removed (these settings would be ignored, as only an SSL/TLS server can decide to enforce/request client authentication).
  • Some tests were refactored to make mocking easier.
  • Removed extraneous file loading during NiFiProperties construction in many tests.
  • Enforced modern TLS protocol versions in various internal socket creations.

The easiest way to test these changes is to configure and deploy a secured cluster (see Apache NiFi Walkthroughs: Creating and Securing a NiFi Cluster with the TLS Toolkit) and run a flow which handles incoming secured connections such as ListenHTTP, HandleHttpRequest, etc.

Copy link
Contributor

@pvillard31 pvillard31 left a comment

Some unit tests failing. It looks like order matters when asserting that two arrays are equal. I suggested to use Set instead. (Note that's the only thing I looked at for now - just to get green builds)

@alopresto
Copy link
Contributor Author

alopresto commented May 11, 2020

Still some failures because the Java 8 on the GitHub Actions containers does not support TLSv1.3. Will resolve.

@thenatog
Copy link
Contributor

thenatog commented May 13, 2020

Reviewing..

Added external timing check to timing test assertion.
@alopresto
Copy link
Contributor Author

alopresto commented May 14, 2020

I made the dropdown for RestrictedSSLContextService more explicit where it now provides TLS, TLSv1.2 on Java 8 and TLS, TLSv1.2, TLSv1.3 on Java 11. Selecting TLS will allow connections over TLSv1.2 and TLSv1.3 (on Java 11 only. Java 8 does not support TLSv1.3).

With TLSv1.2 selected:


# TLSv1.2 is successful

 ..oolkit-1.11.4   master ●  echo Q | openssl s_client -connect node1.nifi:9999 -key nifi-key.key -cert nifi-cert.pem -CAfile nifi-cert.pem -tls1_2
CONNECTED(00000003)
depth=1 OU = NIFI, CN = ca.nifi
verify return:1
depth=0 OU = NIFI, CN = node1.nifi
verify return:1
---
Certificate chain
 0 s:OU = NIFI, CN = node1.nifi
   i:OU = NIFI, CN = ca.nifi
 1 s:OU = NIFI, CN = ca.nifi
   i:OU = NIFI, CN = ca.nifi
---
...
---
SSL handshake has read 2289 bytes and written 1464 bytes
Verification: OK
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: BA2FC4...0D2790
    Session-ID-ctx:
    Master-Key: C773AC...A85A19
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1589478477
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: yes
---
DONE

# TLSv1.3 fails

 ..oolkit-1.11.4   master ●  echo Q | openssl s_client -connect node1.nifi:9999 -key nifi-key.key -cert nifi-cert.pem -CAfile nifi-cert.pem -tls1_3
CONNECTED(00000003)
4570201536:error:1409442E:SSL routines:ssl3_read_bytes:tlsv1 alert protocol version:ssl/record/rec_layer_s3.c:1544:SSL alert number 70
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 7 bytes and written 234 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
 ✘  ..oolkit-1.11.4   master ● 

With TLS selected:


### TLSv1.3 is successful

 ..oolkit-1.11.4   master ●  echo Q | openssl s_client -connect node1.nifi:9999 -key nifi-key.key -cert nifi-cert.pem -CAfile nifi-cert.pem -tls1_3
CONNECTED(00000003)
depth=1 OU = NIFI, CN = ca.nifi
verify return:1
depth=0 OU = NIFI, CN = node1.nifi
verify return:1
---
Certificate chain
 0 s:OU = NIFI, CN = node1.nifi
   i:OU = NIFI, CN = ca.nifi
 1 s:OU = NIFI, CN = ca.nifi
   i:OU = NIFI, CN = ca.nifi
---
...
---
SSL handshake has read 2510 bytes and written 1800 bytes
Verification: OK
---
New, TLSv1.3, Cipher is TLS_AES_128_GCM_SHA256
Server public key is 2048 bit
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
DONE

# TLSv1.2 is successful

 ..oolkit-1.11.4   master ●  echo Q | openssl s_client -connect node1.nifi:9999 -key nifi-key.key -cert nifi-cert.pem -CAfile nifi-cert.pem -tls1_2
CONNECTED(00000003)
depth=1 OU = NIFI, CN = ca.nifi
verify return:1
depth=0 OU = NIFI, CN = node1.nifi
verify return:1
---
Certificate chain
 0 s:OU = NIFI, CN = node1.nifi
   i:OU = NIFI, CN = ca.nifi
 1 s:OU = NIFI, CN = ca.nifi
   i:OU = NIFI, CN = ca.nifi
---
...
---
SSL handshake has read 2293 bytes and written 1464 bytes
Verification: OK
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES256-GCM-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : TLSv1.2
    Cipher    : ECDHE-RSA-AES256-GCM-SHA384
    Session-ID: 7E5D46...1F4E63
    Session-ID-ctx:
    Master-Key: AB80DE...4FCC9A
    PSK identity: None
    PSK identity hint: None
    SRP username: None
    Start Time: 1589478427
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: yes
---
DONE

Copy link
Contributor

@markap14 markap14 left a comment

Thanks @alopresto! I know this was a huge and tedious undertaking. But it dramatically simplifies things when dealing with TLS and will help to ensure that we are both correct and consistent in dealing with these configurations in the future.

I reviewed the code and all looks good from that perspective. I created a 2-node cluster that was secured with self-signed certificates. Tested clustering and UI access. Tested load-balanced connections and secure site-to-site (both http-based and raw socket based). Everything appears to work exactly as expected.

I'm a +1 but @thenatog indicated that he is reviewing also so will not merge it to master yet. Thanks again!

@thenatog
Copy link
Contributor

thenatog commented May 14, 2020

Looks like there's currently a test error for JDK11.

My testing:

Java 8

Java 11

Saw errors with site to site when using the HTTP protocol. I'm not certain if it's related to these changes or not:
"2020-05-14 15:16:06,799 WARN [Timer-Driven Process Thread-9] o.apache.nifi.remote.client.PeerSelector Could not communicate with node0.com:9551 to determine which nodes exist in the remote NiFi cluster, due to javax.net.ssl.SSLPeerUnverifiedException: Certificate for <node0.com> doesn't match any of the subject alternative names: [node1.com]"
It's possible these errors only happen for a cluster hosted on the same machine/localhost.

@alopresto
Copy link
Contributor Author

alopresto commented May 15, 2020

Thanks @markap14 and @thenatog for the extensive testing. I pushed another commit which enables TLSv1.3 for the Java 11 UI/API port and should resolve the test error.

I can reproduce the S2S issue mentioned above on a secure 3 node cluster pointing back to itself when all nodes are hosted on the same machine. I don't think the PR changed how this worked, so I suspect this existed previously, but I'll try to address it here as well. I also encountered trouble retrieving S2S peers so I will add some unit tests there to see what I can isolate and fix.

@alopresto
Copy link
Contributor Author

alopresto commented May 15, 2020

Force pushed as I had to fix the referenced Jira number in the recent commit messages.

@alopresto
Copy link
Contributor Author

alopresto commented May 18, 2020

I decided to do the S2S refactor in a separate Jira as it grew larger than anticipated. @thenatog if you're satisfied with what's here, please give a +1 and I'll merge. Thanks.

@thenatog
Copy link
Contributor

thenatog commented May 18, 2020

Looks like there might be a small issue with JDK11 tests, if we can fix that I'll +1. Thanks for the huge contribution, Andy! Definitely valuable changes here - the SSL Contexts have needed this improvement for a long time.

@alopresto
Copy link
Contributor Author

alopresto commented May 18, 2020

There was another Java 11 unit test failure. Resolved that.

@alopresto
Copy link
Contributor Author

alopresto commented May 18, 2020

Running a full build this time because some of the tests were failing on ordering.

… classes that don't support TLSv1.3. Filed NIFI-7468 as follow on task.
@alopresto
Copy link
Contributor Author

alopresto commented May 19, 2020

Thanks for finding all the edge cases @thenatog & @markap14. I think this is ready for your +1. I'll then merge.

@thenatog
Copy link
Contributor

thenatog commented May 19, 2020

+1, looks good. Thanks Andy!

@asfgit asfgit closed this in 441781c May 19, 2020
phuthientran pushed a commit to FerrelBurn/nifi that referenced this pull request Jan 8, 2021
…2" (in shared constant).

Changed JettyServer default SSL initialization and updated unit test.
Removed SecurityStoreTypes (unused).
Added StringUtils inverted blank and empty checks.
Added TlsConfiguration container object.
Enhanced KeystoreType enum.
Added clean #createSSLContext() method to serve as base method for special cases/other method signatures.
Added utility methods in KeyStoreUtils.
Added generic TlsException for callers that cannot resolve TLS-specific exceptions.
Added utility methods for component object debugging.
Enforced TLS protocol version on cluster comms socket creation.
Added utility method for SSL server socket creation.
Refactored (Server)SocketConfigurationFactoryBean to store relevant NiFiProperties in TlsConfiguration instead of stateful SSLContextFactory (Cluster comms now enforce modern TLS protocol version).
Removed duplicate SSLContextFactory.
Switched duplicate SslContextFactory to wrap shared SSLContextFactory.
Refactored SslContextFactoryTest for clarity (will move any unique tests to nifi-security-utils class test).
Added further validation & boundary checking in uses of TlsConfiguration.
Provided SSLSocketFactory accessor in SslContextFactory.
Refactored OkHttpReplicationClient tuple method.
Refactored OcspCertificateValidator TLS logic.
Added utility method to apply TLS configs to OkHttpClientBuilder.
Removed references to duplicate SslContextFactory.
Removed unnecessary SslContextFactory.
Moved OkHttpClientUtils to nifi-web-util module.
Updated module dependencies.
Removed now empty nifi-security module.
Enforced TLS protocol selection on LB server socket.
Enforced TLS protocol selection on S2S server socket.
Applied specified TLS protocol versions to S2S socket creation.
Completed removal of legacy SSLContext creation methods from only remaining SslContextFactory.
Replaced references to creation methods throughout codebase.
Replaced references to unnecessary NiFiProperties file reads throughout tests.
Removed duplicate ClientAuth enum from SSLContextService and changed all references to SslContextFactory.ClientAuth.
Suppressed repeated TLS exceptions in cluster, S2S, and load balance socket listeners.
Cleaned up legacy code.
Added external timing check to timing test assertion.
Made RestrictedSSLContextService TLS protocol versions allowable values explicit.
Enabled TLSv1.3 on Java 11.
Added explanations of TLS protocol versions in StandardSSLContextService and StandardRestrictedSSLContextService.
Resolved additional Java 11 test failures for NiFi internal classes that don't support TLSv1.3. Filed NIFI-7468 as follow on task.

This closes apache#4263.

Signed-off-by: Nathan Gough <thenatog@gmail.com>
Signed-off-by: Mark Payne <markap14@hotmail.com>
driesva pushed a commit to driesva/nifi that referenced this pull request Mar 19, 2021
…2" (in shared constant).

Changed JettyServer default SSL initialization and updated unit test.
Removed SecurityStoreTypes (unused).
Added StringUtils inverted blank and empty checks.
Added TlsConfiguration container object.
Enhanced KeystoreType enum.
Added clean #createSSLContext() method to serve as base method for special cases/other method signatures.
Added utility methods in KeyStoreUtils.
Added generic TlsException for callers that cannot resolve TLS-specific exceptions.
Added utility methods for component object debugging.
Enforced TLS protocol version on cluster comms socket creation.
Added utility method for SSL server socket creation.
Refactored (Server)SocketConfigurationFactoryBean to store relevant NiFiProperties in TlsConfiguration instead of stateful SSLContextFactory (Cluster comms now enforce modern TLS protocol version).
Removed duplicate SSLContextFactory.
Switched duplicate SslContextFactory to wrap shared SSLContextFactory.
Refactored SslContextFactoryTest for clarity (will move any unique tests to nifi-security-utils class test).
Added further validation & boundary checking in uses of TlsConfiguration.
Provided SSLSocketFactory accessor in SslContextFactory.
Refactored OkHttpReplicationClient tuple method.
Refactored OcspCertificateValidator TLS logic.
Added utility method to apply TLS configs to OkHttpClientBuilder.
Removed references to duplicate SslContextFactory.
Removed unnecessary SslContextFactory.
Moved OkHttpClientUtils to nifi-web-util module.
Updated module dependencies.
Removed now empty nifi-security module.
Enforced TLS protocol selection on LB server socket.
Enforced TLS protocol selection on S2S server socket.
Applied specified TLS protocol versions to S2S socket creation.
Completed removal of legacy SSLContext creation methods from only remaining SslContextFactory.
Replaced references to creation methods throughout codebase.
Replaced references to unnecessary NiFiProperties file reads throughout tests.
Removed duplicate ClientAuth enum from SSLContextService and changed all references to SslContextFactory.ClientAuth.
Suppressed repeated TLS exceptions in cluster, S2S, and load balance socket listeners.
Cleaned up legacy code.
Added external timing check to timing test assertion.
Made RestrictedSSLContextService TLS protocol versions allowable values explicit.
Enabled TLSv1.3 on Java 11.
Added explanations of TLS protocol versions in StandardSSLContextService and StandardRestrictedSSLContextService.
Resolved additional Java 11 test failures for NiFi internal classes that don't support TLSv1.3. Filed NIFI-7468 as follow on task.

This closes apache#4263.

Signed-off-by: Nathan Gough <thenatog@gmail.com>
Signed-off-by: Mark Payne <markap14@hotmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants