ZOOKEEPER-4415 + ZOOKEEPER-4912: enable TLSv1.3 and remove default cipher overrides#143
Conversation
…pher overrides Combined backport of two upstream commits to branch-3.6: - ZOOKEEPER-4415 (Apache 66771be, in 3.9.2+): adds TLSv1.3 support to the server by picking TLSv1.3 as the default SSLContext protocol when the running JDK supports it, and returning the JDK's default enabled protocols (TLSv1.3 + TLSv1.2 on JDK 11+; TLSv1.2-only on older JDKs) when `ssl.enabledProtocols` is unset. - ZOOKEEPER-4912 (Apache master 2aaeff8, not yet in any 3.9.x release): removes the hardcoded TLSv1.2-only default cipher list from X509Util. SSLContextAndOptions.getCipherSuites now returns null when `ssl.ciphersuites` is unset, which lets the JDK supply its own default cipher list at SSLEngine construction time. Motivation ========== With branch-3.6 today, the server installs a fixed TLSv1.2-only cipher set on every SSLEngine even when the JDK supports more recent ciphers and protocols. Any attempt to enable TLSv1.3 from configuration runs into two problems: 1. The hardcoded cipher list contains no TLSv1.3 ciphers, so TLSv1.3 handshakes have no overlap on the server side and fail. 2. The default protocol is pinned to TLSv1.2 regardless of what the JDK supports. The MP-side workaround of shipping `ssl.ciphersuites` with explicit TLSv1.3 cipher names is brittle against JDK drift: if the running JDK does not recognise a cipher name (e.g. TLS_CHACHA20_POLY1305_SHA256 on OpenJDK 11.0.8, which only added it in 11.0.11), SSLEngine throws `IllegalArgumentException: Unsupported CipherSuite` at engine setup and every handshake fails. With this backport, neither X509Util nor any downstream config has to carry a cipher list at all. The JDK is the single source of truth for both protocol selection and cipher selection, so JDK upgrades automatically pull in new ciphers and protocols without an MP change. Per-deployment overrides via `ssl.enabledProtocols` / `ssl.ciphersuites` continue to work exactly as before. Changes ======= zookeeper-server/.../X509Util.java - Add TLS_1_1 / TLS_1_2 / TLS_1_3 string constants. - Replace `DEFAULT_PROTOCOL = "TLSv1.2"` with `defaultTlsProtocol()`, which returns TLSv1.3 when the JDK supports it, TLSv1.2 otherwise. - Delete getGCMCiphers / getCBCCiphers / concatArrays / DEFAULT_CIPHERS_JAVA8 / DEFAULT_CIPHERS_JAVA9 / getDefaultCipherSuites / getDefaultCipherSuitesForJavaVersion. - Drop the now-irrelevant "Default cipher suites" class doc-comment and the unused `Objects` import. zookeeper-server/.../SSLContextAndOptions.java - getCipherSuites: return null when `ssl.ciphersuites` is unset instead of falling back to X509Util.getDefaultCipherSuites(). - Constructor: null-guard cipherSuitesAsList (Collections.asList of null would NPE; configureSslParameters and createNettyJdkSslContext both already accept null = "use JDK defaults"). - getEnabledProtocols: return sslContext.getDefaultSSLParameters() .getProtocols() (the JDK's curated default list for the chosen protocol) instead of new String[]{sslContext.getProtocol()}. zookeeper-server/.../X509UtilTest.java - Augment testCreateSSLContextWithoutCustomProtocol to assert TLSv1.3 is selected on JDKs that support it (and the enabled protocol list contains both TLSv1.2 and TLSv1.3); otherwise assert TLSv1.2-only. - Switch "TLSv1.1" string literal to X509Util.TLS_1_1 constant. - Pin testClientRenegotiationFails to TLSv1.2 (renegotiation is not a TLSv1.3 feature, so this test must explicitly force 1.2 to remain meaningful once 1.3 becomes the default). - Delete the six obsolete testGetDefaultCipherSuites* tests that exercised the now-deleted Java-version-aware cipher selector. zookeeper-docs/.../zookeeperAdmin.md - Document the new defaults: TLSv1.3 when JDK supports it; JDK default protocol/cipher list when properties are unset. Local verification ================== X509UtilTest: 304/304 pass, including the new TLSv1.3 assertions. QuorumSSLTest: 12/13 pass; testOCSP failure is pre-existing on branch-3.6 (verified against unmodified HEAD) and unrelated to TLS protocol/cipher logic. Full zookeeper-server module compiles (384 source files, BUILD SUCCESS).
testProtocolVersion still used the raw "TLSv1.2" / "TLSv1.1" strings even after the ZOOKEEPER-4415 backport introduced the TLS_1_1 / TLS_1_2 constants on X509Util. Switch the two System.setProperty calls to use the constants for consistency with the new convention. No behaviour change. Verified by running testProtocolVersion locally (1/1 pass).
Add a focused regression guard for the null-handling chain introduced by the ZOOKEEPER-4912 backport. testCreateSSLContextWithoutCustomProtocol exercises the unset-ssl.ciphersuites path implicitly, but does not pin the assertion to that condition — a future regression that reintroduces a hardcoded default cipher list would still let it pass. testCreateSSLContextWithoutCipherSuites explicitly clears the property, rebuilds the X509Util, and asserts SSLContextAndOptions is constructed without NPE. Pre-fix, the constructor would fail on Collections.unmodifiableList(Arrays.asList(null)) before returning.
laxman-ch
left a comment
There was a problem hiding this comment.
- Test it locally and provide a report with combinations (JDK version, Client version, Client side protocols version).
- Do we need to backport any more related changes (bug fixes, etc) after this initial set of commits from 3.9 branch?
|
1. Local test report with combinations — published as a Google Doc: TLSv1.3 Enablement — Test Plan. The Local testing and Local performance testing sections at the end cover:
Bottom line: every JDK × protocol combination succeeds end-to-end. No |
|
@laxman-ch — triage done. Scanned all commits on Nothing critical to backport. No bug fix on What's left on
|
adityaagg09
left a comment
There was a problem hiding this comment.
Reviewed the backport — overall LGTM. Clean, well-documented change and the substantive TLS logic is correct. The one genuinely risky part — the null-cipher chain from ZOOKEEPER-4912 — checks out: getCipherSuites() returning null is correctly handled by both consumers (configureSslParameters guards with if (cipherSuites != null), and createNettyJdkSslContext hands cipherSuitesAsList to Netty's JdkSslContext, which reads null as "use JDK defaults"). The constructor null-guard is therefore necessary and right.
Left a batch of inline comments — all minor / non-blocking. Summary:
getEnabledProtocolsnow widens the enabled set to JDK defaults when onlyssl.protocolis set (doc note worth adding).testCreateSSLContextWithoutCipherSuiteshardcodesClientX509Util, so the parametrizedQuorumX509Utilrun never exercises a distinct path.- A couple of small nits in
defaultTlsProtocol().
None of these block the merge.
…note Address two non-blocking review comments from @adityaagg09 on PR linkedin#143. * X509UtilTest.testCreateSSLContextWithoutCipherSuites: loop the regression guard over both ClientX509Util (zookeeper.ssl.* prefix) and QuorumX509Util (zookeeper.ssl.quorum.* prefix) so a future change to either subclass's config prefix or defaults path is caught. The null-handling lives in the shared SSLContextAndOptions, but exercising both subclasses removes the prior duplication where all eight parametrized invocations ran the identical ClientX509Util path. Verified locally: X509UtilTest 312/312 pass on JDK 17. * SSLContextAndOptions.getEnabledProtocols and zookeeperAdmin.md: document that setting only ssl.protocol selects the SSLContext protocol but does not by itself restrict the enabled-protocol set (the JDK's default enabled list for the chosen context applies). Operators who want strict single-protocol pinning must set ssl.enabledProtocols explicitly. The behavior matches upstream ZOOKEEPER-4415; this just makes the implication explicit in the operator-facing docs and in the source comment so future readers understand the widening. No functional change.
|
Thanks @adityaagg09 for the careful review. Pushed
Skipped the other nits as you suggested ( |
Summary
Combined backport of two upstream Apache ZooKeeper fixes to
branch-3.6:66771be4d, released in 3.9.2): server-side TLSv1.3 support. Picks TLSv1.3 as the default SSLContext protocol when the running JDK supports it, and returns the JDK's default enabled protocols (TLSv1.3 + TLSv1.2 on JDK 11+; TLSv1.2-only on older JDKs) whenssl.enabledProtocolsis unset.2aaeff840, not yet in any 3.9.x release): removes the hardcoded TLSv1.2-only default cipher list.SSLContextAndOptions.getCipherSuites()now returnsnullwhenssl.ciphersuitesis unset, letting the JDK supply its own default cipher list at SSLEngine construction time.These two changes were a clean hand-craft against
branch-3.6rather than cherry-picks, because the upstream commits live on top of the JUnit 5 migration thatbranch-3.6does not carry — cherry-picking pulled in unrelated test-framework refactoring noise. The substantive TLS logic is unchanged from upstream.Motivation
With
branch-3.6today, the server installs a fixed TLSv1.2-only cipher set on every SSLEngine even when the JDK supports more recent ciphers and protocols. Two problems block TLSv1.3:A downstream workaround of shipping
ssl.ciphersuiteswith explicit TLSv1.3 cipher names is brittle against JDK drift: if the running JDK does not recognise a cipher name (e.g.TLS_CHACHA20_POLY1305_SHA256on OpenJDK 11.0.8, which only added it in 11.0.11),SSLEngine.setEnabledCipherSuites()throwsIllegalArgumentException: Unsupported CipherSuiteat engine setup and every TLS handshake fails. We hit exactly this in an internal deployment.After this backport, neither X509Util nor any downstream config has to carry a cipher list at all. The JDK is the single source of truth for both protocol selection and cipher selection, so JDK upgrades automatically pull in new ciphers and protocols without any code change. Per-deployment overrides via
ssl.enabledProtocols/ssl.ciphersuitescontinue to work exactly as before.Changes
X509Util.javaTLS_1_1/TLS_1_2/TLS_1_3string constants.DEFAULT_PROTOCOL = "TLSv1.2"withdefaultTlsProtocol(), which returns TLSv1.3 when the JDK supports it, TLSv1.2 otherwise.getGCMCiphers/getCBCCiphers/concatArrays/DEFAULT_CIPHERS_JAVA8/DEFAULT_CIPHERS_JAVA9/getDefaultCipherSuites/getDefaultCipherSuitesForJavaVersion.Objectsimport.SSLContextAndOptions.javagetCipherSuites(): returnnullwhenssl.ciphersuitesis unset, instead of falling back toX509Util.getDefaultCipherSuites().cipherSuitesAsList(Collections.unmodifiableList(Arrays.asList(null))would NPE; bothconfigureSslParametersandcreateNettyJdkSslContextalready acceptnull= "use JDK defaults").getEnabledProtocols(): returnsslContext.getDefaultSSLParameters().getProtocols()(the JDK's curated default list for the chosen protocol) instead ofnew String[]{sslContext.getProtocol()}.X509UtilTest.javatestCreateSSLContextWithoutCustomProtocolto assert TLSv1.3 is selected on JDKs that support it (and the enabled-protocol list contains both TLSv1.2 and TLSv1.3); otherwise assert TLSv1.2-only. Mirrors the upstream ZOOKEEPER-4415 test additions, in JUnit 4 style."TLSv1.1"string literal intestCreateSSLContextWithCustomProtocolto the newX509Util.TLS_1_1constant.testClientRenegotiationFailsto TLSv1.2 — renegotiation is not a TLSv1.3 feature, so this test must explicitly force 1.2 to stay meaningful once 1.3 becomes the default.testGetDefaultCipherSuites*tests that exercised the now-deleted Java-version-aware cipher selector.testCreateSSLContextWithoutCipherSuitesas a focused regression guard for the null-handling chain — clearsssl.ciphersuites, rebuilds the X509Util, and asserts theSSLContextAndOptionsconstructor returns without NPE. Pre-fix,Collections.unmodifiableList(Arrays.asList(null))would throw before the SSLContext was produced.QuorumSSLTest.javatestProtocolVersion: adopt the newX509Util.TLS_1_1/X509Util.TLS_1_2constants in place of raw"TLSv1.x"literals, for consistency with the new convention introduced in this PR. No behaviour change.zookeeperAdmin.mdCompatibility notes
X509Util.DEFAULT_PROTOCOLis no longer a compile-time constant. It is still declaredpublic static final Stringbut its value is computed at class-load time viaSSLContext.getDefault().getSupportedSSLParameters(). Callers that were compiled against an earlierbranch-3.6with the literal"TLSv1.2"inlined byjavacwill keep seeing"TLSv1.2"until recompiled against the new JAR. For this fork, all consumers ship together as part of the same release, so this is a non-issue — flagged for anyone backporting further.X509Util.getDefaultCipherSuites()andgetDefaultCipherSuitesForJavaVersion()are removed. Both were package-private (static, notpublic static). The only intra-repo caller wasSSLContextAndOptions.getCipherSuites(), which is updated in the same commit. Agrep -rn 'getDefaultCipherSuites' zookeeper-server/src/mainconfirms no other callers. Anyone calling these via reflection from outside this repo will break; we are not aware of such callers in the LinkedIn ZK ecosystem.ssl.enabledProtocolsandssl.ciphersuitescontinue to honour user-supplied values exactly as before. Settingssl.enabledProtocols=TLSv1.2on a specific ensemble pins that ensemble to TLSv1.2 even on a JDK that supports TLSv1.3.Testing done
Verified locally with Maven 3.6.3 / OpenJDK 17.0.5:
X509UtilTestQuorumSSLTesttestOCSPfails — confirmed pre-existing on unmodifiedbranch-3.6HEAD (same failure, same elapsed time), unrelated to TLS protocol/cipher logicX509SpiffeAuthIntegrationTestNoSuchMethodError— pre-existing test-classpath issue, unrelatedzookeeper-servermodule compileThe
ClientX509Util.javaparts of upstream ZOOKEEPER-4912 are not included becausebranch-3.6'sClientX509Utilis a 39-line stub that does not own any of the NettySslContextBuildercalls that the upstream commit modifies — there is nothing inbranch-3.6to update there.Diff stat