Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-9320: Enable TLSv1.3 by default (KIP-573) #8695

Merged
merged 35 commits into from
Jun 2, 2020

Conversation

nizhikov
Copy link
Collaborator

@nizhikov nizhikov commented May 19, 2020

  1. Enables TLSv1.3 by default with Java 11 or newer.
  2. Add unit tests that cover the various TLSv1.2 and TLSv1.3 combinations.
  3. Extend benchmark_test.py and replication_test.py to run with 'TLSv1.2'
    or 'TLSv1.3'.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@nizhikov
Copy link
Collaborator Author

nizhikov commented May 20, 2020

@ijuma Looks like tests are OK.
Please, take a look at this preliminary PR.

String cipherSuite = "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384";

sslServerConfigs.put(SslConfigs.SSL_PROTOCOL_CONFIG, "TLSv1.2");
sslServerConfigs.put(SslConfigs.SSL_ENABLED_PROTOCOLS_CONFIG, Arrays.asList(SslConfigs.DEFAULT_SSL_ENABLED_PROTOCOLS.split(",")));
Copy link
Contributor

@ijuma ijuma May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you want to leave this as the default and see if it works correctly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello. Sorry, I don't understand your concern :)

  1. DEFAULT_SSL_ENABLED_PROTOCOLS = TLSv1.2,TLSv1.3 for java11+
  2. DEFAULT_SSL_ENABLED_PROTOCOLS = TLSv1.2 for others jdk.

This property modified inside this test so I forcefully set it as default value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, you are forcefully setting it to the default. Makes sense. OK, so this test shows that we can negotiate successfully even though we have no cipher suites that work with TLS 1.3. Can we also test that if the client sets TLS 1.3, it will fail?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests added.

@ijuma
Copy link
Contributor

ijuma commented May 20, 2020

One question: any downside to setting ssl.protocol=TLSv1.3 instead of ssl.protocol=TLSv1.2?

@nizhikov
Copy link
Collaborator Author

@ijuma I can't see downside in forcing usage of the latest TLS version.
Added this change to PR.

@nizhikov
Copy link
Collaborator Author

retest this, please

@ijuma
Copy link
Contributor

ijuma commented May 21, 2020

@nizhikov Thanks. Can you update the KIP and start the voting on it?

@nizhikov
Copy link
Collaborator Author

@ijuma KIP-573 updated.
Actually, I'm started one (link), but didn't get any votes :)
Should I start another?

@ijuma
Copy link
Contributor

ijuma commented May 21, 2020

Since the vote passed, can we flesh out the PR to include more tests that exercise TLS 1.3? A few things to think about:

  1. Unit tests like the ones included in the PR currently. Can we go through the various possible combinations of client and server configuration and check that they all work or fail in the way we expect.

  2. Make sure the integration tests use the same TLS configuration we use by default (if they don't already). Since Java 8 sticks to TLS 1.2 for now, we will get coverage of the old and new approach this way.

  3. Adjust system tests to use TLS 1.3 by default, but also include variants where client uses TLS 1.2 and broker uses 1.3, the reverse and finally where TLS 1.2 is used for both.

@ijuma
Copy link
Contributor

ijuma commented May 25, 2020

A few failures seem related to the changes in this PR:

kafka.network.SocketServerTest.testConnectionIdReuse
kafka.network.SocketServerTest.remoteCloseWithBufferedReceivesFailedSend
kafka.network.SocketServerTest.remoteCloseSendFailure
kafka.network.SocketServerTest.remoteCloseWithoutBufferedReceives
kafka.network.SocketServerTest.remoteCloseWithCompleteAndIncompleteBufferedReceives
kafka.network.SocketServerTest.remoteCloseWithIncompleteBufferedReceive
kafka.network.SocketServerTest.closingChannelWithBufferedReceives
kafka.network.SocketServerTest.closingChannelSendFailure
kafka.network.SocketServerTest.idleExpiryWithBufferedReceives
kafka.network.SocketServerTest.closingChannelWithBufferedReceivesFailedSend
kafka.network.SocketServerTest.remoteCloseWithBufferedReceives
kafka.network.SocketServerTest.closingChannelWithCompleteAndIncompleteBufferedReceives

@nizhikov
Copy link
Collaborator Author

retest this, please

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates. We're closer. :) A few comments below.

@nizhikov nizhikov changed the title KAFKA-9320: KIP-573 - Enable TLSv1.3 by default [WIP] KAFKA-9320: KIP-573 - Enable TLSv1.3 by default May 28, 2020
@nizhikov
Copy link
Collaborator Author

nizhikov commented Jun 2, 2020

@ijuma I found explanation of the test behavior.

Full information can be found in the guide Please, navigate to the "Send ClientHello Message". You may want to take a look at the "client version" and "supported_versions (43)" fields.

The root of the "strange" behavior is the structure of the SSL ClientHello message(quote from tutorial):

Client version: For TLS 1.3, this has a fixed value, TLSv1.2; TLS 1.3 uses the extension supported_versions and not this field to negotiate protocol version
...
supported_versions: Lists which versions of TLS the client supports. In particular, if the client
requests TLS 1.3, then the client version field has the value TLSv1.2 and this extension
contains the value TLSv1.3; if the client requests TLS 1.2, then the client version field has the
value TLSv1.2 and this extension either doesn’t exist or contains the value TLSv1.2 but not the value TLSv1.3.

This means we can't connect with the following configuration:
client:

ssl.protocol=TLSv1.2 #this will be used for ClientHello
ssl.enabled.protocols=TLSv1.2,TLSv1.3 #TLS v1.3 will be ignored in ClientHello message.

Server:

ssl.protocol=TLSv1.3
ssl.enabled.protocols=TLSv1.3 # Accept only TLSv1.3 

You can see all details of the SSL connection process in the javax.net log.
It can be enabled like the following:

    public SslVersionsTransportLayerTest(List<String> serverProtocols, List<String> clientProtocols) {
        System.setProperty("javax.net.debug", "ssl:handshake"); //This will turn on the log from jdk SSL system classes.
        this.serverProtocols = serverProtocols;
        this.clientProtocols = clientProtocols;
    }

So correct check should be:

    private boolean isCompatible(List<String> serverProtocols, List<String> clientProtocols) {
        return serverProtocols.contains(clientProtocols.get(0)) ||
            (clientProtocols.get(0).equals("TLSv1.3") && clientProtocols.contains("TLSv1.2"));
    }

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for figuring this out. Very helpful. Can we update SSL_PROTOCOL_DOC to include what we've learned here? It's not particularly intuitive. :) Aside from that and a couple of nits below, I think we're good to merge.

@ijuma
Copy link
Contributor

ijuma commented Jun 2, 2020

Oh, one more thing, let's please add an entry to upgrade.html.

@ijuma
Copy link
Contributor

ijuma commented Jun 2, 2020

retest this please

@nizhikov
Copy link
Collaborator Author

nizhikov commented Jun 2, 2020

I think, currently, the trunk is broken with the c6633a1
I prepared oneliner fix for it - #8779

+ "Allowed values in recent JVMs are TLSv1.2 and TLSv1.3. TLS, TLSv1.1, SSL, SSLv2 and SSLv3 "
+ "may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities.";
+ "may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities."
+ "Please, note, TLSv1.2 clients can't connect to the servers with TLSv1.3 only even if ssl.enabled.protocols contains TLSv1.3";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

"The SSL protocol used to generate the SSLContext. "
            + "The default is TLSv1.3 when running with Java 11 or newer, TLSv1.2 otherwise. "
            + "This value should be fine for most use cases. "
            + "Allowed values in recent JVMs are TLSv1.2 and TLSv1.3. TLS, TLSv1.1, SSL, SSLv2 and SSLv3 "
            + "may be supported in older JVMs, but their usage is discouraged due to known security vulnerabilities. ";
            + "With the default value for this config and ssl.enabled.protocols, clients will downgrade to TLSv1.2 if "
            + "the server does not support TLSv1.3. If this config is set to TLSv1.2, clients will not use TLSv1.3 even "
            + "if it is one of the values in ssl.enabled.protocols and the server only supports TLSv1.3."

@@ -64,7 +66,17 @@

public static final String SSL_ENABLED_PROTOCOLS_CONFIG = "ssl.enabled.protocols";
public static final String SSL_ENABLED_PROTOCOLS_DOC = "The list of protocols enabled for SSL connections.";
Copy link
Contributor

@ijuma ijuma Jun 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

The list of protocols enabled for SSL connections. The default is 'TLSv1.2,TLSv1.3' when running with Java 11 or newer, 'TLSv1.2' otherwise. With the default value for Java 11, clients and servers will prefer TLSv1.3 if both support it and fallback to TLSv1.2 otherwise (assuming both support at least TLSv1.2). This default should be fine for most cases. Also see the `ssl.protocol` config documentation.

@@ -18,6 +18,10 @@
<script><!--#include virtual="js/templateData.js" --></script>

<script id="upgrade-template" type="text/x-handlebars-template">
<h5><a id="upgrade_270_notable" href="#upgrade_270_notable">Notable changes in 2.7.0</a></h5>
<ul>
<li>TLSv1.3 have been enabled by default for JDK11+. See <a href="https://cwiki.apache.org/confluence/display/KAFKA/KIP-573%3A+Enable+TLSv1.3+by+default">KIP-573</a> for full details.</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nit: "TLSv1.3 has been enabled by default for Java 11 or newer. The client and server will negotiate TLSv1.3 if both support it and fallback to TLSv1.2 otherwise. See...

@ijuma
Copy link
Contributor

ijuma commented Jun 2, 2020

@nizhikov I think we're good to merge this after the non code suggestions above are addressed (assuming we can get a Jenkins build, I merged your other PR fixing the build issue).

Copy link
Contributor

@ijuma ijuma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@nizhikov
Copy link
Collaborator Author

nizhikov commented Jun 2, 2020

@ijuma Thanks for your time! Appreciate it.
Try my best to waste fewer rounds of review next time :)

@ijuma
Copy link
Contributor

ijuma commented Jun 2, 2020

@nizhikov These changes are pretty tricky to validate. Thanks for being thorough on the tests. :)

@ijuma
Copy link
Contributor

ijuma commented Jun 2, 2020

The failures are unrelated:

org.apache.kafka.clients.consumer.CooperativeStickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges
org.apache.kafka.clients.consumer.CooperativeStickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges
org.apache.kafka.clients.consumer.StickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges
org.apache.kafka.clients.consumer.StickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges

Also failing in master:

org.apache.kafka.clients.consumer.CooperativeStickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges
org.apache.kafka.clients.consumer.StickyAssignorTest.testReassignmentWithRandomSubscriptionsAndChanges

@ijuma ijuma merged commit 8b22b81 into apache:trunk Jun 2, 2020
ijuma pushed a commit that referenced this pull request Jun 2, 2020
1. Enables `TLSv1.3` by default with Java 11 or newer.
2. Add unit tests that cover the various TLSv1.2 and TLSv1.3 combinations.
3. Extend `benchmark_test.py` and `replication_test.py` to run with 'TLSv1.2'
or 'TLSv1.3'.

Reviewers: Ismael Juma <ismael@juma.me.uk>
ijuma added a commit to confluentinc/kafka that referenced this pull request Jun 3, 2020
* apache-github/2.6: (32 commits)
  KAFKA-10083: fix failed testReassignmentWithRandomSubscriptionsAndChanges tests (apache#8786)
  KAFKA-9945: TopicCommand should support --if-exists and --if-not-exists when --bootstrap-server is used (apache#8737)
  KAFKA-9320: Enable TLSv1.3 by default (KIP-573) (apache#8695)
  KAFKA-10082: Fix the failed testMultiConsumerStickyAssignment (apache#8777)
  MINOR: Remove unused variable to fix spotBugs failure (apache#8779)
  MINOR: ChangelogReader should poll for duration 0 for standby restore (apache#8773)
  KAFKA-10030: Allow fetching a key from a single partition (apache#8706)
  Kafka-10064 Add documentation for KIP-571 (apache#8760)
  MINOR: Code cleanup and assertion message fixes in Connect integration tests (apache#8750)
  KAFKA-9987: optimize sticky assignment algorithm for same-subscription case (apache#8668)
  KAFKA-9392; Clarify deleteAcls javadoc and add test for create/delete timing (apache#7956)
  KAFKA-10074: Improve performance of `matchingAcls` (apache#8769)
  KAFKA-9494; Include additional metadata information in DescribeConfig response (KIP-569) (apache#8723)
  KAFKA-10056; Ensure consumer metadata contains new topics on subscription change (apache#8739)
  KAFKA-10029; Don't update completedReceives when channels are closed to avoid ConcurrentModificationException (apache#8705)
  KAFKA-10061; Fix flaky `ReassignPartitionsIntegrationTest.testCancellation` (apache#8749)
  KAFKA-9130; KIP-518 Allow listing consumer groups per state (apache#8238)
  KAFKA-9501: convert between active and standby without closing stores (apache#8248)
  MINOR: Relax Percentiles test (apache#8748)
  MINOR: regression test for task assignor config (apache#8743)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants