Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-4493: Throw InvalidTransportLayerException when a plaintext client receives a TLS response #5940

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from
Open

KAFKA-4493: Throw InvalidTransportLayerException when a plaintext client receives a TLS response #5940

wants to merge 2 commits into from

Conversation

stanislavkozlovski
Copy link
Contributor

@stanislavkozlovski stanislavkozlovski commented Nov 22, 2018

Currently, if a client if misconfigured to not use ssl and conencts to a broker with SSL listeners, the broker responds with a TLS Alert response.
The client interprets the first 4 TLS alert response bytes as the size of the response, resulting in it trying to allocate 352518912 bytes.
This either raises an OutOfMemory exception or stalls on the allocate() call. Either way, we should fatally throw a more clear exception so the user can know what is wrong.
I personally spent more time than I'm comfortable sharing debugging this issue and I'd like to see it improved.

Implementation

On the first read of a PlaintextTransportLayer channel, we check against the pre-defined TLS alert headers, which consist of the alert byte 0x15 and the TLS version 0301,0302, 0303.

I believe that it should be alright to close such responses since we would never expect to receive a 3mb response on the first response in a newly-established connection since we should at least verify API_VERSIONS first.

I'm not entirely certain whether this warrants creation of a KIP since we are raising a new exception - InvalidTransportLayerException. That is not publicly exposed, right?

cc @rajinisivaram for review

@ijuma
Copy link
Contributor

ijuma commented Nov 22, 2018

It would be better if the broker did the right thing here. Have we considered that? Given the client upgrades take forever, any fix on the client side is less effective in the short/medium term.

@stanislavkozlovski
Copy link
Contributor Author

stanislavkozlovski commented Nov 23, 2018

I had not considered that. Thanks for the suggestion @ijuma.

After some digging around I find that the first place where we the server detects that something is wrong is from a raised SSLException. As the docstring of maybeProcessHandshakeFailure (

private void maybeProcessHandshakeFailure(SSLException sslException, boolean flush, IOException ioException) throws IOException {
) says:

// SSL handshake failures are typically thrown as SSLHandshakeException, SSLProtocolException,
// SSLPeerUnverifiedException or SSLKeyException if the cause is known. These exceptions indicate
// authentication failures (e.g. configuration errors) which should not be retried. But the SSL engine
// may also throw exceptions using the base class SSLException in a few cases:
// a) If there are no matching ciphers or TLS version or the private key is invalid, client will be
// unable to process the server message and an SSLException is thrown:
// javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
// b) If server closes the connection gracefully during handshake, client may receive close_notify
// and and an SSLException is thrown:
// javax.net.ssl.SSLException: Received close_notify during handshake
// We want to handle a) as a non-retriable SslAuthenticationException and b) as a retriable IOException.
// To do this we need to rely on the exception string. Since it is safer to throw a retriable exception
// when we are not sure, we will treat only the first exception string as a handshake exception.

javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection? is exactly what gets raised here. Since we do not know the core cause of the issue (and there doesn't seem to be a way to find out), we flush whatever bytes we have down the wire to the client. These bytes happen to represent a TLS alert message (any one of https://www.gnutls.org/manual/html_node/The-TLS-Alert-Protocol.html) (I could not get Java to print out the whole bytes and decode what type of message the SSLEngine produced).

It seems like we want to preserve the behavior of flushing the bytes to the client. For example, if the client's private key is invalid, we would want to send the alert message back.

@ijuma
Copy link
Contributor

ijuma commented Dec 8, 2018

Thanks for the investigation. My concern is that this is a bit of a layering violation. The plaintext code now has to know about TLS. What about other protocols that can cause the same issue? It would be better if the plaintext code could validate based on what it knows about its own protocol.

@stanislavkozlovski
Copy link
Contributor Author

I agree. What do you think would be a better approach? Maybe we can throw an exception if we receive a message size above a certain threshold. That would catch this scenario where the plaintext layer thinks it's receiving a 3mb message. It sounds reasonable to me since newly-established connections should first negotiate API_VERSION and that does not need big messages

@ijuma
Copy link
Contributor

ijuma commented Dec 9, 2018

That could work, but the current code is checking all reads, right? Not just the first response.

@stanislavkozlovski
Copy link
Contributor Author

No, the current code checks only the very first read due to the completedRead flag. Since we store one TransportLayer per connecition (KafkaChannel), this should only catch the very first read

@ijuma
Copy link
Contributor

ijuma commented Jan 9, 2019

One thing to keep in mind is that we don't use ApiVersionRequest for inter-broker communication.

@stanislavkozlovski
Copy link
Contributor Author

@ijuma From reading the code and quickly testing locally, I believe that inter-broker communication starts with either

  • UPDATE_METADTA request when it's from a controller to another broker, which contains all the partitions present in the cluster
[2019-07-13 23:08:32,238] ERROR [KafkaApi-0] Handling request:RequestHeader(apiKey=UPDATE_METADATA, apiVersion=4, clientId=0, correlationId=32) -- {controller_id=0,controller_epoch=3,partition_states=[{topic=test,partition=0,controller_epoch=3,leader=2,leader_epoch=6,isr=[2],zk_version=8,replicas=[1,2],offline_replicas=[1]}],live_brokers=[{id=2,end_points=[{port=9094,host=localhost,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null},{id=0,end_points=[{port=9092,host=stanislozlovski,listener_name=PLAINTEXT,security_protocol_type=0}],rack=null}]} from connection 192.168.1.104:9092-192.168.1.104:61588-0;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS (kafka.server.KafkaApis)
  • OFFSET_FOR_LEADER_EPOCH request when it's from a follower to a leader. In both cases. It contains all the partitions the follower is interested in
[2019-07-13 23:07:35,463] ERROR [KafkaApi-0] Handling request:RequestHeader(apiKey=OFFSET_FOR_LEADER_EPOCH, apiVersion=2, clientId=broker-2-fetcher-0, correlationId=161) -- {topics=[{topic=test,partitions=[{partition=0,current_leader_epoch=3,leader_epoch=0}]}]} from connection 192.168.1.104:9092-192.168.1.104:61715-4;securityProtocol:PLAINTEXT,principal:User:ANONYMOUS (kafka.server.KafkaApis)

It seems like the UpdateMetadataRequest is likely to go over 1MB in size. From a very rough calculation (basically multiplying the most frequently-variable piece of that request - UpdateMetadataPartitionState), I estimate that we will go over the 1MB threshold of UpdateMetadataRequest size once we reach around 20k partitions for the entire cluster.

Perhaps we could raise the MB check limit here? My initial message had said that in such a misconfiguration, the broker thinks it needs to allocate 352MB for the request. A simple 100MB check should suffice I think

@stanislavkozlovski
Copy link
Contributor Author

cc @ijuma thoughts on this?

int readSize = socketChannel.read(dst);

if (!completedRead && ((ByteBuffer) dst.rewind()).getInt() > 1_000_000) // first responses should negotiate API_VERSION and be small in size
throw new IllegalTransportLayerStateException("Received a first response larger than 1MB (Is this a plaintext response?)");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your consideration: message could be more direct .

Is this a plaintext response? --> The broker expects SSL, is your client configured for SSL correctly? (or something of that nature)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this layer we don't know whether the broker expects SSL. The broker is actually configured to expect plaintext in this scenario and we don't know whether the response here is SSL. As @ijuma said earlier, that is a layering violation so I think it's better to keep the current message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stanislavkozlovski understood. My proposed wording was aggressively making assumptions, but perhaps we could find a middle ground that still directs users to check their port? The issue with Received a first response larger than 1MB (Is this a plaintext response?) is that users won't know what to DO with that response. It would be great to reword it accurately to give some proposed action/suggestions to the user...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Any other suggestions? Would something like (Please verify that the configuration is correct. It may be that we are not connecting to a plaintext port) work better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding an extra "if" statement on every call to read() seems like a very heavy price to pay for this feature. We should look at where we're using the total length and check there, I think.

Copy link
Contributor Author

@stanislavkozlovski stanislavkozlovski Nov 14, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cmccabe @ijuma Doesn't the compiler optimize the if check?

I do not know how to proceed with this change.

@astubbs's example wouldn't work for the common case where we try to allocate ~352MBs and it would also be appleid to every receive we get, whereas here we are trying to scope it to the first receive for a specific channel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to my fix? or my problem? I just would like my problem solved :) I'm not concerned about my fix - I was just mainly trying to demonstrate the problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to minimize the cost, you could move the exception and second part of the if to a separate method so that it's not in the hot path. Btw, calling rewind on the ByteBuffer mutates it, which seems generally unsafe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I thought the previous discussion indicates that 1 MB might be too low of a check?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the right time to check this would be when we originally create the NetworkReceive object, not in the hot path for every read. Also, whatever we choose as the "upper limit" for a size to be interpreted as valid needs to become the new upper limit for message.max.bytes (which in turn, means that maybe this needs a KIP, since it changes a public config)

@guozhangwang
Copy link
Contributor

test this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants