New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSLEngine.unwrap seems to lose data if the source network data being unwrapped spans two/multiple TCP packets #26

Closed
jaikiran opened this Issue Oct 24, 2017 · 11 comments

Comments

Projects
None yet
3 participants
@jaikiran
Contributor

jaikiran commented Oct 24, 2017

I have started experimenting the usage of WildFly OpenSSL library as a SSL provider in context of a runtime server/project (not related to WildFly server). I have stumbled upon, what appears to be some kind of issue, in WildFly OpenSSL or OpenSSL. Given my lack of knowledge in either of these projects or the underlying code, I haven't been able to fully narrow this issue down in terms of what pieces of code contributes to it. However, here's what I think is happening based on analyzing logs and wireshark captures from different completes across multiple runs:

  • The server in question uses WildFly OpenSSL provider and internally uses SSLEngine API to wrap/unwrap data on an underlying SocketChannel using Java NIO selectors.

  • A typical "write" looks something like this which uses the SSLEngine:


	final ByteBuffer networkWriteBuffer = ...
	final ByteBuffer[] srcs = ...
    for (i = 0;  i < length; i++) {
        if (srcs[i].hasRemaining()) {
			SSLEngineResult wrapResult = sslEngine.wrap(src[i], networkWriteBuffer);
    		// deal with the result...
    		// write out to the underlying socketchannel, the "wrapped" data
    		socketChannel.write(networkWriteBuffer);
    		...
        }
        ....
    }

Please consider this a pseudo-code and assume that the necessary NIO selector semantics and the ByteBuffer usage semantics are being propely handled in the real code.

So what's essentially going on in the send side is that for each source data, the SSLEngine is used to wrap the data and write it over the socket channel.

  • A typical "read" on the other side of the SocketChannel uses the SSLEngine to unwrap the data and looks like:
        final ByteBuffer networkReadBuffer = ...
	final ByteBuffer appData = ....
	// read the data from the socket channel
	socketChannel.read(networkReadBuffer);
	// now use the SSLEngine to unwrap the data
        SSLEngineResult unwrapResult = sslEngine.unwrap(networkReadBuffer, appData);
        // deal with the result....

So the read side just keeps reading of the socket channel (whenever the relevant Selector is notified about data being available) and then unwrapping that data into a internal application specific buffer.

The issue I keep running into seems to be on the unwrapping part (or at least that's where I can "notice" it). Based on my wireshark captures and log analysis what I see is happening is that if the write side happens to (wrap and) send more than one packet of TCP data on the socket channel (I have tested consistently with two packets) and if the read side happens to read both those TCP packets in one go and pass it to sslEngine for unwrapping, the engine seems to unwrap just one packet and return back a OK result and the second packet is never returned back into the target buffer that's sent to the unwrap call. To explain a bit more concisely, consider the following sequence of events:

- Sender:
	T1: SSLEngine.wrap(dataOne) generates 33 bytes of data
	T2: write dataOne (==33 bytes) on to the socket channel
	T3: SSLEngine.wrap(dataTwo) generates 65 bytes of data
	T4: write dataTwo (==65 bytes) on to the socket channel

- Receiver:
	T5: read into incomingData from socket channel. Reads 98 bytes of data into the incomingData
	T6: SSLEngine.unwrap(incomingData of 98 bytes, dest). This ends up with the `dest` buffer being updated with 4 bytes of data and a SSLEngineResult which looks like:
		
		Status = OK HandshakeStatus = NOT_HANDSHAKING bytesConsumed = 98 bytesProduced = 4

The unwrap ends up producing wrong (lesser amount) of data in the above case. The important piece here is that this seems to happen only when the unwrap is fed with incoming data that belonged to 2 separate TCP packets that were read off the socket channel (since it was available on the socket). If it so happens that the socket read returns just one packet and that then is passed on to the unwrap API, one each at a time, then it ends up processing it correctly. In other words, the following sequence works:

- Sender:
	T1: SSLEngine.wrap(dataOne) generates 33 bytes of data
	T2: write dataOne (==33 bytes) on to the socket channel
	T3: SSLEngine.wrap(dataTwo) generates 65 bytes of data
	T4: write dataTwo (==65 bytes) on to the socket channel

- Receiver:
	T5: read into incomingData from socket channel. Reads 33 bytes of data into the incomingData
	T6: SSLEngine.unwrap(incomingData of 33 bytes, dest). This ends up with the `dest` buffer being updated with (correct) 4 bytes of data and a (correct) SSLEngineResult which looks like:
		
		Status = OK HandshakeStatus = NOT_HANDSHAKING bytesConsumed = 33 bytesProduced = 4

	T7: go back to reading off the socket channel. Reads 65 bytes of data into the incomingData
	T8: SSLEngine.unwrap(incomingData of 65 bytes, dest). This ends up with the `dest` buffer being updated with (correct) 36 bytes of data and a (correct) SSLEngineResult which looks like:

		Status = OK HandshakeStatus = NOT_HANDSHAKING bytesConsumed = 65 bytesProduced = 36

I hope this information is good enough to probably understand where the issue might be. So far, based on what I see in the server's usage of SSLEngine APIs and its usage of Java NIO selector APIs, I don't see anything obviously incorrect that it's doing. In fact, it's currently fully functional with the regular SSLEngine that's shipped in Java runtime (currently using Java 8). Having said that, if this appears to be some kind of an issue with the API usage, do let me know, I'll try and dig deeper in that part of their code. Unfortunately, I don't have an easy way to reproduce this outside of that library to narrow it down to this implementation of the SSL engine.

Environment details:

  • Operating system: Mac OS
  • Java version:
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
  • OpenSSL version (both 1.0.2 and 1.1.0): OpenSSL 1.0.2l 25 May 2017 and OpenSSL 1.1.0f 25 May 2017
  • WildFly OpenSSL version: Latest upstream (1.0.3.Final-SNAPSHOT)
@stuartwdouglas

This comment has been minimized.

Show comment
Hide comment
@stuartwdouglas

stuartwdouglas Oct 24, 2017

Contributor

I think the issue may be that we do not 100% honour the contract of SSLEngine, and can potentially buffer data in the engine. Can you test out this branch:

https://github.com/stuartwdouglas/ssl-experiments/commits/fixes

I had someone else report this, but they never responded when I asked them to test out the fix and I forgot about it (as Undertow will defensivly always unwrap till the engine returns no more data this issue should not affect undertow).

Contributor

stuartwdouglas commented Oct 24, 2017

I think the issue may be that we do not 100% honour the contract of SSLEngine, and can potentially buffer data in the engine. Can you test out this branch:

https://github.com/stuartwdouglas/ssl-experiments/commits/fixes

I had someone else report this, but they never responded when I asked them to test out the fix and I forgot about it (as Undertow will defensivly always unwrap till the engine returns no more data this issue should not affect undertow).

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 25, 2017

Contributor

Can you test out this branch:

https://github.com/stuartwdouglas/ssl-experiments/commits/fixes

I just did a very quick test building that branch from that repo. That indeed helped and I no longer see this issue. I am going to do a more thorough test in a few hours, but I think those fixes indeed gets me past this. Thanks very much Stuart.

P.S: I see that, that branch is behind on some of the commits that have happened here in this repo. If my testing goes fine, I will try and merge it on top of this repo and see if it still passes fine (right now, my attempts to merge it showed some conflicts and I was in a hurry so didn't pay much attention to the nature of the conflicts).

Contributor

jaikiran commented Oct 25, 2017

Can you test out this branch:

https://github.com/stuartwdouglas/ssl-experiments/commits/fixes

I just did a very quick test building that branch from that repo. That indeed helped and I no longer see this issue. I am going to do a more thorough test in a few hours, but I think those fixes indeed gets me past this. Thanks very much Stuart.

P.S: I see that, that branch is behind on some of the commits that have happened here in this repo. If my testing goes fine, I will try and merge it on top of this repo and see if it still passes fine (right now, my attempts to merge it showed some conflicts and I was in a hurry so didn't pay much attention to the nature of the conflicts).

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 25, 2017

Contributor

P.S: I see that, that branch is behind on some of the commits that have happened here in this repo. If my testing goes fine, I will try and merge it on top of this repo and see if it still passes fine (right now, my attempts to merge it showed some conflicts and I was in a hurry so didn't pay much attention to the nature of the conflicts).

So the merge conflict was due to my local testing related changes than a real one. So I went ahead and brought in those commits from the ssl-experiments/fixes branch and applied it to latest upstream master. It merged cleanly. Here's the branch in my repo which contains them https://github.com/jaikiran/wildfly-openssl/commits/ssl-unwrap-fixes. If necessary, I can open a PR with these changes, if it looks fine.

My testing so far has gone well and I will soon be creating some kind of initial basic performance comparison between this and the regular Java SSLEngine, which was the whole goal of this exercise.

Contributor

jaikiran commented Oct 25, 2017

P.S: I see that, that branch is behind on some of the commits that have happened here in this repo. If my testing goes fine, I will try and merge it on top of this repo and see if it still passes fine (right now, my attempts to merge it showed some conflicts and I was in a hurry so didn't pay much attention to the nature of the conflicts).

So the merge conflict was due to my local testing related changes than a real one. So I went ahead and brought in those commits from the ssl-experiments/fixes branch and applied it to latest upstream master. It merged cleanly. Here's the branch in my repo which contains them https://github.com/jaikiran/wildfly-openssl/commits/ssl-unwrap-fixes. If necessary, I can open a PR with these changes, if it looks fine.

My testing so far has gone well and I will soon be creating some kind of initial basic performance comparison between this and the regular Java SSLEngine, which was the whole goal of this exercise.

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 30, 2017

Contributor

@stuartwdouglas , my initial tests show that WildFly OpenSSL (with the fixes noted in the above comments) performs better than the SSL engine shipped in JRE. Here's a more detailed explanation of what was tested and what the numbers look like https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html

Do you consider the fixes you have in that other branch, good enough to be merged to upstream? If there's any help that I can provide in testing any additional fixes that you might have in mind, in the context of this issue, do let me know.

Contributor

jaikiran commented Oct 30, 2017

@stuartwdouglas , my initial tests show that WildFly OpenSSL (with the fixes noted in the above comments) performs better than the SSL engine shipped in JRE. Here's a more detailed explanation of what was tested and what the numbers look like https://jaitechwriteups.blogspot.com/2017/10/kafka-with-openssl.html

Do you consider the fixes you have in that other branch, good enough to be merged to upstream? If there's any help that I can provide in testing any additional fixes that you might have in mind, in the context of this issue, do let me know.

@stuartwdouglas

This comment has been minimized.

Show comment
Hide comment
@stuartwdouglas

stuartwdouglas Oct 31, 2017

Contributor

I have merged them upstream

Contributor

stuartwdouglas commented Oct 31, 2017

I have merged them upstream

@n1hility

This comment has been minimized.

Show comment
Hide comment
@n1hility

n1hility Oct 31, 2017

Hi Jaikiran,

Would you mind if we aggregated your blog entry on wildfly.org?

-Jason

n1hility commented Oct 31, 2017

Hi Jaikiran,

Would you mind if we aggregated your blog entry on wildfly.org?

-Jason

@n1hility

This comment has been minimized.

Show comment
Hide comment
@n1hility

n1hility Oct 31, 2017

Oops left off the reference, sorry @jaikiran , see comment above

n1hility commented Oct 31, 2017

Oops left off the reference, sorry @jaikiran , see comment above

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 31, 2017

Contributor

@n1hility

Hi Jaikiran,

Would you mind if we aggregated your blog entry on wildfly.org?

Hi Jason,

Feel free to include it :)

Contributor

jaikiran commented Oct 31, 2017

@n1hility

Hi Jaikiran,

Would you mind if we aggregated your blog entry on wildfly.org?

Hi Jason,

Feel free to include it :)

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 31, 2017

Contributor

I have merged them upstream

Hi @stuartwdouglas, I don't see them in this upstream repo commits. Forgot to push maybe?

Contributor

jaikiran commented Oct 31, 2017

I have merged them upstream

Hi @stuartwdouglas, I don't see them in this upstream repo commits. Forgot to push maybe?

@stuartwdouglas

This comment has been minimized.

Show comment
Hide comment
@stuartwdouglas

stuartwdouglas Oct 31, 2017

Contributor
Contributor

stuartwdouglas commented Oct 31, 2017

@jaikiran

This comment has been minimized.

Show comment
Hide comment
@jaikiran

jaikiran Oct 31, 2017

Contributor

Thank you!

Closing this, now that the changes have been committed 91a15b7

Contributor

jaikiran commented Oct 31, 2017

Thank you!

Closing this, now that the changes have been committed 91a15b7

@jaikiran jaikiran closed this Oct 31, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment