Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy data from multiple TLS records to userspace buffer #53

Closed
lancerchao opened this issue Jun 20, 2016 · 9 comments
Closed

Copy data from multiple TLS records to userspace buffer #53

lancerchao opened this issue Jun 20, 2016 · 9 comments

Comments

@lancerchao
Copy link
Contributor

If server sends: abcde, then sends 12345, a client calling recv() with a read size of 10 would get "abcde" instead of "abcde12345"

@fridex fridex changed the title recv stream yields incorrect data Copy data from multiple TLS records to userspace buffer Jun 21, 2016
@nmav
Copy link
Member

nmav commented Jun 21, 2016

I'm not sure whether that is an issue. If the data are send in different TLS records, then data would be received on the TLS record boundary. I believe that's very similar to the TCP handling.

@fridex
Copy link
Member

fridex commented Jun 21, 2016

From my POV, this is definitely not an issue. I'm not sure if this enhancement worth it since it can be complicated to implement if we want to keep consistency with bound socket.

Consider following scenario with two TLS records:

TLS record data size:
Record1: 1000B
Record2: 1000B

  1. A user supplies a buffer of size S, where 1000B < S < 2000B:
    In this case we shouldn't cache partial data in the AF_KTLS socket, since the second record was not fully read (same as if user supplies buffer of size < 1000 when there is returned ENOMEM, no partial data are returned). We should simply propagate only Record1 - 1000B. If we would cache decrypted data switching from AF_KTLS to OpenSSL/GnuTLS would be inconsistent.
  2. A user supplies a buffer of size S, where S > 2000:
    We could fill supplied buffer with payload of Record1 and Record2 and pop those TLS records from TCP socket. Personally, I don't see any benefits by doing so. Are there any benefits I am not aware of?

@lancerchao
Copy link
Contributor Author

I'm not sure whether that is an issue. If the data are send in different TLS records, then data would be received on the TLS record boundary. I believe that's very similar to the TCP handling.

I don't quite follow. Can you explain a little more?

A user supplies a buffer of size S, where 1000B < S < 2000B:
In this case we shouldn't cache partial data in the AF_KTLS socket, since the second record was not fully read (same as if user supplies buffer of size < 1000 when there is returned ENOMEM, no partial data are returned). We should simply propagate only Record1 - 1000B. If we would cache decrypted data switching from AF_KTLS to OpenSSL/GnuTLS would be inconsistent.

Do you mean that we will not support partial reads? In such a case, how would a client know how big the supplied buffer needs to be, without knowing the size of the TLS record that its about to receive?

My impression is that if the client opened the original socket with SOCK_STREAM, then it should expect that af_ktls socket operations bound to that socket could work exactly as a stream socket is expected to behave. In the case of a TCP socket, it should be able to read and write data of arbitrary length without worrying about record boundaries, record sizes, etc.

@fridex
Copy link
Member

fridex commented Jun 21, 2016

Do you mean that we will not support partial reads? In such a case, how would a client know how big the supplied buffer needs to be, without knowing the size of the TLS record that its about to receive?

Actually you know it - it will not exceed KTLS_MAX_PAYLOAD_SIZE.

My impression is that if the client opened the original socket with SOCK_STREAM, then it should expect that af_ktls socket operations bound to that socket could work exactly as a stream socket is expected to behave. In the case of a TCP socket, it should be able to read and write data of arbitrary length without worrying about record boundaries, record sizes, etc.

TCP can be fragmented; even there is not guaranteed that if you request N bytes, that N bytes are really received in RX buffer and available.

I understand your approach. The issue is with user space sync. I can imagine a getsockopt(2) that would propagate cache size to user space so user space can pick the data. The original approach was to pop records from receiving queue only if we know that they were fully read by user space.

@djwatson
Copy link
Member

If the data are send in different TLS records, then data would be received on the TLS record boundary. I believe that's very similar to the TCP handling.

I don't think this is true? read() will happily read across TCP message boundaries.

TCP can be fragmented; even there is not guaranteed that if you request N bytes, that N bytes are really received in RX buffer and available.

Correct - a SOCK_STREAM read() where insufficient bytes are available will usually block waiting for them, unlike a SOCK_DATAGRAM.

Not supporting small partial reads means this will never be a drop-in replacement for existing code, and anyone using af_ktls would need to reimplement some sort of buffering mechanism on top of the socket - the application may be putting message boundaries on top of tcp, for example:

int size
read(fd, &size, 4); // or SSL_read
char buf[size];
read(fd, &buf, size); // or SSL_read

Note that OpenSSL (and I presume GnuTLS) does this buffering for you, so this would even be a change with respect to existing TLS implementaitons:

SSL_read() works based on the SSL/TLS records. The data are received in records (with a maximum record size of 16kB for SSLv3/TLSv1). Only when a record has been completely received, it can be processed (decryption and check of integrity). Therefore data that was not retrieved at the last call of SSL_read() can still be buffered inside the SSL layer and will be retrieved on the next call to SSL_read()

@djwatson
Copy link
Member

Yea, it looks like gnutls_record_recv is roughly the same and buffers data - so yea I think we do need this complexity. For userspace sync we'd need the equivalent of:

Function: size_t gnutls_record_check_pending (gnutls_session_t session)
session: is a gnutls_session_t type.

This function checks if there are unread data in the gnutls buffers. If the return value is non-zero the next call to gnutls_record_recv() is guaranteed not to block.

Returns: Returns the size of the data or zero.`

@djwatson djwatson mentioned this issue Jun 23, 2016
@djwatson
Copy link
Member

I think this was fixed with the buffer changes?

@fridex
Copy link
Member

fridex commented Aug 18, 2016

I'm not sure with this (but I believe yes), Lance would give a proper status of this (btw this is correct for DTLS).

@lancerchao
Copy link
Contributor Author

Yes, this was fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants