Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not mandate direct bytes in SslHandler queue #9656

Merged
merged 2 commits into from Oct 12, 2019

Conversation

@tbrooks8
Copy link
Contributor

tbrooks8 commented Oct 12, 2019

Motivation:

Currently when the SslHandler coalesces outbound bytes it always
allocates a direct byte buffer. This does not make sense if the JDK
engine is being used as the bytes will have to be copied back to heap
bytes for the engine to operate on them.

Modifications:

Inspect engine type when coalescing outbound bytes and allocate heap
buffer if heap bytes are preferred by the engine.

Result:

Improved performance for JDK engine. Better performance in environments
without direct buffer pooling.

tbrooks8 added 2 commits Oct 12, 2019
Motivation:

Currently when the SslHandler coalesces outbound bytes it always
allocates a direct byte buffer. This does not make sense if the JDK
engine is being used as the bytes will have to be copied back to heap
bytes for the engine to operate on them.

Modifications:

Inspect engine type when coalescing outbound bytes and allocate heap
buffer if heap bytes are preferred by the engine.

Result:

Improved performance for JDK engine. Better performance in environments
without direct buffer pooling.
@netty-bot

This comment has been minimized.

Copy link

netty-bot commented Oct 12, 2019

Can one of the admins verify this patch?

@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 12, 2019

@netty-bot test this please

@johnou

This comment has been minimized.

Copy link
Contributor

johnou commented Oct 12, 2019

@tbrooks8 out of interest, do you have any benchmark results you can share?

@johnou
johnou approved these changes Oct 12, 2019
@tbrooks8

This comment has been minimized.

Copy link
Contributor Author

tbrooks8 commented Oct 12, 2019

@johnou unfortunately not. The reason this came to my attention was not because of an obvious performance regression, but because I was seeing direct buffers being allocated in an environment without direct buffer pooling enabled. Netty normal does its best to ensure pooled direct buffers are enabled before using them.

As I investigated more I saw this was only happening in the SslHandler. The reasoning behind my decision to always use a heap buffer for the JDK engine, is that if you pass the JDK engine a direct buffer it will internally allocate a byte array and copy the data to that for wrap. I just presume that is worse than using a heap buffer in the first place.

@normanmaurer normanmaurer self-assigned this Oct 12, 2019
@normanmaurer normanmaurer added this to the 4.1.43.Final milestone Oct 12, 2019
@normanmaurer normanmaurer merged commit ec8d43c into netty:4.1 Oct 12, 2019
3 checks passed
3 checks passed
pull request validation (centos6-java11) Build finished.
Details
pull request validation (centos6-java12) Build finished.
Details
pull request validation (centos6-java8) Build finished.
Details
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 12, 2019

@tbrooks8 great catch!

normanmaurer added a commit that referenced this pull request Oct 12, 2019
Motivation:

Currently when the SslHandler coalesces outbound bytes it always
allocates a direct byte buffer. This does not make sense if the JDK
engine is being used as the bytes will have to be copied back to heap
bytes for the engine to operate on them.

Modifications:

Inspect engine type when coalescing outbound bytes and allocate heap
buffer if heap bytes are preferred by the engine.

Result:

Improved performance for JDK engine. Better performance in environments
without direct buffer pooling.
@normanmaurer

This comment has been minimized.

Copy link
Member

normanmaurer commented Oct 19, 2019

@tbrooks8 btw are you sure the copy to heap byte array is still a thing in java11 and later ?

@tbrooks8

This comment has been minimized.

Copy link
Contributor Author

tbrooks8 commented Oct 19, 2019

@normanmaurer Obviously it is pretty complicated to read through the JDK code with all the indirection. But here is what I currently see (JDK 12):

The SSLEngine will eventually call to Cipher#doFinal(ByteBuffer input, ByteBuffer output) or Ciper#update(ByteBuffer input, ByteBuffer output).

    public final int doFinal(ByteBuffer input, ByteBuffer output)
            throws ShortBufferException, IllegalBlockSizeException,
            BadPaddingException {
        checkCipherState();

        if ((input == null) || (output == null)) {
            throw new IllegalArgumentException("Buffers must not be null");
        }
        if (input == output) {
            throw new IllegalArgumentException("Input and output buffers must "
                + "not be the same object, consider using buffer.duplicate()");
        }
        if (output.isReadOnly()) {
            throw new ReadOnlyBufferException();
        }

        chooseFirstProvider();
        return spi.engineDoFinal(input, output);
    }

This calls The CipherSpi:

protected int engineDoFinal(ByteBuffer input, ByteBuffer output)
            throws ShortBufferException, IllegalBlockSizeException,
            BadPaddingException {
        return bufferCrypt(input, output, false);
    }

bufferCrypt does the copying.

    /**
     * Implementation for encryption using ByteBuffers. Used for both
     * engineUpdate() and engineDoFinal().
     */
    private int bufferCrypt(ByteBuffer input, ByteBuffer output,
            boolean isUpdate) throws ShortBufferException,
            IllegalBlockSizeException, BadPaddingException {
        if ((input == null) || (output == null)) {
            throw new NullPointerException
                ("Input and output buffers must not be null");
        }
        int inPos = input.position();
        int inLimit = input.limit();
        int inLen = inLimit - inPos;
        if (isUpdate && (inLen == 0)) {
            return 0;
        }
        int outLenNeeded = engineGetOutputSize(inLen);

        if (output.remaining() < outLenNeeded) {
            throw new ShortBufferException("Need at least " + outLenNeeded
                + " bytes of space in output buffer");
        }

        boolean a1 = input.hasArray();
        boolean a2 = output.hasArray();
        int total = 0;
        byte[] inArray, outArray;
        if (a2) { // output has an accessible byte[]
            outArray = output.array();
            int outPos = output.position();
            int outOfs = output.arrayOffset() + outPos;

            if (a1) { // input also has an accessible byte[]
                inArray = input.array();
                int inOfs = input.arrayOffset() + inPos;
                if (isUpdate) {
                    total = engineUpdate(inArray, inOfs, inLen, outArray, outOfs);
                } else {
                    total = engineDoFinal(inArray, inOfs, inLen, outArray, outOfs);
                }
                input.position(inLimit);
            } else { // input does not have accessible byte[]
                inArray = new byte[getTempArraySize(inLen)];
                do {
                    int chunk = Math.min(inLen, inArray.length);
                    if (chunk > 0) {
                        input.get(inArray, 0, chunk);
                    }
                    int n;
                    if (isUpdate || (inLen > chunk)) {
                        n = engineUpdate(inArray, 0, chunk, outArray, outOfs);
                    } else {
                        n = engineDoFinal(inArray, 0, chunk, outArray, outOfs);
                    }
                    total += n;
                    outOfs += n;
                    inLen -= chunk;
                } while (inLen > 0);
            }
            output.position(outPos + total);
        } else { // output does not have an accessible byte[]
            if (a1) { // but input has an accessible byte[]
                inArray = input.array();
                int inOfs = input.arrayOffset() + inPos;
                if (isUpdate) {
                    outArray = engineUpdate(inArray, inOfs, inLen);
                } else {
                    outArray = engineDoFinal(inArray, inOfs, inLen);
                }
                input.position(inLimit);
                if (outArray != null && outArray.length != 0) {
                    output.put(outArray);
                    total = outArray.length;
                }
            } else { // input also does not have an accessible byte[]
                inArray = new byte[getTempArraySize(inLen)];
                do {
                    int chunk = Math.min(inLen, inArray.length);
                    if (chunk > 0) {
                        input.get(inArray, 0, chunk);
                    }
                    int n;
                    if (isUpdate || (inLen > chunk)) {
                        outArray = engineUpdate(inArray, 0, chunk);
                    } else {
                        outArray = engineDoFinal(inArray, 0, chunk);
                    }
                    if (outArray != null && outArray.length != 0) {
                        output.put(outArray);
                        total += outArray.length;
                    }
                    inLen -= chunk;
                } while (inLen > 0);
            }
        }
        return total;
    }

It looks to me that the P11Cipher overrides engineDoFinal and engineUpdate with implementations that do support direct byte buffers. But that is the only Cipher. For example ChaCha20Cipher and AESCipher appear to only operate on byte arrays to me.

So I think my understanding is correct. And I have seen in debugging allocations happening when working with AES. But I can investigate more this week if you think there is something I am missing.

@carl-mastrangelo

This comment has been minimized.

Copy link
Member

carl-mastrangelo commented Oct 19, 2019

Slightly related, but Conscrypt had a very similar issue a few years back, which never ended up getting resolved: google/conscrypt#317

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.