Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8253821: Improve ByteBuffer performance with GCM #411

Closed
wants to merge 18 commits into from

Conversation

ascarpino
Copy link
Contributor

@ascarpino ascarpino commented Sep 29, 2020


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Download

$ git fetch https://git.openjdk.java.net/jdk pull/411/head:pull/411
$ git checkout pull/411

@bridgekeeper
Copy link

@bridgekeeper bridgekeeper bot commented Sep 29, 2020

👋 Welcome back ascarpino! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

@openjdk openjdk bot commented Sep 29, 2020

@ascarpino The following label will be automatically applied to this pull request:

  • security

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the security label Sep 29, 2020
@ascarpino
Copy link
Contributor Author

@ascarpino ascarpino commented Oct 5, 2020

I'd like a review of this change. They are two performance improvements to AES-GCM, the larger being the usage with ByteBuffers. Below are the details of the change and are listed in the JBS bug description, any future comments will be applied to the bug:

There were two areas of focus, the primary is when direct bytebuffers are used with some crypto algorithms, data is copied to byte arrays numerous times, causing unnecessary memory allocation and bringing down performance. The other focus was the non-direct bytebuffer output arrays.

This change comes in multiple parts:

  1. Changing CipherCore to not allocate a new output array if the existing array is large enough. Create a new array only if the length is not enough. The only SunJCE algorithm that has special output needs is GCM which can be dealt with elsewhere.
  2. AESCipher has a one-size-fits-all approach to bytebuffers. All encryption and decryption is done in byte arrays. When the input data is a byte array or a bytebuffer backed by a byte array, this is ok. However when it is a direct buffer, the data is copied into a new byte array. Unfortunately, this hurts SSLEngine which uses direct buffers causing multiple copies of data down to the raw algorithm. Additionally GCM code and other related classes had to be changed to allow ByteBuffers down to the algorithm where it can be copied into a fixed size byte array that can be reused. Code without this modifications running JFR with Flink, a performance test, shows ~150GB of byte array allocation in one minute of operation, afterward 7GB.
  3. GCM needed some reworking of the logic. Being an authenticated cipher, if the GHASH check fails, the decryption fails and no data is returned. The existing code would perform the decryption at the same time as the GHASH check, which current design offers no parallel performance advantage. Performing GHASH fully before decryption prevents allocating output data and perform unneeded operations if the GHASH is failed. If GHASH is successful, in-place operations can be performed directly to the buffer without allocating an intermediary buffer and then copying that data.
  4. GCTR and GHASH allocating a fixed buffer size if the data size is over 1k when going into an intrinsic. At this time copying data from the bytebuffer into a byte array for the intrinsic to work on it is required. We cannot eliminate the copy, but we can reduce the size of the allocated buffer. There is little harm in creating a maximum size this buffer can be and copy data into that buffer repeatedly until it is finished. Having the maximum size at 4k does produce slightly faster top-end performance at times, but inconsistent results and an increase in memory usage from 7GB to 17GB have been inconclusive to increase the buffer size.
  5. Using bytebuffers allows for using duplicate() which lets the code easier chop up the data without unnecessary copying

The CipherCore change provided a 6% performance gains for GCM with byte array based data, such as SSLSocket and direct API calls. Similar performance gains should be evident with other algorithms using this method.

The GCM bytebuffer and logic changes produced a 16% increase in performance in the Flink test. This is limited to only GCM as the other algorithms still use bytebuffer-to-byte array copy method. Doing similar work on other algorithms would provide less of a performance gain because of the complexities of GCM and are have diminishing usage in TLS.

@ascarpino ascarpino marked this pull request as ready for review Oct 6, 2020
@openjdk openjdk bot added the rfr label Oct 6, 2020
@mlbridge
Copy link

@mlbridge mlbridge bot commented Oct 6, 2020

Webrevs

Copy link
Member

@XueleiFan XueleiFan left a comment

Impressive update and performance improvement! I have no major concerns, all comments are just about trivial details like indents.

@openjdk
Copy link

@openjdk openjdk bot commented Oct 7, 2020

@ascarpino This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8253821: Improve ByteBuffer performance with GCM

Reviewed-by: xuelei, valeriep

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1036 new commits pushed to the master branch:

  • 3e89981: 8257623: vmTestbase/nsk/jvmti/ResourceExhausted/resexhausted001/TestDescription.java shouldn't use timeout
  • 93b6ab5: 8256818: SSLSocket that is never bound or connected leaks socket resources
  • 692b273: 8257189: Handle concurrent updates of MH.form better
  • 6704266: 8257565: epsilonBarrierSet.hpp should not include barrierSetAssembler
  • 0b8c780: 8256256: UL should not use heap allocation for output string
  • 2508bc7: 8257140: Crash in JvmtiTagMap::flush_object_free_events()
  • cfb50a9: 8253916: ResourceExhausted/resexhausted001 crashes on Linux-x64
  • 287b829: 8254877: GCLogPrecious::_lock rank constrains what locks you are allowed to have when crashing
  • 1fd0ea7: 8256382: Use try_lock for hs_err EventLog printing
  • bff68f1: 8257533: legacy-jre-image includes jpackage and jlink tools
  • ... and 1026 more: https://git.openjdk.java.net/jdk/compare/fb206908b431f3691a64d3e58fda86da5636c99a...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready label Oct 7, 2020
Copy link

@valeriepeng valeriepeng left a comment

I will take a look as well.

@Override
protected int engineDoFinal(ByteBuffer input, ByteBuffer output)
throws ShortBufferException, IllegalBlockSizeException,
BadPaddingException {
return bufferCrypt(input, output, false);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the override of this method for using a different bufferCrypt impl? There is also engineUpdate(ByteBuffer, ByteBuffer) in CipherSpi, is there a reason for not overriding that here?

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. thanks.. The IDE covered that up by going to the one in CipherSpi and I thought it was calling the AES one. TLS doesn't use update() so the perf numbers won't change. I'll have to run the tests again.

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not going to do the update() method, leaving it as is. There is some complications with the Encrypt.java test were the update is done with a direct bytebuffer, but the doFinal() is an empty buffer, which sends it to the byte[] doFinal(). CipherCore mitigates this situation inefficiently and I'd rather optimize that in future changeset that I'm already planning for byte[] methods.

} else { // input does not have an accessible byte[]
if (core.getMode() == CipherCore.GCM_MODE) {
if (isUpdate) {
return core.update(input, output);
} else {
return core.doFinal(input, output);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems this block is the only difference between this method and CipherSpi.bufferCrypt(). Have you considered moving this special handling to the overridden engineDoFinal(...) method and not duplicating the whole CipherSpi.bufferCrypt() method here?
BTW, instead of using the generic update/doFinal name and then commenting them for GCM usage only, perhaps it's more enticing to name them as gcmUpdate/gcmDoFinal?

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a way to override this because CipherSpi is a public class, any methods I added would become a new API.
Also bufferCrypt is private, so I had to copy it. CipherSpi does not know which mode is being used, but AESCipher does. Maybe I'm missing something, I'd rather it not be a copy, but I couldn't see a better way. If you have a specific idea, please give me details.

Copy link

@valeriepeng valeriepeng Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about overriding

protected int engineDoFinal(ByteBuffer input, ByteBuffer output) throws ... {
    if (core.getMode() == CipherCore.GCM_MODE && !input.hasArray()) {
        // call your own optimized byte buffer code path
    } else {
        super.engineDoFinal(input, output);
    }
}

Would this work? If yes, then no need to duplicate the whole bufferCrypt() method.

Copy link
Contributor Author

@ascarpino ascarpino Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.. I wouldn't have thought about super.

if (decrypting) {
return cipher.decrypt(src, dst);
}
return cipher.encrypt(src, dst);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about return (decrypting? cipher.decrypt(src, dst) : cipher.encrypt(src, dst));

@@ -812,6 +813,13 @@ int update(byte[] input, int inputOffset, int inputLen, byte[] output,
return outLen;
}

int update(ByteBuffer src, ByteBuffer dst) throws ShortBufferException {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this also one of the "GCM only" methods? If so, same comment as doFinal(ByteBuffer, ByteBuffer)?
Maybe the name should be more specific to avoid misuse.

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.. I see what you mean by renaming the method. Yeah, I suppose it's better since it's not truely generic

};

int encryptFinal(ByteBuffer src, ByteBuffer dst)
throws IllegalBlockSizeException, AEADBadTagException,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tag is generated during encryption, this can't possibly throw AEADBadTagException, copy-n-paste error maybe?

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. copied decryptFinal.


if (decrypting) {
if (buffered > 0) {
cipher.decrypt(buffer, 0, buffered, new byte[0], 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a bit strange? The output buffer is 0-length which would lead to ShortBufferException when the buffered bytes is enough to produce some output.

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is right. decrypt() puts the data into GaloisCounterMode.ibuffer and never uses the output

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, right, I forgot that GCM cipher has that internal buffering.

return cipher.decryptFinal(src, dst);
} else {
if (buffered > 0) {
cipher.encrypt(buffer, 0, buffered, new byte[0], 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above?

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is a problem here, but not a short buffer. I hadn't realized that encrypt(byte[]..) didn't put data in the GaloisCounterMode.ibuffer because I did that with encrypt(ByteBuffer..). I added a GCM only method that will cover this and place the data in the ibuffer for doFinal(ByteBuffer..) can complete the op.

// create temporary output buffer if the estimated size is larger
// than the user-provided buffer.
internalOutput = new byte[estOutSize];
offset = 0;
}
// create temporary output buffer so that only "real"
// data bytes are passed to user's output buffer.
outWithPadding = new byte[estOutSize];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is somewhat dangerous. When using the supplied output buffer directly, you may corrupt its content w/ padding bytes. This optimization should only be applied when no padding is involved. In addition, input and output can point to the same buffer with all sorts of index combinations, i.e. inOfs == outOfs, inOfs < outOfs, inOfs > outOfs. With "outWithPadding" approach, no need to check. However, for "internalOutput", data corruption may happen. This kind of problem can be very hard to diagnose. Have to be very very careful here as this code may impact all ciphers...

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I do not understand where the corruption comes from. The user provides a buffer that output is suppose to be placed into, what could it be corrupting? The existing tests (SameBuffer, in particular) works fine with this and the ByteBuffer calls. I spent a lot of time trying to get those same buffer tests to pass.
  • It was my expectation that checkOutputCapacity() is making sure all the inOfs ==<> outOfs are going to work. Does that not catch all cases?
  • outWithPadding" is excessive because it doubles the allocation for every operation followed by a copy to the original buffer, even if the original buffer was adequate. I'm ok with doing the extra alloc & copy in certain situations, but not everytime. Can you be more specific what things may fail that we don't already check for in the regression tests?

Copy link
Contributor Author

@ascarpino ascarpino Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a whole new tests to exercise all the buffers with and without offsets. Hopefully that covers the concern. The test found 4 bugs..

Copy link

@valeriepeng valeriepeng Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* I do not understand where the corruption comes from.  The user provides a buffer that output is suppose to be placed into, what could it be corrupting?  

Existing tests may not catch/check all error cases. Especially, in the past, all calls w/ ByteBuffer inputs are converted to calls w/ byte[] inputs. Thus, testing is focused on byte[] scenarios. In addition, since for decryption, internal buffering is done, there may be no test checking if padding bytes are written to the output buffer. Please see my comment follows. The corruption that I refer to is about the padding bytes which are now written into the output buffer with this change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* It was my expectation that checkOutputCapacity() is making sure all the inOfs ==<> outOfs are going to work. Does that not catch all cases?

checkOutputCapacity() as the name has, only changes that the output buffer is large enough.

* outWithPadding" is excessive because it doubles the allocation for every operation followed by a copy to the original buffer, even if the original buffer was adequate.  I'm ok with doing the extra alloc & copy in certain situations, but not everytime.  Can you be more specific what things may fail that we don't already check for in the regression tests?

For example,

  1. various input/output offset combinations, e.g. inOfs <,=,> outOfs,
  2. Given a output buffer of 200-byte and outOfs == 0, assuming the returned decrypted data length is 160 bytes, the bytes from index 160 and onward should not be overwritten. GCM has no padding, so you may not notice any difference if this is what you are testing. This is generic CipherCore code and changes impact all ciphers.

Copy link
Contributor Author

@ascarpino ascarpino Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checkOutputCapacity: Yes.. The method includes the offsets for the output buffer, which I believe would verify that the output area in the buffer with offsets is large enough.

outWithPadding: I understand the situation and I am assuming there are tests that cover this case. Given it's a generic situation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested the outWithPadding situation? Given that the existing impl only write out the final result, I don't think you can assume that existing tests cover it. I have wrote a simple test to check it if you have not done so, can you try it out to be sure?

import java.io.PrintStream;
import java.util.*;
import java.security.*;
import java.security.spec.*;

import javax.crypto.*;
import javax.crypto.spec.*;

public class TestDoFinal {

    private static String ALGO = "AES";
    private static int BLK_SIZE = 16;

    public static void main(String args[]) throws Exception {

        byte[] in = new byte[32];
        Arrays.fill(in, (byte)8);
        KeyGenerator kg = KeyGenerator.getInstance(ALGO, "SunJCE");
        SecretKey skey = kg.generateKey();
        Cipher ci = Cipher.getInstance(ALGO + "/CBC/PKCS5Padding", "SunJCE");
        ci.init(Cipher.ENCRYPT_MODE, skey);
        int inLen = in.length - BLK_SIZE;
        byte[] out = ci.doFinal(in, 0, inLen);
        System.out.println("=> enc " + inLen + " bytes, ret " +
                (out == null? "null":(out.length + " byte")));

        AlgorithmParameters param = ci.getParameters();
        ci.init(Cipher.DECRYPT_MODE, skey, param);
        int rLen = ci.doFinal(out, 0, out.length, in);
        System.out.println("=> dec " + out.length + " bytes, ret " +
                rLen + " byte");
        // check if more than rLen bytes are written into 'in'
        for (int j = rLen; j < in.length; j++) {
            if (in[j] != (byte)8) {
                throw new Exception("Value check failed at index " + j);
            }
        }
        System.out.println("Test Passed");
    }
}

Copy link
Contributor Author

@ascarpino ascarpino Nov 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to fix this, and I did for this test, but there other situations with update() that weren't working. It would take some reworking of a few common methods during the doFinal process to handle this right. I'm going to put an 'if()" so non-GCM modes create a new buffer like it did before. It was a "nice to have' for this rfe that can be done with future work for other mode optimizations.

return len;
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the above red icon? It warns about missing newline at end of this file.

return 0;
}

ArrayUtil.blockSizeCheck(src.remaining(), blockSize);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am not sure if this check still applies in ByteBuffer case. You are passing the ByteBuffer objs directly from AESCipher->CipherCore->GaloisCounterMode. This is different from the byte[] case where CipherCore would chop up the data into blocks and pass the blocks to the underlying FeedbackCipher impl. Perhaps no existing regression tests covers ByteBuffer inputs w/ non-blocksize data? Otherwise, this should be caught?
BTW, why not just use 'len' again? Seems unnecessary to keep calling src.remaining() in various places in this method.

Copy link
Contributor Author

@ascarpino ascarpino Oct 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the check is unnecessary

I suspect not using len was simply a mistake

// Process en/decryption all the way to the last block. It takes both
// For input it takes the ibuffer which is wrapped in 'buffer' and 'src'
// from doFinal.
void doLastBlock(ByteBuffer buffer, ByteBuffer src, ByteBuffer dst)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ordering of these new ByteBuffer-arg methods seems random? Perhaps, either group them altogether or move them so that they are together with the byte[] counterpart methods?

@@ -462,6 +505,54 @@ int encrypt(byte[] in, int inOfs, int len, byte[] out, int outOfs) {
return len;
}

int decrypt(ByteBuffer src, ByteBuffer dst) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to above. It seems a bit strange to see decrypt(...) method in between encrypt(...) methods.

if (src.remaining() > 0) {
byte[] b = new byte[src.remaining()];
src.get(b);
try {
ibuffer.write(b);
} catch (IOException e) {
throw new ProviderException(
"Unable to add remaining input to the buffer", e);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing usage for ibuffer is only for decryption. If you are storing into ibuffer for encryption, the comment about it should be updated. The earlier code in this method should also check ibuffer and if non-empty, apply its content before applying src?

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the comment. I had redone this method before your comment. Take a look at that on the next webrev

}

processed += len;
ghashAllToS.update(src, len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't input to ghashAllToS always be the produced cipher text? Did I miss something?

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

method is removed

GCTR gctrForSToTag = new GCTR(embeddedCipher, this.preCounterBlock);
gctrForSToTag.doFinal(s, 0, s.length, sOut, 0);
gctrForSToTag.doFinal(s, 0, s.length, block, 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since GCTR output the same length of output as input, (e.g. 'sOut' is same length as 's'), can't we just re-use 's' as the output buffer instead of 'block'?

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I can take it one step further.. I think I can remove 's' and use 'block' for everything. I'll have to make sure no unexpected overwriting happens.

if (tagLenBytes > block.length) {
block = new byte[tagLenBytes];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is tagLenBytes ever larger than AES block size?

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah.. I see what you mean.. Yeah that's not necessary

byte[] sOut = new byte[s.length];
//byte[] sOut = new byte[s.length];
if (tagLenBytes != block.length) {
block = new byte[tagLenBytes];
}
GCTR gctrForSToTag = new GCTR(embeddedCipher, this.preCounterBlock);
gctrForSToTag.doFinal(s, 0, s.length, sOut, 0);
gctrForSToTag.doFinal(s, 0, tagLenBytes, block, 0);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as earlier.

mismatch |= tag[i] ^ sOut[i];
mismatch |= tag[i] ^ block[i];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

block->s?

throw new ShortBufferException("Output buffer too small");
}

checkDataLength(processed, len);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that both checks (line 605 and 614) can be combined into:
checkDataLength(processed, Math.addExact(len, tagLenBytes));

Copy link
Contributor Author

@ascarpino ascarpino Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed both encryptFinal for byte[] and bytebuffer, as I got that check from copying the byte[] method. They do look like mostly redundant checks, and I moved them to the top.

if (tagLenBytes > block.length) {
block = new byte[tagLenBytes];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, will this ever happen? Can just use 's' instead of 'block' from here and below?

gctrForSToTag.doFinal(s, 0, s.length, block, 0);
dst.put(block, 0, tagLenBytes);

return (processed + tagLenBytes);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it supposed to return "all data processed + tag len"? Normally, we return the output size produced by this particular call instead of all accumulated.

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should be len + tagLenBytes. We obviously don't check that in our tests

checkDataLength(processed, len);

processAAD();
if (len > 0) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if (len == 0), we should still process the data stored into 'ibuffer'? It seems that both of the encrypt(ByteBuffer) and encryptFinal(ByteBuffer) are adapted from their counterpart with byte[] arguments. However, the byte[] methods have different entrant conditions due to the buffering in CipherCore. So the impl of the ByteBuffer ones may need additional logic to handle all possible calling sequence.

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes for encryptFinal. encrypt is removed

doLastBlock((ibuffer == null || ibuffer.size() == 0) ?
null : ByteBuffer.wrap(ibuffer.toByteArray()), src, dst);
dst.reset();
ghashAllToS.doLastBlock(dst, processed);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure about using "processed" here? I'd expect the value is the number of bytes written into dst in the doLastBlock(...) call on line 618. Is the "processed" variable used differently in ByteBuffer case?

Copy link
Contributor Author

@ascarpino ascarpino Oct 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should probably be 'len'

// Maximum buffer size rotating ByteBuffer->byte[] intrinsic copy
private static final int MAX_LEN = 1024;

int doFinal(ByteBuffer src, ByteBuffer dst) throws IllegalBlockSizeException {
Copy link

@valeriepeng valeriepeng Oct 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conventionally, we put doFinal() method after update() method. Perhaps can we reorder it in the same way?

Copy link
Contributor Author

@ascarpino ascarpino Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

return len;
}

int update(ByteBuffer src, ByteBuffer dst) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the impl in GaloisCounterMode, this is only called when src.remaining() >= 128. Perhaps documenting the conditions here as there are no checks on the src/dst sizes as in the byte[] case.

Copy link
Contributor Author

@ascarpino ascarpino Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good

int len = inLen % AES_BLOCK_SIZE;
int processed = len;
byte[] out = new byte[Math.min(MAX_LEN, len)];
int offset = inOfs;
while (processed > MAX_LEN) {
encrypt(in, offset, MAX_LEN, out, 0);
dst.get(out, 0, MAX_LEN);
processed -= MAX_LEN;
offset += MAX_LEN;
}
encrypt(in, offset, processed, out, 0);
dst.get(out, 0, processed);
return len;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code looks strange?
len = inLen % AES_BLOCK_SIZE => len must be between 0...AES_BLOCK_SIZE-1, I think you meant to use len = (inLen - inLen% AES_BLOCK_SIZE)
dst.get(...) should be dst.put(...)

Copy link
Contributor Author

@ascarpino ascarpino Oct 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. Not surprising no test failed given it's a rollover situation

// if there it not enough data in the ibuffer and 'in', then append
// that to the ibuffer.
int remainder = inLen % blockSize;
if (remainder > 0) {
if (ibuffer == null) {
ibuffer = new ByteArrayOutputStream(len % blockSize);
ibuffer = new ByteArrayOutputStream(inLen % blockSize);
}
len -= remainder;
ibuffer.write(in, len, remainder);
inLen -= remainder;
// remainder offset is based on original buffer length
ibuffer.write(in, inOfs + inLen, remainder);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this can be moved down for better readability, i.e. process data in multiple of blocks, and store the remaining into 'ibuffer'?

Copy link
Contributor Author

@ascarpino ascarpino Nov 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to, but I don't like how the variable line up doing the remainder afterwards. I put some hopefully better comments above each section

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

// 'len' stores the length to use with buffer 'in'.
// 'inLen' stores the length returned by the method.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 551 has "return len;" which seems conflicting with these two-line comments here?

Copy link
Contributor Author

@ascarpino ascarpino Nov 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is backwards. I wrote the comment before I decided to switch the meaning of the variables and never updated the comment

return 0;
} else {
System.arraycopy(buffer, 0, block, 0, buflen);
System.arraycopy(in, buflen, block, buflen,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use 'bufLen' as the offset for 'in' looks incorrect?

Copy link
Contributor Author

@ascarpino ascarpino Nov 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should be in's offset

if (buflen >= blockSize) {
System.arraycopy(buffer, 0, block, 0, blockSize);
buflen -= block.length;
return 0;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will bufLen > blockSize? Judging from the context of this method, buffer.length should always <= blockSize and never be larger? If bufLen == blockSize, perhaps we can just use buffer directly and no need to copy.

Copy link
Contributor Author

@ascarpino ascarpino Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is being removed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

int buflen = buffer.length;
if (buflen >= blockSize) {
System.arraycopy(buffer, 0, block, 0, blockSize);
buflen -= block.length;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bufLen isn't used, why bother updating its value?

// the data to fail with a short buffer exception
if ((src.remaining() + ((ibuffer != null) ? ibuffer.size() : 0) -
tagLenBytes) > dst.remaining()) {
throw new RuntimeException("output buffer too small");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be ShortBufferException instead of RuntimeException?

Copy link
Contributor Author

@ascarpino ascarpino Nov 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so too, but that isn't what GCTR returns. All the GCTR checks are RuntimeExceptions. This check was original inside of GCTR, but I had to bring it out because of the ibuffer lengths. I don't mind changing, it's just a strange inconsistency.