[AVRO-2245] Improve java tests for compression codecs#351
[AVRO-2245] Improve java tests for compression codecs#351nandorKollar merged 1 commit intoapache:masterfrom
Conversation
a1b912d to
78aa579
Compare
| assertTrue(codecClass.isInstance(codecInstance)); | ||
| assertTrue(codecInstance.getName().equals(codec)); | ||
|
|
||
| ByteBuffer inputByteBuffer = ByteBuffer.wrap(input); |
There was a problem hiding this comment.
Here, the original tests allocated a 2 times bigger buffer than the original input, which was supposed to test AVRO-1326 is fixed. If i'm not mistaken, with refactor this test like you did, we lose this verification.
There was a problem hiding this comment.
Yes, that's true. Although I'd argue that this test wasn't really successfully testing anything before. (The test doesn't even clearly verify AVRO-1326, although it sort of does...it would be much more straightforward to allocate a larger buffer and call rewind and limit. Instead it's actually compressing the second (empty) half of the ByteBuffer. If we allocated inputSize * 2 + 1 the assert(inputSize == outputSize) check would fail.) I honestly can't tell if that assertion passes out of lucky coincidence or if the test was written to be confusing. I think the first. :)
Anyways: there's actually still a couple little bugs in that code similar to what AVRO-1326 fixes. (With how the compression code is called currently, I don't think these bugs would ever get exercised, but still...they should probably get fixed). There's another PR here: #352 to fix those.
As part of that PR I was going to add another test in TestAllCodecs to verify the changes. Without the changes in #352 the test fails, so, not adding it here; I'll add it there once this is merged. (Basically, this tests that we do the right thing for both compression and decompression, even if we start out in somewhere in the middle of the input bytebuffer).
@Test
public void testCodecSlice() throws IOException {
int inputSize = 500_000;
byte[] input = generateTestData(inputSize);
Codec codecInstance = CodecFactory.fromString(codec).createInstance();
ByteBuffer partialBuffer = ByteBuffer.wrap(input);
partialBuffer.position(17);
ByteBuffer inputByteBuffer = partialBuffer.slice();
ByteBuffer compressedBuffer = codecInstance.compress(inputByteBuffer);
int compressedSize = compressedBuffer.remaining();
// Make sure something returned
assertTrue(compressedSize > 0);
// Create a slice from the compressed buffer
ByteBuffer sliceBuffer = ByteBuffer.allocate(compressedSize + 100);
sliceBuffer.position(50);
sliceBuffer.put(compressedBuffer);
sliceBuffer.limit(compressedSize + 50);
sliceBuffer.position(50);
// Decompress the data
ByteBuffer decompressedBuffer = codecInstance.decompress(sliceBuffer.slice());
// Validate the the input and output are equal.
inputByteBuffer.rewind();
Assert.assertEquals(decompressedBuffer, inputByteBuffer);
}|
Thanks for spotting this problem! I support refactoring these to parametrized tests. Could you please address my only concern? Also (despite probably not very often used part of Avro) Trevni has similar meaningless tests (with the same name as these), if it requires only low effort, could you please fix those too? |
|
I can make the changes in Trevni in a separate PR if that's ok. Yes, they also don't make much sense: |
|
@jacobtolar Can you rebase onto master as well? |
|
@nandorKollar Final thoughts? |
Remove
TestBZip2Codec.javaandTestZstandardCodec.java, which were implemented incorrectly, and add a new parametrized test that validates all codecs correctly.