Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #20: write uncompressed bytes if uncompressed size==compressed size #21

Merged
merged 3 commits into from Jun 3, 2011
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 1 addition & 1 deletion build.xml
Expand Up @@ -28,7 +28,7 @@

<property name="Name" value="Hadoop GPL Compression"/>
<property name="name" value="hadoop-lzo"/>
<property name="version" value="0.4.10"/>
<property name="version" value="0.4.11"/>
<property name="final.name" value="${name}-${version}"/>
<property name="year" value="2008"/>

Expand Down
2 changes: 1 addition & 1 deletion src/java/com/hadoop/compression/lzo/LzopOutputStream.java
Expand Up @@ -177,7 +177,7 @@ protected void compress() throws IOException {
// the LZO specification says that we should write the uncompressed bytes rather
// than the compressed bytes. The decompressor understands this because both sizes
// get written to the stream.
if (compressor.getBytesRead() < compressor.getBytesWritten()) {
if (compressor.getBytesRead() <= compressor.getBytesWritten()) {
// Compression actually increased the size of the buffer, so write the uncompressed bytes.
byte[] uncompressed = ((LzoCompressor)compressor).uncompressedBytes();
rawWriteInt(uncompressed.length);
Expand Down
20 changes: 20 additions & 0 deletions src/test/com/hadoop/compression/lzo/TestLzopOutputStream.java
Expand Up @@ -46,6 +46,7 @@ public class TestLzopOutputStream extends TestCase {
private final String bigFile = "100000.txt";
private final String mediumFile = "1000.txt";
private final String smallFile = "100.txt";
private final String issue20File = "issue20-lzop.txt";

@Override
protected void setUp() throws Exception {
Expand Down Expand Up @@ -84,6 +85,25 @@ public void testSmallFile() throws NoSuchAlgorithmException, IOException,
runTest(smallFile);
}

/**
* The LZO specification says that we should write the uncompressed bytes
* rather than the compressed bytes if the compressed buffer is actually
* larger ('&gt;') than the uncompressed buffer.
*
* To conform to the standard, this means we have to write the uncompressed
* bytes also when they have exactly the same size as the compressed bytes.
* (the '==' in '&lt;=').
*
* The input data of this test is known to compress to the same size as the
* uncompressed data. Hence we verify that we handle the boundary condition
* correctly.
*
*/
public void testIssue20File() throws NoSuchAlgorithmException, IOException,
InterruptedException {
runTest(issue20File);
}

/**
* Test that reading an lzo-compressed file produces the same lines as reading the equivalent
* flat file. The test opens both the compressed and flat file, successively reading each
Expand Down
6 changes: 6 additions & 0 deletions src/test/data/issue20-lzop.txt
@@ -0,0 +1,6 @@
0.5 74 25425
0.9 200 25384
0.95 203 4
0.98 211 2
0.99 219 3
0.995 240 5