Skip to content

Commit

Permalink
Updated benchmark results.
Browse files Browse the repository at this point in the history
More tidying up.
  • Loading branch information
ckolivas committed Nov 5, 2010
1 parent 2965349 commit a66dafe
Show file tree
Hide file tree
Showing 5 changed files with 73 additions and 66 deletions.
74 changes: 40 additions & 34 deletions doc/README.benchmarks
Original file line number Diff line number Diff line change
@@ -1,28 +1,29 @@
These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram
using lrzip v0.42.

The first comparison is that of a linux kernel tarball (2.6.31). In all cases
the default options were used. 3 other common compression apps were used for
comparison, 7z which is an excellent all-round lzma based compression app,
gzip which is the benchmark fast standard that has good compression, and bzip2
which is the most common linux used compression.

In the following tables, lrzip means lrzip default options, lrzip(lzo) means
lrzip using the lzo backend, lrzip(gzip) means using the gzip backend,
lrzip(bzip2) means using the bzip2 backend and lrzip(zpaq) means using the zpaq
In the following tables, lrzip means lrzip default options, lrzip -l means
lrzip using the lzo backend, lrzip -g means using the gzip backend,
lrzip -b means using the bzip2 backend and lrzip -z means using the zpaq
backend.


linux-2.6.31.tar

These are benchmarks performed on a 3GHz quad core Intel Core2 with 8GB ram
using lrzip v0.42.

Compression Size Percentage Compress Decompress
None 365711360 100
7z 53315279 14.6 2m4.770s 0m5.360s
lrzip 52372722 14.3 2m48.477s 0m8.336s
lrzip(zpaq) 43455498 11.9 10m11.335 10m14.296
lrzip(lzo) 112151676 30.7 0m14.913s 0m5.063s
lrzip(gzip) 73476127 20.1 0m29.628s 0m5.591s
lrzip(bzip2) 60851152 16.6 0m43.539s 0m12.244s
lrzip -z 43455498 11.9 10m11.335 10m14.296
lrzip -l 112151676 30.7 0m14.913s 0m5.063s
lrzip -g 73476127 20.1 0m29.628s 0m5.591s
lrzip -b 60851152 16.6 0m43.539s 0m12.244s
bzip2 62416571 17.1 0m44.493s 0m9.819s
gzip 80563601 22.0 0m14.343s 0m2.781s

Expand All @@ -37,29 +38,34 @@ What lrzip offers at this end of the spectrum is extreme compression if
desired.


Let's take two kernel trees one version apart as a tarball, linux-2.6.31 and
linux-2.6.32-rc8. These will show lots of redundant information, but hundreds
Let's take six kernel trees one version apart as a tarball, linux-2.6.31 to
linux-2.6.36. These will show lots of redundant information, but hundreds
of megabytes apart, which lrzip will be very good at compressing. For
simplicity, only 7z will be compared since that's by far the best general
purpose compressor at the moment:

These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram
using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only
2GB addressing was posible. However the benchmark was run with the -U option
allowing the whole file to be treated as one large compression window.

Tarball of two kernel trees, one version apart.
Tarball of 6 consecutive kernel trees.

Compression Size Percentage Compress Decompress
None 749066240 100
7z 108710624 14.5 4m4.260s 0m11.133s
lrzip 57943094 7.7 3m08.788s 0m10.747s
lrzip(lzo) 124029899 16.6 0m18.997s 0m7.107s
None 2373713920 100
7z 344088002 14.5 17m26s 1m22s
lrzip -U 73356070 3.1 08m53s 43s
lrzip -Ul 158851141 6.7 04m31s 35s

Things start getting very interesting now when lrzip is really starting to
shine. Note how it's not that much larger for 2 kernel trees than it was for
shine. Note how it's not that much larger for 6 kernel trees than it was for
one. That's because all the similar data in both kernel trees is being
compressed as one copy and only the differences really make up the extra size.
All compression software does this, but not over such large distances. If you
copy the same data over multiple times, the resulting lrzip archive doesn't
get much larger at all.


Using the first example (linux-2.6.31.tar) and simply copying the data multiple
times over gives these results with lrzip(lzo):

Expand All @@ -70,7 +76,7 @@ Copies Size Compressed Compress Decompress


I had the amusing thought that this compression software could be used as a
bullshit detector if you were to compress peoples' speeches because if their
bullshit detector if you were to compress people's speeches because if their
talks were full of catchphrases and not much actual content, it would all be
compressed down. So the larger the final archive, the less bullshit =)

Expand All @@ -83,31 +89,31 @@ system and some basic working software on it. The default options on the

10GB Virtual image:

These benchmarks were done on the quad core with version 0.5.1

Compression Size Percentage Compress Time Decompress Time
None 10737418240 100.0
gzip 2772899756 25.8 05m47.35s 2m46.77s
bzip2 2704781700 25.2 20m34.269s 7m51.362s
xz 2272322208 21.2 58m26.829s 4m46.154s
7z 2242897134 20.9 29m28.152s 6m35.952s
lrzip* 1354237684 12.6 29m13.402s 6m55.441s
lrzip M* 1079528708 10.1 23m44.226s 4m05.461s
lrzip(lzo)* 1793312108 16.7 05m13.246s 3m12.886s
lrzip(lzo)M* 1413268368 13.2 04m18.338s 2m54.650s
lrzip(zpaq)* 1299844906 12.1 04h32m14s 04h33m
lrzip(zpaq)M* 1066902006 9.9 04h07m14s 04h08m
lrzip 1354237684 12.6 29m13.402s 6m55.441s
lrzip -M 1079528708 10.1 23m44.226s 4m05.461s
lrzip -l 1793312108 16.7 05m13.246s 3m12.886s
lrzip -lM 1413268368 13.2 04m18.338s 2m54.650s
lrzip -z 1299844906 12.1 04h32m14s 04h33m
lrzip -zM 1066902006 9.9 04h07m14s 04h08m

(The benchmarks with * were done with version 0.5.1)

At this end of the spectrum things really start to heat up. The compression
advantage is massive, with the lzo backend even giving much better results than
7z, and over a ridiculously short time. Note that it's not much longer than it
takes to just *read* a 10GB file. What appears to be a big disappointment is
actually zpaq here which takes more than 8 times longer than lzma for a measly
.2% improvement. The reason is that most of the advantage here is achieved by
the rzip first stage since there's a lot of redundant space over huge distances
on a virtual image. The -M option which works the memory subsystem rather hard
making noticeable impact on the rest of the machine also does further wonders
for the compression and times.
7z, and over a ridiculously short time. What appears to be a big disappointment
is actually zpaq here which takes more than 8 times longer than lzma for a
measly .2% improvement. The reason is that most of the advantage here is
achieved by the rzip first stage since there's a lot of redundant space over
huge distances on a virtual image. The -M option which works the memory
subsystem rather hard making noticeable impact on the rest of the machine also
does further wonders for the compression and times.

This should help govern what compression you choose. Small files are nicely
compressed with zpaq. Intermediate files are nicely compressed with lzma.
Expand All @@ -117,4 +123,4 @@ Or, to make things easier, just use the default settings all the time and be
happy as lzma gives good results. :D

Con Kolivas
Tue, 4th Nov 2010
Tue, 5th Nov 2010
4 changes: 2 additions & 2 deletions main.c
Original file line number Diff line number Diff line change
Expand Up @@ -312,7 +312,7 @@ static void decompress_file(void)
print_output("Output filename is: %s: ", control.outfile);
print_progress("[OK] - %lld bytes \n", expected_size);

if (unlikely(close(fd_hist) != 0 || close(fd_out) != 0))
if (unlikely(close(fd_hist) || close(fd_out)))
fatal("Failed to close files\n");

if (TEST_ONLY | STDOUT) {
Expand Down Expand Up @@ -501,7 +501,7 @@ static void compress_file(void)
if (STDOUT)
dump_tmpoutfile(fd_out);

if (unlikely(close(fd_in) != 0 || close(fd_out)))
if (unlikely(close(fd_in) || close(fd_out)))
fatal("Failed to close files\n");

if (STDOUT) {
Expand Down
2 changes: 1 addition & 1 deletion runzip.c
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ static i64 runzip_chunk(int fd_in, int fd_out, int fd_hist, i64 expected_size, i
if (unlikely(ofs == -1))
fatal("Failed to seek input file in runzip_fd\n");

if (fstat(fd_in, &st) != 0 || st.st_size - ofs == 0)
if (fstat(fd_in, &st) || st.st_size - ofs == 0)
return 0;

ss = open_stream_in(fd_in, NUM_STREAMS);
Expand Down
35 changes: 18 additions & 17 deletions rzip.c
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,11 @@ static void remap_low_sb(void)

static inline void remap_high_sb(i64 p)
{
if (unlikely(munmap(sb.buf_high, sb.size_high) != 0))
if (unlikely(munmap(sb.buf_high, sb.size_high)))
fatal("Failed to munmap in remap_high_sb\n");
sb.size_high = sb.high_length; /* In case we shrunk it when we hit the end of the file */
sb.offset_high = p;
/* Make sure offset is rounded to page size of total offset */
sb.offset_high -= (sb.offset_high + sb.orig_offset) % 4096;
if (unlikely(sb.offset_high + sb.size_high > sb.orig_size))
sb.size_high = sb.orig_size - sb.offset_high;
Expand All @@ -138,10 +139,10 @@ static inline void remap_high_sb(i64 p)

/* We use a "sliding mmap" to effectively read more than we can fit into the
* compression window. This is done by using a maximally sized lower mmap at
* the beginning of the block, and a one-page-sized mmap block that slides up
* and down as is required for any offsets beyond the lower one. This is
* 100x slower than mmap but makes it possible to have unlimited sized
* compression windows. */
* the beginning of the block which slides up once the hash search moves beyond
* it, and a 64k mmap block that slides up and down as is required for any
* offsets outside the range of the lower one. This is much slower than mmap
* but makes it possible to have unlimited sized compression windows. */
static uchar *get_sb(i64 p)
{
i64 low_end = sb.offset_low + sb.size_low;
Expand All @@ -152,14 +153,14 @@ static uchar *get_sb(i64 p)
return (sb.buf_low + p - sb.offset_low);
if (p >= sb.offset_high && p < (sb.offset_high + sb.size_high))
return (sb.buf_high + (p - sb.offset_high));
/* (p > sb.size_low && p < sb.offset_high) */
/* p is not within the low or high buffer range */
remap_high_sb(p);
return (sb.buf_high + (p - sb.offset_high));
}

static inline void put_u8(void *ss, int stream, uchar b)
{
if (unlikely(write_stream(ss, stream, &b, 1) != 0))
if (unlikely(write_stream(ss, stream, &b, 1)))
fatal("Failed to put_u8\n");
}

Expand Down Expand Up @@ -226,7 +227,7 @@ int write_sbstream(void *ss, int stream, i64 p, i64 len)
p += n;
len -= n;
if (sinfo->s[stream].buflen == sinfo->bufsize) {
if (unlikely(flush_buffer(sinfo, stream) != 0))
if (unlikely(flush_buffer(sinfo, stream)))
return -1;
}
}
Expand Down Expand Up @@ -407,7 +408,7 @@ static inline i64 match_len(struct rzip_state *st, i64 p0, i64 op, i64 end,
if (end < st->last_match)
end = st->last_match;

while (p > end && op > 0 && *get_sb(op - 1) == *get_sb(p-1)) {
while (p > end && op > 0 && *get_sb(op - 1) == *get_sb(p - 1)) {
op--;
p--;
}
Expand Down Expand Up @@ -673,7 +674,7 @@ static void init_sliding_mmap(struct rzip_state *st, int fd_in, i64 offset)
i64 size = st->chunk_size;

if (sizeof(long) == 4 && size > two_gig) {
print_verbose("Limiting to 2G due to 32 bit limitations\n");
print_verbose("Limiting to 2GB due to 32 bit limitations\n");
size = two_gig;
}
sb.orig_offset = offset;
Expand All @@ -689,14 +690,14 @@ static void init_sliding_mmap(struct rzip_state *st, int fd_in, i64 offset)
/* Better to shrink the window to the largest size that works than fail */
if (sb.buf_low == MAP_FAILED) {
size = size / 10 * 9;
size -= size % 4096; /* Round to page size */
size -= size % 4096;
if (unlikely(!size))
fatal("Unable to mmap any ram\n");
goto retry;
}
print_maxverbose("Succeeded in preallocating %lld sized mmap\n", size);
if (!STDIN) {
if (unlikely(munmap(sb.buf_low, size) != 0))
if (unlikely(munmap(sb.buf_low, size)))
fatal("Failed to munmap\n");
} else
st->chunk_size = size;
Expand All @@ -707,7 +708,7 @@ static void init_sliding_mmap(struct rzip_state *st, int fd_in, i64 offset)
sb.buf_low = (uchar *)mmap(sb.buf_low, size, PROT_READ, MAP_SHARED, fd_in, offset);
if (sb.buf_low == MAP_FAILED) {
size = size / 10 * 9;
size -= size % 4096; /* Round to page size */
size -= size % 4096;
if (unlikely(!size))
fatal("Unable to mmap any ram\n");
goto retry;
Expand All @@ -718,7 +719,7 @@ static void init_sliding_mmap(struct rzip_state *st, int fd_in, i64 offset)

if (size < st->chunk_size) {
if (UNLIMITED && !STDIN)
print_verbose("File is beyond window size, will proceed MUCH slower in unlimited mode with a sliding_mmap buffer\n");
print_verbose("File is beyond window size, will proceed in unlimited mode with a sliding_mmap buffer but may be much slower\n");
else {
print_verbose("Needed to shrink window size to %lld\n", size);
st->chunk_size = size;
Expand Down Expand Up @@ -866,10 +867,10 @@ void rzip_fd(int fd_in, int fd_out)
eta_hours = (unsigned int)(finish_time - elapsed_time) / 3600;
eta_minutes = (unsigned int)((finish_time - elapsed_time) - eta_hours * 3600) / 60;
eta_seconds = (unsigned int)(finish_time - elapsed_time) - eta_hours * 60 - eta_minutes * 60;
chunkmbs=(last_chunk / 1024 / 1024) / (double)(current.tv_sec-last.tv_sec);
chunkmbs = (last_chunk / 1024 / 1024) / (double)(current.tv_sec-last.tv_sec);
print_verbose("\nPass %d / %d -- Elapsed Time: %02d:%02d:%02d. ETA: %02d:%02d:%02d. Compress Speed: %3.3fMB/s.\n",
pass, passes, elapsed_hours, elapsed_minutes, elapsed_seconds,
eta_hours, eta_minutes, eta_seconds, chunkmbs);
pass, passes, elapsed_hours, elapsed_minutes, elapsed_seconds,
eta_hours, eta_minutes, eta_seconds, chunkmbs);
}
last.tv_sec = current.tv_sec;
last.tv_usec = current.tv_usec;
Expand Down
24 changes: 12 additions & 12 deletions stream.c
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ static void zpaq_compress_buf(struct stream *s, int *c_type, i64 *c_len)

zpipe_compress(in, out, control.msgout, s->buflen, (int)(SHOW_PROGRESS));

if (unlikely(memstream_update_buffer(out, &c_buf, &dlen) != 0))
if (unlikely(memstream_update_buffer(out, &c_buf, &dlen)))
fatal("Failed to memstream_update_buffer in zpaq_compress_buf");

fclose(in);
Expand Down Expand Up @@ -387,7 +387,7 @@ static int lzma_decompress_buf(struct stream *s, size_t c_len)
/* With LZMA SDK 4.63 we pass control.lzma_properties
* which is needed for proper uncompress */
lzmaerr = LzmaUncompress(s->buf, &dlen, c_buf, &c_len, control.lzma_properties, 5);
if (unlikely(lzmaerr != 0)) {
if (unlikely(lzmaerr)) {
print_err("Failed to decompress buffer - lzmaerr=%d\n", lzmaerr);
return -1;
}
Expand Down Expand Up @@ -675,11 +675,11 @@ void *open_stream_in(int f, int n)
if (control.major_version == 0 && control.minor_version < 4) {
u32 v132, v232, last_head32;

if (read_u32(f, &v132) != 0)
if (unlikely(read_u32(f, &v132)))
goto failed;
if (read_u32(f, &v232) != 0)
if (unlikely(read_u32(f, &v232)))
goto failed;
if (read_u32(f, &last_head32) != 0)
if ((read_u32(f, &last_head32)))
goto failed;

v1 = v132;
Expand Down Expand Up @@ -708,11 +708,11 @@ void *open_stream_in(int f, int n)
print_err("Unexpected initial tag %d in streams\n", c);
goto failed;
}
if (unlikely(v1 != 0)) {
if (unlikely(v1)) {
print_err("Unexpected initial c_len %lld in streams %lld\n", v1, v2);
goto failed;
}
if (unlikely(v2 != 0)) {
if (unlikely(v2)) {
print_err("Unexpected initial u_len %lld in streams\n", v2);
goto failed;
}
Expand Down Expand Up @@ -791,11 +791,11 @@ static int fill_buffer(struct stream_info *sinfo, int stream)
if (control.major_version == 0 && control.minor_version < 4) {
u32 c_len32, u_len32, last_head32;

if (read_u32(sinfo->fd, &c_len32) != 0)
if (unlikely(read_u32(sinfo->fd, &c_len32)))
return -1;
if (read_u32(sinfo->fd, &u_len32) != 0)
if (unlikely(read_u32(sinfo->fd, &u_len32)))
return -1;
if (read_u32(sinfo->fd, &last_head32) != 0)
if (unlikely(read_u32(sinfo->fd, &last_head32)))
return -1;
c_len = c_len32;
u_len = u_len32;
Expand Down Expand Up @@ -911,13 +911,13 @@ int close_stream_out(void *ss)

/* reallocate buffers to try and save space */
for (i = 0; i < sinfo->num_streams; i++) {
if (sinfo->s[i].buflen != 0) {
if (sinfo->s[i].buflen) {
if (unlikely(!realloc(sinfo->s[i].buf, sinfo->s[i].buflen)))
fatal("Error Reallocating Output Buffer %d\n", i);
}
}
for (i = 0; i < sinfo->num_streams; i++) {
if (unlikely(sinfo->s[i].buflen != 0 && flush_buffer(sinfo, i)))
if (unlikely(sinfo->s[i].buflen && flush_buffer(sinfo, i)))
return -1;
if (sinfo->s[i].buf)
free(sinfo->s[i].buf);
Expand Down

0 comments on commit a66dafe

Please sign in to comment.