New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Please add gzip/bzip compression for delta files in rdiff #8

Open
pavel-odintsov opened this Issue Jul 31, 2014 · 7 comments

Comments

Projects
None yet
4 participants
@pavel-odintsov

pavel-odintsov commented Jul 31, 2014

Hello!

I tried to use flags --gzip/--bzip for rdiff but got error:

rdiff: ERROR: (rdiff_options) sorry, compression is not really implemented yet

For my data (VPS disks) compression provides really excellent compression for delta files:

source size: 4.6 Gb delta size: 2093.0 MB compressed size: 223.0
source size: 14.8 Gb delta size: 2205.0 MB compresses size: 998.7 MB

Thank you!

@pavel-odintsov

This comment has been minimized.

pavel-odintsov commented Jul 31, 2014

I tried to compress signatures biut it's really useless:

du -sh /root/rdiff_signatures_25_june/
20M /root/rdiff_signatures_25_june/

tar -cpzf /root/rdiff_signatures_25_june.tar.gz /root/rdiff_signatures_25_june/
ls -alh /root/rdiff_signatures_25_june.tar.gz
-rw-r--r-- 1 root root 19M Авг  1 00:41 /root/rdiff_signatures_25_june.tar.gz

tar -cpjf /root/rdiff_signatures_25_june.tar.bz2 /root/rdiff_signatures_25_june/
ls -alh /root/rdiff_signatures_25_june.tar.bz2
-rw-r--r-- 1 root root 19M Авг  1 00:41 /root/rdiff_signatures_25_june.tar.bz2

But compression for deltas is really useful, please add it :)

@sourcefrog

This comment has been minimized.

Contributor

sourcefrog commented Aug 1, 2014

You can just pipe it into gzip.

@pavel-odintsov

This comment has been minimized.

pavel-odintsov commented Aug 1, 2014

Hello!

Thank you for answer!

Yes, I'm use rdiff delta in way:

rdiff delta signature.dat data.dat - | pigz > signature.gz 

But out of box support for compressed deltas will be fine feature.

@dbaarda

This comment has been minimized.

Member

dbaarda commented Oct 10, 2014

Note rsync uses a modified zlib for delta compression that uses matching data that is not included in the delta to "prime" the compression data tables and then throws away the "matching" compressed output. This in general gives slightly better compression than just gzipping the resulting delta. For an example of how this can be done with an unmodified zlib you can look a pysync http://minkirri.apana.org.au/~abo/projects/pysync.

@dbaarda

This comment has been minimized.

Member

dbaarda commented Oct 17, 2017

I'm considering tackling this next. Either that or Rabin-Karp rollsums... whichever people prefer.

Note that signature files being collections of hash values probably don't compress at all well, unless they have long runs of identical blocks. I'm planning to only add compression to the deltas, with optional "context compression" support (which compresses hits as well as misses to prime the compressor with context from matching blocks).

@yxj1992

This comment has been minimized.

yxj1992 commented Nov 28, 2017

I have set cmake -D ENABLE_COMPRESSION=ON .,
but it doesn't work,'ERROR: (rdiff_options) sorry, compression is not really implemented yet',
who can tell me why?

@dbaarda

This comment has been minimized.

Member

dbaarda commented Feb 11, 2018

yxj1992: because that feature hasn't been implemented yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment