Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

excess cpu use in rs_mdfour / rs_search_for_block / rs_findmatch #11

Closed
vps2fast opened this issue Sep 30, 2014 · 39 comments
Closed

excess cpu use in rs_mdfour / rs_search_for_block / rs_findmatch #11

vps2fast opened this issue Sep 30, 2014 · 39 comments

Comments

@vps2fast
Copy link

Hello!

I'm using rdiff for creating incremental backups and it hangs every day with one file with 5GB in size.

OS: CentOS 6

ps aux|grep rdiff
root      771334  0.0  0.0 103248   892 pts/0    S+   14:39   0:00 grep rdiff
root      945509 99.2  0.0  78444 59484 ?        R    Sep29 1207:21 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature /vz/private/4822/root.hdd/root.hdd /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

Source files:

ls -al /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
-rw-r--r-- 1 root root 191889408 Sep 30 13:41 /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

ls -alh /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
-rw-r--r-- 1 root root 183M Sep 30 13:41 /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

ls -la /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature
-rw-r--r-- 1 root root 49752 Sep 29 18:23 /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature

Strace show nothing:

strace -f -p 945509
Process 945509 attached - interrupt to quit

But gdb show so much very interesting info:

debuginfo-install librsync-0.9.7-15.el6.x86_64
(gdb) bt
#0  0x00007f2d96a53bbf in rs_mdfour64 (m=0x7fff58b760a0, p=0x7f2d946d33a4) at mdfour.c:113
#1  0x00007f2d96a5413f in rs_mdfour_block (md=0x7fff58b760a0, in_void=<value optimized out>, n=<value optimized out>) at mdfour.c:263
#2  rs_mdfour_update (md=0x7fff58b760a0, in_void=<value optimized out>, n=<value optimized out>) at mdfour.c:359
#3  0x00007f2d96a5442a in rs_mdfour (out=0x7fff58b76170 "\251\017\226/\035B\253\212\311lH<\222\222t\230", in=0x7f2d946c63a4, n=1048576) at mdfour.c:389
#4  0x00007f2d96a55bbc in rs_search_for_block (weak_sum=0, inbuf=<value optimized out>, block_len=<value optimized out>, sig=0xb3dd80, stats=<value optimized out>, match_where=<value optimized out>)
    at search.c:143
#5  0x00007f2d96a52b01 in rs_findmatch (job=0xb3dbd0) at delta.c:272
#6  rs_delta_s_scan (job=0xb3dbd0) at delta.c:156
#7  0x00007f2d96a5391e in rs_job_work (job=0xb3dbd0, buffers=0x7fff58b762e0) at job.c:187
#8  rs_job_iter (job=0xb3dbd0, buffers=0x7fff58b762e0) at job.c:145
#9  0x00007f2d96a53a39 in rs_job_drive (job=0xb3dbd0, buf=0x7fff58b762e0, in_cb=0x7f2d96a51ee0 <rs_infilebuf_fill>, in_opaque=<value optimized out>, out_cb=0x7f2d96a51d90 <rs_outfilebuf_drain>, 
    out_opaque=<value optimized out>) at job.c:241
#10 0x00007f2d96a56b72 in rs_whole_run (job=0xb3dbd0, in_file=<value optimized out>, out_file=0xb3d990) at whole.c:80
#11 0x00007f2d96a56cf7 in rs_delta_file (sig=<value optimized out>, new_file=0xb3d750, delta_file=0xb3d990, stats=0x7fff58b76380) at whole.c:154
#12 0x000000000040149b in rdiff_delta (argc=-1804782684, argv=0xb208f4c2) at rdiff.c:298
#13 rdiff_action (argc=-1804782684, argv=0xb208f4c2) at rdiff.c:358
#14 main (argc=-1804782684, argv=0xb208f4c2) at rdiff.c:374

With disk system everything looks fine:

cat  /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature |pv >/dev/null
48.6kB 0:00:00 [ 452MB/s] [   <=>                  
cat /vz/private/4822/root.hdd/root.hdd  |pv >/dev/null
4.05GB 0:00:02 [1.36GB/s] [          <=>                  

Please help me :(

@sourcefrog
Copy link
Contributor

Thanks for providing a traceback etc. If it's in rs_mdfour64 and using CPU, it's not hung, just perhaps slower than it should be. This might be the same as #6.

@sourcefrog sourcefrog changed the title rdiff hangs without any reasons excess cpu use in rs_mdfour / rs_search_for_block / rs_findmatch Sep 30, 2014
@vps2fast
Copy link
Author

I reproduced this bug. I move this files on another server and rdiff hangs everytime on same delta size:

-rw-r--r-- 1 root root  118489088 Sep 30 18:36 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

And I got strace also:

strace -f rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
execve("/usr/bin/rdiff", ["rdiff", "--block-size=1048576", "--input-size=16777216", "--output-size=16777216", "delta", "extracted_backup_34bdf94b-43eb-4"..., "root.hdd", "extracted_backup_34bdf94b-43eb-4"...], [/* 23 vars */]) = 0
brk(0)                                  = 0x1cd8000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f0a000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=22142, ...}) = 0
mmap(NULL, 22142, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fae95f04000
close(3)                                = 0
open("/usr/lib64/librsync.so.1", O_RDONLY) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320)\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=53976, ...}) = 0
mmap(NULL, 2149160, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fae95adf000
mprotect(0x7fae95aec000, 2093056, PROT_NONE) = 0
mmap(0x7fae95ceb000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xc000) = 0x7fae95ceb000
close(3)                                = 0
open("/lib64/libpopt.so.0", O_RDONLY)   = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\20\33\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=36360, ...}) = 0
mmap(NULL, 2131536, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fae958d6000
mprotect(0x7fae958de000, 2097152, PROT_NONE) = 0
mmap(0x7fae95ade000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x8000) = 0x7fae95ade000
close(3)                                = 0
open("/lib64/libbz2.so.1", O_RDONLY)    = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\26\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=67592, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f03000
mmap(NULL, 2162768, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fae956c5000
mprotect(0x7fae956d5000, 2093056, PROT_NONE) = 0
mmap(0x7fae958d4000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x7fae958d4000
close(3)                                = 0
open("/lib64/libz.so.1", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0 !\0\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=88600, ...}) = 0
mmap(NULL, 2183696, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fae954af000
mprotect(0x7fae954c4000, 2093056, PROT_NONE) = 0
mmap(0x7fae956c3000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14000) = 0x7fae956c3000
close(3)                                = 0
open("/lib64/libc.so.6", O_RDONLY)      = 3
read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0p\356\1\0\0\0\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1921096, ...}) = 0
mmap(NULL, 3750152, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x7fae9511b000
mprotect(0x7fae952a5000, 2097152, PROT_NONE) = 0
mmap(0x7fae954a5000, 20480, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x18a000) = 0x7fae954a5000
mmap(0x7fae954aa000, 18696, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7fae954aa000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f02000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f01000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f00000
arch_prctl(ARCH_SET_FS, 0x7fae95f01700) = 0
mprotect(0x7fae954a5000, 16384, PROT_READ) = 0
mprotect(0x7fae956c3000, 4096, PROT_READ) = 0
mprotect(0x7fae95f0b000, 4096, PROT_READ) = 0
munmap(0x7fae95f04000, 22142)           = 0
brk(0)                                  = 0x1cd8000
brk(0x1cf9000)                          = 0x1cf9000
open("extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature", O_RDONLY) = 3
open("root.hdd", O_RDONLY)              = 4
open("extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 5
mmap(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae9411a000
fstat(3, {st_mode=S_IFREG|0644, st_size=49752, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f09000
read(3, "rs\0016\0\20\0\0\0\0\0\10\221\273\200\340\32\246\241\324\1773\244v\274\0230\312\210]-u"..., 16777216) = 49752
read(3, "", 16723968)                   = 0
munmap(0x7fae9411a000, 16781312)        = 0
brk(0x1d52000)                          = 0x1d52000
open("/proc/meminfo", O_RDONLY|O_CLOEXEC) = 6
fstat(6, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f08000
read(6, "MemTotal:       396066784 kB\nMem"..., 1024) = 1024
close(6)                                = 0
munmap(0x7fae95f08000, 4096)            = 0
brk(0x2d5a000)                          = 0x2d5a000
brk(0x3d5a000)                          = 0x3d5a000
fstat(4, {st_mode=S_IFREG|0600, st_size=4347396096, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f08000
read(4, "WithouFreSpacExt\2\0\0\0\20\0\0\0\200\273\0\0\0\10\0\0"..., 16777216) = 16777216
mmap(NULL, 33558528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae9311a000
fstat(5, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fae95f07000
write(5, "rs\0026C\0\20\0\0WithouFreSpacExt\2\0\0\0\20\0\0"..., 4194304) = 4194304
read(4, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8384512) = 8384512
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\2\0>\0\1\0\0\0\363'@\0\0\0\0\0"..., 16777216) = 16777216
write(5, "n(uwsgi.snmp_community) > 72) {\n"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\1\0\0L\211\362H\211\336L\211\347I\211\204$\30-\0\0H\211+\306\204$?\1\0\0\0L"..., 16777216) = 16777216
read(4, "\307\203\200:g\0\0\0\0\0A\307E\10\0\0\0\0I\213n\10\213Uh\205\322t*Lc\372"..., 16777216) = 16777216
read(4, "\0_ZN2v88internal8Counters19compu"..., 16777216) = 16777216
read(4, "\377\377\377\350\270\275\25\377\205\300\177\334H\215\264$\377\1\0\0H\211\337\350\264\273\25\377\353\312H\215"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\203\307\10H\211\376\377\320H\213\204$p\2\0\0H\307\0\0\0\0\0L\211\367\350\201\30\306\377\351"..., 16777216) = 16777216
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\2\0>\0\1\0\0\0\34\23@\0\0\0\0\0"..., 16777216) = 16777216
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\2\0>\0\1\0\0\0\340\272u\0\0\0\0\0"..., 16777216) = 16777216
read(4, "}\376\377\377H\215=ft/\0\271\7\0\0\0L\211\306\363\246\271\f\0\0\0\17\227\302\17\222\300"..., 16777216) = 16777216
read(4, "\0\0\17\204\16\5\0\0H\205\366trH\213\224$h\1\0\0H\213\204$`\1\0\0H\205\322"..., 16777216) = 16777216
read(4, "m\274[\0\0\0\0\0\372\252A\0\0\0\0\0\1\0\0\0\0\0\0\0\0\341\213\0\0\0\0\0"..., 16777216) = 16777216
read(4, "\350\233\375\377\377H\213S\20H\213;D\211\346H\213\\$\10L\213d$\20H\203\304\30\351~\375"..., 16777216) = 16777216
read(4, "5\335A\206\0H\211\347\350C\301\225\377H\211\346H\211\337\3508%\0\0H\213,$H\203\355\30"..., 16777216) = 16777216
read(4, "a\0\0\0\0\0\0\0\246G\23\0\22\0\f\0`\27:\1\0\0\0\0\277\0\0\0\0\0\0\0"..., 16777216) = 16777216
read(4, "$D\17\266\204$\203\0\0\0\210D$C\17\266\204$\202\0\0\0\17\266L$C\210D$B\17"..., 16777216) = 16777216
read(4, "\7\0\0\0\0H\205\377tH\270\377\377\377\377\360\17\301\7\203\350\1\211\204$\200\0\0\0\213\204$"..., 16777216) = 16777216
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\2\0>\0\1\0\0\0f\235@\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0G\3\0361\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\320\215\4\200\211\302\301\352\0041\302i\322\t\10\0\0A\211\324A\301\354\20A1\324A\213U\20\215"..., 16777216) = 16777216
read(4, "\350[\332\356\377I\213\204$\300\0\0\0H\205\300\17\204\257\2\0\0M\215l$\10H\213L$\30"..., 16777216) = 16777216
read(4, "\216\2\0\0\0\0\0\0\323\21\5\0\"\0\f\0 \336J\1\0\0\0\0\23\0\0\0\0\0\0\0"..., 16777216) = 16777216
read(4, "E1\3441\3331\355H\213T$\30H\213|$\20A\203\304\1H\213\2\2134\3\350\260\303\3\0"..., 16777216) = 16777216
read(4, "//  Boost string_algo library sp"..., 16777216) = 16777216
write(5, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "/*=============================="..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "//  Copyright (c) 2001-2011 Hart"..., 16777216) = 16777216
write(5, "\3\0\0\0\f\206\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, " (geographic)\n}\n */\n\n/*\nNote, in"..., 16777216) = 16777216
read(4, "\300;9\230\0\0\0\2\0\5\370\237\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\4\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "////////////////////////////////"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\\S\204u\21S\0\0\0\0\0\0\1\0 \0\0\0\0\0\10\0\1\0\0\0\n\363\1\0\4\0"..., 3141632) = 3141632
read(4, "ash_value, this->priv_real_bucke"..., 16777216) = 16777216
read(4, "rg4,Arg5> type;\n    };\n};\nString"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 2093056) = 2093056
read(4, "gument of the elliptic function."..., 16777216) = 16777216
read(4, "/*!\n * jQuery JavaScript Library"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "R)\0\0\0R*\0\0\0R\2\2\0\0(\0\0\0\0(\5\0\0\0R\221\0\0\0Rq"..., 16777216) = 16777216
read(4, "ea39176a4a7\n1069559093df0243f498"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4190208) = 4190208
read(4, "\336\22\4\225\0\0\0\0\251\4\0\0\34\0\0\0d%\0\0=\6\0\0\254J\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "\200\321\205\321\226\320\262\321\203\321\224\321\202\321\214\321\201\321\217\0\320\272\320\265\321\200\321\203\320\262\320\260"..., 16777216) = 16777216
read(4, "\5\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0\5\0\0\0\5\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "e patches\0Termin\303\251\0Termin\303\251 (r\303\251"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "big-endian, -b, -be\n            "..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "\t!\10\0\f\0\1\2.\0\0\0006 \10\0\f\0\2\2..\0\0\n!\10\0\30\0\r\7"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\361\271\\S\214\201\341\261\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 7335936) = 7335936
read(4, "\4\35\3\201\365u\202'\2\205\324\201\241\2\0253\2\25\7\301z\230\0036\211\240\202\22\3\7\f\301"..., 16777216) = 16777216
read(4, "\336\22\4\225\0\0\0\0N\7\0\0\34\0\0\0\214:\0\0\307\t\0\0\374t\0\0\0\0\0\0"..., 16777216) = 16777216
read(4, "bcd5-00c04fd8d5b6}\nadminProperty"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\34\0\0\0T\271=\vT\271=\vT\271=\v\275Q\4T"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "\0%\377\272\0\240\0'\377\342\0\240\0+\377\342\0\240\0.\377\354\0\240\0003\377\342\0\240\0005"..., 16777216) = 16777216
read(4, "<li style=\"font-size: 120%;\">\n  "..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\37\213\10\10\366a\25S\2\3objdump.1\0\254[\353r\0337\262\376?O\201R"..., 16777216) = 16777216
read(4, "# CATEGORY(XLC|XIM|OM)\tSHARED_LI"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0[\240\366\331\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "# !!!!!!!   DO NOT EDIT THIS FIL"..., 16777216) = 16777216
read(4, "\37\213\10\0\0\0\0\0\2\3\234[\333\222\333F\222}\367WT\310\16\273{LR\4xi6W"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "\37\213\10\0\0\0\0\0\2\3\355\275k\227\33\311\221%\370=~E\10\3329\371\30\0\254\207\244\356"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\22\26c\237\252[Ok\332\365\330\350Uy\331kV\0262\263_\21d\215\27V\236,\343\224\251}"..., 1044480) = 1044480
read(4, "\300;9\230\0\0\0\2\0\5\377T\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "RICT LIABILITY,\n   OR TORT (INCL"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "<?xml version=\"1.0\" encoding=\"IS"..., 16777216) = 16777216
read(4, "\327\323Gm\326\357}\26\347]\26\0079rhI\24<\226\372\325\353\277\220\344\313\352~\235j}\337"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\37\213\10\0\0\0\0\0\2\3\225\234]\217\333\270\25\206\357\363+tS`\366\"\255\365-\25\301\2"..., 16777216) = 16777216
read(4, "e789c7ba33d1d8a\n2d496631ced03949"..., 16777216) = 16777216
write(5, "Z\3\0\0\207\202\340S$\272\\S\311,\\P\0\0\0\0\0\0\1\0\10\0\0\0\0\0\10\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\244\201\0\0\0\0\0\0\223\37\362RH\267\343SH\267\343SH\267"..., 3141632) = 3141632
read(4, "\360=\6\0\f\0\1\2.\0\0\0\357=\6\0\f\0\2\2..\0\0J5\6\0\20\0\7\1"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\244\201\0\0\3519\0\0"..., 3141632) = 3141632
read(4, "c15CentralFreeList18ReleaseListT"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "H;D$\30s\30I\213T\334\10H\2155\212%\2\0H\211\3571\300\350I\360\377\377\353\333H"..., 3141632) = 3141632
read(4, "\316g\30\0\0\0\0\0hZ<\0\0\0\0\0\10\0\0\0\0\0\0\0 ?>\0\0\0\0\0"..., 16777216) = 16777216
read(4, "# Autogenerated by Sphinx on Fri"..., 16777216) = 16777216
read(4, "C\n    2045 aware mail clients.\n\n"..., 16777216) = 16777216
read(4, "\234\214s\21\25\"6\21\371\177c\203\307\215\347 \6\0230\257?\354\365{\311\24\3221\3757\275\362"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\227=l[U\24\307\357\275\366\363\307\263\375\362l?\1776\rN\224FIp\235\17\271!\ti\224"..., 16777216) = 16777216
read(4, "\0\1\0\0\0\3\0\0\0C\0\0\0s\21\0\0\0t\0\0d\1\0d\2\0\203\2\0\1d"..., 16777216) = 16777216
read(4, "\0\203C\4\1@\366\305@u\35@\366\305\20u#\203\345 uk\213C\4\203\350\1\211C\4["..., 16777216) = 16777216
read(4, "\24\1\0\0H\215=\235\6#\0\350@\260\376\377H\215L$\f1\322L\211\346H\211\357\350\256\370"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\213@\30H\211D$(\350\303\245\377\377H\213@\20H\307D$8\0\0\0\0L\213h H\213"..., 16777216) = 16777216
read(4, "\213|$\34Hc\367H\1\306I9\362\17\203\315\0\0\0A\1\377H\211\306H\211\360\351\372\375\377"..., 16777216) = 16777216
read(4, "M\215d\326\370I\213\24$H!\323H\323\352H\211\321\215P\377Hc\322\350\343z\377\377H\1\303"..., 16777216) = 16777216
read(4, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\5\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\366\17\204\231\1\0\0H\215D$PM\205\355L\215\204$\220\0\0\0H\215\25-\274\6\0H\211"..., 16777216) = 16777216
read(4, "\211\311D\1\301Hc\311\17\266\211 \207\335\0D\17\266\301A\203\370\177\17\217\343\374\377\377D\17\266"..., 16777216) = 16777216
read(4, "|\0\0\17\205\311\0\0\0H\213=`Mt\0\350\313\371\273\3771\377\350\24u\376\377\200=\205\307"..., 16777216) = 16777216
read(4, "\201\376\32\1\0\0uX\271\1\0\0\0\353\243\220H\2155\231{\1\0H\215=\342\201\1\0E1"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 3141632) = 3141632
read(4, "9*\n ID_OUI_FROM_DATABASE=FSN DIS"..., 16777216) = 16777216
read(4, "H\215\204$\240\0\0\0H\213t$(E1\311E1\300\2713\0\0\0H\211\352H\211\337H\307"..., 16777216) = 16777216
read(4, "(\213\263\2612Vw?\227s+B\330<s]\17\211\264\247\323\247\2327\30h\244e?\201\253\2"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\35Hed8 \203\220s_@|\r\3265]an[C\324^\343\304\220\26\232]>U`\267"..., 3141632) = 3141632
read(4, "\352k\223\323'i\17\317\270\22:\37\353H\17T\266\277\224\21\270\365\343\6\271\17\267J\344 \4("..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\233\30\200\6A\347>\206\252]Y\343\0\262\252\223\375\356I\355W\242~`\212\237\331\27\f\307\r\345"..., 1044480) = 1044480
read(4, "\377\330\377\341!\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
write(5, "\7\35\306\3361\353\232\351\213\215W\312\376#\5\26\233\227C~\313P\275\263\272\206\356+o\264Ip"..., 4096) = 4096
write(5, "\10\361\220O\"\233\320{\324\321\215\335EsX\324|k\220\10\0007|\323\300\fq\301\367\240G\236"..., 2093056) = 2093056
read(4, "\377\330\377\341)\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
read(4, "\377\330\377\341\37\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\322\245\2328w3c{\365\3061\236\236\370\352E&\365&\332\35^\247\341\266\325\364\33x\274\2670"..., 1044480) = 1044480
read(4, "\360G\31\254+\210\310^@<s\223L\315\25-\356m\344*$\\\20N\1\3\256\177\306\252\370\242"..., 16777216) = 16777216
read(4, "\377\330\377\340\0\20JFIF\0\1\1\1\0\264\0\264\0\0\377\341\21{Exif\0\0II"..., 16777216) = 16777216
write(5, "\306\352M\304\257*8\365\244\300s\273\0p\331\250\201' \367\367\353@<\221\266\232\315\223\317j\0"..., 4096) = 4096
write(5, "?\f\370sS\325e %\245\273\315\317r\7\3\363\305'\242\21\361\177\355\251\361\17\373o\305\321\350"..., 1044480) = 1044480
read(4, "\377\330\377\341\30\216Exif\0\0II*\0\10\0\0\0\17\0\17\1\2\0\6\0\0\0\302\0"..., 16777216) = 16777216
write(5, "\332\240\23\216\347\334\222k\263\5N\355\315\364\330\347\304O^S\304/,\374\335:\25\22Kk\34\200"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2093056) = 2093056
read(4, "\377\330\377\341!\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
read(4, "\377\330\377\341E\204Exif\0\0II*\0\10\0\0\0\v\0\17\1\2\0\6\0\0\0\222\0"..., 16777216) = 16777216
read(4, "\377\330\377\341\37\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
write(5, "x\254\243\314{\270\343E,%\337\345\247\312}j\365\304\377\0i\267x\220\244\214c\22\10\330mn"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\377\330\377\341#\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
read(4, "R\356\21\252\240\223\215\235\267\6{\230\246\23\"\211\16\4o.v\357\\`\26\37\303U\204\266\236t"..., 16777216) = 16777216
write(5, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\377\330\377\341+\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\244\201\0\0\0\0\0\0V\260\rTV\260\rTV\260\rT\0\0\0\0\0\0"..., 4096) = 4096
write(5, "Windows NT 6.1; WOW64) Presto/2."..., 1044480) = 1044480
read(4, "\377\330\377\340\0\20JFIF\0\1\1\1\0`\0`\0\0\377\333\0C\0\1\1\1\1\1\1\1"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "x\255XO\247Z\241\32\366\371\255\210{z\320#n\333\212\332\206U\214\256\346\3'\3\236\246\230\215"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\230\20y[q\234\217sK\270)\0a\273\340T\245W\31<\322,A\277,S\2\26\221\211\306\335"..., 1044480) = 1044480
read(4, "\377\330\377\341#\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
read(4, "{\206\347\361\315\0K\r\343\202\252\256[\323kc\6\260\356\354\255Rx\332\356\322\31\371$,\261\217"..., 16777216) = 16777216
read(4, "\6\0\0\0P>\0\0ssh:notty\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\303\237\366Y\361\365\233\31#\260\227h\377\0\236WI\374\203W\335\264v\251\372\274z6_\326\347\325"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
read(4, "\6\0\0\0\307v\0\0ssh:notty\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\377\330\377\340\0\20JFIF\0\1\1\1\0\264\0\264\0\0\377\341\r\6Exif\0\0II"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8384512) = 8384512
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "h:notty\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4190208) = 4190208
read(4, "\377\330\377\340\0\20JFIF\0\1\1\1\0`\0`\0\0\377\333\0C\0\2\1\1\1\1\1\2"..., 16777216) = 16777216
read(4, "\377\330\377\340\0\20JFIF\0\1\1\1\0`\0`\0\0\377\333\0C\0\1\1\1\1\1\1\1"..., 16777216) = 16777216
read(4, "\30\307|\2322wc\3h\35}M\0#\34\2\10\r\221\220\277\336\250\331\243\0o*A\341U{"..., 16777216) = 16777216
read(4, "\331\36u\213\310\361x|E\f*X:\361\223\302\322\372\316\36x>jm/k\f\276\206\17\222\223"..., 16777216) = 16777216
read(4, "\273\213)\4\202\10\337\205\351\330\37\233\274_\373[\374[\361N\370\356\274W\252\\\2431\300\211V\23"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1044480) = 1044480
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16777216) = 16777216
read(4, "4m$\311\344\373\352w\367 \363CUV\246h\210o\30F\347\6\333\224\365'\247\5$%\320\22"..., 16777216) = 16777216
read(4, "\333\330<\213\251\247\323\326X\346S\214\251\3\335W\234\3\201\214\343\212\271Z\254\\'\267qR\255*"..., 16777216) = 16777216
read(4, "\306\17\3735\306j\277\21\340\370\317\361\213So\27\210,mm\356\226\301'\26\206\351\355\fd\341\\"..., 16777216) = 16777216
read(4, "\351\34{\326\314RO\344 Gdl\36s\2220k'\246\2057\314\215\233W\226V9%\231\270\25"..., 16777216) = 16777216
read(4, "\377\330\377\341!\376Exif\0\0II*\0\10\0\0\0\t\0\17\1\2\0\6\0\0\0z\0"..., 16777216) = 16777216
read(4, "\2563Um\t\261\f\214\240m\333\363\16\224\324\270\17\362\251\346\246\303e\245\233\354\344\35\240\234\365&"..., 16777216) = 16777216
read(4, "\32\331^\210\255\355Z\34\333Kh8Q\366\2002\254W \22x q\336\263\2177\331)F\351\376"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\34\0\0\0\270x\211O\0\0\0\0\234I"..., 2093056) = 2093056
read(4, "!T\226h\360%\362\245\7\207^1Y0\351\3222\264\233\361\214\202[\276\16*o\320[\225\26\304"..., 16777216) = 16777216
read(4, "\247\317\33\243\320\241;\304\373.\316\363\314\201@q\363\16\307\255iF\f-\203\3631\352=+\311o"..., 16777216) = 16777216
read(4, "_i\27\344\254\225\255\246\236\227\365~f\220\312h\244\271i\246\265\263\344r\335\353\253_\360z\31\322"..., 16777216) = 16777216
read(4, "\264rk4\352\16g%\334\0G\335Q\222\177\317\265]\256F\346<\236l\254\301w\257\241\177\360\352"..., 16777216) = 16777216
read(4, "\6\245\246\312\271I\341`G\320\372\21\307\7\221]\364w\377\0i\371\t^\234\341\271\256\244\357\252<"..., 16777216) = 16777216
read(4, "\20\352v\327\32\274V\202\315\245*\250\233\tg\36\247?\215k\370s\340'\301\277\33\370\337\376\22\213"..., 16777216) = 16777216
read(4, "F\312\226\21\265\326\365diO\5\235\273\373LJ\362\264?;\237\2326\332<\210B\240\347\330T\332"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096) = 4096
write(5, "\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377\377"..., 1044480) = 1044480
read(4, "\274\334\200\245Sw\2125a\263\237\275\314\244g\360\374+\321\241\204\245SW\30\375\307,\261\365\251K"..., 16777216) = 16777216
read(4, "\3\270\343\232[\217\232\346&\253\345,\245\0357.\7\3\245eG=\274l\251\271w\1\304y\344\217"..., 16777216) = 16777216
read(4, "\373\27!\3109?\257z\235,\245\334n!f\225%\343s08\366\377\0\353{UIt\32n\366"..., 16777216) = 16777216
read(4, "7_\267Y-\356#\2326\31\00409\241;\210\326{;k\241\3063\327\3455J]\4\34\25\220"..., 16777216) = 16777216
read(4, "\200\33#\263g\3\31\355Q\6\31\340\3\216\375\350\2\274\310\351\301\31?\312\236\343\34\234P\7\310\n"..., 16777216) = 16777216
write(5, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\35\0\0\0\0\0\0\0\0\0\0\0!\0\0\0"..., 4096) = 4096
write(5, "^h\3Bc=\306\322&HN98\315R\236\2\216\21\356\345\345q\225\340Q\250\5\255\245\231\302"..., 1044480) = 1044480
read(4, "\247}A`Gb\252X\365\33s\307\322\251{\253\310\233_C\36\347V\347\0\202q\367}j\203\310"..., 16777216) = 16777216
....

And rdiff do nothing according to strace but it eat so much processor time:

    PID USER      PRI  NI  VIRT   RES   SHR S CPU% MEM%   TIME+  Command
 691051 root       20   0 78444 44132   608 R 100.  0.0  5:01.11 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature
ps aux|grep rdiff
root      691048  0.0  0.0   4404   760 pts/0    S+   18:36   0:00 strace -f rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
root      691051 99.8  0.0  78444 44132 pts/0    R+   18:36   5:10 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
root      692333  0.0  0.0 103248   892 pts/1    S+   18:41   0:00 grep rdiff

@vps2fast
Copy link
Author

Thank you for an answer! But this rdiff task executed about 35 hours for 5Gb file and was killed by me... But for next backup iteration this bug occurs again...

@vps2fast
Copy link
Author

And even iotop show no I/O operations for rdiff:

otal DISK READ: 0.00 B/s | Total DISK WRITE: 0.00 B/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                         
691051 be/4 root        0.00 B/s    0.00 B/s  ?unavailable?  rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 ~hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

@pavel-odintsov
Copy link

Tool worked 3 hours on CPU E5-2620 0 @ 2.00GHz and did some progress, delta file was larger for 4 mbytes:

-rw-r--r-- 1 root root  122683392 Сен 30 20:31 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

But CPU consumption is really enormous!

root      698496 99.9  0.0  78444 46180 pts/2    R+   19:02 187:30 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

And tool still working :(

@pavel-odintsov
Copy link

After 2 hours no change in delta file:

-rw-r--r-- 1 root root 122683392 Сен 30 20:31 broken_rdiff/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

And tool still works:

root      698496 97.9  0.0  78444 46180 pts/2    R+   Sep30 316:43 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

@vps2fast
Copy link
Author

vps2fast commented Oct 1, 2014

After ~12 hours we have smaaaalll progress:

168820736 Oct  1 10:25 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

But rdiff works about 13 hours and still working now:

root      698496 99.2  0.0  78444 54368 pts/2    R+   Sep30 941:09 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

:(

@pavel-odintsov
Copy link

Tool still working about 27 hours on modern 24-core Xeon server:

root      698496 99.5  0.0  78444 59488 pts/2    R+   Sep30 1562:11 rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

But only 30 mb was added and no any ETA when tool generated complete delta :(

 /bin/ls -la /root/broken_rdiff/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta 
-rw-r--r-- 1 root root 191889408 Окт  1 17:48 /root/broken_rdiff/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

@therealmik
Copy link
Contributor

This could be a result of excessive weak sum collisions. There are a maximum of 2**32 weak sums, but due to the non-prime modulus there's bound to be tons of collisions.

Once a weak sum match is found, an md4 of the block (the next 'blocksize' bytes) is made, then compared against an array of { weaksum, md4sum }, until either the md4 sum matches or the weaksum doesn't.

The odd thing is, that at-least in the example you had, it was probably matching 0-bytes (sparse files?):
#4 0x00007f2d96a55bbc in rs_search_for_block (weak_sum=0, ....
(Although that could be an optimisation problem - can you try with a '-O0 -g' build?)

If that were the case, it shouldn't be slowing down here - it should be skipping along pretty fast.

@vps2fast
Copy link
Author

vps2fast commented Oct 3, 2014

I checked my source files with this script https://gist.github.com/vps2fast/51e4c3ead5a9f2f563bb and it is not sparsed on two different test servers:

perl check_sparse.pl root.hdd 
File root.hdd is not sparsed, size: 4347396096, actual space used: 4347400192

perl check_sparse.pl root.hdd 
File root.hdd is not sparsed, size: 4347396096, actual space used: 4347400192

But idea about collisions looks like solution and I will check it.

Thank you!

@vps2fast
Copy link
Author

vps2fast commented Oct 3, 2014

If anybody interested, rdiff was finished! It worked about 72 hours on Xeon e5-2620v2 (24 cores) and generated delta 400mb delta file for 4 Gb original file :)

-rw-r--r-- 1 root root  394266685 Oct  3 02:52 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta
-rw------- 1 root root 4347396096 Sep 30 18:34 root.hdd

@vps2fast
Copy link
Author

vps2fast commented Oct 3, 2014

I recompiled rdiff with -O0 flags but without any results. Rdiff hangs at 113 MB of delta file same as -O2 version again.

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fff347fe000
0x0000000000406089 in rs_mdfour64 (m=0x7fff346cc1a0, p=0x7fa274fdce18) at mdfour.c:155
155     ROUND3(C, D, A, B, 7, 11);
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.132.el6_5.4.x86_64 popt-1.13-7.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x0000000000406089 in rs_mdfour64 (m=0x7fff346cc1a0, p=0x7fa274fdce18) at mdfour.c:155
#1  0x0000000000406258 in rs_mdfour_block (md=0x7fff346cc1a0, p=0x7fa274fdce18) at mdfour.c:263
#2  0x0000000000406420 in rs_mdfour_update (md=0x7fff346cc1a0, in_void=0x7fa274f15d98, n=233344) at mdfour.c:359
#3  0x000000000040652f in rs_mdfour (out=0x7fff346cc270 "\251\017\226/\035B\253\212\311lH<\222\222t\230\240\302l4\377\177", in=0x7fa274f15d98, n=1048576) at mdfour.c:389
#4  0x00000000004057d0 in rs_calc_strong_sum (buf=0x7fa274f15d98, len=1048576, sum=0x7fff346cc270) at checksum.c:80
#5  0x0000000000404249 in rs_search_for_block (weak_sum=0, inbuf=0x7fa274f15d98 "", block_len=1048576, sig=0x1818d80, stats=0x1818ca0, match_where=0x7fff346cc330) at search.c:143
#6  0x000000000040254d in rs_findmatch (job=0x1818bd0, match_pos=0x7fff346cc330, match_len=0x7fff346cc328) at delta.c:272
#7  0x0000000000401f74 in rs_delta_s_scan (job=0x1818bd0) at delta.c:156
#8  0x000000000040322c in rs_job_work (job=0x1818bd0, buffers=0x7fff346cc470) at job.c:187
#9  0x00000000004030d7 in rs_job_iter (job=0x1818bd0, buffers=0x7fff346cc470) at job.c:145
#10 0x00000000004032f9 in rs_job_drive (job=0x1818bd0, buf=0x7fff346cc470, in_cb=0x4019e3 <rs_infilebuf_fill>, in_opaque=0x1818dc0, out_cb=0x401c38 <rs_outfilebuf_drain>, out_opaque=0x2879430) at job.c:241
#11 0x000000000040543e in rs_whole_run (job=0x1818bd0, in_file=0x1818750, out_file=0x1818990) at whole.c:80
#12 0x00000000004055aa in rs_delta_file (sig=0x1818d80, new_file=0x1818750, delta_file=0x1818990, stats=0x7fff346cc520) at whole.c:154
#13 0x0000000000401678 in rdiff_delta (opcon=0x1818010) at rdiff.c:298
#14 0x0000000000401848 in rdiff_action (opcon=0x1818010) at rdiff.c:358
#15 0x00000000004018c7 in main (argc=8, argv=0x7fff346cc718) at rdiff.c:374

@vps2fast
Copy link
Author

vps2fast commented Oct 3, 2014

I executed rdiff under /usr/bin/time but outout is not very interesting:

    User time (seconds): 434.26
    System time (seconds): 0.68
    Percent of CPU this job got: 99%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 7:15.02
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 176512
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 0
    Minor (reclaiming a frame) page faults: 11084
    Voluntary context switches: 3
    Involuntary context switches: 480
    Swaps: 0
    File system inputs: 0
    File system outputs: 231424
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

@vps2fast
Copy link
Author

vps2fast commented Oct 3, 2014

Hello again!

I tried rdiff --verbose and got very interesting output:

rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682
rdiff: (rs_search_for_block) this was a false positive, the strong sig doesn't match
rdiff: (rs_search_for_block) found weak match for 00000000 in token 2682

It's really looks like collision but code is very hard to understand for me :(

/* 
 * See if there is a match for the specified block INBUF..BLOCK_LEN in
 * the checksum set, using precalculated WEAK_SUM.
 *
 * If we don't find a match on the weak checksum, then we just give
 * up.  If we do find a weak match, then we proceed to calculate the
 * strong checksum for the current block, and see if it will match
 * anything.
 */
int
rs_search_for_block(rs_weak_sum_t weak_sum,
                    char const *inbuf, size_t block_len,
                    rs_signature_t const *sig, rs_stats_t * stats,
                    rs_long_t * match_where)
{
    int                     hash_tag = gettag(weak_sum);
    int                     j = sig->tag_table[hash_tag];
    rs_strong_sum_t         strong_sum;
    int                     got_strong = 0;

    if (j == NULL_TAG) {
        return 0;
    }   

    for (; j < sig->count && sig->targets[j].t == hash_tag; j++) {
        int                     i = sig->targets[j].i;
        int                     token;

        if (weak_sum != sig->block_sigs[i].weak_sum)
            continue;

        token = sig->block_sigs[i].i;

        rs_trace("found weak match for %08x in token %d", weak_sum, token);

        if (!got_strong) {
            rs_calc_strong_sum(inbuf, block_len, &strong_sum);
            got_strong = 1;
        }   

        /* FIXME: Use correct dynamic sum length! */
        if (memcmp(strong_sum, sig->block_sigs[i].strong_sum,
                   sig->strong_sum_len) == 0) {
            /* XXX: This is a remnant of rsync: token number 1 is the
             * block at offset 0.  It would be good to clear this
             * up. */
            *match_where = (rs_long_t)(token - 1) * sig->block_len;
            return 1;
        } else {
            rs_trace("this was a false positive, the strong sig doesn't match");
            stats->false_matches++;
        }   
    }   
    return 0;
}

@pavel-odintsov
Copy link

therealmik, I'm used pypy3 with your pyrdiff and it stuck in same place at 118th MB of delta file like original rdiff :)

/usr/bin/time /usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py delta extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature root.hdd extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta.pyrdiff
-rw-r--r--  1 root root  49K Сен 30 18:34 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature
-rw-r--r--  1 root root 118M Окт  4 16:50 extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta.pyrdiff
-rw-------  1 root root 4,1G Сен 30 18:34 root.hdd

@pavel-odintsov
Copy link

I found source of problem, pyrdiff and rdiff do one byte shift even for 1MB blocks:

Global offset is 2109734912 block size: 1048576 offset: 1262076
Global offset is 2109734912 block size: 1048576 offset: 1262077
Global offset is 2109734912 block size: 1048576 offset: 1262078
Global offset is 2109734912 block size: 1048576 offset: 1262079
Global offset is 2109734912 block size: 1048576 offset: 1262080
Global offset is 2109734912 block size: 1048576 offset: 1262081
Global offset is 2109734912 block size: 1048576 offset: 1262082
Global offset is 2109734912 block size: 1048576 offset: 1262083
Global offset is 2109734912 block size: 1048576 offset: 1262084
Global offset is 2109734912 block size: 1048576 offset: 1262085
Global offset is 2109734912 block size: 1048576 offset: 1262086
Global offset is 2109734912 block size: 1048576 offset: 1262087
Global offset is 2109734912 block size: 1048576 offset: 1262088
Global offset is 2109734912 block size: 1048576 offset: 1262089
Global offset is 2109734912 block size: 1048576 offset: 1262090
Global offset is 2109734912 block size: 1048576 offset: 1262091
Global offset is 2109734912 block size: 1048576 offset: 1262092
Global offset is 2109734912 block size: 1048576 offset: 1262093
Global offset is 2109734912 block size: 1048576 offset: 1262094
Global offset is 2109734912 block size: 1048576 offset: 1262095
Global offset is 2109734912 block size: 1048576 offset: 1262096
Global offset is 2109734912 block size: 1048576 offset: 1262097
Global offset is 2109734912 block size: 1048576 offset: 1262098

@therealmik
Copy link
Contributor

In my pyrdiff I have a hidden command debugsigs - try running it like this:
pyrdiff.py debugsigs sigfile | grep "SIGNATURE: 00000000"

@pavel-odintsov
Copy link

Hello!

With weaksumms everything fine, only one zero checksumm found:

/usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py debugsigs extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature|awk '{print $2}'|grep 00000000|wc -l
1

And number of collisions is very small:

/usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py debugsigs extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature|awk '{print $2}'|wc -l
4146

/usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py debugsigs extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature|awk '{print $2}'|sort|uniq -c |wc -l
4141

And with md4 collisions everything fine too:

/usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py debugsigs extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature|awk '{print $3}'| wc -l
4146

/usr/src/pypy3-2.3.1-src/pypy/goal/pypy-c pyrdiff.py debugsigs extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature|awk '{print $3}'| sort | uniq -c |wc -l
4141

@therealmik
Copy link
Contributor

I was specifically checking what the md4 sum of that 0 block was. I
suspect it's incorrect somehow.

@pavel-odintsov
Copy link

With signature files everything fine! But I found about 128 zero checksumms in new file:

./a.out |grep 00000000|wc -l
128

And with details:

weak: 00000000 offset: 2110783488 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2111832064 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2112880640 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2113929216 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2114977792 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2116026368 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2117074944 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2118123520 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2119172096 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2120220672 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2121269248 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2122317824 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2123366400 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2124414976 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2125463552 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2126512128 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2127560704 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2128609280 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2129657856 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2130706432 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2131755008 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2132803584 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2133852160 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2134900736 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2135949312 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2136997888 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2138046464 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2139095040 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2140143616 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2141192192 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2142240768 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2190475264 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2200961024 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2203058176 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2205155328 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2206203904 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2207252480 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2208301056 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2209349632 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2210398208 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2211446784 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2212495360 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2213543936 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2214592512 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2215641088 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2217738240 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2218786816 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2242904064 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2243952640 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2245001216 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2246049792 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2247098368 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2248146944 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2249195520 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2250244096 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2251292672 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2252341248 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2253389824 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2254438400 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2255486976 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2256535552 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2257584128 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2258632704 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2332033024 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2336227328 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2338324480 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2340421632 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2341470208 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2342518784 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2343567360 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2344615936 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2345664512 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2346713088 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2347761664 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2348810240 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2349858816 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2350907392 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2353004544 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2354053120 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2355101696 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2356150272 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2357198848 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2358247424 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2359296000 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2360344576 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2361393152 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2362441728 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2363490304 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2364538880 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2365587456 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2366636032 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2367684608 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2368733184 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2369781760 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2370830336 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2371878912 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2372927488 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2373976064 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2375024640 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2376073216 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2377121792 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2378170368 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2379218944 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2403336192 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2404384768 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2405433344 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2406481920 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2407530496 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2408579072 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2409627648 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2411724800 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2412773376 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2413821952 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2414870528 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2415919104 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2416967680 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2418016256 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2419064832 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2420113408 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2588934144 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2589982720 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2591031296 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2592079872 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2593128448 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2594177024 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2595225600 md4: a90f962f1d42ab8a
weak: 00000000 offset: 2596274176 md4: a90f962f1d42ab8a

It's normal for my data because I have clean blocks sometimes. And this blocks broke rdiff algorithm?

@therealmik
Copy link
Contributor

So that looks good:
dd if=/dev/zero bs=1048576 count=1 | openssl md4

If you have an all-zeros block it should match that md4, which would cause it to skip forward. For some reason either the md4 doesn't match or it's not skipping forward for some other reason (perhaps integer overflow somewhere?)

@vps2fast
Copy link
Author

vps2fast commented Oct 6, 2014

I suppose everything fine with rdiff/pyrdiff.

Excessive cpu usage is related with byte-to-byte md4 checksumm calculation which is useless for my data.

My data aligned in strong 1MB blocks but with --block-size rdiff still iterate byte-to-byte :(

@sourcefrog
Copy link
Contributor

It shouldn't be recomputing the md4 sum on every single byte - I suspect this is due to too many collisions on the weak checksum, or some similar issue.

@pavel-odintsov
Copy link

I could provide data and signature file to you and you can check :) For your convenience I can provide VPS with debian and test input files.

@therealmik
Copy link
Contributor

Out of interest, did you change the block size between signature and delta operations?

@vps2fast
Copy link
Author

vps2fast commented Oct 8, 2014

Hello!

No, I generated deltas with 1MB block and run delta with 1MB blocksize too:

rdiff --block-size=1048576 --input-size=16777216 --output-size=16777216 delta /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature /vz/private/4822/root.hdd/root.hdd /vz/tmp/extracted_backup_34bdf94b-43eb-4491-bc65-196d8f624d48.signature.delta

@dbaarda
Copy link
Member

dbaarda commented Oct 10, 2014

librsync finds deltas by quickly checking a weak 32bit rolling checksum at every byte offset, and then using a (truncated md4) strong checksum to verify that it's actually a match. This way it can find matches at non-block-size offsets, for example after someone has inserted/deleted a single byte.

Lots of weak checksum collisions means lots of md4sum calculations. There might also be some sort of hashtable limitation being hit inside librsync too for files this big (it's been a long time since I looked at this code).

Also note that for very large files, there is a small risk of getting a strong checksum collision that will result in corrupt data. In rsync there is an algorithm that dynamically calculates a reasonable strongsum and blocksize. You may want to set --sum-size to set the strongsum size based on your file size and --block-size using the following info;

http://marc.info/?l=rsync&m=104840197612370&w=2

Note that rsync also includes a whole-file checksum so it can detect failures from strongsum collisions and retry with a different strongsum seed up to 3 times before falling back to just sending the whole file. librsync doesn't yet have a while-file checksum, so you probably want a larger --sum-size than recommended in the above post.

@vps2fast
Copy link
Author

Hello, folks!

Thank you for your code (and especially thanks to therealmik) :) I completely rewrite rdiff in C++ and remove rollling and weak checksums and everything works perfectly:

./fastrdiff delta root.hdd_with_changed_data root.hdd.signature.fastrdiff  delta_for_one_block
Total time consumed by delta generation is: 7 seconds generation speed: 592.3 MB/s:

7 seconds looks nice instead 72 hours :)

@yobert
Copy link

yobert commented Nov 2, 2014

For comparing two images that always have the same size (or are aligned in fixed blocks), skipping the rolling checksum will certainly be superior. The problem is, in a general purpose sense, if you inserted a single byte into the start of your image and compared without rolling checksum, the delta would be the size of the whole original file.

@sourcefrog
Copy link
Contributor

Yep, just hashing every block will be unbeatable. But, the difference is only this extreme because of a bug in dealing with large files.

@pavel-odintsov
Copy link

sourcefrog, what kind of bug you say?

@sourcefrog
Copy link
Contributor

As Donovan says above, probably the weak sums are colliding too much either
in their actual domain or in the in memory hash table.

On Mon, Nov 3, 2014, 1:13 AM Pavel Odintsov notifications@github.com
wrote:

sourcefrog, what kind of bug you say?


Reply to this email directly or view it on GitHub
#11 (comment).

@cgm999
Copy link

cgm999 commented May 6, 2015

I think the issue is due SRC missing 1 continous block of zeros(1MB in this case), but DST having quite a big area (which is common for VM disk image), and some other block that have the weak sum same with the one of 1MB of \x00 . rdiff would keep doing STRONG_SUM for 1MB(when the large area of \0xx is hit) ,while moving the window 1 byte ,which is sloow (on my CPU is no moving the window with 1000bytes every few seconds)

Using the default block size of 2KB should not hit this issue due 2 reasons: STRONG_SUM(2KB) is way more fast and also the chance of a not having 2KB \0xx is lower.

vps2fast => if you still have the data that triggers this , try to create a new SRC = ORIG_SRC + 1MB of \x00 (begin or end of the file) then rerun rdiff. Also try running rdiff with ORIG_SRC and a default block size (-b 2048).

One possible fix that I see is to detect areas of zero (when SRC is missing it) and issue literal commands to delta output . I have some initial work done , if there is interest I can try get it completed :)

@pavel-odintsov
Copy link

I'm vps2fast. I fixed this issue by complete rdiff rewrite in C++ with disabled rolling checksumms.

@cgm999
Copy link

cgm999 commented May 7, 2015

Hi pavel, it would be good if the issue you encounter is the one from my theory, so if you can try to test as suggessted .

@pavel-odintsov
Copy link

Unfortunately, I can't because my tool is incompatible with rdiff anymore :(

@cgm999
Copy link

cgm999 commented May 7, 2015

Well there is the rdiff from librsync :). That means generating a signature with it and then delta, if you still have the data from when this issue started.

@dbaarda
Copy link
Member

dbaarda commented Mar 1, 2017

Looking at this again, I think the problem was collisions for a weaksum of 0. In particular, there was long runs of zeros which the rolling checksum gave a zero weaksum for at every byte offset. The signature must have had one non-zero block with a zero weaksum and no all-zero's blocks.

A contributing factor here is the blocksize of 1M. The rollsum we use for the weaksum has s0=sum(inbytes+31),s1=sum(s0), with both s0 and s1 truncated to 16 bits and then appended together. The +31 added to each byte is to ensure that the block length ends up encoded in the sum so eg blocks of zeros of different sizes don't end up with the same sum. When the blocksize is 1M, the +31 just cleanly overflows out of both s0 and s1, so it reduces to s0=sum(inbytes),s1=sum(s0), which gives you zero for a 1Mblock of zero's. This happens for any blocksize that is a power of 2 greater than 2^16.

I suspect if you had chosen a blocksize of 1M-1 the problem would have gone away.

Note the recommended blocksize is sqrt(filesize), so a 1M blocksize is pretty large.

@dbaarda
Copy link
Member

dbaarda commented Jun 2, 2017

Performance issues for large files have been improved with the new hashtable. I think the best fix for the remaining points is to use a better rolling hash. That is being tracked in #3, so I'm going to close this as a duplicate of that.

@dbaarda dbaarda closed this as completed Jun 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants