Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rsync 3.2.3 + xxh128/xxh3 support? #122

Closed
centminmod opened this issue Dec 2, 2020 · 13 comments
Closed

rsync 3.2.3 + xxh128/xxh3 support? #122

centminmod opened this issue Dec 2, 2020 · 13 comments

Comments

@centminmod
Copy link

@Cyan4973 suggested this question was better asked here :)

I tried building 2 rsync 3.2.3 RPMs on CentOS 7. One on a system with avx2 support and one without avx2 support.

The resulting rsync 3.2.3 binary's help output lists that xxh128 and xxh3 support is only on the non-avx2 built version and not the avx2 one? So what on the system determines whether rsync with xxHash will support xxh128 and xxh3 ? Trying to determine if I missed a build step/flag/configuration somewhere.

Both systems had xxhash-devel installed from EPEL repo

yum -q info xxhash-devel
Installed Packages
Name        : xxhash-devel
Arch        : x86_64
Version     : 0.8.0
Release     : 1.el7
Size        : 183 k
Repo        : installed
From repo   : epel
Summary     : Extremely fast hash algorithm - development files
URL         : http://www.xxhash.com/
License     : BSD
Description : Development files for the xxhash library

avx2 support system

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                8
On-line CPU(s) list:   0-7
Thread(s) per core:    2
Core(s) per socket:    4
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 60
Model name:            Intel(R) Core(TM) i7-4790K CPU @ 4.00GHz
Stepping:              3
CPU MHz:               4199.707
CPU max MHz:           4400.0000
CPU min MHz:           800.0000
BogoMIPS:              7982.22
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              8192K
NUMA node0 CPU(s):     0-7
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm epb invpcid_single ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt dtherm ida arat pln pts md_clear spec_ctrl intel_stibp flush_l1d
rsync --version
rsync  version 3.2.3  protocol version 31
Copyright (C) 1996-2020 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, hardlink-specials, symlinks, IPv6, atimes,
    batchfiles, inplace, append, ACLs, xattrs, optional protect-args, iconv,
    symtimes, prealloc, stop-at, no crtimes
Optimizations:
    SIMD, asm, openssl-crypto
Checksum list:
    xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
free -mlt
              total        used        free      shared  buff/cache   available
Mem:          31973        7163       13891         165       10918       24252
Low:          31973       18081       13891
High:             0           0           0
Swap:          2045           0        2045
Total:        34019        7163       15937

no avx2 support system

lscpu
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                1
On-line CPU(s) list:   0
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 45
Model name:            Intel(R) Xeon(R) CPU E5-2643 0 @ 3.30GHz
Stepping:              7
CPU MHz:               3299.998
BogoMIPS:              6599.99
Hypervisor vendor:     KVM
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              4096K
NUMA node0 CPU(s):     0
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx hypervisor lahf_lm tsc_adjust xsaveopt arat
rsync --version
rsync  version 3.2.3  protocol version 31
Copyright (C) 1996-2020 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, hardlink-specials, symlinks, IPv6, atimes,
    batchfiles, inplace, append, ACLs, xattrs, optional protect-args, iconv,
    symtimes, prealloc, stop-at, no crtimes
Optimizations:
    SIMD, asm, openssl-crypto
Checksum list:
    xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.
free -mlt
              total        used        free      shared  buff/cache   available
Mem:           3788        1228         334         192        2226        2093
Low:           3788        3454         334
High:             0           0           0
Swap:          2047         142        1905
Total:         5836        1370        2239
@centminmod
Copy link
Author

centminmod commented Dec 2, 2020

I think I found the differences on avx2 system I had previously compiled zstd and it seems rsync build picked that up

on avx2 system without xxh128 and xxh3 is using /usr/local/lib/libxxhash.so.0

ls -lah /usr/local/lib/libxxhash.so.0
lrwxrwxrwx 1 root root 18 Nov  2  2019 /usr/local/lib/libxxhash.so.0 -> libxxhash.so.0.7.2
ldd /usr/local/bin/rsync
        linux-vdso.so.1 =>  (0x00007ffdb1179000)
        libattr.so.1 => /lib64/libattr.so.1 (0x00007ff938bf3000)
        libacl.so.1 => /lib64/libacl.so.1 (0x00007ff9389ea000)
        libpopt.so.0 => /lib64/libpopt.so.0 (0x00007ff9387e0000)
        liblz4.so.1 => /usr/local/lib/liblz4.so.1 (0x00007ff9385a8000)
        libzstd.so.1 => /usr/local/lib/libzstd.so.1 (0x00007ff938f4a000)
        libxxhash.so.0 => /usr/local/lib/libxxhash.so.0 (0x00007ff938f3d000)
        libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007ff938145000)
        libc.so.6 => /lib64/libc.so.6 (0x00007ff937d77000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007ff937b73000)
        libz.so.1 => /lib64/libz.so.1 (0x00007ff93795d000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff938df8000)

on non-avx2 system with xxh128 and xxh3 is using /lib64/libxxhash.so.0

ls -lah /lib64/libxxhash.so.0
lrwxrwxrwx 1 root root 18 Dec  2 02:27 /lib64/libxxhash.so.0 -> libxxhash.so.0.8.0
ldd /usr/local/bin/rsync
        linux-vdso.so.1 =>  (0x00007ffffebc3000)
        libattr.so.1 => /lib64/libattr.so.1 (0x00007fb1bf4ae000)
        libacl.so.1 => /lib64/libacl.so.1 (0x00007fb1bf2a5000)
        libpopt.so.0 => /lib64/libpopt.so.0 (0x00007fb1bf09b000)
        liblz4.so.1 => /lib64/liblz4.so.1 (0x00007fb1bee8c000)
        libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fb1bebb9000)
        libxxhash.so.0 => /lib64/libxxhash.so.0 (0x00007fb1be9b0000)
        libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007fb1be54d000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fb1be17f000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fb1bdf7b000)
        libz.so.1 => /lib64/libz.so.1 (0x00007fb1bdd65000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fb1bf6b3000)

once I updated xxHash to 0.80 and rebuild rsync 3.2.3 RPM it shows properly now

rsync --version
rsync  version 3.2.3  protocol version 31
Copyright (C) 1996-2020 by Andrew Tridgell, Wayne Davison, and others.
Web site: https://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, hardlink-specials, symlinks, IPv6, atimes,
    batchfiles, inplace, append, ACLs, xattrs, optional protect-args, iconv,
    symtimes, prealloc, stop-at, no crtimes
Optimizations:
    SIMD, asm, openssl-crypto
Checksum list:
    xxh128 xxh3 xxh64 (xxhash) md5 md4 none
Compress list:
    zstd lz4 zlibx zlib none

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the G

@WayneD
Copy link
Member

WayneD commented Dec 3, 2020

Yeah, Cyan4973 could have told you that the 128-bit xxhash only just stabilized in its 0.8.0 release, so anything older than that isn't compatible.

@WayneD WayneD closed this as completed Dec 3, 2020
@wdoekes
Copy link

wdoekes commented Jul 8, 2021

For the record:

Cross-link to https://bugs.launchpad.net/ubuntu/+source/rsync/+bug/1934992

ERROR: .pwd.lock failed verification -- update discarded.
rsync error: some files/attrs were not transferred (see previous errors)
(code 23) at main.c(1816) [generator=3.2.3]

Caused by libxxhash 0.7.3 with rsync 3.2.3.

@WayneD
Copy link
Member

WayneD commented Jul 8, 2021

What is the version of libxxhash on the other side of the connection? You can also use --debug=nstr to have rsync mention what checksum it is negotiating. If it only fails on that one file it would be helpful if you could send me a copy of it for my testing.

@wdoekes
Copy link

wdoekes commented Jul 9, 2021

What is the version of libxxhash on the other side of the connection?

I'm sorry that was implied by the Groovy. On Ubuntu/Groovy you get 0.8 (at the moment).

But if you manually install the rsync 3.2.3 deb package it will allow any version >= 0.7.1. And on Focal, you have 0.7.3 in the main repository. So if you don't pay attention to this, you get one rsync side with libxxhash 0.7 and one with 0.8.

If it only fails on that one file [...]

It fails on all files.

Verbose steps to reproduce are now attached to the launchpad report.

(For an rsync source fix for this, it might be an option to do a xxh128 validity assertion before adding it to the list of allowed hashes. In a better world, the changed hashes should not have been made available through the
same symbols in libxxhash0.)

@utkarsh2102
Copy link

Hey,

I'm sorry that was implied by the Groovy. On Ubuntu/Groovy you get 0.8 (at the moment).
But if you manually install the rsync 3.2.3 deb package it will allow any version >= 0.7.1. And on Focal, you have 0.7.3 in the main repository. So if you don't pay attention to this, you get one rsync side with libxxhash 0.7 and one with 0.8.

I am sorry but what you're trying to do is not correct! You cannot just decide to take one of the package from another release and the other from another release. That is not how it works, I am afraid :/

If you do a fresh installation of rsync on Groovy, everything works fine! See here:

- $ lxc launch images:ubuntu/groovy groovy-rsync
- $ lxc shell groovy-rsync
- # apt update && apt install rsync
  - here you'll see that the right version of libxxhash gets installed (cf: `Setting up libxxhash0:amd64 (0.8.0-1ubuntu1.20.10.1) ...`)
- # ls -lah /usr/lib/x86_64-linux-gnu/libxxhash.so.0
lrwxrwxrwx 1 root root 18 Jan 12 11:17 /usr/lib/x86_64-linux-gnu/libxxhash.so.0 -> libxxhash.so.0.8.0

@wdoekes
Copy link

wdoekes commented Jul 9, 2021

You cannot just decide to take one of the package from another release and the other from another release. That is not how it works, I am afraid :/

Yes you can. It might not be recommended, but this generally works fine.

The problem here is that libxxhash reused symbols for changed functionality without (1) bumping the soname version or (2) using different symbols.

But @utkarsh2102: I'm not here to debate whether what I did was right or wrong. The fact is that people do this, and the debian package in Groovy is not equipped to handle this situation. If it's only supposed to work on Groovy it should depend on packages that are available on Groovy only: in this case libxxhash>=0.8. (Also: when upgrading a system, you get the inverse situation with mixed packages, and Debian and Ubuntu work just fine with that.)

And again: I don't have an issue with rsync. I just reported it here in case people came looking for this exact problem.

@WayneD
Copy link
Member

WayneD commented Jul 9, 2021

Rsync already handles proper negotiation of available hashes. The only possible bad thing I can imagine is that someone with the 0.7 xxhash changed the source code to enable the 128-bit hash when they shouldn't have. If an xxhash 0.7-based rsync says in the --debug=nstr output that it negotiated xxh128 then someone did something very bad.

@wdoekes
Copy link

wdoekes commented Jul 9, 2021

Compare:

# dpkg -S `which xxh128sum`
xxhash: /usr/bin/xxh128sum
# apt-cache policy xxhash | grep Installed
  Installed: 0.8.0-1ubuntu1.20.10.1
# for bits in 32 64 128; do xxh${bits}sum <(echo -n); done
02cc5d05  /dev/fd/63
ef46db3751d8e999  /dev/fd/63
99aa06d3014798d86001c324468d497f  /dev/fd/63

Against:

$ dpkg -S `which xxh128sum`
xxhash: /usr/bin/xxh128sum
$ apt-cache policy xxhash | grep Installed
  Installed: 0.7.3-1
$ for bits in 32 64 128; do xxh${bits}sum <(echo -n); done
02cc5d05  /dev/fd/63
ef46db3751d8e999  /dev/fd/63
07fd4e968e916ae11f17545bce1061f1  /dev/fd/63

I don't know who enabled what. But this looks like a problem.

Sounds to me like neither xxh128sum nor all those 128bit symbols (*) should have been in those 0.7.x packages.
(*)

 XXH3_128bits@Base 0.7.0
 XXH3_128bits_digest@Base 0.7.1
 XXH3_128bits_reset@Base 0.7.1
 XXH3_128bits_reset_withSecret@Base 0.7.1
 XXH3_128bits_reset_withSeed@Base 0.7.1
 XXH3_128bits_update@Base 0.7.1
 XXH3_128bits_withSecret@Base 0.7.1
 XXH3_128bits_withSeed@Base 0.7.0

@norbusan ?

@norbusan
Copy link

norbusan commented Jul 9, 2021

Hi everyone,
sorry I cannot really contribute here, since I am purely Debian maintainer and have no influence over what Ubuntu carries. Debian carries 0.6 in stable (soon oldstable), and 0.8 in testing (soon to be stable) and unstable.

If there have been strange changes to symbols between 0.7 and 0.8, that is something I cannot deal with now.

What might be a good idea is adding symbol tracking to the package. Help welcomed here.

@WayneD
Copy link
Member

WayneD commented Jul 10, 2021

Note that the /usr/bin utilities don't have any effect on rsync. The only things to check with rsync is what rsync -V says and what rsync --debug=nstr ... says when doing a copy. Since a (stock) rsync will never support xxh128 w/o xxhash lib is 0.8, any sign of xxh128 with an xxhash 0.7 lib is a sign of a badly patched rsync.

@wdoekes
Copy link

wdoekes commented Jul 10, 2021

Hi Wayne,

this is beginning to become a big misunderstanding. Let me try to explain what I think is going on:

  • rsync build checks (during compile time) what to enable;
  • if it compiles against libxxhash 0.8 it enables xxh128;
  • when building binary packages for debian/ubuntu this was done against 0.8.

(All fine up to this point.)

So, now you -- or in this case Ubuntu Groovy -- have an rsync 3.2.3 binary package that thinks it lives alongside a libxxhash 0.8 library. But because both are in packages, they can be individually swapped out for older/newer versions:

  • rsync can freely be upgraded, for example from 3.2.3 to 3.2.4;
  • libxxhash can be upgraded from 0.8.0 to 0.8.1.

But they can also be downgraded. At that point, when someone attempts to load libxxhash 0.7.3 onto their system, we want the package manager to complain that libxxhash-0.7.3 is not compatible with the rsync-3.2.3 binary package because it depends on libxxhash-0.8.0 functionality.

That's what these do:

$ apt-cache show rsync | grep ^Depends -A2
Depends: lsb-base, libacl1 (>= 2.2.23), libc6 (>= 2.15), libpopt0 (>= 1.14)

If you attempt to install this alongside some super ancient libc 2.14, then the package manager will suggest the removal of rsync because it needs libc 2.15 or higher.

For the rsync 3.2.3 Groovy package, this looks like this:

$ apt-cache show rsync | grep ^Depends
Depends: lsb-base, libacl1 (>= 2.2.23), libc6 (>= 2.15), liblz4-1 (>= 0.0~r130),
  libpopt0 (>= 1.14), libssl1.1 (>= 1.1.0), libxxhash0 (>= 0.7.1),
  libzstd1 (>= 1.3.8), zlib1g (>= 1:1.1.4)

Of interest here is that is says:

libxxhash0 (>= 0.7.1)

Instead of:

libxxhash0 (>= 0.8.0)

So, when trying to install the rsync 3.2.3 binary deb package alongside any binary libxxhash package you get no complaints from the (apt) package manager, as long as the version is >=0.7.1.

How?

At the last stage of building the binary deb packages there is the dh_shlibdeps step, which resolves all external/library symbols to packages/versions. It looks like this:

rsync-3.2.3$ dpkg-shlibdeps -v -v debian/rsync/usr/bin/rsync | grep -i symbol.*xxh
dpkg-shlibdeps: debug: Using symbols file /var/lib/dpkg/info/libxxhash0:amd64.symbols for libxxhash.so.0
dpkg-shlibdeps: debug:  Looking up symbol XXH3_64bits_reset@Base
dpkg-shlibdeps: debug:  Found in symbols file of libxxhash.so.0 (minver: 0.7.1, dep: libxxhash0 #MINVER#)
dpkg-shlibdeps: debug:  Looking up symbol XXH3_128bits_withSeed@Base
dpkg-shlibdeps: debug:  Found in symbols file of libxxhash.so.0 (minver: 0.7.0, dep: libxxhash0 #MINVER#)
...

You see, on this system with libxxhash 0.8, the binary package creation process will get the info that any version above 0.7.1 is fine. And thus the resultant package will Depend on libxxhash >=0.7.1 instead of on >=0.8.0.

How to remedy the situation?

There is nothing for rsync to do. It could check xxh128 sanity at runtime. But it's really not its job.

Instead, the symbols in the libxxhash packages should be updated to reflect that XXH3_128bits_withSeed in libxxhash0 0.7.1 is not the same as the XXH3_128bits_withSeed in 0.8.0.

This is something that @norbusan might be able to help with. But according to his feedback (thanks!), Debian is not affected by this, because Debian will only have 0.6 and 0.8 libxxhash versions in (at one point) stable releases.

Ubuntu is affected by this because it has taken the 0.7.3 package from debian and landed it into (stable) Ubuntu/Focal.

I think that we should attempt to alter the symbol file I pasted above to show:

 XXH3_128bits@Base 0.8.0
 XXH3_128bits_digest@Base 0.8.0
 XXH3_128bits_reset@Base 0.8.0
 XXH3_128bits_reset_withSecret@Base 0.8.0
 XXH3_128bits_reset_withSeed@Base 0.8.0
 XXH3_128bits_update@Base 0.8.0
 XXH3_128bits_withSecret@Base 0.8.0
 XXH3_128bits_withSeed@Base 0.8.0

If a new binary deb package is built against a newer libxxhash with that symbols file, I think the dependency problem would be fixed. Then only every project that links against libxxhash needs a rebuild so the package manager dependencies can be updated. And this problem would fade away as the newer packages get more prominence.

I'll be moving this discussion to norbusan/debian-xxhash, as I think any changes should be done there.

Thanks all for listening :)

@WayneD
Copy link
Member

WayneD commented Jul 10, 2021

Thanks for the clarification. That does indeed sound plausible. One solution is to have debian/ubuntu patch the rsync code to define XXH_INLINE_ALL and then rsync will be fine because it won't use the external xxhash library. Otherwise, if rsync is compiled with xxhash 0.8 the binary must be marked with 0.8 as the minimum version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants