Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the build reproducible #2

Closed
wants to merge 1 commit into from

Conversation

lamby
Copy link
Contributor

@lamby lamby commented Sep 6, 2019

Whilst working on the Reproducible Builds effort we noticed that libnbd could not be built reproducibly.

This is due to it shipping a .pod generation wrapper that does not use/respect SOURCE_DATE_EPOCH and additionally varies the output depending on the build user's current timezone.

(This was originally filed in Debian as #939546.)

Whilst working on the Reproducible Builds effort [0] we noticed
that libnbd could not be built reproducibly.

This is due to it shipping a pod generation wrapper that
does not use/respect SOURCE_DATE_EPOCH [1] and additionally
varies the output depending on the build user's current
timezone.

(This was originally filed in Debian as #939546 [2].)

 [0] https://reproducible-builds.org/
 [1] https://reproducible-builds.org/docs/source-date-epoch/
 [2] https://bugs.debian.org/939546

Signed-off-by: Chris Lamb <lamby@debian.org>
@rwmjones
Copy link
Collaborator

rwmjones commented Sep 6, 2019

@rwmjones rwmjones closed this Sep 6, 2019
libguestfs pushed a commit that referenced this pull request Mar 1, 2021
Allow the user to control the maximum request size. This can improve
performance and minimize memory usage. With the new option, it is easy
to test and tune the tool for particular environment.

I tested this on our scale lab with FC storage, copying 100 GiB image
with 66 GiB of data from local fast SDD (Dell Express Flash PM1725b
3.2TB SFF) to a qcow2 preallocated volume on FC storage domain
(NETAPP,LUN C-Mode).

The source and destination images are served by qemu-nbd, using same
configuration used in oVirt:

    qemu-nbd --persistent --shared=8 --format=qcow2 --cache=none --aio=native \
        --read-only /scratch/nsoffer-v2v.qcow2 --socket /tmp/src.sock

    qemu-nbd --persistent --shared=8 --format=qcow2 --cache=none --aio=native \
        /dev/{vg-name}/{lv-name} --socket /tmp/dst.sock

Tested with hyperfine using using 10 runes for every request size.

Benchmark #1: ./nbdcopy --request-size=262144 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     113.299 s ±  1.160 s    [User: 7.427 s, System: 23.862 s]
  Range (min … max):   112.332 s … 115.598 s    10 runs

Benchmark #2: ./nbdcopy --request-size=524288 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     107.952 s ±  0.800 s    [User: 10.085 s, System: 24.392 s]
  Range (min … max):   107.023 s … 109.368 s    10 runs

Benchmark #3: ./nbdcopy --request-size=1048576 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     105.992 s ±  0.442 s    [User: 11.809 s, System: 24.215 s]
  Range (min … max):   105.391 s … 106.853 s    10 runs

Benchmark #4: ./nbdcopy --request-size=2097152 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     107.625 s ±  1.011 s    [User: 11.767 s, System: 26.629 s]
  Range (min … max):   105.650 s … 109.466 s    10 runs

Benchmark #5: ./nbdcopy --request-size=4194304 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     111.190 s ±  0.874 s    [User: 11.160 s, System: 27.767 s]
  Range (min … max):   109.967 s … 112.442 s    10 runs

Benchmark #6: ./nbdcopy --request-size=8388608 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     117.950 s ±  1.051 s    [User: 10.570 s, System: 28.344 s]
  Range (min … max):   116.077 s … 119.758 s    10 runs

Benchmark #7: ./nbdcopy --request-size=16777216 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     125.154 s ±  2.121 s    [User: 10.213 s, System: 28.392 s]
  Range (min … max):   122.395 s … 129.108 s    10 runs

Benchmark #8: ./nbdcopy --request-size=33554432 nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     130.694 s ±  1.315 s    [User: 4.459 s, System: 38.734 s]
  Range (min … max):   128.872 s … 133.255 s    10 runs

For reference, same copy using qemu-img convert with maximum number of
coroutines:

Benchmark #9: qemu-img convert -n -f raw -O raw -W -m 16 \
              nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock          
  Time (mean ± σ):     106.093 s ±  4.616 s    [User: 3.994 s, System: 24.768 s]
  Range (min … max):   102.407 s … 115.493 s    10 runs

We can see that current default 32 MiB request size is 23% slower and
use 17% more cpu time compared with 1 MiB request size.

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
libguestfs pushed a commit that referenced this pull request Mar 1, 2021
Add example for copying an image between nbd servers using libev event
loop. Currently supports only dumb copying without using extents or
trying to detect zeroes.

The main motivation for adding this example is testing the efficiency
of the home-brew event loop in nbdcopy. Testing this example shows
similar performance compared with qemu-img convert. nbdcopy performs
worse, but tweaking the request size shows similar performance using
more cpu time.

I tested this only with nbdkit memory plugin, using:

    nbdkit -f -r pattern size=1G -U /tmp/src.sock
    nbdkit -f memory size=1g -U /tmp/dst.sock

I used hyperfine to run all benchmarks using --warmup=3 and --run=10.

Benchmark #1: ./copy-libev nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     552.9 ms ±  47.4 ms    [User: 76.4 ms, System: 456.3 ms]
  Range (min … max):   533.8 ms … 687.6 ms    10 runs

qemu-img shows same performance, using slightly less cpu time:

Benchmark #2: qemu-img convert -n -W nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     554.6 ms ±  42.4 ms    [User: 69.1 ms, System: 456.6 ms]
  Range (min … max):   535.5 ms … 674.9 ms    10 runs

nbdcopy is 78% slower, and uses 290% more cpu time:

Benchmark #3: .nbdcopy --flush nbd+unix:///?socket=/tmp/src.sock \
              nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     935.8 ms ±  37.8 ms    [User: 206.4 ms, System: 1340.8 ms]
  Range (min … max):   890.5 ms … 1017.6 ms    10 runs

Disabling extents and sparse does not make a difference, but changing
the request size show similar performance:

Benchmark #4: ./nbdcopy --flush --no-extents --sparse=0 --request-size=1048576 \
              nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     594.5 ms ±  39.2 ms    [User: 250.0 ms, System: 1197.7 ms]
  Range (min … max):   578.2 ms … 705.8 ms    10 runs

Decreasing number of requests is little faster and use less cpu time, but
nbdcopy is still 5% slower and uses 240% more cpu time.

Benchmark #5: ./nbdcopy --flush --no-extents --sparse=0 --request-size=1048576 --requests=16 \
              nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     583.0 ms ±  30.7 ms    [User: 243.9 ms, System: 1051.5 ms]
  Range (min … max):   566.6 ms … 658.3 ms    10 runs

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
libguestfs pushed a commit that referenced this pull request Apr 23, 2021
If the destination supports zero, try to zero entire extent in one
request. This speeds up copying of large sparse images. Same logic is
used by nbdcopy.

Here is an example benchmark, copying empty 1 TiB qcow2 image:

$ qemu-img create -f qcow2 src.qcow2 1t
$ qemu-img create -f qcow2 dst.qcow2 1t
$ qemu-nbd --persistent --socket=/tmp/src.sock --format=qcow2 --read-only src.qcow2
$ qemu-nbd --persistent --socket=/tmp/dst.sock --format=qcow2 dst.qcow2
$ export SRC=nbd+unix:///?socket=/tmp/src.sock
$ export DST=nbd+unix:///?socket=/tmp/dst.sock

$ hyperfine -w3 \
    "./copy-libev $SRC $DST" \
    "qemu-img convert -n -W $SRC $DST" \
    "../copy/nbdcopy --request-size=1048576 --requests=16 --connections=1 $SRC $DST"

Benchmark #1: ./copy-libev nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):     940.9 ms ±  36.3 ms    [User: 80.8 ms, System: 120.0 ms]
  Range (min … max):   892.8 ms … 1005.3 ms    10 runs

Benchmark #2: qemu-img convert -n -W nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):      2.848 s ±  0.087 s    [User: 241.7 ms, System: 253.9 ms]
  Range (min … max):    2.740 s …  3.035 s    10 runs

Benchmark #3: ../copy/nbdcopy --request-size=1048576 --requests=16 --connections=1 nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock
  Time (mean ± σ):      1.082 s ±  0.041 s    [User: 77.6 ms, System: 100.9 ms]
  Range (min … max):    1.043 s …  1.148 s    10 runs

Summary
  './copy-libev nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock' ran
    1.15 ± 0.06 times faster than '../copy/nbdcopy --request-size=1048576 --requests=16 --connections=1 nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock'
    3.03 ± 0.15 times faster than 'qemu-img convert -n -W nbd+unix:///?socket=/tmp/src.sock nbd+unix:///?socket=/tmp/dst.sock'

Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants