Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add buffering to send-stream #154

Closed
diraimondo opened this issue May 9, 2017 · 20 comments

Comments

Projects
None yet
2 participants
@diraimondo
Copy link

commented May 9, 2017

The btrfs send/receive transferring is usually slowed down by the non-constant speed of the two involved parts (I guess more the destination that applies the receive step). The usage of a not-so-small buffer on the destination side (or even both) should improve the transferring time (this is at least true for ZFS: http://everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/). The pv command has already a small default buffer but: I think pv is used only if the --progress option is enabled and pv could (not verified) be applied on the send-side.
I would suggest to, maybe optionally, use a configurable buffer on the destination side (or both) using pv or, better, a more powerful command like mbuffer (if installed).

@digint

This comment has been minimized.

Copy link
Owner

commented May 10, 2017

I agree this could very well also apply to btrfs send streams.

I'd love to have some numbers about this before thinking of implementing a buffering feature. My problem here is that I don't have any high-performance machines on which I could perform some tests (and not much time for this...). I agree that it probably has more impact on the receiving side, but there might also be an improvement when having a buffer on the sending side.

As far as I know, ssh does no buffering by itself. So when running btrfs send | ssh btrfs receive, I believe the only buffering (on both sides) happens in pipe buffers, which by default are 64k size. See also the discussion in #105, where I optimised writing to disk using buffering with dd.

One more thing: do you know of any core command that we could use for buffering? dd would definitively no do the trick. I'd love not to depend on any extra packages on remote hosts. For now, pv is only used for --progress and rate_limit options, and this is all done on the local side.

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 10, 2017

I did some test on my local network using a subvolume snapshot of about 6GiB:

# first attempt without any buffering: 6 minutes and 7 seconds
time sudo btrfs send TEST | ssh root@sauron "btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,26s user 26,36s system 7% cpu 6:03,76 total
ssh root@sauron   32,53s user 13,54s system 12% cpu 6:07,20 total

# second attempt without buffering: 6 minutes and 29 seconds
time sudo btrfs send TEST | ssh root@sauron "btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,18s user 22,28s system 5% cpu 6:24,09 total
ssh root@sauron   30,79s user 10,80s system 10% cpu 6:29,92 total

# first attempt using a 300MiB buffer on the receive-side using mbuffer: 5 minutes and 31 seconds... good!
time sudo btrfs send TEST | ssh root@sauron "mbuffer -m 300M | btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
in @  0.0 KiB/s, out @ 13.0 MiB/s, 6007 MiB total, buffer   0% full
summary: 6008 MiByte in  5min 28.3sec - average of 18.3 MiB/s
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,26s user 25,30s system 8% cpu 5:16,39 total
ssh root@sauron   31,74s user 13,20s system 13% cpu 5:31,69 total

# second attempt using a 300MiB buffer on the receive-side using mbuffer: 5 minutes and 8 seconds... even better!
time sudo btrfs send TEST | ssh root@sauron "mbuffer -m 300M | btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
in @  0.0 KiB/s, out @ 5354 KiB/s, 6007 MiB total, buffer   0% full
summary: 6008 MiByte in  5min 06.7sec - average of 19.6 MiB/s
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,19s user 19,20s system 8% cpu 3:44,97 total
ssh root@sauron   29,98s user 7,62s system 12% cpu 5:08,00 total

# an attempt to replace mbuffer with pv on the receive-side: more than 8 minutes... very bad!
time sudo btrfs send TEST | ssh root@sauron "pv -B 300m | btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,16s user 19,56s system 4% cpu 8:09,15 total
ssh root@sauron   29,95s user 8,05s system 7% cpu 8:10,88 total

# attempt to use a 300MiB buffer on the send-size: 6 minutes and 21 seconds... no difference as the buffer is always full
time sudo btrfs send TEST | mbuffer -m 300M | ssh root@sauron "btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
in @  0.0 KiB/s, out @ 12.7 MiB/s, 6006 MiB total, buffer   1% full
summary: 6008 MiByte in  6min 18.2sec - average of 15.9 MiB/s
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,22s user 22,43s system 6% cpu 5:52,47 total
mbuffer -m 300M  1,34s user 10,78s system 3% cpu 6:18,27 total
ssh root@sauron   31,23s user 9,78s system 10% cpu 6:21,71 total

# final attempt to use a 300MiB buffer on both sides: ...no further advantage
time sudo btrfs send TEST | mbuffer -m 300M | ssh root@sauron "mbuffer -m 300M | btrfs receive /mnt/btrfs-data-pool/ ; btrfs subvolume delete /mnt/btrfs-data-pool/TEST"
At subvol TEST
At subvol TEST
in @ 25.0 MiB/s, out @ 96.3 MiB/s, 5737 MiB total, buffer  88% full
summary: 6008 MiByte in  4min 51.2sec - average of 20.6 MiB/s
summary: 6008 MiByte in  5min 01.2sec - average of 19.9 MiB/s
Delete subvolume (no-commit): '/mnt/btrfs-data-pool/TEST'
sudo btrfs send TEST  0,32s user 25,79s system 10% cpu 4:04,33 total
mbuffer -m 300M  1,36s user 13,46s system 5% cpu 4:51,33 total
ssh root@sauron   32,83s user 13,87s system 15% cpu 5:04,52 total
@diraimondo

This comment has been minimized.

Copy link
Author

commented May 10, 2017

about the usage of standard commands: as long as I know none of the cited commands (expect dd) are standard in any distribution. mbuffer is very powerful with its double-buffering, can use a percentage of the main memory as buffer size and can even use a file on disk as buffer. buffer is another alternative (less powerful than mbuffer). pv looks not suitable for our goal.

@digint

This comment has been minimized.

Copy link
Owner

commented May 10, 2017

Look great, what I can see from your test is a ~20% improvement.
I pushed a quick and dirty implementation using mbuffer to receive_buffer branch (commit 7d2060b).

Simply set receive_buffer 500m in your btrbk.conf and it should add the mbuffer command needed.

I put it AFTER decompression, hoping it makes most sense like this.

@diraimondo would be great if you had time to test this a bit, especially with different stream_compress configs. I suspect the improvements might drop on processor-intensive compression algorithms like xz, but probably will still look good on lzo.

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 10, 2017

@digint I'm testing your branch but on my test I can't see any instance of mbuffer used on receive-machine. I'm pulling from the server the same subvolume TEST from my laptop with the following config:

transaction_syslog daemon
#lockfile /var/lock/btrbk.lock
timestamp_format long

ssh_identity /root/.ssh/id_rsa
ssh_user root

volume ssh://nazgul-dock-eth/mnt/btrfs-pool
  group test
  snapshot_dir snapshots
  snapshot_preserve_min all
  snapshot_create no
  #resume_missing yes
  receive_buffer 500m
  target_preserve_min no
  target_preserve 6d 5w 6m
  target send-receive /mnt/btrfs-data-pool/TEST
  subvolume TEST

from the process list I can spot the used command: sh -c { ssh -i /root/.ssh/id_rsa root@nazgul-dock-eth 'btrfs send /mnt/btrfs-pool/snapshots/TEST.20170510T0000' 2>&3 | btrfs receive /mnt/btrfs-data-pool/TEST/ 2>&3 ; } 3>&1

@digint

This comment has been minimized.

Copy link
Owner

commented May 10, 2017

sorry, forgot about the case where source is remote and target is local. commit 9b00045 should also work for you now

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 10, 2017

Pulling from a server the same subvolume TEST of 6GiB from my laptop:

  • without buffer:
    sudo /tmp/btrbk/btrbk -v run nazgul-dock-eth:/mnt/btrfs-pool/TEST 27,97s user 18,85s system 12% cpu 6:29,36 total

  • with option 'receive_buffer 500m':
    sudo /tmp/btrbk/btrbk -v run nazgul-dock-eth:/mnt/btrfs-pool/TEST 27,38s user 22,44s system 16% cpu 4:53,65 total

  • with options 'receive_buffer 500m; stream_compress lzo': light cpu usage on the send-side with buffer on the receive-side almost full
    sudo /tmp/btrbk/btrbk -v run nazgul-dock-eth:/mnt/btrfs-pool/TEST 31,99s user 24,43s system 21% cpu 4:24,57 total

  • with options 'receive_buffer 500m; stream_compress xz': one core of laptop on the send-side is heavily used and this creates a bottleneck causing the buffer on the receive-side to remain almost empty!
    ... I stopped the test after 11 minutes with only 1.5GiB tranferred...

Note that the cpu usage information are related to the server pulling the data from my laptop, so the receive-side.
Furthermore, the mbuffer command has a default live progress line like pv: this looks confusing/disturbing if used with the --progress option (activating pv on the send-side); mbuffer should have a --quiet option.
mbuffer allows to use a percentage of the physical memory (like -m 30%) as buffer size: it could be nice to allow this syntax in your parser.

I'll carry on some tests to verify if it makes sense to use a small buffer on a same-machine send-receive process. My guess is that it could help a bit.

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 11, 2017

As promised test using buffering on local machine (no network): even a small buffer help!

time sudo btrfs send TEST-ro | sudo btrfs receive snapshots
sudo btrfs send TEST-ro  0,05s user 13,99s system 28% cpu 48,672 total
sudo btrfs receive snapshots  20,03s user 13,32s system 68% cpu 48,827 total

time sudo btrfs send TEST-ro | sudo btrfs receive snapshots
sudo btrfs send TEST-ro  0,05s user 13,52s system 28% cpu 47,191 total
sudo btrfs receive snapshots  19,81s user 12,94s system 69% cpu 47,316 total

time sudo btrfs send TEST-ro | mbuffer -q -m 64M | sudo btrfs receive snapshots
sudo btrfs send TEST-ro  0,15s user 13,88s system 32% cpu 42,765 total
mbuffer -q -m 64M  0,53s user 5,72s system 14% cpu 43,175 total
sudo btrfs receive snapshots  20,25s user 12,76s system 76% cpu 43,366 total

time sudo btrfs send TEST-ro | mbuffer -q -m 128M | sudo btrfs receive snapshots
sudo btrfs send TEST-ro  0,16s user 13,64s system 32% cpu 42,766 total
mbuffer -q -m 128M  0,57s user 5,47s system 13% cpu 43,632 total
sudo btrfs receive snapshots  19,98s user 12,77s system 74% cpu 43,766 total

time sudo btrfs send TEST-ro | mbuffer -q -m 512M | sudo btrfs receive snapshots
sudo btrfs send TEST-ro  0,15s user 13,91s system 35% cpu 40,071 total
mbuffer -q -m 512M  0,61s user 5,75s system 14% cpu 43,892 total
sudo btrfs receive snapshots  20,14s user 13,14s system 75% cpu 43,992 total

I did the tests on the same pool for lack of a second disk on the laptop; I expect similar behavior using multiple pools on different disks.

@digint

This comment has been minimized.

Copy link
Owner

commented May 11, 2017

@diraimondo thanks for testing! I suspect the "xxx total" output from the time command is some "total time the command needed for completion", but no matter, it's faster with mbuffer :)

I just pushed some changes you suggested.

Note to self: rename this to stream_buffer for final implementation, this seems more consistent.

@digint digint added the enhancement label May 11, 2017

@digint digint changed the title buffering Add buffering to send-stream May 11, 2017

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 11, 2017

The actual code doesn't permit to use the buffer without a ssh session, right? As stated by the tests I reported today, it looks that there is a small advantage in using a small buffer on a local transfer.

@digint

This comment has been minimized.

Copy link
Owner

commented May 11, 2017

Yes it does. I just double-checked:

btrbk.conf.receive_buffer:

snapshot_dir    _btrbk_snap
receive_buffer 12%
volume /mnt/btr_system
  subvolume root_gentoo
    target send-receive /mnt/btr_data

test run:

./btrbk -c btrbk.conf.receive_buffer -l debug dryrun
[...]
Creating incremental backup...
[send/receive] source: /mnt/btr_system/_btrbk_snap/root_gentoo.20170511
[send/receive] parent: /mnt/btr_system/_btrbk_snap/root_gentoo.20170510
[send/receive] target: /mnt/btr_data/root_gentoo.20170511
### (dryrun) btrfs send -p /mnt/btr_system/_btrbk_snap/root_gentoo.20170510 /mnt/btr_system/_btrbk_snap/root_gentoo.20170511 | mbuffer -q -m 12% | btrfs receive /mnt/btr_data/
[...]
@diraimondo

This comment has been minimized.

Copy link
Author

commented May 11, 2017

During the regular usage of the branch with buffering I can spot two strange behaviors:

  • on the local machine usage, with 64MB of buffer, I can see the mbuffer process working on a core with 100% usage... 8-|
  • if I cancel with CTRL+C on btrbk during such runs, the brtfs and mbuffer process are not stopped;

I suspect also that during a send-receive over ssh session such process remains alive even after a drop of the ssh connection: I've found many still-alive copy of btrfs and mbuffer processes...

note: I'm not sure if this behavior was present also in the main branch; I'm new here!

@digint

This comment has been minimized.

Copy link
Owner

commented May 11, 2017

on the local machine usage, with 64MB of buffer, I can see the mbuffer process working on a core with 100% usage... 8-|

bad.

if I cancel with CTRL+C on btrbk during such runs, the brtfs and mbuffer process are not stopped

bad, bad, BAD! Looks like mbuffer does some ugly forks (it says something about multithreading in the man page which already scared me in the first place...)
Edit: on second thought, mbuffer needs to be multithreaded to read and write concurrently

Maybe we'll have to also check other tools, like "buffer".

@diraimondo

This comment has been minimized.

Copy link
Author

commented May 11, 2017

I did a test using buffer and, at least on the same-machine test, it didn't make the transfer quicker... On the other side, it looks to regularly manage the SIGINT signal on CTRL-C keystroke.

@digint

This comment has been minimized.

Copy link
Owner

commented Jun 8, 2017

Final implementation is on master (flagged experimental): 315b3f2

Merged from receive_buffer branch, renamed configuration option to stream_buffer

@diraimondo

This comment has been minimized.

Copy link
Author

commented Jun 8, 2017

Did you solve the problem of killing the mbuffer process on btrbk exit?

@digint

This comment has been minimized.

Copy link
Owner

commented Jun 8, 2017

No. That's the main reason why I leave it "experimental".

I did some tests though (local and remote transfers), and all resultet in mbuffer process being properly killed after SIGINT (ctrl-c).

Maybe you use a different shell? I'm on gentoo here, and use bash on all machines.

@diraimondo

This comment has been minimized.

Copy link
Author

commented Jun 8, 2017

I'm using ArchLinux with zsh.

@diraimondo

This comment has been minimized.

Copy link
Author

commented Jun 19, 2017

I'm trying to use the mbuffer-based new feature but sometime I find the mbuffer process hogging my CPU on the server during the transfer from the laptop.

@digint

This comment has been minimized.

Copy link
Owner

commented Jul 30, 2017

stream_buffer is included in btrbk-v0.25.1, closing issue

@digint digint closed this Jul 30, 2017

digint added a commit that referenced this issue Aug 21, 2017

Merge tag 'v0.25.1' into qgroup
Version 0.25.1

  * Support for btrfs-progs v4.12: fix parsing of "btrfs sub show"
    output, which now prints relative paths (close #171).
  * Add "stream_buffer" configuration option (close #154).
  * Bugfix: accept "no" for "transaction_log", "transaction_syslog"
    and "lockfile" configuration options.
  * Show "up-to-date" status for backups in "stats" command.
  * Show "correlated" status instead of "orphaned" in "stats" command.
  * Check source subvolumes for readonly and received_uuid flags, and
    abort if one of them is set.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.