Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed: borg vs rsync for cPanel account #4190

Closed
enboig opened this issue Nov 30, 2018 · 21 comments
Closed

speed: borg vs rsync for cPanel account #4190

enboig opened this issue Nov 30, 2018 · 21 comments

Comments

@enboig
Copy link

enboig commented Nov 30, 2018

  • Adding version: borg 1.1.7

I have a script to backup my VPS webserver (web+mail+mysqldumps) accounts using rsync. I am exploring changing rsync to borg to have larger version history, but it is way slower:

small account (478.06 MB):
borg: 57.91 seconds
rsync: 0m3.762s

mid size (2.18 GB):
borg: 2 minutes 18.17 seconds
rsync: 0m5.974s

big size (15.87 GB):
borg: 14 minutes 59.75 seconds
rsync: 0m25.863s

huge (133.50 GB):
borg: 2 hours 7 minutes 38.05 seconds
rsync: 4m40.872s

I create my backups with

borg create --stats --verbose --filter AME --list --show-rc --compression auto,zstd ::'{hostname}-account-{now}'  /home/account

Both backups go from server A to B and are written in the same RAID disk.

The backup involve lots of small text files, should I have set special parameters when creating the repository?

Are there SSH options to improve the speed? it a fibre connection, so there shouldn't be any problem.

@enboig
Copy link
Author

enboig commented Nov 30, 2018

I have just discovered #3039 ; but I don't know if this is related to my case or how to check it.

@ThomasWaldmann
Copy link
Member

rsync doesn't do much with the data, it just copies them from a to b.

borg chunks, hashes/deduplicates, compresses, encrypts, authenticates the data, so it is expected to need more resources (cpu, ram, time, ...) than rsync.

for a practical comparison, you should not only compare the first run (which is slowest), but also subsequent runs (which might be faster if there is duplicate data).

yes, it can also be that borg is slower because it cares more for file metadata than rsync, not sure about that.

@enboig
Copy link
Author

enboig commented Dec 1, 2018

I assumed as both only transmitted changed data, the would get similar speed. Usually bandwidth is the bottle neck and compression should help borg. I didn't thought about encryption. Deduplication should also help borg, and having a big repository. I could imagine borg being slower, but not that much.

And it was fifth daily run.

I will try and unencrypted and uncompressed backup in a week, and try to find bottle neck

@enboig
Copy link
Author

enboig commented Dec 3, 2018

Cheking the logs I notice unchanged files are upload again:

A /home/acount/public_html/......gif
A /home/acount/public_html/......jpg

It is a VPS and home is mounted, so maybe it is related to inodes... I will check different values for --files-cache

@ThomasWaldmann
Copy link
Member

Please add the borg version to your top post.

@enboig
Copy link
Author

enboig commented Dec 3, 2018

done: 1.1.7

Tonight is next backup, tomorrow I will post new times.

@enboig
Copy link
Author

enboig commented Dec 4, 2018

Again "static" (php files, images, etc...) files are added again....
I run borg with:

/root/bin/borg create               \
    --stats                         \
    --files-cache ctime,size        \
    --verbose                       \
    --filter AME                    \
    --list                          \
    --show-rc                       \
    --compression auto,zstd         \
    --exclude '/home/'$1'/tmp/*'           \
    ::'{hostname}-'$1'-{now}'       \
    /home/$1

The script is called in a loop for each server account

@infectormp
Copy link
Contributor

@enboig which file system is used?

@enboig
Copy link
Author

enboig commented Dec 4, 2018

The filesystem in the VPS is:

# cat mtab | grep home
/dev/mapper/vg_label-lv_home /home ext4 rw,usrjquota=quota.user,jqfmt=vfsv0 0 0

# findmnt | grep home
├─/home                                                                       /dev/mapper/vg_label-lv_home                                                  ext4        rw,relatime,barrier=1,data=ordered,jqfmt=vfsv0,usrjquota=quota.user

@infectormp
Copy link
Contributor

To complete the experiment now you need try --ignore-inode

@knutov
Copy link

knutov commented Dec 4, 2018

@enboig Is's slow becouse you are using ztsd to compress. Change it to LZ4 and it will be pretty fast.

Also, borg first run will be always a little bit slow (in my case - twice slow, compared to tar.gzip to remote sshfs), but any additional snapshot is much faster compared to tar/rsync.

Also, better add --nobsdflags --files-cache=ctime,size, I think, especially on linux/vps

@enboig
Copy link
Author

enboig commented Dec 4, 2018

I get a warning; and applied extra flags.

Warning: "--ignore-inode" has been deprecated. Use "--files-cache=ctime,size" or "...=mtime,size" instead.

@knutov I changed compression, it may help, but my problem is all files appear to be new every time

I think --nobsdflags did the trick. I don't know why, but in my last backup log, there wasn't a single file marked as "M", all were "A". Now I am running it manually and everything seems fine.

Thanks a lot, I will wait some days to inform back if everything is working as expected.

@enboig enboig closed this as completed Dec 4, 2018
@enboig
Copy link
Author

enboig commented Dec 10, 2018

After checking the logs of the weekend, it is slow again. The I am using is:

/root/bin/borg create               \
    --stats                         \
    --files-cache ctime,size        \
    --verbose                       \
    --filter AME                    \
    --list                          \
    --show-rc                       \
    --nobsdflags                    \
    --compression auto,lz4          \
    --exclude '*/.cpan/*'           \
    --exclude '/home/'$1'/tmp/*'    \
    --exclude '*/.cache/*'          \
    ::'{hostname}-'$1'-{now}'       \
    /home/$1

Where $1 is the account name. All static files appear again as A. By some reason borg think all files are new.

------------------------------------------------------------------------------
Archive name: hostname-biggest account-2018-12-10T06:32:55
Archive fingerprint: 95f250d234e0c2023674b4f768af0cb441eafcf2f171685ecc3c10f77c35de57
Time (start): Mon, 2018-12-10 06:32:57
Time (end):   Mon, 2018-12-10 08:39:45
Duration: 2 hours 6 minutes 48.23 seconds
Number of files: 677658
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              133.61 GB             92.07 GB             78.49 MB
All archives:                4.03 TB              2.93 TB            174.39 GB

                       Unique chunks         Total chunks
Chunk index:                 1645125             30188099
------------------------------------------------------------------------------
terminating with success status, rc 0

If I run for the same account twice in a row, old files are not added again, but when running in cron one account after the other, it fails to detect already uploaded files.
Maybe the cache is pruned for "nearly full" partitions? My root partition is quite full (95%); I moved /root/.cache/borg to another partition and symlinked it (I couldn't find a better way change cache location). Any idea is welcome.

@enboig enboig reopened this Dec 10, 2018
@infectormp
Copy link
Contributor

@enboig can your problem be related to #4192

@enboig
Copy link
Author

enboig commented Dec 10, 2018

It seems the same issue; I just rised the BORG_FILES_CACHE_TTL.
I will run twice the script and check times and cache size

@enboig
Copy link
Author

enboig commented Dec 11, 2018

I backup 45 sets in a row daily; I set BORG_FILES_CACHE_TTL=50; what may be an advisable value? I thouhgt "2X + [some margin]" would suffice, but after watching the first log it don't appear so.

@ThomasWaldmann
Copy link
Member

X should be ok already.

@enboig
Copy link
Author

enboig commented Dec 11, 2018

I forgot to use EXPORT for BORG_FILES_CACHE_TTL ; making a new batch of tests

@enboig
Copy link
Author

enboig commented Dec 12, 2018

Now everything work as expected, being borg faster than rsync. It wasn't a problem with --files-cache neither --nobsdflags

Thanks a lot for your help!

@enboig enboig closed this as completed Dec 12, 2018
@ThomasWaldmann
Copy link
Member

So, what was the problem?

@enboig
Copy link
Author

enboig commented Dec 13, 2018

I forgot to use EXPORT for BORG_FILES_CACHE_TTL ; so the variable didn't make it to borg process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants