New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Same Thing In Python? #329

Closed
jojomi opened this Issue Oct 28, 2015 · 23 comments

Comments

Projects
None yet
@jojomi
Copy link

jojomi commented Oct 28, 2015

Perhaps there are interesting ideas to be borrow, I did not have a look at it yet.

https://github.com/borgbackup/borg

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 28, 2015

Hey, thanks for the hint. borg is a fork of attic https://attic-backup.org/ and https://github.com/jborg/attic, and I know the author. We're in contact regularly to talk about ideas and algorithms.

@fd0 fd0 closed this Oct 28, 2015

@jojomi

This comment has been minimized.

Copy link

jojomi commented Oct 28, 2015

Thank you for explaining.

Is the additional python dependency the main difference between attic/borg and restic?

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 28, 2015

There are many more, I think, mostly in the details. As far as I understood borg/attic is mainly written in Python, with the core component (the chunking algorithm) in C. And it uses openssl for crypto (not sure if that's good or bad) ;)

@pvgoran

This comment has been minimized.

Copy link

pvgoran commented Oct 28, 2015

For me, the essential difference was that in restic a repository consists of unnamed (referenced by hashes) "snapshots", while in attic/borg a repository consists of named "archives". Also, attic/borg has client-server mode that is a preferred way of accessing remote repositories via SSH.

And of course, there are differences in speed, space consumption and currently implemented features. :)

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 28, 2015

@pvgoran did you give borg a try? How does it compare to restic to you as a user?

@pvgoran

This comment has been minimized.

Copy link

pvgoran commented Oct 28, 2015

I considered borg, but I was averted by, you know, "hasty" and "too open" nature of its development, when new versions are released often and seemingly without much thought. And scared by one bug for which the then-latest version offered a weird work-around instead of a fix. So I'll compare with attic, which I currently use.

  • I like restic's approach to repository structure more. Having to come up with a different archive name each time is boring (though easily manageable). On the other hand, I miss the possibility to mark snapshots with a meaningful name (something like reusable - not unique - tags).
  • restic seems to be much faster, especially when incremental backups. attic, on the other hand, has compression that nicely halves the needed space and bandwidth.
  • I don't particularly like the way encryption is handled in both programs, from the usability point of view. I don't like how I can't just specify the keyfile in command line. With restic, I should either enter the passphrase with every invocation (which is awkward), or use the environment variable which is inconvenient and won't play nice with sudo (I didn't even try it, in fact). With attic, I can have keyfiles, but their location is automatically derived from the repository path, and I don't want them in /root where they turn up unless I set HOME to something other.
  • Right now, restic doesn't handle hard links, ACLs and xattrs, while attic works with all of these. This prevents me from using restic for actual backups. I'm thinking of adding support for ACLs and xattrs myself.
@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Oct 29, 2015

Thanks for the interesting insight!

A few of the things you mentioned are already on the todo list:

  • tags for snapshots #55
  • extended attributes/ACL #25
  • read password from key file #278

If you'd like to try working on one of these issues (I'd recommend starting with an easier one like #55), I'd be glad to assist!

@alexeymuranov

This comment has been minimized.

Copy link

alexeymuranov commented May 22, 2017

I think this question would be more appropriate for a mailing list, but i did not find a mailing list for restic.

Trying to compare restic with borg, i decided to compare the amount of source code (wondering about maintainability of the two projects). I downloaded the latest snapshots of the master branches and discovered that borg has 2.2 MB of code in 207 files and folders and restic has 32.2 MB of code in 1150 files and folders. Why such a big difference?

@zcalusic

This comment has been minimized.

Copy link
Member

zcalusic commented May 22, 2017

Have you been careful to exclude vendored libraries and testdata?

Naive zsh -c "du -sc src/**/*.go" returns only 1.2MB for me.

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented May 22, 2017

As @zcalusic wrote so nicely, the source code is only about 1.2MiB in 219 files. The rest is vendored libraries (everything below vendor/, 738 files, 8.4MiB) and testdata (47 files, 21MiB)

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented May 22, 2017

cloc is also interesting, both projects have approximately the same code size:

$ cloc restic/src
     254 text files.
     247 unique files.                           
      36 files ignored.

http://cloc.sourceforge.net v 1.60  T=0.65 s (334.9 files/s, 47274.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Go                             217           5401           1802          23430
-------------------------------------------------------------------------------
SUM:                           217           5401           1802          23430
-------------------------------------------------------------------------------

vs. borg:

$ clog borg/src
      71 text files.
      70 unique files.
       4 files ignored.

http://cloc.sourceforge.net v 1.60  T=0.24 s (293.4 files/s, 129775.3 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          49           3377           3200          16846
HTML                             1            499             11           1931
C                                6            292            234           1791
Cython                          10            375            281           1320
C/C++ Header                     3             51             29            282
-------------------------------------------------------------------------------
SUM:                            69           4594           3755          22170
-------------------------------------------------------------------------------
@alexeymuranov

This comment has been minimized.

Copy link

alexeymuranov commented May 22, 2017

Thank you for the detailed clarification. I was too lazy to investigate myself.

@nicolas17

This comment has been minimized.

Copy link

nicolas17 commented Sep 16, 2017

Restic can upload backups to dumb cloud storage such as S3. As far as I can tell, Borg needs custom software running on the server side to do remote backups. Is this correct?

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Sep 16, 2017

I haven't looked into borg deep enough to answer that. Sorry.

@Crest

This comment has been minimized.

Copy link

Crest commented Sep 19, 2017

Yes attic/borg is implemented as two processes talking msgpack rpc over socket/pipe. It supports running both processes locally and spawning the server on a remote system over SSH. The later requires borg to be installed on the backup server.

@TheAMM

This comment has been minimized.

Copy link

TheAMM commented Oct 1, 2017

@nicolas17 borg only does "local" and ssh backups. "Local" meaning you can mount a remote location (with sshfs for example) and backup to that "local" path, although this is obviously less efficient than the client-server model. Naturally, you can upload the backup repository elsewhere (rsync & others can do this efficiently), but borg doesn't natively do any other targets.

@minusf

This comment has been minimized.

Copy link

minusf commented Feb 19, 2018

a non-scientific benchmark from a new user of both borg and restic:
MacBook Pro (Retina, 15-inch, Mid 2015), macOS High Sierra 10.13.3 (17D47)
/Volumes/backup is a 2TB external USB3 disk.

$ borg --version
borg 1.1.4
$ borg init -e authenticated /Volumes/backup/borg
# borgmatic setup: no compression, excludes are the same as restic
$ borgmatic --verbosity 1
/Volumes/backup/borg: Creating archive
------------------------------------------------------------------------------
Archive name: 2018-02-19T15:45:40
Archive fingerprint: 0e5912175b30fb8431737143e232bf0e6e0b2882b498371f46ee880aaeacb960
Time (start): Mon, 2018-02-19 15:45:41
Time (end):   Mon, 2018-02-19 16:54:34
Duration: 1 hours 8 minutes 53.73 seconds
Number of files: 337637
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:               10.91 GB             10.91 GB              7.46 GB
All archives:               10.91 GB             10.91 GB              7.46 GB

                       Unique chunks         Total chunks
Chunk index:                  141328               337642
------------------------------------------------------------------------------

$ restic init -r /Volumes/backup/restic
$ restic backup -r /Volumes/backup/restic --exclude-caches \
    -e ~/.Trash -e ~/.cache -e ~/Library/Caches ~
password is correct
scan [/Users/user]
scanned 76950 directories, 344828 files in 0:23
[20:06] 99.96%  10.066 GiB / 10.071 GiB  421780 / 421778 items  0 errors  ETA 0:00
duration: 20:06
snapshot ded296df saved

interesting to see the quite big difference between the number of processed files with the same excludes. i'll compare the filelists later.

so restic is 3x faster even while encrypting. i don't need this encryption btw, the external drive is an encrypted partition, but restic insists :}

but i like the summary data more of borg :D

restic is very promising indeed.

@tobecwb

This comment has been minimized.

Copy link

tobecwb commented Mar 12, 2018

I tried borg and right now trying restic with a huge backup data (more than 8 millions of files with 15tb of data)...
borg works greats, but it's very slow!

my initial backup takes 3 weeks (without encryption with default compression - lzma I think), while restic takes only 2 days.

I think the big problem with borg it's the cache system. It's very slow, and the backup take more time reading and writing the cache than doing the backup of files (this occur only with have a huge amount of files, like my case)...

personally, I prefer some things in borg that I miss in restic, like summary in the end.
but right now, i'm using restic as my primary method of backup.

when borg change the way that cache is implemented or increase the speed performance, maybe I try again.

@RipperFox

This comment has been minimized.

Copy link

RipperFox commented Mar 15, 2018

What about the performance comparison (memory usage, speed of incremental backups and especially pruning) described here: https://blog.stickleback.dk/borg-or-restic/ ? Anyone got further tests?

@fd0

This comment has been minimized.

Copy link
Member

fd0 commented Mar 15, 2018

Please be aware that the blog entry does not take recent developments for restic into account, like the metadata cache which greatly speeds up non-initial backups :)

@tobecwb

This comment has been minimized.

Copy link

tobecwb commented Mar 15, 2018

@RipperFox I don't make a "cientific" comparation between borg and restic, but in the link you posted, this think caught my attention:
In our experience Borg also has a significant advantage when it comes to performance

In my case, this is exactly the opossite. Like I said, borg is fast when the filesystem is small or has fewer files, but when things become big, like more than 9 millions of files - 3 days ago are 8 millions, now is 9 millions - 15tb of data, the performance of borg become horrible.

Same data, same files... initial backup:

  • borg takes almost 3 weeks to complete with default compression;
  • restic takes 2 days to complete;

I think that the the big problem here is that borg read the cache entirely on memory before and after each backup, and this cache is huge because of the number of files;

Take a look in this issue: borgbackup/borg#3394

About memory and cpu, restic is more hungry (much more).
While borg use something about 2.5GB of memory, restic can take more than 4GB of memory to backup same data
CPU usage is the same, borg takes less cpu usage than restic (sorry, I don't have any numbers on this);

I really like more borg because some features.
About the backup size, even with restic don't support compression, I got similar results

While borg backuped everything in 8tb, restic use 9tb.

but because of the horrible performance on borg, I prefer to use restic

@nicolas17

This comment has been minimized.

Copy link

nicolas17 commented Mar 15, 2018

I suspect the main reason why Borg is slower is compression!

@tobecwb

This comment has been minimized.

Copy link

tobecwb commented Mar 15, 2018

@nicolas17 I have sure that the compression is not the problem. I tried without compression and got same results. Borg take a long time reading and writing file cache before start backup of files and after the backup have finished.
Pretty sure that make Borg slow is the "cache" system. When start a new backup, the cache file is readed entirely on memory, using msgpack to transform the data in a python dict.
The same thing to write the "cache" to disk.

And if i'm not wrong, Borg save this cache every 30 minutes (pause the backup, write the cache, then continue)... only to save a cache file, on my server, this can take almost 2 hours!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment