Dry Run Statistics, please implement or document #3298

Code7R · 2017-11-07T06:22:04Z

Hi,

this is a follow-up to #265.

I don't agree with the with the way it was closed. And I was tricked by the same assumption as shown there, and there is STILL no documentation nor implementation of this functionality.

Seriously, I do expect that "--stats -C none ..." would give me some useful information. That is what I get from rsync when used for similar purpose ("rsync -a --stats --checksum ...") and a such limitation in borg seems to be weird, frankly speaking.

ThomasWaldmann · 2017-11-08T11:40:26Z

The major use case for borg create --dry-run was to optimize the exclude options.

Thus, it doesn't do much except recursing over the the input directories, doing pattern matching on the file names. It does that rather fast though, by getting out there quickly before the real processing of a file begins (reading, chunking, hashing, updating the repo). Adding --stats thus gives no useful information.

Adding a useful --dry-run --stats functionality is not easily possible, I think:

dry-run requires no changes to the repo
the deduplication processing (and stats) only works if the repo, chunks index, etc. is updated
compression processing (and stats) only works if stuff is actually compressed (taking quite some time)

So, there is only one, slightly "dirty" option coming to mind:
It could do a normal borg create run and just not commit the result.
That would give correct stats, but also potentially need a lot of resources (time, space, ...), so not sure we want that. If one wants that, ony could also run a normal borg create and delete the created archive again, which would be a lot cleaner (and is already working with current code).

So, we could just add code that outputs that --stats is not available together with --dry-run.

Code7R · 2017-11-08T17:46:34Z

The major use case for borg create --dry-run was to optimize the exclude options.

Well, this is exactly what I had in mind. Make a dry-run, estimate whether the data would fit on the target filesystem (even in worse case without compression).

Actually, I was starting with an existing borg archive (which contained an old/similar version of the same source data), I would also like to know whether it's worth to keep the old repo or restart from scratch.

If you say that making a "test commit" and dropping it later is totally harmless then I agree with your conclusion. But please document it somehow in a more appropriate way, ideally in the manpage and also in the program output.

PS: and I still don't understand what you tried to say with "The major use case was ... to optimize the exclude options". Was this a use case for regular users or for developers? From user's point of view, this feature basically does NOT work. There is no way for me to make a QUICK check to see whether my exclude file worked or not. I.e. even if I do RTFM and run something like borg create --stats --exclude-from exclude.stuff.txt -verbose $PWD::$(date +%F) /src and then it runs for hours (storage medium is slow) and only then I can see whether the pattern in exclude.stuff.txt was correct or not. It's just PITA.

ThomasWaldmann · 2017-11-09T00:49:25Z

If you run borg create --dry-run --list --exclude ... you will get a list of all files it would back up.

By just looking through that list, you can usually:

spot some files you don't want to backup: downloads, ISOs here and there, unimportant VM images, cache directories, ...
check whether any include/exclude patterns you have used do really work (it's easy to have typos or syntax issues there)

That's what it was made for. Not for estimating dedup or compression efficiency.

milkey-mouse · 2017-11-10T00:34:49Z

So has this issue been reduced to:

telling argparse that --dry-run and --stats are mutually exclusive, and
documenting (in the FAQ or borg create --help) that one should use a temp archive to check dedup?

I do see the use case of running compression/dedup and throwing away the result as we go, though. As documented (and in my personal experience), borg does not handle running out of space well (if we do add a --dry-run & --stats mode, it should probably be mentioned there). Being able to make an estimate of the amount of space an archive would take (or documenting some workaround for doing such a test) would minimize the space problem.

ThomasWaldmann · 2017-11-10T02:48:08Z

@milkey-mouse yes, that's the 2 TODOs here.

trying to cope with too little repo space just by tuning dedup or compression is a somehow futile attempt anyway. one might be able to fit it in there for the initial backup, but usually one wants to use a repo over a longer time and if it is that tight, it will run full rather soon.

borg 1.1 handles (near and really) out-of-space situations better than 1.0.

milkey-mouse · 2017-11-10T03:09:14Z

trying to cope with too little repo space just by tuning dedup or compression is a somehow futile attempt anyway

So are you saying resistance is futile? ;)

ThomasWaldmann · 2017-11-10T03:24:05Z

Yes. And we need more space. :D

fixes borgbackup#3298)

…orgbackup#3298)

…p#3298)

…rgbackup#3298)

Code7R · 2017-11-10T12:08:56Z

2 todos... almost there.

I added #3306 too. Main part of the use case for this whole issue was to test exclude patterns. And do that quickly. And there is apparently no way to do that. Just imagine having a huge directory and a slow backup disks. Now it takes you a couple of hours a then you see: oh sh*t, the exclude patterns didn't work and it started adding the useless binary folders which you wanted to exclude... and what now? Let it run for another five hours? Anyhow, two hours of time and energy wasted. :-(

ThomasWaldmann · 2017-11-10T13:17:23Z

@Code7R to see whether exclude patterns work, you can proceed as I have already told there: #3298 (comment) - and this is very quick (as far as borg is concerned as it does not process file contents). So I suggest you try that.

#3305) show an error when --dry-run & --stats are both used, fixes #3298

…p#3298)

ThomasWaldmann added this to the 1.1.3 milestone Nov 8, 2017

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

Show an error message when --dry-run & --stats are used simultaneously (

5ebc102

fixes borgbackup#3298)

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

Show an error message when --dry-run & --stats are both used (fixes b…

5aa2029

…orgbackup#3298)

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

Show an error when --dry-run & --stats are both used (fixes borgbacku…

135396a

…p#3298)

milkey-mouse mentioned this issue Nov 10, 2017

Show an error message when --dry-run & --stats are used simultaneously #3305

Merged

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

fixup! Show an error when --dry-run & --stats are both used (fixes bo…

2f19f69

…rgbackup#3298)

ThomasWaldmann closed this as completed in #3305 Nov 10, 2017

ThomasWaldmann pushed a commit that referenced this issue Nov 10, 2017

Show an error message when --dry-run & --stats are used simultaneously (

cffac7e

#3305) show an error when --dry-run & --stats are both used, fixes #3298

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

Show an error when --dry-run & --stats are both used (fixes borgbacku…

57169b5

…p#3298)

milkey-mouse added a commit to milkey-mouse/borg that referenced this issue Nov 10, 2017

Show an error when --dry-run & --stats are both used (fixes borgbacku…

31031e9

…p#3298)

newtonne mentioned this issue Feb 19, 2018

Fix issue when using both --dry-run and -v options borgmatic-collective/borgmatic#21

Merged

ghost mentioned this issue Aug 26, 2021

backport #5969

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dry Run Statistics, please implement or document #3298

Dry Run Statistics, please implement or document #3298

Code7R commented Nov 7, 2017

ThomasWaldmann commented Nov 8, 2017 •

edited

Code7R commented Nov 8, 2017 •

edited

ThomasWaldmann commented Nov 9, 2017

milkey-mouse commented Nov 10, 2017 •

edited

ThomasWaldmann commented Nov 10, 2017

milkey-mouse commented Nov 10, 2017

ThomasWaldmann commented Nov 10, 2017

Code7R commented Nov 10, 2017

ThomasWaldmann commented Nov 10, 2017

Dry Run Statistics, please implement or document #3298

Dry Run Statistics, please implement or document #3298

Comments

Code7R commented Nov 7, 2017

ThomasWaldmann commented Nov 8, 2017 • edited

Code7R commented Nov 8, 2017 • edited

ThomasWaldmann commented Nov 9, 2017

milkey-mouse commented Nov 10, 2017 • edited

ThomasWaldmann commented Nov 10, 2017

milkey-mouse commented Nov 10, 2017

ThomasWaldmann commented Nov 10, 2017

Code7R commented Nov 10, 2017

ThomasWaldmann commented Nov 10, 2017

ThomasWaldmann commented Nov 8, 2017 •

edited

Code7R commented Nov 8, 2017 •

edited

milkey-mouse commented Nov 10, 2017 •

edited