Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new mode 'repository' to restic stats #2543

Open
wants to merge 3 commits into
base: master
from

Conversation

@aawsome
Copy link
Contributor

aawsome commented Jan 12, 2020

What is the purpose of this change? What does it change?

Adds the possibility to show various repository statistics without walking trees, i.e. should be very fast.

The output looks like:

restic -r /path/to/repo stats --mode repository
enter password for repository: 
repository f1a1cc2c opened successfully, password is correct
scanning...

Repository content:
==================
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
key files:                1 |         0 B |       450 B |       450 B
snapshot files:           1 |       214 B |        32 B |       246 B
index files:              1 |  10.323 MiB |        32 B |  10.323 MiB
pack files:              11 |  41.284 MiB |         0 B |  41.284 MiB
---------------------------------------------------------------------
all files:               14 |  51.606 MiB |       514 B |  51.607 MiB

percentage:
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
key files:            7.14% |       0.00% |      87.55% |       0.00%
snapshot files:       7.14% |       0.00% |       6.23% |       0.00%
index files:          7.14% |      20.00% |       6.23% |      20.00%
pack files:          78.57% |      80.00% |       0.00% |      80.00%
---------------------------------------------------------------------
all files:          100.00% |     100.00% |     100.00% |     100.00%

statistics - file size:
                        min |         max |         avg
-------------------------------------------------------
key files:            450 B |       450 B |       450 B
snapshot files:       246 B |       246 B |       246 B
index files:     10.323 MiB |  10.323 MiB |  10.323 MiB
pack files:       2.545 MiB |   4.023 MiB |   3.753 MiB
-------------------------------------------------------
all files:            246 B |  10.323 MiB |   3.686 MiB

Index content:
==============
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
tree blobs:            1004 |  34.648 MiB |  31.375 KiB |  34.678 MiB
data blobs:           91981 | 529.232 KiB |   2.807 MiB |   3.324 MiB
---------------------------------------------------------------------
all blobs:            92985 |  35.164 MiB |   2.838 MiB |  38.002 MiB

percentage:
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
tree blobs:           1.08% |      98.53% |       1.08% |      91.25%
data blobs:          98.92% |       1.47% |      98.92% |       8.75%
---------------------------------------------------------------------
all blobs:          100.00% |     100.00% |     100.00% |     100.00%

statistics - raw blobs:
                        min |         max |         avg
-------------------------------------------------------
tree blobs:           362 B | 352.109 KiB |  35.337 KiB
data blobs:             3 B |         8 B |         5 B
-------------------------------------------------------
all blobs:              3 B | 352.109 KiB |       396 B

pack size by index:
                 # packs |   raw blobs | pack header |      crypto |       total
--------------------------------------------------------------------------------
tree blobs:            9 |  34.648 MiB |  36.312 KiB |  31.656 KiB |  34.714 MiB
data blobs:            2 | 529.232 KiB |   3.246 MiB |   2.807 MiB |   6.570 MiB
--------------------------------------------------------------------------------
all blobs:            11 |  35.164 MiB |   3.281 MiB |   2.838 MiB |  41.284 MiB

percentage:
                 # packs |   raw blobs | pack header |      crypto |       total
--------------------------------------------------------------------------------
tree blobs:              |      99.81% |       0.10% |       0.09% |     100.00%
data blobs:              |       7.87% |      49.40% |      42.73% |     100.00%
--------------------------------------------------------------------------------
all blobs:               |      85.18% |       7.95% |       6.87% |     100.00%

Overhead:
=========
index:        10.323 MiB ( 20.00%)
snapshots:         214 B (  0.00%)
pack header:   3.281 MiB (  6.36%)
crypto:        2.839 MiB (  5.50%)
total:        16.443 MiB ( 31.86%)

Total:
======
      92985 blobs
         14 files
 51.607 MiB total rpository size

Was the change discussed in an issue or in the forum before?

No, but I needed some statistics while developing and thought it might be a good extension.
There is an issue with restic stat because it walks all trees for all snapshots and hence is pretty slow, see #2126. This extension does not solve this issue but is able to give repository statistics very fast.

Please give me feedback if this is useful or should be complemented by something. I will then finish this PR (and add docu, etc).
If it is not useful, feel free to close this PR.

Checklist

  • I have read the Contribution Guidelines
  • I have added tests for all changes in this PR
  • I have added documentation for the changes (in the manual)
  • There's a new file in changelog/unreleased/ that describes the changes for our users (template here)
  • I have run gofmt on the code in all commits
  • All commit messages are formatted in the same style as the other commits in the repo
  • I'm done, this Pull Request is ready for review
Alexander Weiss added 2 commits Jan 12, 2020
Alexander Weiss
Adds the possibility to show various repository statistics
without walking trees, i.e. should be very fast.
@rawtaz

This comment has been minimized.

Copy link
Contributor

rawtaz commented Feb 15, 2020

Can you replace the example of a minimal test repo with just one snapshot with a more real world example of a bigger repo?

@aawsome

This comment has been minimized.

Copy link
Contributor Author

aawsome commented Feb 15, 2020

@rawtaz here is the output from a real-life repository:

Repository content:
==================
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
key files:                1 |         0 B |       448 B |       448 B
snapshot files:         182 |  46.834 KiB |   5.688 KiB |  52.521 KiB
index files:             59 |  32.043 MiB |   1.844 KiB |  32.044 MiB
pack files:           36299 | 170.479 GiB |         0 B | 170.479 GiB
---------------------------------------------------------------------
all files:            36541 | 170.511 GiB |   7.969 KiB | 170.511 GiB

percentage:
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
key files:            0.00% |       0.00% |       5.49% |       0.00%
snapshot files:       0.50% |       0.00% |      71.37% |       0.00%
index files:          0.16% |       0.02% |      23.14% |       0.02%
pack files:          99.34% |      99.98% |       0.00% |      99.98%
---------------------------------------------------------------------
all files:          100.00% |     100.00% |     100.00% |     100.00%

statistics - file size:
                        min |         max |         avg
-------------------------------------------------------
key files:            448 B |       448 B |       448 B
snapshot files:       235 B |       311 B |       295 B
index files:          240 B |   3.163 MiB | 556.159 KiB
pack files:           216 B |  11.988 MiB |   4.809 MiB
-------------------------------------------------------
all files:            216 B |  11.988 MiB |   4.778 MiB

Index content:
==============
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
tree blobs:           19432 |  99.245 MiB | 607.250 KiB |  99.838 MiB
data blobs:          236533 | 170.073 GiB |   7.218 MiB | 170.080 GiB
---------------------------------------------------------------------
all blobs:           255965 | 170.170 GiB |   7.811 MiB | 170.178 GiB

percentage:
                      count |    raw size |      crypto |   encr size
---------------------------------------------------------------------
tree blobs:           7.59% |       0.06% |       7.59% |       0.06%
data blobs:          92.41% |      99.94% |      92.41% |      99.94%
---------------------------------------------------------------------
all blobs:          100.00% |     100.00% |     100.00% |     100.00%

statistics - raw blobs:
                        min |         max |         avg
-------------------------------------------------------
tree blobs:            13 B | 754.551 KiB |   5.229 KiB
data blobs:             1 B |   8.000 MiB | 753.953 KiB
-------------------------------------------------------
all blobs:              1 B |   8.000 MiB | 697.112 KiB

pack size by index:
                 # packs |   raw blobs | pack header |      crypto |       total
--------------------------------------------------------------------------------
tree blobs:          229 |  99.245 MiB | 703.027 KiB | 614.406 KiB | 100.531 MiB
data blobs:        36070 | 170.073 GiB |   8.484 MiB |   8.319 MiB | 170.090 GiB
--------------------------------------------------------------------------------
all blobs:         36299 | 170.170 GiB |   9.170 MiB |   8.919 MiB | 170.188 GiB

percentage:
                 # packs |   raw blobs | pack header |      crypto |       total
--------------------------------------------------------------------------------
tree blobs:              |      98.72% |       0.68% |       0.60% |     100.00%
data blobs:              |      99.99% |       0.00% |       0.00% |     100.00%
--------------------------------------------------------------------------------
all blobs:               |      99.99% |       0.01% |       0.01% |     100.00%

Calculated packsize is 298.536 MiB smaller than the actual total size of pack files!
This means there are packs that contain blobs which are not referenced.

This is most likely due to abborted backup operations or abborted 'prune'.
It can also indicate that something is not correct with your repository.
You may consider running 'restic check' or 'restic prune'.

Overhead:
=========
index:                  32.043 MiB (  0.02%)
snapshots:              46.834 KiB (  0.00%)
pack header:             9.170 MiB (  0.01%)
crypto:                  8.927 MiB (  0.01%)
unused blobs in packs: 298.536 MiB (  0.17%)
total:                 348.722 MiB (  0.20%)

Total:
======
     255965 blobs
      36541 files
170.511 GiB total repository size

Please note that I'm using cleanup-index and cleanup-packs from #2513 and hence there exist packs that have used and unused blobs. I don't repack those packs as this is a "cold" storage where retrieving packs is much more expensive than storing some unused data.
I added reporting for this case in this PR.

f.StringVar(&countMode, "mode", countModeRestoreSize, "counting mode: restore-size (default), files-by-contents, blobs-per-file, or raw-data")
f.StringVarP(&snapshotByHost, "host", "H", "", "filter latest snapshot by this hostname")
f.StringVar(&countMode, "mode", countModeRestoreSize, "counting mode: restore-size (default), files-by-contents, blobs-per-file, raw-data or repository")
f.StringVarP(&snapshotByHost, "host", "H", "", "filter latest snapshot by this hostname (not used for mode repository)")

This comment has been minimized.

Copy link
@greatroar

greatroar Feb 19, 2020

Instead of ignoring the flag, wouldn't it be better to check it and bail if it's set?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.