Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Why is snapshot_hash used instead of snapshot ID? #8

Closed
Enrico204 opened this issue Feb 10, 2023 · 3 comments
Closed

[Question] Why is snapshot_hash used instead of snapshot ID? #8

Enrico204 opened this issue Feb 10, 2023 · 3 comments
Labels
question Further information is requested

Comments

@Enrico204
Copy link
Contributor

I was looking at the restic data in my Grafana deployment. After doing a manual backup yesterday, the client appears duplicated in the "Total backup size" item in the provided dashboard. Investigating further, I discovered that the "snapshot hash" is calculated here:

def calc_snapshot_hash(self, snapshot: dict) -> str:

My question is, why is the snapshot hash used instead of the snapshot ID (which already is a sort of hash)?

@ngosang
Copy link
Owner

ngosang commented Feb 10, 2023

For each snapshot restic provides several hashes:

"parent":"caf9137c9bccc0d2b266a0fc02be7a71652f3de4916034a234e535a3e9ef3a11",
"tree":"446e26b7a79b7b08f52d2e183cbbc9be948e0e56787a69aae3c2f9e63fbbd07c",
"id":"100c00902e666e25570773885e82200a982c4334268f1d608f138384de629915",
"short_id":"100c0090"

But none of them is useful to group all snapshots for the same user. In the first implementation I though that the "parent hash" was common across all snapshots of the user but it's not.
I decided to make my own hash which means nothing but it's useful to group snapshots of the same user.
I'm not publishing the restic hashes because they are not useful and they will generate too many time series in Prometheus.

@ngosang ngosang added the question Further information is requested label Feb 10, 2023
@Enrico204
Copy link
Contributor Author

Ok, that makes sense. Thanks!

I suppose that I will investigate on why my client appears duplicated, and I will open a new issue or PR if there is any enhancement :-)

@ferrarimarco
Copy link

ferrarimarco commented Jun 4, 2023

As explained in #16, in my case I get duplicates because of path changes (I added a couple of directories for a given host and user).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants