Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backup speed getting worse after cleanup #94

Open
jure1 opened this issue Jun 20, 2014 · 7 comments
Open

Backup speed getting worse after cleanup #94

jure1 opened this issue Jun 20, 2014 · 7 comments

Comments

@jure1
Copy link

jure1 commented Jun 20, 2014

I've been using attic backup for a month and a half now. I've been very happy with it, but I've got an issue. First the encrypted remote backups of 400G were taking around 1 hour on a 10Mbit line. Then I started cleaning up old archives from the repository (using attic delete -v -s repo:archive) and the backup times are a lot worse - today it was 8 hours.

Start time: Fri Jun 20 02:03:25 2014
End time: Fri Jun 20 10:12:38 2014
Duration: 8 hours 9 minutes 12.86 seconds
Number of files: 1168340

                   Original size      Compressed size    Deduplicated size

This archive: 414.77 GB 334.49 GB 2.14 GB
All archives: 4.58 TB 3.72 TB 272.92 GB


Here are the durations for the last 30 days:
1h 14m 51.72s
1h 2m 44.56s
1h 6m 26.34s
56m 43.38s
58m 20.30s
1h 4m 41.44s
1h 6m 17.06s
1h 36.73s
1h 47.97s
1h 2m 24.76s
2h 53m 0.03s
2h 16m 45.68s
2h 59m 45.16s
2h 47m 52.60s
2h 45m 16.81s
4h 23m 42.96s
2h 53m 50.14s
2h 15m 29.59s
2h 1m 28.54s
2h 56m 5.82s
5h 4m 34.30s
4h 54m 47.90s
7h 44m 42.41s
5h 44m 1.25s
6h 24m 26.47s
6h 1m 7.58s
7h 20m 30.36s
7h 33m 34.42s
8h 29m 44.88s
8h 9m 12.86s

The size of the changed data hasn't changed much - cca 2-3G/day:
Original size Compressed size Deduplicated size
This archive: 426.20 GB 347.28 GB 2.12 GB
This archive: 426.10 GB 347.16 GB 1.98 GB
This archive: 426.29 GB 347.25 GB 2.35 GB
This archive: 426.55 GB 347.46 GB 2.08 GB
This archive: 425.77 GB 346.65 GB 1.87 GB
This archive: 425.79 GB 346.67 GB 1.73 GB
This archive: 425.79 GB 346.67 GB 1.90 GB
This archive: 426.04 GB 346.84 GB 1.94 GB
This archive: 426.08 GB 346.82 GB 1.76 GB
This archive: 426.33 GB 347.05 GB 1.92 GB
This archive: 426.81 GB 347.41 GB 1.99 GB
This archive: 427.17 GB 347.70 GB 1.96 GB
This archive: 428.70 GB 349.23 GB 3.14 GB
This archive: 427.20 GB 347.72 GB 1.80 GB
This archive: 426.22 GB 346.61 GB 869.51 MB
This archive: 427.88 GB 348.10 GB 1.92 GB
This archive: 427.81 GB 348.01 GB 1.73 GB
This archive: 427.80 GB 347.92 GB 1.83 GB
This archive: 428.88 GB 348.85 GB 3.63 GB
This archive: 428.90 GB 348.86 GB 1.92 GB
This archive: 428.87 GB 348.83 GB 1.91 GB
This archive: 429.06 GB 348.92 GB 2.30 GB
This archive: 432.09 GB 351.84 GB 5.02 GB
This archive: 432.27 GB 351.98 GB 2.13 GB
This archive: 413.95 GB 333.78 GB 2.05 GB
This archive: 413.81 GB 333.76 GB 2.13 GB
This archive: 413.79 GB 333.74 GB 1.74 GB
This archive: 413.78 GB 333.71 GB 1.71 GB
This archive: 414.24 GB 334.04 GB 1.88 GB
This archive: 414.33 GB 334.11 GB 1.95 GB
This archive: 414.45 GB 334.23 GB 1.97 GB
This archive: 414.77 GB 334.49 GB 2.14 GB

Any ideas about what I could do?
I'm on attic v. 0.12.

@jborg
Copy link
Owner

jborg commented Jun 21, 2014

Are you performing the deletes from the same machine and user account.
Or are you getting "Initializing cache..." messages when creating a new backup after another one has been deleted?

@jure1
Copy link
Author

jure1 commented Jun 21, 2014

Yes, it's on the same machine under the same account. When I create a backup I just get the list of files and the stats at the end.
The command I'm using is:
attic create -v --stats --exclude $EXCLUDE $DEST $SRC
Maybe I'll try skipping the deletes for a few days or to do it just once/week.

@jborg
Copy link
Owner

jborg commented Jun 22, 2014

This is really strange. Deleting archives should actually make things slightly faster, not slower.

For instance when the backup time jumped from "1h 2m" to "2h 53m", does the directly coincide with the first time you deleted an archive?

And what about resource usage during the backup. Is the internet connection saturated, is the server swapping or is the cpu pegged? What is the bottle neck...

@jure1
Copy link
Author

jure1 commented Jun 22, 2014

Actually it's the next jump in time that coincides with the time I've started deleting old archives:
2h 45m 16.81s
4h 23m 42.96s
but now that you mention it also the jump from 1h to 2h is rather odd. I'm going to try a backup without deleting anything tomorrow to see if it makes a difference.

I've checked the resources graph during the backup. Load average went up when I've started deleting old archives. Usually one cpu is full, but it's because of iowait. The disks are much faster than the network line so it's probably that. I'll try to monitor it more closely today.

Is there a debug setting or something like that so I could see if it's using the cache you've mentioned?
There probably isn't a difference between using attic delete for the old archives and using attic purge to do it automatically?

@jborg
Copy link
Owner

jborg commented Jun 22, 2014

Attic uses a cache directory located at $HOME/.cache/attic/REPOID/. This cache is used and updated whenever an archive is created or deleted. If Attic detects that this cache is out of date (an archive has been created or deleted using a different cache directory) the cache will be automatically rebuilt and the following will be printed to stdout:

Initializing cache...
Analyzing archive: archivename1
Analyzing archive: archivename2
...

After that the create/delete command will continue as usual. This rebuild is quite time consuming and cpu intensive but should not happen unless a single repository is modified by more than one system/user.

Anyway, unless you're seeing the above output this must be something else...

Some additional questions:

  • Does the performance improve if you allow two backups to take place without deleting any archives in-between?
  • Is the extra "slowness" evenly distributed throughout the backup (each individual file now takes a lot longer to backup) or is the bulk of the extra time spent before or after the acutal archive creation takes place, or for just on a few files?

@jure1
Copy link
Author

jure1 commented Jul 6, 2014

It took a while to diagnose but I've found out something was wrong with the internet connection. Tried all sort of things, stopped deleting old backups, enabling/disabling ssh compression, keepalives, etc until I've started monitoring the internet connection and found out it was the cause. Meanwhile the provider probably changed something as now backups are as fast as in the start.
Maybe you could add some more statistics or a verbose option for debugging in the future if you're out of ideas for what to add.
Attic is a great program, thanks :)

@ThomasWaldmann
Copy link
Contributor

please close (maybe open a new issue about the suggestion in last comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants