Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expiration strategy deleting backups that should be kept #121

Open
yorkday opened this issue May 19, 2018 · 26 comments

Comments

Projects
None yet
7 participants
@yorkday
Copy link

commented May 19, 2018

Hi there,

I advised of an issue with the expiration strategy, however the issue was closed when the branch was merged to the master: #105

Because the expiration strategy uses an absolute timestamp value to calculate the age between backups, it deletes backups that don't match the exact durations required.

The 2 examples below show different ways rsync_time_backup removes backups that would be expected to be kept..

I do not like the current expiration strategy as it uses an absolute value of duration based on timestamp seconds. I believe if there is a backup taken at 11pm on day, and 2am the next day, and the user wishes to keep daily backups, both should be retained as the daily backup points if they are the only backups taken on each day.

Preferred Solution:
Rsync_time_backup should retain backups for days, weeks, months and years based on absolute values for those time periods, not based on number of seconds from current backup point.
I prefer the method used by restic (http://restic.readthedocs.io/en/latest/060_forget.html). This allows users to specify the number of days, weeks, months or years to retain. This way users can retain the last backup for a given month, rather than a fixed window like 30 days, which in some cases could hold 2 backups for a month (say on the 1st and the 31st of the month) or no backups for a month (e.g. February which has 28/29 days).

Example 1 - Daily backups removed:
Command used: rsync_tmbackup.sh ./source/ ./target/

Before:

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-13-000010
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-14-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-15-000008
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-16-000007
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-17-000006
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-18-000005
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-190009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 19:47 latest -> 2018-05-19-194725

After:
Note: Backups from the 17th, 15th and 13th were deleted as they were not exactly 24 hours apart as defined in the seconds since the last backup.

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-14-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-16-000007
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-18-000005
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-190009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:03 latest -> 2018-05-19-200316

Example 2 - Monthly backups removed with 30 day strategy:
Command Used: rsync_tmbackup.sh --strategy "1:1 30:30" ./source/ ./target/

Before:

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-01-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-02-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-02-28-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-30-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:03 latest -> 2018-05-19-200316

After:
Note: Backups from January and February were removed altogether, despite them being the only backup points for those months.

drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-03-31-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-04-30-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-01-000009
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194507
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-194725
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-200316
drwxr-xr-x   3 user  staff    96 19 May 18:58 2018-05-19-202723
-rw-r--r--   1 user  staff     0 19 May 18:59 backup.marker
lrwxr-xr-x   1 user  staff    17 19 May 20:27 latest -> 2018-05-19-202723
@yorkday

This comment has been minimized.

Copy link
Author

commented Jun 29, 2018

Unfortunately I think there is a serious bug with the way backup expiration currently works.

The test cases presented are flawed because they assume an existing large history of backup points and the time when the test is run is different to the timestamp for each backup.

In my experience running the tool so far, I am continually finding gaps in daily backups where there is not exactly 86,400 seconds between backups. Worse than that, I am finding zero weekly backups are being kept after 30 days.

For example, after running daily backups since 2018-04-17, with default strategy (1:1, 30:7, 365:30), here is what I have on several backup points:

Example 1:

d--------- 25 root root 4096 May 22 21:53 2018-06-02-000012
d--------- 25 root root 4096 May 22 21:53 2018-06-04-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-06-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-07-000019
d--------- 25 root root 4096 May 22 21:53 2018-06-09-000016
d--------- 25 root root 4096 May 22 21:53 2018-06-11-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-13-000018
d--------- 25 root root 4096 May 22 21:53 2018-06-15-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-17-000018
d--------- 25 root root 4096 May 22 21:53 2018-06-18-000020
d--------- 25 root root 4096 May 22 21:53 2018-06-20-000014
d--------- 25 root root 4096 May 22 21:53 2018-06-21-000021
d--------- 25 root root 4096 May 22 21:53 2018-06-23-000013
d--------- 25 root root 4096 May 22 21:53 2018-06-24-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-25-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-27-000012
d--------- 25 root root 4096 May 22 21:53 2018-06-29-000017
d--------- 25 root root 4096 May 22 21:53 2018-06-30-000018
drwxr-xr-x  2 root root 4096 Jun 30 00:00 backup_log
-rw-r--r--  1 root root    0 Apr 17 11:06 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:00 latest -> 2018-06-30-000018

Example 2:

d--------- 31 root root 4096 May 24 00:28 2018-06-02-000129
d--------- 31 root root 4096 May 24 00:28 2018-06-04-000155
d--------- 31 root root 4096 May 24 00:28 2018-06-06-000122
d--------- 31 root root 4096 May 24 00:28 2018-06-08-000141
d--------- 31 root root 4096 May 24 00:28 2018-06-09-000209
d--------- 31 root root 4096 May 24 00:28 2018-06-11-000120
d--------- 31 root root 4096 May 24 00:28 2018-06-13-000155
d--------- 31 root root 4096 May 24 00:28 2018-06-15-000138
d--------- 31 root root 4096 May 24 00:28 2018-06-17-000145
d--------- 31 root root 4096 May 24 00:28 2018-06-18-003218
d--------- 31 root root 4096 May 24 00:28 2018-06-19-010608
d--------- 31 root root 4096 May 24 00:28 2018-06-20-010710
d--------- 31 root root 4096 May 24 00:28 2018-06-22-000118
d--------- 31 root root 4096 May 24 00:28 2018-06-23-000126
d--------- 31 root root 4096 May 24 00:28 2018-06-25-000112
d--------- 31 root root 4096 May 24 00:28 2018-06-27-000104
d--------- 31 root root 4096 May 24 00:28 2018-06-29-000112
d--------- 31 root root 4096 May 24 00:28 2018-06-30-000140
drwxr-xr-x  2 root root 4096 Jun 30 00:02 backup_log
-rw-r--r--  1 root root    0 Apr 17 11:07 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:02 latest -> 2018-06-30-000140

Example 3:

d--------- 20 root root 4096 Oct 16  2017 2018-06-02-000002
d--------- 20 root root 4096 Oct 16  2017 2018-06-03-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-04-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-05-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-06-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-07-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-08-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-09-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-10-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-11-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-12-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-13-000004
d--------- 20 root root 4096 Oct 16  2017 2018-06-14-000012
d--------- 20 root root 4096 Oct 16  2017 2018-06-16-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-18-000002
d--------- 20 root root 4096 Oct 16  2017 2018-06-19-000005
d--------- 20 root root 4096 Oct 16  2017 2018-06-21-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-22-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-23-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-25-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-27-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-28-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-29-000003
d--------- 20 root root 4096 Oct 16  2017 2018-06-30-000002
drwxr-xr-x  2 root root 4096 Jun 30 00:00 backup_log
-rw-r--r--  1 root root    0 Apr 17 10:26 backup.marker
lrwxrwxrwx  1 root root   17 Jun 30 00:00 latest -> 2018-06-30-000002

Note, the issues:

  1. Gaps in daily backups
  2. Not a single weekly backup kept after 30 days

I know there are gaps, because my logs show that backups have run every single day - at varying levels of minutes/seconds past midnight.

This is a serious fault introduced by the recent new code to allow strategy customisation - as it requires specific 86,400 seconds between 24 hour backups, and the backup strategy test cases assume a history of backups (when normally, they will be added daily, and each prune to strategy will be comparing the time it is run).

I think the only workaround at this time is to resort to retaining all daily backups (ie. --strategy 1:1) and manually pruning until it is resolved.

I would love to help further, but I am really not familiar with scripting - nor do I know how to write code that would remain correct across all the platforms that this script supports.

@laurent22

This comment has been minimized.

Copy link
Owner

commented Jun 29, 2018

I see it's indeed an issue for the daily backups. Making the interval a bit less strict like 85,000 seconds would handle this better, but not sure if that's a good solution.

Do you have any idea though why there's no backup after 30 days? Any way to replicate this bug?

@laurent22 laurent22 added the bug label Jun 29, 2018

@yorkday

This comment has been minimized.

Copy link
Author

commented Jul 2, 2018

Hi Laurent,

I have replicated the bug through some hacking at the code.

Environment: macOS High Sierra 10.13.5

Steps to replicate bug in new backup:

  1. Download latest version of rsync_tmbackup
  2. Create empty source and target folders
  3. Touch backup.marker in target folder
  4. Modify rsync_tmbackup.sh
  • set NOW and EPOCH variables manually to replicate as-was backup timing
  • place NOW and EPOCH variables on lines 4 and 5 of script and comment out lower in script
  • NOW=$(date +"%Y-06-10-000001")
  • EPOCH=$(date -j -f "%d-%B-%y-%T" 10-JUN-18-00:00:01 "+%s")
  1. Run script with following parameters:
  • Strategy is keep daily after 1 day, keep weekly after 8 days
  • ./rsync_tmbackup.sh --strategy "1:1 8:7" --log-dir /Users/yday/Downloads/rsync-time-backup-master ./source ./target
  1. After each run of script, increment the dates for NOW and EPOCH by 1 day
  2. After 8 runs, the target directory looks like this:
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-10-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-11-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-12-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-13-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-14-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-15-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-16-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-17-000001
-rw-r--r--   1 yday  staff     0  2 Jul 21:24 backup.marker
lrwxr-xr-x   1 yday  staff    17  2 Jul 21:27 latest -> 2018-06-17-000001
  1. After the 9th, run, the first backups is pruned, rather than being kept as the first weekly backup
rsync_tmbackup: Creating destination ./target/2018-06-18-000001
rsync_tmbackup: Expiring ./target//2018-06-10-000001
rsync_tmbackup: Starting backup...
  • Target directory looks like:
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-11-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-12-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-13-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-14-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-15-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-16-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-17-000001
drwxr-xr-x   2 yday  staff    64 29 Jun 15:57 2018-06-18-000001
-rw-r--r--   1 yday  staff     0  2 Jul 21:24 backup.marker
lrwxr-xr-x   1 yday  staff    17  2 Jul 21:28 latest -> 2018-06-18-000001
  1. Now every subsequent run prunes the oldest daily backup, not keeping any weekly backups
  2. After 10 more runs, target directory looks like:
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-23-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-24-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-25-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-26-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-27-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-28-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-29-000001
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-30-000001
-rw-r--r--    1 yday  staff     0  2 Jul 21:24 backup.marker
lrwxr-xr-x    1 yday  staff    17  2 Jul 21:31 latest -> 2018-06-30-000001

Bug does not occur for existing backups with history:
However, if I pre-populate the target directory with a large history of existing directories, it correctly prunes dailies and weeklies as expected:

  1. Target directory:
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-17-102937
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-17-170951
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-18-183109
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-20-220247
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-170314
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-222356
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-21-223504
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-22-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-23-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-24-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-25-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-26-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-27-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-28-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-29-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-30-000012
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-01-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-02-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-03-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-04-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-05-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-06-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-07-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-08-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-09-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-10-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-11-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-12-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-13-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-14-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-15-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-16-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-17-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-18-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-19-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-20-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-21-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-22-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-22-222223
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-23-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-24-000008
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-25-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-26-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-27-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-28-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-29-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-30-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-31-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-01-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-02-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-03-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-04-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-05-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-06-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-07-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-08-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-09-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-10-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-11-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-12-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-13-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-14-000015
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-15-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-16-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-17-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-18-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-19-000010
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-20-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-21-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-22-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-23-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-24-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-25-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-26-000008
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-27-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-28-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-29-000006
-rw-r--r--    1 yday  staff     0  2 Jul 21:32 backup.marker
  1. After one run:
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-18-183109
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-04-26-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-03-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-10-000005
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-18-000003
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-05-25-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-01-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-08-000006
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-16-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-23-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-24-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-25-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-27-000004
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-28-000007
drwxr-xr-x    2 yday  staff    64  2 Jul 21:33 2018-06-29-000006
drwxr-xr-x    2 yday  staff    64 29 Jun 15:57 2018-06-30-000001
-rw-r--r--    1 yday  staff     0  2 Jul 21:32 backup.marker
lrwxr-xr-x    1 yday  staff    17  2 Jul 21:33 latest -> 2018-06-30-000001
  1. Note the 26/06 daily is still incorrectly pruned (however this is the absolute seconds issue previously described).

Summary:
So, I think to summarise:

  1. Pruning strategy is not working for new backups with no history
  2. Pruning strategy works when there is existing backup history

It would appear the cause of this issue for new backups is because there is no historical backups old enough to meet the current strategy, the first backup to roll off the previous strategy is pruned, whereas it should be kept. In my example, on the 9th run, the oldest backup is pruned, because it is not 7 x 86,400 seconds older than the next backup. This is why no backups after kept after 8 days (or 30 days for default strategy), because no new daily backup will ever meet the age rule.

The reason why this bug was not picked up, was because:

  1. The test scripts assumed an existing history of directories
  2. The test scripts did not set the NOW and EPOCH times as-was to verify how the backup would behave over time (only run at current date/time)

In the above example, the 2018-06-10 backup should have been kept as the first weekly backup, and backups between 2018-06-11 to 2018-06-16 should have been pruned.

One possible solution to deal with daily backups being pruned, is not to measure the distance between backups in absolute seconds, but to measure the difference in days between individual backups.

For example, a backup at 2018-06-10-235959 and one at 2018-06-11-000000 should each be kept as daily backups, even though they are 1 second apart (currently the script would prune the first backup, even though it was the only backup taken for that day). The importance is not that they are 86,400 seconds apart (distance between backups), it is that they are 1 day apart (absolute date of backups), and are valid to be retained for a daily backup strategy.

I would also like to see the strategy adjusted to use human readable time periods (e.g. weeks, months, years) rather than absolute days. People in general do not want to keep backups every 30 days, they want to keep monthly backups, which has to deal with months with days of 28, 29, 30, 31 days. Unfortunately this would mean a complete re-write of the pruning strategy - which is also why I suggested the approach used by restic:
http://restic.readthedocs.io/en/latest/060_forget.html

I hope this helps.

@Albert444

This comment has been minimized.

Copy link

commented Oct 26, 2018

Hi,

first of all a big thank you to laurent22 for writing this very nice and handy script! It is most usefull for doing all kinds of regular reliable backups.

Although I love the script I ran into the bug as well in the daily routine (Note, it does not occur with your test script due to the reasons discussed above). I am not a pro in scripting and could not do it as elegant as it should be, but I think I found a solution of altering the script at two locations that seems to work for me.

First, I changed linux*) date -d "${1:0:10} ${1:11:2}:${1:13:2}:${1:15:2}" +%s ;; to linux*) date -d "${1:0:10}" +%s ;; in order to set the time after a day artificially to midnight so that slighty different backup-times per day (due to different amount of backup work when you run the script several times for different directories) do not matter.

Second, I changed the contents of the fn_expire_backup function by using a modulo function:

fn_expire_backups() {
	local current_timestamp=$EPOCH
	local start_timestamp=$(fn_parse_date "2018-10-10-000000")
	local last_kept_timestamp=9999999999

	# Process each backup dir from most recent to oldest
	for backup_dir in $(fn_find_backups | sort -r); do
		local backup_date=$(basename "$backup_dir")
		local backup_timestamp=$(fn_parse_date "$backup_date")

		# Skip if failed to parse date...
		if [ -z "$backup_timestamp" ]; then
			fn_log_warn "Could not parse date: $backup_dir"
			continue
		fi

		# Find which strategy token applies to this particular backup
		for strategy_token in $(echo $EXPIRATION_STRATEGY | tr " " "\n" | sort -r -n); do
			IFS=':' read -r -a t <<< "$strategy_token"

			# After which date (relative to today) this token applies (X)
			local cut_off_timestamp=$((current_timestamp - ${t[0]} * 86400))

			# Every how many days should a backup be kept past the cut off date (Y)
			local cut_off_interval=$((${t[1]} * 86400))

			# If we've found the strategy token that applies to this backup
			if [ "$backup_timestamp" -le "$cut_off_timestamp" ]; then

				# Special case: if Y is "0" we delete every time
				if [ $cut_off_interval -eq "0" ]; then
					fn_expire_backup "$backup_dir"
					break
				fi

				# Check if the current backup is in the interval between
				# the last backup that was kept and Y
				local interval_since_last_kept=$((last_kept_timestamp - backup_timestamp))

				dayssincestart=$(((backup_timestamp-start_timestamp)/86400))
				modulo=$((dayssincestart % (cut_off_interval/86400)))

				if [ "$modulo" -ne "0" ]; then
					# Yes: Delete that one
					fn_expire_backup "$backup_dir"
				else
					if [ "$interval_since_last_kept" -lt "86400" ]; then #if diffenrence is less than one day
						fn_expire_backup "$backup_dir"
					else
						# No: Keep it
						last_kept_timestamp=$backup_timestamp
					fi
				fi
				break
			fi
		done
	done
}

It is just a suggestion, maybe laurent22 or anybody else can check it and build it in a more elegant way into the script.

I hope this helps somebody...

@hazelra

This comment has been minimized.

Copy link

commented Jan 17, 2019

I appreciate all the work done here! I think this is a great way to make backups. I've had the same issue with the expiration strategy and pruning not quite working as expected. Has the issue been corrected in the latest code from 7 months ago?

@yorkday

This comment has been minimized.

Copy link
Author

commented Jan 18, 2019

Hi @hazelra, unfortunately I don't think the issue has been corrected since this issue was raised.
I have lived with the issue by choosing to keep all daily backups forever. Even so, not all daily backups are kept due to the absolute duration in seconds required between two backups run on two separate days.

From a backup perspective, I continue to use this tool for its unencrypted, local backup capabilities. Restoration is as simple as browsing to the directory and copying files if I needed anything back. For more robust backups, I choose to use http://restic.net because it is under constant development and improvement, has great encryption and cloud backup capabilities. Restic has much more capable and flexible pruning capability, but it enforces encryption which isn't what I want for my local backup solution.

I think if you want something simple and easy to use without needing encryption, this tool could be OK. For anything else, I'd recommend something like restic which is more robust and feature rich.

Unfortunately I don't have the skills to modify the code, so I'm stuck with what it provides (but still 100% grateful for the work that went into it).

@Albert444

This comment has been minimized.

Copy link

commented Jan 18, 2019

Hi,
I think this simple script is really good - especially with the ability to have space-saving incremental backups and nevertheless the full browsing experience in the file explorer within the different backups.

I suggested a rough solution for the bug three comments above and so far it is working for me. You could try to open the script e.g. with nano and replace at the two locations the original code with my code suggestions. Then just check over a few days with a test-setup whether this would serve your needs...

I am well aware that a real good coder would do it maybe in a more elegant way. So maybe some is willing to take my idea and build it into the official script if the solution is fit for it...

Best greetings....

@hazelra

This comment has been minimized.

Copy link

commented Jan 22, 2019

Thanks Yorkday. That's definitely a good solution to save all backups and have restic do the pruning. If you have any code or scripts that show how you use restic, I and maybe others would find that useful.

Albert, I'm not much of a coder, but I may be able to figure out which lines of code to replace from your post. The first change seems straight forward. The second, which I'll have to check, I'm not sure how much of the original code to cut out. Maybe there is a more elegant way of fixing it, but in my experience there is nothing wrong with inelegant that gets the job done.
Hazel

@kapitainsky

This comment has been minimized.

Copy link

commented Mar 31, 2019

I find this script very useful and expiration bug very annoying. As I have not see much activity here I decided to DIY it. I think I have now working fix. Anybody interested please have a look at my fork at https://github.com/kapitainsky/rsync-time-backup in bugfix-fn_expire_backups branch.
Only change is modified fn_expire_backups function. Here it is the new one:

fn_expire_backups() {
	local current_timestamp=$EPOCH
	local last_kept_timestamp=9999999999

	# Process each backup dir from the oldest to the most recent
	for backup_dir in $(fn_find_backups | sort); do

		local backup_date=$(basename "$backup_dir")
		local backup_timestamp=$(fn_parse_date "$backup_date")

		# Skip if failed to parse date...
		if [ -z "$backup_timestamp" ]; then
			fn_log_warn "Could not parse date: $backup_dir"
			continue
		fi

		# If this is the first "for" iteration backup_dir points to the oldest backup
		if [ "$last_kept_timestamp" == "9999999999" ]; then
			# We dont't want to delete the oldest backup. It becomes first "last kept" backup
			last_kept_timestamp=$backup_timestamp
			# As we keep it we can skip processing it and go to the next oldest one
			continue
		fi

		# Find which strategy token applies to this particular backup
		for strategy_token in $(echo $EXPIRATION_STRATEGY | tr " " "\n" | sort -r -n); do
			IFS=':' read -r -a t <<< "$strategy_token"

			# After which date (relative to today) this token applies (X) - we use seconds to get exact cut off time
			local cut_off_timestamp=$((current_timestamp - ${t[0]} * 86400))

			# Every how many days should a backup be kept past the cut off date (Y) - we use days (not seconds)
			local cut_off_interval_days=$((${t[1]}))

			# If we've found the strategy token that applies to this backup
			if [ "$backup_timestamp" -le "$cut_off_timestamp" ]; then

				# Special case: if Y is "0" we delete every time
				if [ $cut_off_interval_days -eq "0" ]; then
					fn_expire_backup "$backup_dir"
					break
				fi

				# we calculate days number since last kept backup
				local last_kept_timestamp_days=$((last_kept_timestamp / 86400))
				local backup_timestamp_days=$((backup_timestamp / 86400))
				local interval_since_last_kept_days=$((backup_timestamp_days - last_kept_timestamp_days))

				# Check if the current backup is in the interval between
				# the last backup that was kept and Y
				# to determine what to keep/delete we use days difference
				if [ "$interval_since_last_kept_days" -lt "$cut_off_interval_days" ]; then

					# Yes: Delete that one
					fn_expire_backup "$backup_dir"
					# backup deleted no point to check shorter timespan strategies - go to the next backup
					break
				else
					# No: Keep it.
					# af we keep it this is now the last kept backup
					last_kept_timestamp=$backup_timestamp
					# and go to the next backup
					break
				fi
			fi
		done
	done
}

I would appreciate any comments. I am sure it can be coded in more elegant way but I just needed a fix.

@kjyv

This comment has been minimized.

Copy link

commented Apr 5, 2019

I'm trying out your updated expiry logic with my daily backup and I'm now seeing the script expire the last backup. There are two backups from that day but I think it should expire the earlier one, not the latest one? Also, there should be a proper runnable automatic test for the expiry logic anyway with all edge cases. Your approach using full days instead of seconds sounds good though.

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

it will always keep the oldest backup. It is intentional - I agree it might not suit everybody's taste:) The logic behind it is that often people start backup service and when first backup if finished they feel that their files are safely stored in backup. Then they delete them from the source to free some space... and then when pruning kicks in data is gone

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

I posted it here to get some feedback - details can be easily changed. Now as I got the right way to handle expiry logic it is quite simple. I did reasonable amount of testing including many edge cases before publishing it - i am pretty much confident in this function logic. But of course some bugs are always possible.

@kjyv

This comment has been minimized.

Copy link

commented Apr 5, 2019

You mean it will always keep the initial backup or the oldest backup in the time span? In my example, the previous backup on that day is not the initial backup, there are other backups before it. You could argue it is not that important on one day but if it is within a week or month, I guess the newer backup is more important.

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

Initial one.

@kjyv

This comment has been minimized.

Copy link

commented Apr 5, 2019

Initial is fine and makes sense, but that is not what I meant, see above.
In general btw, it's great you took the time to improve the expiry logic, hope it works well otherwise :)

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

this is all relative - "day" is just some time measure. It will keep one backup per day (if requested in strategy). but it won't be exactly last backup from calendar day. If you run hourly backups it is not so important IMHO if backup from 23:00 or from 01:00 is kept

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

I might modify it and make it more "human" - to keep last backup from calendar day. I give it some time though to see if maybe some other bugs are discovered.

@kjyv

This comment has been minimized.

Copy link

commented Apr 5, 2019

A real issue I noticed now is that it uses the last backup as --link-dest in the rsync command, though this backup directory has now been expired.
I guess expiry should in general happen after doing the backup to prevent transferring files again that have just been deleted before.

@kjyv

This comment has been minimized.

Copy link

commented Apr 5, 2019

I'm not running hourly backups btw, I guess that is only one (maybe not very likely) use case. I'm not always in the same network that my backup location is at, so I might not have any backups for multiple days and expiry should still work sensibly.

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 5, 2019

good point. thanks for this comment. I will look into making it more reasonable.

@kapitainsky

This comment has been minimized.

Copy link

commented Apr 6, 2019

A real issue I noticed now is that it uses the last backup as --link-dest in the rsync command, though this backup directory has now been expired.
I guess expiry should in general happen after doing the backup to prevent transferring files again that have just been deleted before.

Agree with this. In general it is real shame that this repo is not maintained anymore. Script does fantastic job but few things should be fixed.

@laurent22

This comment has been minimized.

Copy link
Owner

commented Apr 6, 2019

The issue is that it's difficult to test these pull request and I don't want to push a change that will break people's backup scripts. If there's a consensus on an expiration strategy PR and it's been tested by a few users, I'm happy to merge it.

@filippocarletti

This comment has been minimized.

Copy link

commented Jun 11, 2019

@laurent22 the NethServer community is testing @kapitainsky version of the expiration strategy.
https://community.nethserver.org/t/rsync-engine-old-backups-missing/
It's working as expected.

@hazelra

This comment has been minimized.

Copy link

commented Jun 12, 2019

@laurent22

This comment has been minimized.

Copy link
Owner

commented Jun 12, 2019

That's great, thanks for the feedback @filippocarletti. I've just realised I'm not using that pull request myself for my backups so I'm going to start doing so as well, just in case I notice any issue. I think we can probably merge quite soon.

@kjyv

This comment has been minimized.

Copy link

commented Jun 13, 2019

Works fine for me, too. I made expiry happen after the backups though.
Adding a proper test for the behavior shouldn't be too hard though and should be the prerequisite for adding new code like this since it possibly deletes valuable data for people using this in production. Just create a few empty directories and check if the right ones are being deleted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.