- I use duplicity do backup a set a node/volume on a primary backup storage host
- I use rsync to copy primary backup on a secondary backup storage host at a distant location
- The duplicity setup takes care of removing old backup but sometimes the balance between available space for backup and growth of the data to backup impose to manually remove some older and bigger backup to avoid backup space exhaustion
- The secondary backup is not managed by duplicity but still required supression of older backups and examination of who take how many storage space
Provided the backup space is organised as a two level node/volume filetree containing the usual duplicity file sets for full and incremental backup the tools allows
- To segment backup sets in three groups (
small
,medium
andbig
) depending on the size of the set - To list backup sets with their size and summed size
- To list the duplicity file sets of a backup set
- To select duplicity file sets from a backup set older or newer than a date
./dupli-install
dupli help
dupli sets help
dupli files help
dupli sync help
dupli loop help
$ dupli sets help
dupli-sets [list] [sum] [rev] [help|all|small|medium|big]
list sets of volumes (<node>/<vol>) by size segment
use cmd
"small" to list vol <= 9G
"medium" to list vol > 9G && <= 99G
"big" to list vol > 99G
"all" to list all vol
default to list "all" volumes
default to list reverse sorted by size
default to show size and volume
use option "list" to show volume only (to pipe to other dupli module)
use option "rev" to reverse sort (lower first)
use option "sum" to sum all volume size
options must precede cmd
when using options cmd is required
$ dupli files help
dupli-files [short] [newer] [info] [verb] [debug] help|raw|size|list|sets|fulls set=<node>/<vol> [weeks=<n(12)>]
select files older (default) or "newer" (option) than the last full backup make <n> "weeks" ago (default 12)
use cmd "raw" for bare full path list
use cmd "list" for sorted "ls(1) -lt"
option "short" suppress store prefix
use cmd "size" to get the total size of selected files
use cmd "sets" to reduce output to <node>/<vol> when selected file set is not empty
use cmd "full" to see all full starting points
use option info to show <node>/<vol> and date for "weeks"
$ echo 'secondary.backup := node.domain.tld' | sudo tee /etc/local/duply-local.mk
$ dupli sync help
dupli-sync [time] [dry] [run] [help|main] [BWP=<power of two exponent sequence("8 9 10 11")>]
read a list of volumes (<node>/<vol>) on stdin and generate rsync cmd lines
use cmd "main" to allow options
default to show generated lines
use option "dry" to get dry-run rsync
use option "time" to insert time wrapper
use option "run" to pipe exec generated lines
option time must precede option run
argument BWP is used to compute bandwidth limit in KiB for rsync (eg. BWP="$(seq 8 11)" gives 3840 KiB or 30720 Kibit)
Infinite rsync
loop using flock
to enforce synchronization of
concurrent backups sets
No help yet
dupli sets sum big
dupli sets list big | sed -ne 2p | xargs -i dupli files list set={} weeks=3
dupli sets list small | xargs -i dupli files raw set={} weeks=3 | xargs rm
dupli sets list medium | xargs -i dupli files size set={} weeks=2
dupli sets list medium | xargs -i dupli files newer size set={} weeks=2
dupli sets list big | dupli-sync dry run main # dry run first
dupli sets list big | dupli-sync time run main # wrapped with time
- Mainly to avoid bypassing bandwidth limit
- Use default bandwidth limit and sleep duration between sync
for i in small medium big; do dupli loop $i& done