Skip to content

How to find which backups reference a particular pool file

Craig Barratt edited this page Nov 27, 2018 · 4 revisions

Sometimes you might want to find out which backups reference a particular pool file. Pool files are referenced in the attrib files (one per backup directory), but there is no simple way to go from a pool file back the other way.

Let's do an example for a 4.x installation. Imagine you are looking for backups that reference a pool file with a digest (file name) of e73d8ba4dbc119e8040dc3103c7fd33f. Let's assume

TOPDIR=/data/BackupPC       # bash, or...
set TOPDIR=/data/BackupPC   # tcsh

If compression is on (ie, $Conf{CompressLevel} > 0), the path to the pool file is

ls -l $TOPDIR/cpool/e6/3c/e73d8ba4dbc119e8040dc3103c7fd33f

Notice that the first two subdirectories are the first two bytes of the digest anded with 0xfe (ie, rounded down to an even hex number).

Next we can see how many times this file is referenced in total by looking in the poolCnt file, which is stored below the first-level directory:

BackupPC_poolCntPrint $TOPDIR/cpool/e6/poolCnt | egrep e73d8ba4dbc119e8040dc3103c7fd33f
    e73d8ba4dbc119e8040dc3103c7fd33f => 1436

That file is referenced 1436 times across all the hosts and all the backups.

Now let's find which hosts reference that file. Each host has 128 reference count files (one for each first-level directory, although empty ones will be missing) stored in the refCnt subdirectory in each host's directory (eg, $TOPDIR/pc/HOST/refCnt). In our example, the first-level directory is e6, so the pool count file for a specific host HOST8 is:

BackupPC_poolCntPrint $TOPDIR/pc/HOST8/refCnt/poolCnt.1.e6 | egrep e73d8ba4dbc119e8040dc3103c7fd33f
    e73d8ba4dbc119e8040dc3103c7fd33f => 796

The "1" suffix to poolCnt is the compression on/off flag (it would be "0" if compression was off), and e6 is the first byte of the digest anded with 0xfe. This particular host references that file 796 times.

Now let's do this for all hosts:

BackupPC_poolCntPrint $TOPDIR/pc/*/refCnt/poolCnt.1.e6 | egrep "Pool Ref|e73d8ba4dbc119e8040dc3103c7fd33f"

That might produce output like this:

Pool Ref Count file /data/BackupPC/pc/HOST1/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST2/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST3/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST4/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 108
Pool Ref Count file /data/BackupPC/pc/HOST5/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 12
Pool Ref Count file /data/BackupPC/pc/HOST6/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST7/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST8/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 796
Pool Ref Count file /data/BackupPC/pc/HOST9/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST10/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 29
Pool Ref Count file /data/BackupPC/pc/HOST11/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 6
Pool Ref Count file /data/BackupPC/pc/HOST12/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 448
Pool Ref Count file /data/BackupPC/pc/HOST13/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 1
Pool Ref Count file /data/BackupPC/pc/HOST14/refCnt/poolCnt.1.e6:

File names without any count don't reference the pool file at all. You can see that HOSTs 4, 5, 8, 10, 11, 12 and 13 reference that file and the others do not.

Now let's see which individual backups for HOST10 reference that file:

BackupPC_poolCntPrint $TOPDIR/pc/HOST10/*/refCnt/poolCnt.1.e6 | egrep "Pool Ref|e73d8ba4dbc119e8040dc3103c7fd33f"
Pool Ref Count file /data/BackupPC/pc/HOST10/4/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 1
Pool Ref Count file /data/BackupPC/pc/HOST10/6/refCnt/poolCnt.1.e6:
Pool Ref Count file /data/BackupPC/pc/HOST10/7/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 2
Pool Ref Count file /data/BackupPC/pc/HOST10/8/refCnt/poolCnt.1.e6:
    e73d8ba4dbc119e8040dc3103c7fd33f => 26

You can see that backups #4, #7 and #8 reference that file, for a total of 29 times. Backup #6 does not.

Finally, to find exactly where a specific backup references a file you'll need to search the entire backup tree for that digest. Unfortunately you'll have to do this for each share being backed up:

BackupPC_ls -R -h HOST10 -n 7 -s SHARE / | egrep e73d8ba4dbc119e8040dc3103c7fd33f 

This step can be quite slow since it has to traverse the full backup tree.