Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpc_fileOpen: can't open file #124

Closed
Shadow4711 opened this issue Jun 21, 2017 · 21 comments
Closed

bpc_fileOpen: can't open file #124

Shadow4711 opened this issue Jun 21, 2017 · 21 comments

Comments

@Shadow4711
Copy link

Hello,

since three days I get a lot of these errors (only on one client):

G bpc_fileOpen: can't open file /var/lib/backuppc/pc/monitor/41/f%2f/fusr/fshare/fterminfo/fd/fd210 (from usr/share/terminfo/d/d210, 5, 0, 0)
rsync_bpc: failed to open "/usr/share/terminfo/d/d210", continuing: No such file or directory (2)

Any ideas?

These versions are installed:
BackupPC 4.1.3
BackupPC-XS 0.56
rsync-bpc 3.0.9.8

Any advises?

Regards
Shadow4711

@rfrush
Copy link

rfrush commented Sep 5, 2017

I, too get a similar error on only one of my ~100 rsync clients. Also with BackupPC 4.1.3

 G bpc_fileOpen: can't open file /mnt/backups/BackupPC/pc/isasp160/100/f%2f/fbin/fed (from 
 bin/ed, 3, 0, 0)
 rsync_bpc: failed to open "/bin/ed", continuing: No such file or directory (2)

This error repeats for each backup, referencing the current backup path. (example above is from backup '100'.

@jaccol
Copy link

jaccol commented Nov 5, 2017

I get similar errors. In my case I found that all of the files BackupPC complains about are hardlinks.
I also have a gut feeling, that it started after doing updates while doing backups. I guess it could not read the file once (it was not there because of the update) and since then it "remembered" that this file is not readable.

I think I saw a couple of hardlink related fixes in rsync 3.1.2. Maybe we should make a release of rsync-bpc 3.1.2 so the packagers can make updates?

@craigbarratt
Copy link
Contributor

It looks like several people have reported this, and I'd like to get to the bottom of it.

Ray and Shadow, are your files hardlinks too?

@Shadow4711
Copy link
Author

At least in my case the files are hardlinks. What's really interesting that backuping these files worked for months and for some reason it started complaining about these files.

@rfrush
Copy link

rfrush commented Nov 6, 2017

On all of our Oracle Linux 6 boxes, which is pretty much just RHEL 6, /bin/ed is a hard link:

# ls -l /bin/ed
-rwxr-xr-x. 2 root root 53400 Jul  4  2010 /bin/ed

I only have one out of ~100 systems that actually trigger this

```

$ for j in $(for i in $(ls -1); do ls -tr $i/XferLOG.* | tail -1; done); do /usr/local/BackupPC/bin/BackupPC_zcat $j | grep bpc_fileOpen; done

G bpc_fileOpen: can't open file /mnt/backups/BackupPC/pc/isasp160/171/f%2f/fbin/fed (from bin/ed, 3, 0, 0)


@craigbarratt
Copy link
Contributor

It's likely that the inode is messed up somehow.

Can you tell me the output from BackupPC_attribPrint:

BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp160/171/f%2f/fbin/attrib

You should see an entry for 'ed', something like this:

  'ed' => {
    'compress' => 3,
    'digest' => '',
    'gid' => 20,
    'inode' => 58021,
    'mode' => 436,
    'mtime' => 1278369395,
    'name' => 'ed',
    'nlinks' => 2,
    'size' => 12143,
    'type' => 0,
    'uid' => 501
  },

What output do you get for 'ed'?

Next, grab the inode entry for ed (58021 in the example above) and run this command, with NNNN replaced by the inode number:

 BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp160/171/inode/NNNN

What output do you get?

@jaccol
Copy link

jaccol commented Nov 8, 2017

In my case I miss a lot of files within /usr/share/zoneinfo/
Good news is that that is good testable over different systems.
The system where I miss the files, also does not mention them in the attribPrint
on a random other system they are there.

@jaccol
Copy link

jaccol commented Nov 8, 2017

What is strange though, is that the "browse backup" webUI does show the right file, with the right timestamp and size. But when downloading I get an empty file.
If I browse to a previous backup (before the install of this version of tzdata) I do see the same file, with another timestamp (as expected) which when downloaded results in the right file (md5 matches)

I'm not sure if this is a clue, but in this case it is a file that got reinstalled, got a new timestamp, but did not change content wise.

I'll try to find the hardlink of this file and see how that gets backup-ed or not.

@jaccol
Copy link

jaccol commented Nov 8, 2017

looking further into this:
/usr/share/zoneinfo/Cuba is a hardlink of /usr/share/zoneinfo/America/Havana (who would have guessed)

The attribPrint thing works for me differently on incremental backups than on full backups. When I look at the last full backup I get:
'Cuba' => {
'compress' => 3,
'digest' => '',
'gid' => 0,
'inode' => 204798,
'mode' => 164260,
'mtime' => 1508934825,
'name' => 'Cuba',
'nlinks' => 0,
'size' => 2411,
'type' => 0,
'uid' => 0
},

'Havana' => {
'compress' => 3,
'digest' => '1877fc45317e144ea206ddc622cebba2',
'gid' => 0,
'inode' => 204798,
'mode' => 164260,
'mtime' => 1508934825,
'name' => 'Havana',
'nlinks' => 0,
'size' => 2411,
'type' => 0,
'uid' => 0
},

And this for inode 204798:

Attrib digest is 59b2a460cc3b3fccbe502b4902a8e8cd
$VAR1 = {
'compress' => 3,
'digest' => '9e75aa0afeff12c06d8102a0a60b2cad',
'gid' => 0,
'inode' => 204798,
'mode' => 164260,
'mtime' => 1501860490,
'name' => 'fe1f03',
'nlinks' => 2,
'size' => 19706,
'type' => 0,
'uid' => 0
};

The digest for Havana is something I recognise, that is the right md5 of the file. Havana is also not generating any xfer errors and can be downloaded correctly. Cuba does give errors and downloads as an empty file

@rfrush
Copy link

rfrush commented Nov 8, 2017

Craig:

Definitely something up. I get an unexpected result:

$ /usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp160/171/f%2f/fbin/attrib $VAR1 = {};

The same command against another host:
$ /usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp170/171/f%2f/fbin/attrib Attrib digest is ae34b5abdff81ad35389c2b4981c91af $VAR1 = { 'alsaunmute' => { 'compress' => 3, 'digest' => '915f47a4ce27382b57507a07994a724a', 'gid' => 0, 'inode' => 5, 'mode' => 493, 'mtime' => 1422291852, 'name' => 'alsaunmute', 'nlinks' => 0, 'size' => 123, 'type' => 0, 'uid' => 0 }, 'arch' => { 'compress' => 3, 'digest' => '48268c725b4a9090f3814197ea40a614', 'gid' => 0, ...

@Shadow4711
Copy link
Author

Shadow4711 commented Nov 8, 2017

It seems that I have different cases:

  • Some of the complained files are not in the attrib output
  • Other cases are like this:

G bpc_fileOpen: can't open file /var/lib/backuppc/pc/internet/169/f%2f/fusr/fshare/fterminfo/fd/fdiablo-lm (from usr/share/terminfo/d/diablo-lm, 5, 0, 2)

[backuppc@server ~]$ BackupPC_attribPrint /var/lib/backuppc/pc/internet/169/f%2f/fusr/fshare/fterminfo/fd/attrib
'diablo-lm' => {
'compress' => 5,
'digest' => '',
'gid' => 0,
'inode' => 175014,
'mode' => 420,
'mtime' => 1507619605,
'name' => 'diablo-lm',
'nlinks' => 2,
'size' => 425,
'type' => 0,
'uid' => 0
},
[backuppc@server ~]$ BackupPC_attribPrint /var/lib/backuppc/pc/internet/169/inode/175014
Reading inode attrib file /var/lib/backuppc/pc/internet/169/inode/01/attrib2a; inodeName = a6ab02
Attrib digest is 4ff82ed625d232f13ca6a82d02a2fbe5
$VAR1 = {
'compress' => 5,
'digest' => 'bb0c3a8f81851fbd37cda64af9f57b93',
'gid' => 0,
'inode' => 175014,
'mode' => 420,
'mtime' => 1507619605,
'name' => 'a6ab02',
'nlinks' => 3,
'size' => 425,
'type' => 0,
'uid' => 0
};

[root@internet ~]# ls -i /usr/share/terminfo/d/diablo-lm
560561 /usr/share/terminfo/d/diablo-lm
[root@internet ~]# find / -inum 560561
/usr/share/terminfo/x/xerox-lm
/usr/share/terminfo/d/diablo-lm
/usr/share/terminfo/d/diablo1640-lm

But only diablo-lm is in the XFer errors.

@jaccol
Copy link

jaccol commented Nov 8, 2017

I only now realise that the errors I see are about the previous backup:

G bpc_fileOpen: can't open file /var/lib/BackupPC//pc/odroid-rs7/764/f%2f/fusr/fshare/fzoneinfo/fCuba (from usr/share/zoneinfo/Cuba, 3, 0, 0)
rsync_bpc: failed to open "/usr/share/zoneinfo/Cuba", continuing: No such file or directory (2)

is from the errorlog of backup # 765

@craigbarratt
Copy link
Contributor

Thanks for the various updates. In all cases, the bpc_fileOpen error shows that the digest length is 0 (which is obviously wrong). That means there are two issues - it shouldn't get into that state in the first place, and then it should recover in the next full backup.

I'm responding to all three cases below.

@Shadow4711: in your case the attrib entries for diablo-lm look correct. The entry in backup 169 has nlinks > 0, which means the inode has the correct info. For some reason it's not looking up and using the inode information. What is the entry for diablo1640-lm in 169 (which should also reference the same inode)? What happens when you run these commands:

BackupPC_zcat /var/lib/backuppc/pc/internet/169/f%2f/fusr/fshare/fterminfo/fd/fdiablo-lm
BackupPC_zcat /var/lib/backuppc/pc/internet/169/f%2f/fusr/fshare/fterminfo/fd/fdiablo1640-lm
BackupPC_zcat bb0c3a8f81851fbd37cda64af9f57b93

Any of these that succeed will print the (binary) file output to stdout, so you might want to pipe into md5sum rather than splatter binary output into your terminal (although it would be good to check for any errors that might accidentally go to stdout).

@jaccol: the Cuba attribute is wrong: it has an empty digest, and since nlinks == 0, the inode won't be checked, so that's why bpc_fileOpen gives an error. Also, are your backups 764 and 765 incrementals or fulls?

@rfrush: is backup 171 on isasp160 an incremental? You'll need to run BackupPC_attribPrint on a full.

@jaccol
Copy link

jaccol commented Nov 13, 2017

765 is a full
764 was a partial, I don't have the log anymore
763 is an incremental.

I was a bit surprised with the numbers of the backup mentioned in the errors.
in XferLOG.763 it references the same backup in the 'can't open file' error
in XferLOG.765 the previous backup was referenced

a previous full (#754) has the same error and here the error references the same backup.
Apparently It only references the previous number if the previous was a partial backup

@Shadow4711
Copy link
Author

All three commands give the same output and hash, no errors.

@rfrush
Copy link

rfrush commented Nov 20, 2017

Craig-
(Sorry this took so long)

From our 'problem' system:

/usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp160/185/f%2f/fbin/attrib  | grep -A 12 \'ed\'
  'ed' => {
    'compress' => 3,
    'digest' => '',
    'gid' => 0,
    'inode' => 243505,
    'mode' => 493,
    'mtime' => 1278243683,
    'name' => 'ed',
    'nlinks' => 0,
    'size' => 53400,
    'type' => 0,
    'uid' => 0
  },
$ /usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp160/185/inode/243505
Reading inode attrib file /mnt/backups/BackupPC/pc/isasp160/185/inode/01/attrib6d; inodeName = 31b703
$VAR1 = undef;

From a similar system without this issue:

$ /usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp170/186/f%2f/fbin/attrib  | grep -A 12 \'ed\'
  'ed' => {
    'compress' => 3,
    'digest' => '',
    'gid' => 0,
    'inode' => 65,
    'mode' => 493,
    'mtime' => 1278243683,
    'name' => 'ed',
    'nlinks' => 2,
    'size' => 53400,
    'type' => 0,
    'uid' => 0
  },

$ /usr/local/BackupPC/bin/BackupPC_attribPrint /mnt/backups/BackupPC/pc/isasp170/186/inode/65
Reading inode attrib file /mnt/backups/BackupPC/pc/isasp170/186/inode/00/attrib00; inodeName = 41
Attrib digest is 8af1a1da3850a4ba828d3dd925ab31b1
$VAR1 = {
'compress' => 3,
'digest' => 'c85bf68458a76aeb7d67f28eff53c241',
'gid' => 0,
'inode' => 65,
'mode' => 493,
'mtime' => 1278243683,
'name' => '41',
'nlinks' => 2,
'size' => 53400,
'type' => 0,
'uid' => 0
};

@craigbarratt
Copy link
Contributor

The same issue was reported in issue #123, and there is a lot of detailed information on that issue. I've debugged it and the fixes have been pushed to git. Specifically:

  • commit ef4a7d4 fixes BackupPC_dump so it gets the updated inodeLast on multiple shares
  • commit 9debb7b updates BackupPC_refCountUpdate so it finds the highest inode during an fsck and updates inodeLast in the backups file if necessary. Requires BackupPC-XS version 0.57 (not released, but head of backuppc-xs git is update to date).
  • rsync-bpc 3.0.9 branch has fixes that check the inode nlink count and updates if it's not correct.

It would be great if you could test these fixes. Thanks for your help!

@SebastianS90
Copy link

When testing the fixes, you might need to run more than one backup to have one without error messages. In my case, the first two full backups after the update still showed No such file or directory errors. The third one went through without errors.

@craigbarratt
Copy link
Contributor

I've released BackupPC 4.1.5, BackupPC::XS 0.57 and rsync-bpc 3.0.9.9 with these fixes.

@rfrush
Copy link

rfrush commented Dec 6, 2017

I know this is closed, but wanted to confirm that the fix resolved the observed issue in our environment. Like SebastianS90, I observed that it took three backups (one full, two incrementals) to make the error go away.

Thanks!

@craigbarratt
Copy link
Contributor

@rfrush - thanks for confirming. That's good to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants