Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "file inode changed" on ZFS snapshot #6652

Open
benurb opened this issue Apr 26, 2022 · 17 comments
Open

Error "file inode changed" on ZFS snapshot #6652

benurb opened this issue Apr 26, 2022 · 17 comments

Comments

@benurb
Copy link

benurb commented Apr 26, 2022

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes. #6650 is about the same problem, but I'm not backing up a network filesystem.

Is this a BUG / ISSUE report or a QUESTION?

BUG QUESTION

System information. For client/server mode post info for both machines.

Your borg version (borg -V).

borg 1.2.0

Operating system (distribution) and version.

Ubuntu 20.04.4 LTS

Hardware / network configuration, and filesystems used.

Core i3 7320
2x 16GB Kingston DDR4-2400 ECC RAM
Filesystem for backup source: ZFS (snapshot)
Filesystem for backup target: ext4
Backup is not sent over network

How much data is handled by borg?

Around 2 TB per archive, around 152 TB in repository (2.5 TB deduplicated)

Full borg commandline that lead to the problem (leave away excludes and passwords)

BORG_RELOCATED_REPO_ACCESS_IS_OK=yes borgbackup create --compression lz4 --verbose --stats ${PATH_BACKUP}::$(/bin/date '+%Y-%m-%d_%H-%M') \
        /root/ \
        /storage/iobroker/.zfs/snapshot/borg/
// ... + a few more ZFS snapshot locations

Describe the problem you're observing.

UPDATE: As it turns out that is probably intentional behaviour by ZFS. While investigating this I found out, that when accessing snapshots via the hidden .zfs directory, a temporary mount is created. That leads to the error described below, because the inode changes between the unmounted and the mounted snapshot.

My fix is to not rely on the automount feature, but instead mount the snapshot manually. I changed my backup script to mount all snapshots I want to backup into /mnt/zfs/<snapshot name> and the problem is fixed.

Maybe this info helps someone that's why I didn't delete the issue, but changed it to be a question.

Original Text:
As far as I can see starting with borgbackup 1.2.0 the backup keeps failing with "file inode changed" errors:

Creating archive at "/mnt/backup_borg/files::2022-04-25_10-12"
/storage/backup/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/iobroker/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/minecraft/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/music/.zfs/snapshot/borg: file inode changed (race condition), skipping file
/storage/scans/.zfs/snapshot/borg: file inode changed (race condition), skipping file
------------------------------------------------------------------------------
Repository: /mnt/backup_borg/files
Archive name: 2022-04-25_10-12
Archive fingerprint: 9e3e209c74de5a120eeccb34d301da677f2d6eb3c1fb95c5c930910fe8ddb7b5
Time (start): Mon, 2022-04-25 10:12:57
Time (end):   Mon, 2022-04-25 10:15:47
Duration: 2 minutes 49.51 seconds
Number of files: 1016125
Utilization of max. archive size: 0%

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

Yes. It doesn't happen for all snapshots on very invocation as zfs seems to recycle inodes. But for each invocation it happens for 2 to 6 of the 10 snapshots I'm backupping.

Include any warning/errors/backtraces from the system logs

@ThomasWaldmann
Copy link
Member

Thanks for the interesting report. Yeah, guess just mounting the stuff beforehands is the best way to fix this.

Would be bad if we would have to add an additional stat for each directory just to work around this. And then maybe even have more timing issues if the mount takes some times and the next stat still would not yield the final/stable values.

@ThomasWaldmann
Copy link
Member

ThomasWaldmann commented Apr 27, 2022

Considering this likely happens with any "auto mounter", guess we should add this to docs / FAQ.

cprussin added a commit to cprussin/dotfiles that referenced this issue Aug 14, 2022
Previously, we would use .zfs/snapshot, but it seems there's some kind of error
caused by how zfs automounts those paths, see
borgbackup/borg#6652.  So this PR moves away from
backing up the paths in .zfs/snapshot, and instead we now explicitly mount the
snapshots.  As an added benefit, this gives me a way to do some cool things:

- I can now tag zfs volumes with net.prussin:backup and they will be
automatically backed up, rather than having to update my nixos config to add
volumes to the backup

- I can now structure the borg backup to match my zfs volume layout instead of
matching the structure of where the zfs volumes are mounted, which makes more
sense to me and makes consuming the backups simpler
@mwalliczek
Copy link

I got the same error when migrating from borgbackup 1.17 to 1.21 with a cifs mount via autofs. After changing back to version 1.17 the error does not occur any more.

@ThomasWaldmann
Copy link
Member

@mwalliczek that's because 1.1.x works a lot based on filenames and doesn't check much (so it's quite open to all sorts of race conditions, but also tolerant to autofs / automounter i guess).

1.2.x works based on fd (file descriptors) and makes pretty sure that it only opens what it intended to open and not something different suddenly appearing at the same place. good to avoid race conditions, bad for automounter.

can you just mount before running borg?

@iansmirlis
Copy link

Is it too much to disable this behavior, i.e use filenames instead of inodes, with a file system option during borg create?

@ThomasWaldmann
Copy link
Member

@iansmirlis the problem is the magic behaviour of the mountpoint and the solution was already found, see above posts.

@iansmirlis
Copy link

iansmirlis commented Oct 24, 2022

@ThomasWaldmann, sure, thanks for clarification. See, my issue is that automount is there for a reason, and it behaves like this, magic or not, for a reason, too.

borg also has good reasons to work on fd, however in this case, I have to manually taking care mounting and unmounting snapshots, without any actual gain. i.e. I do not see a way to have a race condition on a read-only zfs snapshot.

Imho, it would be more convenient for me to have the option to disable this behavior, instead of manipulating mounting.

Having said that, I will not insist. You are far more experienced to see if this is a clean behavior.

@logitab
Copy link

logitab commented Jan 7, 2023

I propose a command line switch to select the behavior
of stat_update_check(st_old, st_curr). Automounting is a
common technique. There shouldn't be a need to create a workaround to
backup a common filesystem technique. I got the impression that this
all is to increase security, but there are environments where security
is not the first concern.

Can't the whole issue be solve by adding an option like --nofdcheck and
placing a if statement within stat_update_check() or are there deeper implications?

@fbicknel
Copy link

My fix is to not rely on the automount feature, but instead mount the snapshot manually. I changed my backup script to mount all snapshots I want to backup into /mnt/zfs/<snapshot name> and the problem is fixed.

I'm so glad you posted this. Until I found this, I had no clue as to what was causing this.

I thought about it a little bit, and instead of going to the trouble of mounting to some other location, I tried the following -- and it worked:

            info "($TARGET) Mounting snapshot"
            cd ${TARGET} && cd -

All it took was a chdir to the target location (e.g., /.zfs/snapshot/today/var) and zfs mounted it for me.

laziness. that's the stuff.

@ThomasWaldmann
Copy link
Member

@fbicknel thanks for adding that here.

maybe even a ls -d ${TARGET} would work (just one command and not changing the cwd)?

@fbicknel
Copy link

fbicknel commented Jan 19, 2023 via email

@jdchristensen
Copy link
Contributor

@fbicknel What about ls ${TARGET}/, with a trailing slash?

@fbicknel
Copy link

I tried it. It works, too. So I guess take your choice.

@gzagatti
Copy link

gzagatti commented Jul 12, 2023

I'd like to backup directly from the snapshot location ~/.zfs/snapshot/ because as a normal user I'm not allowed to mount or clone files.

Is it indeed the case that file inodes will be unstable when doing backups from say ~/.zfs/snapshot/my-snap1 and ~/.zfs/snapshot/my-snap2 even when doing a relative backup?

Is the only alternative then to use --file-cache ctime,size?

I'm particularly concerned about the answer to these two FAQs I am seeing ‘A’ (added) status for an unchanged file!? and It always chunks all my files, even unchanged ones!.

@ThomasWaldmann
Copy link
Member

@gzagatti

  • ls -i shows the inode, so just check that
  • the full absolute path of a file must stay the same, because that is used as index into the files cache

@gzagatti
Copy link

@ThomasWaldmann

Perfect! I checked the inode of one single file in two different snapshots and they are the same. However, it is difficult to tell if it'll always be the case.

In any case, point 2 means that I should backup from the same path even when using relative backups as I was doing.

@benurb
Copy link
Author

benurb commented Sep 16, 2023

@fbicknel What about ls ${TARGET}/, with a trailing slash?

Just to add here why I didn't use that approach (or cd into the directory): I'm backing up snapshots of 10 zfs file systems. When I'm cd'ing into the snapshots before starting the backup, a few of them are already unmounted again when the backup process reaches them. Compared to the hassle of writing a script that constantly touches the mountpoints while borgbackup is running, mounting the snapshots manually seems like to more reasonable solution 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants