Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fuse: Mix inode hashes in a non-symmetric way #4255

Merged
merged 1 commit into from Apr 7, 2023

Conversation

greatroar
Copy link
Contributor

What does this PR change? What problem does it solve?

Since 0.15 (#4020), inodes are generated as hashes of names, xor'd with the parent inode. That means that the inode of a/b/b is

h(a/b/b) = h(a) ^ h(b) ^ h(b) = h(a).

I.e., the grandchild has the same inode as the grandparent. GNU find trips over this because it thinks it has encountered a loop in the filesystem, and fails to search a/b/b. This happens more generally when the same name occurs an even number of times.

Fix this by multiplying the parent by a large prime, so the combining operation is not longer symmetric in its arguments. This is what the FNV hash does, which we used prior to 0.15. The hash is now

h(a/b/b) = h(b) ^ p*(h(b) ^ p*h(a))

Note that we already ensure that h(x) is never zero.

Collisions can still occur, but they should be much less likely to occur within a single path.

Was the change previously discussed in an issue or on the forum?

Fixes #4253.

Checklist

  • I have read the contribution guidelines.
  • I have enabled maintainer edits.
  • I have added tests for all code changes.
  • I have added documentation for relevant changes (in the manual).
  • There's a new file in changelog/unreleased/ that describes the changes for our users (see template).
  • I have run gofmt on the code in all commits.
  • All commit messages are formatted in the same style as the other commits in the repo.
  • I'm done! This pull request is ready for review.

@greatroar
Copy link
Contributor Author

The changelog entry needs some reviewing. Also, there is an alternative solution that solves the original issue perfectly, which is to generate the inode number of a directory as its depth:

if node.Type == "dir" {
    inode = 1 + parent
}

but that has the obvious downside that a/b and a/x have the same inode, and similarly a/b/c and a/x/y. I bet find is going to trip over that too, especially when following symlinks to directories.

@rawtaz
Copy link
Contributor

rawtaz commented Mar 20, 2023

Even I can understand the changelog 👌

Since 0.15 (restic#4020), inodes are generated as hashes of names, xor'd with
the parent inode. That means that the inode of a/b/b is

	h(a/b/b) = h(a) ^ h(b) ^ h(b) = h(a).

I.e., the grandchild has the same inode as the grandparent. GNU find
trips over this because it thinks it has encountered a loop in the
filesystem, and fails to search a/b/b. This happens more generally when
the same name occurs an even number of times.

Fix this by multiplying the parent by a large prime, so the combining
operation is not longer symmetric in its arguments. This is what the FNV
hash does, which we used prior to 0.15. The hash is now

	h(a/b/b) = h(b) ^ p*(h(b) ^ p*h(a))

Note that we already ensure that h(x) is never zero.

Collisions can still occur, but they should be much less likely to occur
within a single path.

Fixes restic#4253.
Copy link
Member

@MichaelEischer MichaelEischer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for debugging this issue!

@MichaelEischer MichaelEischer merged commit 26a3c47 into restic:master Apr 7, 2023
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filesystem loop in mount when folder has same name as parent
3 participants