Skip to content

Support for hard links by using backing IDs#68

Closed
khanhtranngoccva wants to merge 17 commits intoAlogani:develfrom
khanhtranngoccva:hard-link-support
Closed

Support for hard links by using backing IDs#68
khanhtranngoccva wants to merge 17 commits intoAlogani:develfrom
khanhtranngoccva:hard-link-support

Conversation

@khanhtranngoccva
Copy link
Copy Markdown
Contributor

@khanhtranngoccva khanhtranngoccva commented Jan 13, 2026

This pull request should add a new inode resolver and mapper (InodeMultiMapper) - instead of resolving to a single PathBuf, the resolver should now deal with HybridId structures. A HybridId is a hashable wrapper over a raw inode, but can also search the mapper for every known path to the same inode, up to a given limit.

The user can define and supply the ID of the backing/underlying file, usually a combination of device number dev_t, filesystem ID fsid_t and inode ino_t, and when the same backing ID is found, instead of a new inode being allocated, the inode associated with the ID is reused.

For example, if /path/to/A and /path/to/B both share the same underlying inode on the same FS, thus the same backing ID like {dev: 15u64, fsid: 100u64, ino: 243}, then the lookup of both paths will yield the same inode. During resolution, they will map to the same HybridId, and both paths can be searched using HybridId::paths with limit set to at least 2.

The implementation of the mapper's resolve and resolve_all methods contains checks that if an inode is already visited and will not visit the same inode twice to prevent an infinite loop.

The caveat of this resolution scheme is that PathBufs are no longer comparable directly and the first_path call varies after each request to an inode. An inode may also go orphaned and no longer resolve to a path, which happens if all of its locations are occupied by another inode using rename operations. Additionally, setting the limit too high in the HybridId::paths method will create a lot of memory footprint and may slow down performance a lot on filesystems where hard links are extensively used on multiple levels, since it is a Cartesian product-like algorithm.

This PR should not be breaking since resolver structures are private, and the InodeMapper structure is retained. Old resolver structures also had the internal mapper migrated to the new MultiInodeMapper.

@khanhtranngoccva khanhtranngoccva marked this pull request as draft January 13, 2026 09:35
@khanhtranngoccva
Copy link
Copy Markdown
Contributor Author

khanhtranngoccva commented Jan 13, 2026

I currently marked this as draft because now inodes can be orphaned (no links to parent, which is correct behavior) and attempting to resolve them for a forget() call leads to a panic. This also prevents the old resolvers from migrating to the new MultiInodeMapper (it might be a misnomer, let me know if there should be a rename), which always expect a path even if an inode occupying that path is orphaned after a rename.

@khanhtranngoccva khanhtranngoccva marked this pull request as ready for review January 13, 2026 11:34
@khanhtranngoccva
Copy link
Copy Markdown
Contributor Author

khanhtranngoccva commented Jan 14, 2026

Additional tested behavior: If user specifies an unrecognized backing ID to a location already having a (guaranteed to be different) backing ID, a new inode is created. A debug-only assertion warrants these two backing IDs from being the same.

@Alogani
Copy link
Copy Markdown
Owner

Alogani commented Jan 27, 2026

Hi khanhtranngoccva,

Is this PR related to the support of passthrough feature that you want to use ?

Because if not, i don't see the use case clarity for HybridResolver** - I'm not convinced of the value proposition:

  • FuseHandler<PathBuf>: Simple, traditional filesystem semantics
  • FuseHandler<Inode>: Lightweight, maximum control
  • FuseHandler<HybridId<BackingId>>: Adds significant complexity for what seems to be a niche use case (hard link support). What real-world scenarios require this over the existing options ?

Could you clarify the use case or provide examples (even with pseudo code) of where this would be beneficial ?

@khanhtranngoccva
Copy link
Copy Markdown
Contributor Author

khanhtranngoccva commented Jan 28, 2026

The inclusion of this PR by the other features were a mistake because I forgot to use git checkout devel first before doing anything else like creating a new branch.

As for the use case, this is to ensure POSIX compliance on mirror filesystems, where after hard linking path A to path B, the inode of path A must be used for path B. This does not happen on FuseHandler<PathBuf> where a new inode is always created after a link call.

The HybridId aims to solve this while still allowing the user to take advantage of the simplicity of PathBuf if they so choose (by calling HybridId::first_path or HybridId::paths). The code for the filesystem I am developing extensively uses PathBuf to compute the opening policies for a file.

type FilesystemId = HybridId<BackingFileId>;

impl LinkFS {
  fn open(&self, id: FilesystemId) -> Result<OwnedFileHandle, PosixError> {
    // Since an inode may have more than one path alias e.g. `/path/to/A` and `/path/to/B`, 
    // if the user tries to open `/path/to/A`, the FUSE driver might see the open request as `/path/to/B`, 
    // thus incorrectly altering the policy interpretation result in the `is_protected_view_allowed` call 
    // if we did not exhaustively scan for all path aliases.
    let candidate_paths = id.paths(100);
    if candidate_paths.len() == 0 {
        return Err(PosixError::from_raw_os_error(libc::EINVAL));
    }
    let protected_view_allowed = self.is_protected_view_allowed(candidate_paths)?;
    let protected_view_args = protected_view_allowed.then_some(self.get_protected_view_args());
    let backing_path = candidate_paths.first();
    let handle = Handle::open(backing_path, protected_view_args)?;
    todo!();
  }
  
  // POSIX compliance requires the newly generated entry to have the same inode number 
  // as the inode to be hard linked. This allows use cases in mirror filesystems like file locking on two paths that 
  // resolves to inodes on the file system. (e.g.: /path/to/A -> /source/path/to/A, /path/to/B -> /source/path/to/B, 
  // both pointing to inode N on underlying ext4 filesystem)

  // This method should also be resilient to a race condition scenario where the underlying backing path 
  // in the mount source directory is suddenly overridden with a new inode (/source/path/to/file: N -> N')
  // without the FUSE driver's knowledge - the returned BackingFileId in that case should be N'.
  fn link(&self, id: FilesystemId, target_dir_id: FilesystemId, name: &OsStr) -> Result<(Option<BackingFileId>, FileAttribute), PosixError> {
    let old_handle = self.internal_open(id)?;
    let target_dir_handle = self.internal_open(target_dir_id)?;
    let link_at = self.internal_link_at(&old_handle, &target_dir_id, &name)?;
    // Returning the old handle's backing ID allows the new link to use the old handle's inode value
    (Some(old_handle.backing_file_id()), old_handle.attributes()?)
  }
}

@Alogani
Copy link
Copy Markdown
Owner

Alogani commented Jan 29, 2026

Ok, i will take time to review it before merging.

I don't still comprehend the rationals, but you have a legitimate use case and it doesn't change the actual implementation.
I would like to also look more into HybridId (why it is a generic and if another name could make more sense).

@Alogani Alogani closed this Jan 31, 2026
Alogani added a commit that referenced this pull request Feb 1, 2026
HardLink support by the introduction of the new type `HybridId<BackingId>` and a specialized Inode to Path mapper called InodeMultiMapper
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants