% The FUSE Wire Protocol
This document tries to summarize and structure what I have learned about the FUSE (Filesystem in Userspace) protocol and Linux kernel internals during the development of gocryptfs.
The Markdown source code of this document is available at https://github.com/rfjakob/the-fuse-wire-protocol - pull requests welcome!
The rendered HTML should always be available at https://nuetzlich.net/the-fuse-wire-protocol/.
To understand how FUSE works it is important to know how the Linux filesystem stack looks like. FUSE is designed to fit seamlessly into the existing model.
Let's take unlink("/tmp/foo")
on an ext4 filesystem as an example.
Like many other system calls, unlink()
operates on a file path, while
Linux interally operates on dentry
("directory entry") structs
(definiton).
Each dentry
has a pointer to an inode
struct
(definiton)
that is filled by the filesystem (in our example, ext4).
Each inode
struct in turn contains a list of function pointers in an inode_operations
struct
(definition).
The overall structure looks like this:
dentry
inode
inode_operations
lookup()
unlink()
- ...
The Linux VFS layer splits the path into segments. In our case, /
, tmp
, foo
.
The /
(root directory) dentry
is created
at mount-time and serves as the starting point for the recursive walk:
- The VFS calls
lookup("tmp")
on thedentry
corresponding to/
and receives thedentry
fortmp
- The VFS calls
lookup("foo")
on thedentry
corresponding totmp
and receives thedentry
forfoo
- The VFS calls
unlink()
on thedentry
corresponding tofoo
The lookup()
and unlink()
functions are, in our example, implemented by the ext4 filesystem.
For a FUSE filesystem, the functions in inode_operations
are implemented in the
userspace filesystem. The FUSE module in the Linux kernel provides stub implementations
(definition)
that forward the requests to the userspace filesystem and convert between kernel API and FUSE wire protocol.
Translating paths to dentry
structs is a performance-critical operation. To avoid calling the
filesystem's lookup()
function for each segment, the Linux kernel implements a directory entry
cache called dcache
.
For local filesystems like ext4, the cached entries never expire. For FUSE filesystems, the default
timeout is 1 second, but it can be set to an arbitrary value using the entry_timeout
mount option
in libfuse (see man 8 fuse
) or the EntryTimeout
field in go-fuse.
The Linux kernel and the userspace filesystem communicate by sending messages through the
/dev/fuse
device. On the kernel side, message parsing and generation is handled by the FUSE
module. On the userspace side this is usually handled by a FUSE library.
libfuse is the reference implementation and is developed
in lockstep with the kernel. Alternative FUSE libraries like
go-fuse
follow the developments in libfuse.
Note: the excellent manual page fuse.4 has more details.
Kernel & userspace have the message format defined correspondingly in C header files:
- Userspace: libfuse/include/fuse_kernel.h
- Kernel: linux/include/uapi/linux/fuse.h
Every message from the kernel to userspace starts with the fuse_in_header
struct
(definition),
the most interesting fields are:
opcode
... the operation the kernel wants to perform (a uint32 from enum fuse_opcode)nodeid
... the file or directory to operate on (arbitrary uint64 identifier)
The opcode defines the data that follows the header. An opcode-specific struct and up to
two filenames may follow. A RENAME
message uses all of those fields and looks like this:
fuse_in_header
structfuse_rename_in
struct- filename
- filename
Whereas an UNLINK
message looks like this:
fuse_in_header
struct- filename
The go-fuse library has two nice tables listing what data follows the header for each opcode. Due to Go naming conventions, the struct names are slightly different than the C names, but the correlation should be clear enough.
The nodeid
field in fuse_in_header
identifies which file or directory the operation
should be performed on. The kernel has to obtain the nodeid
from the
userspace filesystem before it can perform any other operation.
The process is the same for in-kernel filesystems: See the section "The Inode Object" in https://www.kernel.org/doc/Documentation/filesystems/vfs.txt.
The LOOKUP
opcode allows the kernel to get a nodeid
for a filename in a directory.
A LOOKUP
message looks like this:
fuse_in_header
struct- filename
The userspace filesystem replies with the nodeid
corresponding to
the filename in the directory identified by the nodeid
in the header.
The root directory has a fixed nodeid
of 1.
The nodeid
is an arbitrary value that is chosen by the userspace
filesystem. The userspace filesystem must remember which file or
directory the nodeid
corresponds to.
-
fuse(4) — Linux manual page https://man7.org/linux/man-pages/man4/fuse.4.html
-
Writing a FUSE Filesystem: a Tutorial
Joseph J. Pfeiffer Jr.
https://www.cs.nmsu.edu/~pfeiffer/fuse-tutorial/ -
Overview of the Linux Virtual File System
Richard Gooch, Pekka Enberg
https://www.kernel.org/doc/Documentation/filesystems/vfs.txt -
To FUSE or Not to FUSE: Performance of User-Space File Systems
Vangoor, Tarasov, Zadok; 2017
https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf