Large diffs are not rendered by default.

@@ -216,7 +216,6 @@ had ->revalidate()) add calls in ->follow_link()/->readlink().
->d_parent changes are not protected by BKL anymore. Read access is safe
if at least one of the following is true:
* filesystem has no cross-directory rename()
* dcache_lock is held
* we know that parent had been locked (e.g. we are looking at
->d_parent of ->lookup() argument).
* we are called from ->rename().
@@ -318,3 +317,87 @@ if it's zero is not *and* *never* *had* *been* enough. Final unlink() and iput(
may happen while the inode is in the middle of ->write_inode(); e.g. if you blindly
free the on-disk inode, you may end up doing that while ->write_inode() is writing
to it.

---
[mandatory]

.d_delete() now only advises the dcache as to whether or not to cache
unreferenced dentries, and is now only called when the dentry refcount goes to
0. Even on 0 refcount transition, it must be able to tolerate being called 0,
1, or more times (eg. constant, idempotent).

---
[mandatory]

.d_compare() calling convention and locking rules are significantly
changed. Read updated documentation in Documentation/filesystems/vfs.txt (and
look at examples of other filesystems) for guidance.

---
[mandatory]

.d_hash() calling convention and locking rules are significantly
changed. Read updated documentation in Documentation/filesystems/vfs.txt (and
look at examples of other filesystems) for guidance.

---
[mandatory]
dcache_lock is gone, replaced by fine grained locks. See fs/dcache.c
for details of what locks to replace dcache_lock with in order to protect
particular things. Most of the time, a filesystem only needs ->d_lock, which
protects *all* the dcache state of a given dentry.

--
[mandatory]

Filesystems must RCU-free their inodes, if they can have been accessed
via rcu-walk path walk (basically, if the file can have had a path name in the
vfs namespace).

i_dentry and i_rcu share storage in a union, and the vfs expects
i_dentry to be reinitialized before it is freed, so an:

INIT_LIST_HEAD(&inode->i_dentry);

must be done in the RCU callback.

--
[recommended]
vfs now tries to do path walking in "rcu-walk mode", which avoids
atomic operations and scalability hazards on dentries and inodes (see
Documentation/filesystems/path-lookup.txt). d_hash and d_compare changes
(above) are examples of the changes required to support this. For more complex
filesystem callbacks, the vfs drops out of rcu-walk mode before the fs call, so
no changes are required to the filesystem. However, this is costly and loses
the benefits of rcu-walk mode. We will begin to add filesystem callbacks that
are rcu-walk aware, shown below. Filesystems should take advantage of this
where possible.

--
[mandatory]
d_revalidate is a callback that is made on every path element (if
the filesystem provides it), which requires dropping out of rcu-walk mode. This
may now be called in rcu-walk mode (nd->flags & LOOKUP_RCU). -ECHILD should be
returned if the filesystem cannot handle rcu-walk. See
Documentation/filesystems/vfs.txt for more details.

permission and check_acl are inode permission checks that are called
on many or all directory inodes on the way down a path walk (to check for
exec permission). These must now be rcu-walk aware (flags & IPERM_FLAG_RCU).
See Documentation/filesystems/vfs.txt for more details.

--
[mandatory]
In ->fallocate() you must check the mode option passed in. If your
filesystem does not support hole punching (deallocating space in the middle of a
file) you must return -EOPNOTSUPP if FALLOC_FL_PUNCH_HOLE is set in mode.
Currently you can only have FALLOC_FL_PUNCH_HOLE with FALLOC_FL_KEEP_SIZE set,
so the i_size should not change when hole punching, even when puching the end of
a file off.

--
[mandatory]
->get_sb() is gone. Switch to use of ->mount(). Typically it's just
a matter of switching from calling get_sb_... to mount_... and changing the
function type. If you were doing it manually, just switch from setting ->mnt_root
to some pointer to returning that pointer. On errors return ERR_PTR(...).
@@ -95,10 +95,11 @@ functions:
extern int unregister_filesystem(struct file_system_type *);

The passed struct file_system_type describes your filesystem. When a
request is made to mount a device onto a directory in your filespace,
the VFS will call the appropriate get_sb() method for the specific
filesystem. The dentry for the mount point will then be updated to
point to the root inode for the new filesystem.
request is made to mount a filesystem onto a directory in your namespace,
the VFS will call the appropriate mount() method for the specific
filesystem. New vfsmount refering to the tree returned by ->mount()
will be attached to the mountpoint, so that when pathname resolution
reaches the mountpoint it will jump into the root of that vfsmount.

You can see all filesystems that are registered to the kernel in the
file /proc/filesystems.
@@ -107,14 +108,14 @@ file /proc/filesystems.
struct file_system_type
-----------------------

This describes the filesystem. As of kernel 2.6.22, the following
This describes the filesystem. As of kernel 2.6.39, the following
members are defined:

struct file_system_type {
const char *name;
int fs_flags;
int (*get_sb) (struct file_system_type *, int,
const char *, void *, struct vfsmount *);
struct dentry (*mount) (struct file_system_type *, int,
const char *, void *);
void (*kill_sb) (struct super_block *);
struct module *owner;
struct file_system_type * next;
@@ -128,11 +129,11 @@ struct file_system_type {

fs_flags: various flags (i.e. FS_REQUIRES_DEV, FS_NO_DCACHE, etc.)

get_sb: the method to call when a new instance of this
mount: the method to call when a new instance of this
filesystem should be mounted

kill_sb: the method to call when an instance of this filesystem
should be unmounted
should be shut down

owner: for internal VFS use: you should initialize this to THIS_MODULE in
most cases.
@@ -141,7 +142,7 @@ struct file_system_type {

s_lock_key, s_umount_key: lockdep-specific

The get_sb() method has the following arguments:
The mount() method has the following arguments:

struct file_system_type *fs_type: describes the filesystem, partly initialized
by the specific filesystem code
@@ -153,32 +154,39 @@ The get_sb() method has the following arguments:
void *data: arbitrary mount options, usually comes as an ASCII
string (see "Mount Options" section)

struct vfsmount *mnt: a vfs-internal representation of a mount point
The mount() method must return the root dentry of the tree requested by
caller. An active reference to its superblock must be grabbed and the
superblock must be locked. On failure it should return ERR_PTR(error).

The get_sb() method must determine if the block device specified
in the dev_name and fs_type contains a filesystem of the type the method
supports. If it succeeds in opening the named block device, it initializes a
struct super_block descriptor for the filesystem contained by the block device.
On failure it returns an error.
The arguments match those of mount(2) and their interpretation
depends on filesystem type. E.g. for block filesystems, dev_name is
interpreted as block device name, that device is opened and if it
contains a suitable filesystem image the method creates and initializes
struct super_block accordingly, returning its root dentry to caller.

->mount() may choose to return a subtree of existing filesystem - it
doesn't have to create a new one. The main result from the caller's
point of view is a reference to dentry at the root of (sub)tree to
be attached; creation of new superblock is a common side effect.

The most interesting member of the superblock structure that the
get_sb() method fills in is the "s_op" field. This is a pointer to
mount() method fills in is the "s_op" field. This is a pointer to
a "struct super_operations" which describes the next level of the
filesystem implementation.

Usually, a filesystem uses one of the generic get_sb() implementations
and provides a fill_super() method instead. The generic methods are:
Usually, a filesystem uses one of the generic mount() implementations
and provides a fill_super() callback instead. The generic variants are:

get_sb_bdev: mount a filesystem residing on a block device
mount_bdev: mount a filesystem residing on a block device

get_sb_nodev: mount a filesystem that is not backed by a device
mount_nodev: mount a filesystem that is not backed by a device

get_sb_single: mount a filesystem which shares the instance between
mount_single: mount a filesystem which shares the instance between
all mounts

A fill_super() method implementation has the following arguments:
A fill_super() callback implementation has the following arguments:

struct super_block *sb: the superblock structure. The method fill_super()
struct super_block *sb: the superblock structure. The callback
must initialize this properly.

void *data: arbitrary mount options, usually comes as an ASCII
@@ -843,23 +843,6 @@ struct dentry *mount_bdev(struct file_system_type *fs_type,
}
EXPORT_SYMBOL(mount_bdev);

int get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt)
{
struct dentry *root;

root = mount_bdev(fs_type, flags, dev_name, data, fill_super);
if (IS_ERR(root))
return PTR_ERR(root);
mnt->mnt_root = root;
mnt->mnt_sb = root->d_sb;
return 0;
}

EXPORT_SYMBOL(get_sb_bdev);

void kill_block_super(struct super_block *sb)
{
struct block_device *bdev = sb->s_bdev;
@@ -897,22 +880,6 @@ struct dentry *mount_nodev(struct file_system_type *fs_type,
}
EXPORT_SYMBOL(mount_nodev);

int get_sb_nodev(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt)
{
struct dentry *root;

root = mount_nodev(fs_type, flags, data, fill_super);
if (IS_ERR(root))
return PTR_ERR(root);
mnt->mnt_root = root;
mnt->mnt_sb = root->d_sb;
return 0;
}
EXPORT_SYMBOL(get_sb_nodev);

static int compare_single(struct super_block *s, void *p)
{
return 1;
@@ -943,22 +910,6 @@ struct dentry *mount_single(struct file_system_type *fs_type,
}
EXPORT_SYMBOL(mount_single);

int get_sb_single(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt)
{
struct dentry *root;
root = mount_single(fs_type, flags, data, fill_super);
if (IS_ERR(root))
return PTR_ERR(root);
mnt->mnt_root = root;
mnt->mnt_sb = root->d_sb;
return 0;
}

EXPORT_SYMBOL(get_sb_single);

struct vfsmount *
vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void *data)
{
@@ -988,19 +939,13 @@ vfs_kern_mount(struct file_system_type *type, int flags, const char *name, void
goto out_free_secdata;
}

if (type->mount) {
root = type->mount(type, flags, name, data);
if (IS_ERR(root)) {
error = PTR_ERR(root);
goto out_free_secdata;
}
mnt->mnt_root = root;
mnt->mnt_sb = root->d_sb;
} else {
error = type->get_sb(type, flags, name, data, mnt);
if (error < 0)
goto out_free_secdata;
root = type->mount(type, flags, name, data);
if (IS_ERR(root)) {
error = PTR_ERR(root);
goto out_free_secdata;
}
mnt->mnt_root = root;
mnt->mnt_sb = root->d_sb;
BUG_ON(!mnt->mnt_sb);
WARN_ON(!mnt->mnt_sb->s_bdi);
mnt->mnt_sb->s_flags |= MS_BORN;
@@ -236,7 +236,7 @@ static int yaffs_file_flush(struct file *file, fl_owner_t id);
static int yaffs_file_flush(struct file *file);
#endif

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
static int yaffs_sync_object(struct file *file, loff_t start, loff_t end, int datasync);
#elif (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 34))
static int yaffs_sync_object(struct file *file, int datasync);
@@ -1827,7 +1827,7 @@ static int yaffs_symlink(struct inode *dir, struct dentry *dentry,
return -ENOMEM;
}

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
static int yaffs_sync_object(struct file *file, loff_t start, loff_t end, int datasync)
#elif (LINUX_VERSION_CODE > KERNEL_VERSION(2, 6, 34))
static int yaffs_sync_object(struct file *file, int datasync)
@@ -2976,7 +2976,7 @@ static int yaffs_internal_read_super_mtd(struct super_block *sb, void *data,
return yaffs_internal_read_super(1, sb, data, silent) ? 0 : -EINVAL;
}

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
static struct dentry *yaffs_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data)
{
@@ -3005,7 +3005,7 @@ static struct super_block *yaffs_read_super(struct file_system_type *fs,
static struct file_system_type yaffs_fs_type = {
.owner = THIS_MODULE,
.name = "yaffs",
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
.mount = yaffs_mount,
#else
.get_sb = yaffs_read_super,
@@ -3032,7 +3032,7 @@ static int yaffs2_internal_read_super_mtd(struct super_block *sb, void *data,
return yaffs_internal_read_super(2, sb, data, silent) ? 0 : -EINVAL;
}

#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
static struct dentry *yaffs2_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data)
{
@@ -3060,7 +3060,7 @@ static struct super_block *yaffs2_read_super(struct file_system_type *fs,
static struct file_system_type yaffs2_fs_type = {
.owner = THIS_MODULE,
.name = "yaffs2",
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 39))
#if (LINUX_VERSION_CODE >= KERNEL_VERSION(2, 6, 38))
.mount = yaffs2_mount,
#else
.get_sb = yaffs2_read_super,
@@ -1796,8 +1796,6 @@ int sync_inode_metadata(struct inode *inode, int wait);
struct file_system_type {
const char *name;
int fs_flags;
int (*get_sb) (struct file_system_type *, int,
const char *, void *, struct vfsmount *);
struct dentry *(*mount) (struct file_system_type *, int,
const char *, void *);
void (*kill_sb) (struct super_block *);
@@ -1820,24 +1818,12 @@ extern struct dentry *mount_ns(struct file_system_type *fs_type, int flags,
extern struct dentry *mount_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int));
extern int get_sb_bdev(struct file_system_type *fs_type,
int flags, const char *dev_name, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt);
extern struct dentry *mount_single(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int));
extern int get_sb_single(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt);
extern struct dentry *mount_nodev(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int));
extern int get_sb_nodev(struct file_system_type *fs_type,
int flags, void *data,
int (*fill_super)(struct super_block *, void *, int),
struct vfsmount *mnt);
void generic_shutdown_super(struct super_block *sb);
void kill_block_super(struct super_block *sb);
void kill_anon_super(struct super_block *sb);