Skip to content

Commit

Permalink
Merge branch 'nfs-for-2.6.32'
Browse files Browse the repository at this point in the history
  • Loading branch information
Trond Myklebust authored and Trond Myklebust committed Sep 11, 2009
2 parents 332a339 + 2ecda72 commit ab3bbaa
Show file tree
Hide file tree
Showing 48 changed files with 4,064 additions and 1,805 deletions.
98 changes: 98 additions & 0 deletions Documentation/filesystems/nfs.txt
@@ -0,0 +1,98 @@

The NFS client
==============

The NFS version 2 protocol was first documented in RFC1094 (March 1989).
Since then two more major releases of NFS have been published, with NFSv3
being documented in RFC1813 (June 1995), and NFSv4 in RFC3530 (April
2003).

The Linux NFS client currently supports all the above published versions,
and work is in progress on adding support for minor version 1 of the NFSv4
protocol.

The purpose of this document is to provide information on some of the
upcall interfaces that are used in order to provide the NFS client with
some of the information that it requires in order to fully comply with
the NFS spec.

The DNS resolver
================

NFSv4 allows for one server to refer the NFS client to data that has been
migrated onto another server by means of the special "fs_locations"
attribute. See
http://tools.ietf.org/html/rfc3530#section-6
and
http://tools.ietf.org/html/draft-ietf-nfsv4-referrals-00

The fs_locations information can take the form of either an ip address and
a path, or a DNS hostname and a path. The latter requires the NFS client to
do a DNS lookup in order to mount the new volume, and hence the need for an
upcall to allow userland to provide this service.

Assuming that the user has the 'rpc_pipefs' filesystem mounted in the usual
/var/lib/nfs/rpc_pipefs, the upcall consists of the following steps:

(1) The process checks the dns_resolve cache to see if it contains a
valid entry. If so, it returns that entry and exits.

(2) If no valid entry exists, the helper script '/sbin/nfs_cache_getent'
(may be changed using the 'nfs.cache_getent' kernel boot parameter)
is run, with two arguments:
- the cache name, "dns_resolve"
- the hostname to resolve

(3) After looking up the corresponding ip address, the helper script
writes the result into the rpc_pipefs pseudo-file
'/var/lib/nfs/rpc_pipefs/cache/dns_resolve/channel'
in the following (text) format:

"<ip address> <hostname> <ttl>\n"

Where <ip address> is in the usual IPv4 (123.456.78.90) or IPv6
(ffee:ddcc:bbaa:9988:7766:5544:3322:1100, ffee::1100, ...) format.
<hostname> is identical to the second argument of the helper
script, and <ttl> is the 'time to live' of this cache entry (in
units of seconds).

Note: If <ip address> is invalid, say the string "0", then a negative
entry is created, which will cause the kernel to treat the hostname
as having no valid DNS translation.




A basic sample /sbin/nfs_cache_getent
=====================================

#!/bin/bash
#
ttl=600
#
cut=/usr/bin/cut
getent=/usr/bin/getent
rpc_pipefs=/var/lib/nfs/rpc_pipefs
#
die()
{
echo "Usage: $0 cache_name entry_name"
exit 1
}

[ $# -lt 2 ] && die
cachename="$1"
cache_path=${rpc_pipefs}/cache/${cachename}/channel

case "${cachename}" in
dns_resolve)
name="$2"
result="$(${getent} hosts ${name} | ${cut} -f1 -d\ )"
[ -z "${result}" ] && result="0"
;;
*)
die
;;
esac
echo "${result} ${name} ${ttl}" >${cache_path}

29 changes: 29 additions & 0 deletions Documentation/kernel-parameters.txt
Expand Up @@ -1503,6 +1503,14 @@ and is between 256 and 4096 characters. It is defined in the file
[NFS] set the TCP port on which the NFSv4 callback
channel should listen.

nfs.cache_getent=
[NFS] sets the pathname to the program which is used
to update the NFS client cache entries.

nfs.cache_getent_timeout=
[NFS] sets the timeout after which an attempt to
update a cache entry is deemed to have failed.

nfs.idmap_cache_timeout=
[NFS] set the maximum lifetime for idmapper cache
entries.
Expand Down Expand Up @@ -2395,6 +2403,18 @@ and is between 256 and 4096 characters. It is defined in the file
stifb= [HW]
Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]]

sunrpc.min_resvport=
sunrpc.max_resvport=
[NFS,SUNRPC]
SunRPC servers often require that client requests
originate from a privileged port (i.e. a port in the
range 0 < portnr < 1024).
An administrator who wishes to reserve some of these
ports for other uses may adjust the range that the
kernel's sunrpc client considers to be privileged
using these two parameters to set the minimum and
maximum port values.

sunrpc.pool_mode=
[NFS]
Control how the NFS server code allocates CPUs to
Expand All @@ -2411,6 +2431,15 @@ and is between 256 and 4096 characters. It is defined in the file
pernode one pool for each NUMA node (equivalent
to global on non-NUMA machines)

sunrpc.tcp_slot_table_entries=
sunrpc.udp_slot_table_entries=
[NFS,SUNRPC]
Sets the upper limit on the number of simultaneous
RPC calls that can be sent from the client to a
server. Increasing these values may allow you to
improve throughput, but will also increase the
amount of memory reserved for use by the client.

swiotlb= [IA-64] Number of I/O TLB slabs

switches= [HW,M68k]
Expand Down
14 changes: 1 addition & 13 deletions fs/lockd/host.c
Expand Up @@ -87,18 +87,6 @@ static unsigned int nlm_hash_address(const struct sockaddr *sap)
return hash & (NLM_HOST_NRHASH - 1);
}

static void nlm_clear_port(struct sockaddr *sap)
{
switch (sap->sa_family) {
case AF_INET:
((struct sockaddr_in *)sap)->sin_port = 0;
break;
case AF_INET6:
((struct sockaddr_in6 *)sap)->sin6_port = 0;
break;
}
}

/*
* Common host lookup routine for server & client
*/
Expand Down Expand Up @@ -177,7 +165,7 @@ static struct nlm_host *nlm_lookup_host(struct nlm_lookup_host_info *ni)
host->h_addrbuf = nsm->sm_addrbuf;
memcpy(nlm_addr(host), ni->sap, ni->salen);
host->h_addrlen = ni->salen;
nlm_clear_port(nlm_addr(host));
rpc_set_port(nlm_addr(host), 0);
memcpy(nlm_srcaddr(host), ni->src_sap, ni->src_len);
host->h_version = ni->version;
host->h_proto = ni->protocol;
Expand Down
44 changes: 5 additions & 39 deletions fs/lockd/mon.c
Expand Up @@ -61,43 +61,6 @@ static inline struct sockaddr *nsm_addr(const struct nsm_handle *nsm)
return (struct sockaddr *)&nsm->sm_addr;
}

static void nsm_display_ipv4_address(const struct sockaddr *sap, char *buf,
const size_t len)
{
const struct sockaddr_in *sin = (struct sockaddr_in *)sap;
snprintf(buf, len, "%pI4", &sin->sin_addr.s_addr);
}

static void nsm_display_ipv6_address(const struct sockaddr *sap, char *buf,
const size_t len)
{
const struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)sap;

if (ipv6_addr_v4mapped(&sin6->sin6_addr))
snprintf(buf, len, "%pI4", &sin6->sin6_addr.s6_addr32[3]);
else if (sin6->sin6_scope_id != 0)
snprintf(buf, len, "%pI6%%%u", &sin6->sin6_addr,
sin6->sin6_scope_id);
else
snprintf(buf, len, "%pI6", &sin6->sin6_addr);
}

static void nsm_display_address(const struct sockaddr *sap,
char *buf, const size_t len)
{
switch (sap->sa_family) {
case AF_INET:
nsm_display_ipv4_address(sap, buf, len);
break;
case AF_INET6:
nsm_display_ipv6_address(sap, buf, len);
break;
default:
snprintf(buf, len, "unsupported address family");
break;
}
}

static struct rpc_clnt *nsm_create(void)
{
struct sockaddr_in sin = {
Expand Down Expand Up @@ -307,8 +270,11 @@ static struct nsm_handle *nsm_create_handle(const struct sockaddr *sap,
memcpy(nsm_addr(new), sap, salen);
new->sm_addrlen = salen;
nsm_init_private(new);
nsm_display_address((const struct sockaddr *)&new->sm_addr,
new->sm_addrbuf, sizeof(new->sm_addrbuf));

if (rpc_ntop(nsm_addr(new), new->sm_addrbuf,
sizeof(new->sm_addrbuf)) == 0)
(void)snprintf(new->sm_addrbuf, sizeof(new->sm_addrbuf),
"unsupported address family");
memcpy(new->sm_name, hostname, hostname_len);
new->sm_name[hostname_len] = '\0';

Expand Down
3 changes: 2 additions & 1 deletion fs/nfs/Makefile
Expand Up @@ -6,7 +6,8 @@ obj-$(CONFIG_NFS_FS) += nfs.o

nfs-y := client.o dir.o file.o getroot.o inode.o super.o nfs2xdr.o \
direct.o pagelist.o proc.o read.o symlink.o unlink.o \
write.o namespace.o mount_clnt.o
write.o namespace.o mount_clnt.o \
dns_resolve.o cache_lib.o
nfs-$(CONFIG_ROOT_NFS) += nfsroot.o
nfs-$(CONFIG_NFS_V3) += nfs3proc.o nfs3xdr.o
nfs-$(CONFIG_NFS_V3_ACL) += nfs3acl.o
Expand Down
140 changes: 140 additions & 0 deletions fs/nfs/cache_lib.c
@@ -0,0 +1,140 @@
/*
* linux/fs/nfs/cache_lib.c
*
* Helper routines for the NFS client caches
*
* Copyright (c) 2009 Trond Myklebust <Trond.Myklebust@netapp.com>
*/
#include <linux/kmod.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/mount.h>
#include <linux/namei.h>
#include <linux/sunrpc/cache.h>
#include <linux/sunrpc/rpc_pipe_fs.h>

#include "cache_lib.h"

#define NFS_CACHE_UPCALL_PATHLEN 256
#define NFS_CACHE_UPCALL_TIMEOUT 15

static char nfs_cache_getent_prog[NFS_CACHE_UPCALL_PATHLEN] =
"/sbin/nfs_cache_getent";
static unsigned long nfs_cache_getent_timeout = NFS_CACHE_UPCALL_TIMEOUT;

module_param_string(cache_getent, nfs_cache_getent_prog,
sizeof(nfs_cache_getent_prog), 0600);
MODULE_PARM_DESC(cache_getent, "Path to the client cache upcall program");
module_param_named(cache_getent_timeout, nfs_cache_getent_timeout, ulong, 0600);
MODULE_PARM_DESC(cache_getent_timeout, "Timeout (in seconds) after which "
"the cache upcall is assumed to have failed");

int nfs_cache_upcall(struct cache_detail *cd, char *entry_name)
{
static char *envp[] = { "HOME=/",
"TERM=linux",
"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
NULL
};
char *argv[] = {
nfs_cache_getent_prog,
cd->name,
entry_name,
NULL
};
int ret = -EACCES;

if (nfs_cache_getent_prog[0] == '\0')
goto out;
ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
/*
* Disable the upcall mechanism if we're getting an ENOENT or
* EACCES error. The admin can re-enable it on the fly by using
* sysfs to set the 'cache_getent' parameter once the problem
* has been fixed.
*/
if (ret == -ENOENT || ret == -EACCES)
nfs_cache_getent_prog[0] = '\0';
out:
return ret > 0 ? 0 : ret;
}

/*
* Deferred request handling
*/
void nfs_cache_defer_req_put(struct nfs_cache_defer_req *dreq)
{
if (atomic_dec_and_test(&dreq->count))
kfree(dreq);
}

static void nfs_dns_cache_revisit(struct cache_deferred_req *d, int toomany)
{
struct nfs_cache_defer_req *dreq;

dreq = container_of(d, struct nfs_cache_defer_req, deferred_req);

complete_all(&dreq->completion);
nfs_cache_defer_req_put(dreq);
}

static struct cache_deferred_req *nfs_dns_cache_defer(struct cache_req *req)
{
struct nfs_cache_defer_req *dreq;

dreq = container_of(req, struct nfs_cache_defer_req, req);
dreq->deferred_req.revisit = nfs_dns_cache_revisit;
atomic_inc(&dreq->count);

return &dreq->deferred_req;
}

struct nfs_cache_defer_req *nfs_cache_defer_req_alloc(void)
{
struct nfs_cache_defer_req *dreq;

dreq = kzalloc(sizeof(*dreq), GFP_KERNEL);
if (dreq) {
init_completion(&dreq->completion);
atomic_set(&dreq->count, 1);
dreq->req.defer = nfs_dns_cache_defer;
}
return dreq;
}

int nfs_cache_wait_for_upcall(struct nfs_cache_defer_req *dreq)
{
if (wait_for_completion_timeout(&dreq->completion,
nfs_cache_getent_timeout * HZ) == 0)
return -ETIMEDOUT;
return 0;
}

int nfs_cache_register(struct cache_detail *cd)
{
struct nameidata nd;
struct vfsmount *mnt;
int ret;

mnt = rpc_get_mount();
if (IS_ERR(mnt))
return PTR_ERR(mnt);
ret = vfs_path_lookup(mnt->mnt_root, mnt, "/cache", 0, &nd);
if (ret)
goto err;
ret = sunrpc_cache_register_pipefs(nd.path.dentry,
cd->name, 0600, cd);
path_put(&nd.path);
if (!ret)
return ret;
err:
rpc_put_mount();
return ret;
}

void nfs_cache_unregister(struct cache_detail *cd)
{
sunrpc_cache_unregister_pipefs(cd);
rpc_put_mount();
}

0 comments on commit ab3bbaa

Please sign in to comment.