Skip to content

Commit

Permalink
devpts: Make each mount of devpts an independent filesystem.
Browse files Browse the repository at this point in the history
The /dev/ptmx device node is changed to lookup the directory entry "pts"
in the same directory as the /dev/ptmx device node was opened in.  If
there is a "pts" entry and that entry is a devpts filesystem /dev/ptmx
uses that filesystem.  Otherwise the open of /dev/ptmx fails.

The DEVPTS_MULTIPLE_INSTANCES configuration option is removed, so that
userspace can now safely depend on each mount of devpts creating a new
instance of the filesystem.

Each mount of devpts is now a separate and equal filesystem.

Reserved ttys are now available to all instances of devpts where the
mounter is in the initial mount namespace.

A new vfs helper path_pts is introduced that finds a directory entry
named "pts" in the directory of the passed in path, and changes the
passed in path to point to it.  The helper path_pts uses a function
path_parent_directory that was factored out of follow_dotdot.

In the implementation of devpts:
 - devpts_mnt is killed as it is no longer meaningful if all mounts of
   devpts are equal.
 - pts_sb_from_inode is replaced by just inode->i_sb as all cached
   inodes in the tty layer are now from the devpts filesystem.
 - devpts_add_ref is rolled into the new function devpts_ptmx.  And the
   unnecessary inode hold is removed.
 - devpts_del_ref is renamed devpts_release and reduced to just a
   deacrivate_super.
 - The newinstance mount option continues to be accepted but is now
   ignored.

In devpts_fs.h definitions for when !CONFIG_UNIX98_PTYS are removed as
they are never used.

Documentation/filesystems/devices.txt is updated to describe the current
situation.

This has been verified to work properly on openwrt-15.05, centos5,
centos6, centos7, debian-6.0.2, debian-7.9, debian-8.2, ubuntu-14.04.3,
ubuntu-15.10, fedora23, magia-5, mint-17.3, opensuse-42.1,
slackware-14.1, gentoo-20151225 (13.0?), archlinux-2015-12-01.  With the
caveat that on centos6 and on slackware-14.1 that there wind up being
two instances of the devpts filesystem mounted on /dev/pts, the lower
copy does not end up getting used.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Greg KH <greg@kroah.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Serge Hallyn <serge.hallyn@ubuntu.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: One Thousand Gnomes <gnomes@lxorguk.ukuu.org.uk>
Cc: Jann Horn <jann@thejh.net>
Cc: Jiri Slaby <jslaby@suse.com>
Cc: Florian Weimer <fw@deneb.enyo.de>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  • Loading branch information
ebiederm authored and torvalds committed Jun 5, 2016
1 parent 049ec1b commit eedf265
Show file tree
Hide file tree
Showing 7 changed files with 126 additions and 296 deletions.
145 changes: 15 additions & 130 deletions Documentation/filesystems/devpts.txt
Original file line number Diff line number Diff line change
@@ -1,141 +1,26 @@
Each mount of the devpts filesystem is now distinct such that ptys
and their indicies allocated in one mount are independent from ptys
and their indicies in all other mounts.

To support containers, we now allow multiple instances of devpts filesystem,
such that indices of ptys allocated in one instance are independent of indices
allocated in other instances of devpts.
All mounts of the devpts filesystem now create a /dev/pts/ptmx node
with permissions 0000.

To preserve backward compatibility, this support for multiple instances is
enabled only if:
To retain backwards compatibility the a ptmx device node (aka any node
created with "mknod name c 5 2") when opened will look for an instance
of devpts under the name "pts" in the same directory as the ptmx device
node.

- CONFIG_DEVPTS_MULTIPLE_INSTANCES=y, and
- '-o newinstance' mount option is specified while mounting devpts

IOW, devpts now supports both single-instance and multi-instance semantics.

If CONFIG_DEVPTS_MULTIPLE_INSTANCES=n, there is no change in behavior and
this referred to as the "legacy" mode. In this mode, the new mount options
(-o newinstance and -o ptmxmode) will be ignored with a 'bogus option' message
on console.

If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and devpts is mounted without the
'newinstance' option (as in current start-up scripts) the new mount binds
to the initial kernel mount of devpts. This mode is referred to as the
'single-instance' mode and the current, single-instance semantics are
preserved, i.e PTYs are common across the system.

The only difference between this single-instance mode and the legacy mode
is the presence of new, '/dev/pts/ptmx' node with permissions 0000, which
can safely be ignored.

If CONFIG_DEVPTS_MULTIPLE_INSTANCES=y and 'newinstance' option is specified,
the mount is considered to be in the multi-instance mode and a new instance
of the devpts fs is created. Any ptys created in this instance are independent
of ptys in other instances of devpts. Like in the single-instance mode, the
/dev/pts/ptmx node is present. To effectively use the multi-instance mode,
open of /dev/ptmx must be a redirected to '/dev/pts/ptmx' using a symlink or
bind-mount.

Eg: A container startup script could do the following:

$ chmod 0666 /dev/pts/ptmx
$ rm /dev/ptmx
$ ln -s pts/ptmx /dev/ptmx
$ ns_exec -cm /bin/bash

# We are now in new container

$ umount /dev/pts
$ mount -t devpts -o newinstance lxcpts /dev/pts
$ sshd -p 1234

where 'ns_exec -cm /bin/bash' calls clone() with CLONE_NEWNS flag and execs
/bin/bash in the child process. A pty created by the sshd is not visible in
the original mount of /dev/pts.
As an option instead of placing a /dev/ptmx device node at /dev/ptmx
it is possible to place a symlink to /dev/pts/ptmx at /dev/ptmx or
to bind mount /dev/ptx/ptmx to /dev/ptmx. If you opt for using
the devpts filesystem in this manner devpts should be mounted with
the ptmxmode=0666, or chmod 0666 /dev/pts/ptmx should be called.

Total count of pty pairs in all instances is limited by sysctls:
kernel.pty.max = 4096 - global limit
kernel.pty.reserve = 1024 - reserve for initial instance
kernel.pty.reserve = 1024 - reserved for filesystems mounted from the initial mount namespace
kernel.pty.nr - current count of ptys

Per-instance limit could be set by adding mount option "max=<count>".
This feature was added in kernel 3.4 together with sysctl kernel.pty.reserve.
In kernels older than 3.4 sysctl kernel.pty.max works as per-instance limit.

User-space changes
------------------

In multi-instance mode (i.e '-o newinstance' mount option is specified at least
once), following user-space issues should be noted.

1. If -o newinstance mount option is never used, /dev/pts/ptmx can be ignored
and no change is needed to system-startup scripts.

2. To effectively use multi-instance mode (i.e -o newinstance is specified)
administrators or startup scripts should "redirect" open of /dev/ptmx to
/dev/pts/ptmx using either a bind mount or symlink.

$ mount -t devpts -o newinstance devpts /dev/pts

followed by either

$ rm /dev/ptmx
$ ln -s pts/ptmx /dev/ptmx
$ chmod 666 /dev/pts/ptmx
or
$ mount -o bind /dev/pts/ptmx /dev/ptmx

3. The '/dev/ptmx -> pts/ptmx' symlink is the preferred method since it
enables better error-reporting and treats both single-instance and
multi-instance mounts similarly.

But this method requires that system-startup scripts set the mode of
/dev/pts/ptmx correctly (default mode is 0000). The scripts can set the
mode by, either

- adding ptmxmode mount option to devpts entry in /etc/fstab, or
- using 'chmod 0666 /dev/pts/ptmx'

4. If multi-instance mode mount is needed for containers, but the system
startup scripts have not yet been updated, container-startup scripts
should bind mount /dev/ptmx to /dev/pts/ptmx to avoid breaking single-
instance mounts.

Or, in general, container-startup scripts should use:

mount -t devpts -o newinstance -o ptmxmode=0666 devpts /dev/pts
if [ ! -L /dev/ptmx ]; then
mount -o bind /dev/pts/ptmx /dev/ptmx
fi

When all devpts mounts are multi-instance, /dev/ptmx can permanently be
a symlink to pts/ptmx and the bind mount can be ignored.

5. A multi-instance mount that is not accompanied by the /dev/ptmx to
/dev/pts/ptmx redirection would result in an unusable/unreachable pty.

mount -t devpts -o newinstance lxcpts /dev/pts

immediately followed by:

open("/dev/ptmx")

would create a pty, say /dev/pts/7, in the initial kernel mount.
But /dev/pts/7 would be invisible in the new mount.

6. The permissions for /dev/pts/ptmx node should be specified when mounting
/dev/pts, using the '-o ptmxmode=%o' mount option (default is 0000).

mount -t devpts -o newinstance -o ptmxmode=0644 devpts /dev/pts

The permissions can be later be changed as usual with 'chmod'.

chmod 666 /dev/pts/ptmx

7. A mount of devpts without the 'newinstance' option results in binding to
initial kernel mount. This behavior while preserving legacy semantics,
does not provide strict isolation in a container environment. i.e by
mounting devpts without the 'newinstance' option, a container could
get visibility into the 'host' or root container's devpts.

To workaround this and have strict isolation, all mounts of devpts,
including the mount in the root container, should use the newinstance
option.
11 changes: 0 additions & 11 deletions drivers/tty/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -120,17 +120,6 @@ config UNIX98_PTYS
All modern Linux systems use the Unix98 ptys. Say Y unless
you're on an embedded system and want to conserve memory.

config DEVPTS_MULTIPLE_INSTANCES
bool "Support multiple instances of devpts"
depends on UNIX98_PTYS
default n
---help---
Enable support for multiple instances of devpts filesystem.
If you want to have isolated PTY namespaces (eg: in containers),
say Y here. Otherwise, say N. If enabled, each mount of devpts
filesystem with the '-o newinstance' option will create an
independent PTY namespace.

config LEGACY_PTYS
bool "Legacy (BSD) PTY support"
default y
Expand Down
15 changes: 8 additions & 7 deletions drivers/tty/pty.c
Original file line number Diff line number Diff line change
Expand Up @@ -668,7 +668,7 @@ static void pty_unix98_remove(struct tty_driver *driver, struct tty_struct *tty)
else
fsi = tty->link->driver_data;
devpts_kill_index(fsi, tty->index);
devpts_put_ref(fsi);
devpts_release(fsi);
}

static const struct tty_operations ptm_unix98_ops = {
Expand Down Expand Up @@ -733,10 +733,11 @@ static int ptmx_open(struct inode *inode, struct file *filp)
if (retval)
return retval;

fsi = devpts_get_ref(inode, filp);
retval = -ENODEV;
if (!fsi)
fsi = devpts_acquire(filp);
if (IS_ERR(fsi)) {
retval = PTR_ERR(fsi);
goto out_free_file;
}

/* find a device that is not in use. */
mutex_lock(&devpts_mutex);
Expand All @@ -745,7 +746,7 @@ static int ptmx_open(struct inode *inode, struct file *filp)

retval = index;
if (index < 0)
goto out_put_ref;
goto out_put_fsi;


mutex_lock(&tty_mutex);
Expand Down Expand Up @@ -789,8 +790,8 @@ static int ptmx_open(struct inode *inode, struct file *filp)
return retval;
out:
devpts_kill_index(fsi, index);
out_put_ref:
devpts_put_ref(fsi);
out_put_fsi:
devpts_release(fsi);
out_free_file:
tty_free_file(filp);
return retval;
Expand Down

0 comments on commit eedf265

Please sign in to comment.