Commits on Aug 16, 2018
  1. signal: Don't send signals to tasks that don't exist

    ebiederm committed Aug 16, 2018
    Recently syzbot reported crashes in send_sigio_to_task and
    send_sigurg_to_task in linux-next.  Despite finding a reproducer
    syzbot apparently did not bisected this or otherwise track down the
    offending commit in linux-next.
    I happened to see this report and examined the code because I had
    recently changed these functions as part of making PIDTYPE_TGID a real
    pid type so that fork would does not need to restart when receiving a
    signal.  By examination I see that I spotted a bug in the code
    that could explain the reported crashes.
    When I took Oleg's suggestion and optimized send_sigurg and send_sigio
    to only send to a single task when type is PIDTYPE_PID or PIDTYPE_TGID
    I failed to handle pids that no longer point to tasks.  The macro
    do_each_pid_task simply iterates for zero iterations.  With pid_task
    an explicit NULL test is needed.
    Update the code to include the missing NULL test.
    Fixes: 0191913 ("signal: Use PIDTYPE_TGID to clearly store where file signals will be sent")
    Signed-off-by: "Eric W. Biederman" <>
Commits on Aug 9, 2018
  1. signal: Don't restart fork when signals come in.

    ebiederm committed Jul 23, 2018
    Wen Yang <> and majiang <>
    report that a periodic signal received during fork can cause fork to
    continually restart preventing an application from making progress.
    The code was being overly pessimistic.  Fork needs to guarantee that a
    signal sent to multiple processes is logically delivered before the
    fork and just to the forking process or logically delivered after the
    fork to both the forking process and it's newly spawned child.  For
    signals like periodic timers that are always delivered to a single
    process fork can safely complete and let them appear to logically
    delivered after the fork().
    While examining this issue I also discovered that fork today will miss
    signals delivered to multiple processes during the fork and handled by
    another thread.  Similarly the current code will also miss blocked
    signals that are delivered to multiple process, as those signals will
    not appear pending during fork.
    Add a list of each thread that is currently forking, and keep on that
    list a signal set that records all of the signals sent to multiple
    processes.  When fork completes initialize the new processes
    shared_pending signal set with it.  The calculate_sigpending function
    will see those signals and set TIF_SIGPENDING causing the new task to
    take the slow path to userspace to handle those signals.  Making it
    appear as if those signals were received immediately after the fork.
    It is not possible to send real time signals to multiple processes and
    exceptions don't go to multiple processes, which means that that are
    no signals sent to multiple processes that require siginfo.  This
    means it is safe to not bother collecting siginfo on signals sent
    during fork.
    The sigaction of a child of fork is initially the same as the
    sigaction of the parent process.  So a signal the parent ignores the
    child will also initially ignore.  Therefore it is safe to ignore
    signals sent to multiple processes and ignored by the forking process.
    Signals sent to only a single process or only a single thread and delivered
    during fork are treated as if they are received after the fork, and generally
    not dealt with.  They won't cause any problems.
    V2: Added removal from the multiprocess list on failure.
    V3: Use -ERESTARTNOINTR directly
    V4: - Don't queue both SIGCONT and SIGSTOP
        - Initialize signal_struct.multiprocess in init_task
        - Move setting of shared_pending to before the new task
          is visible to signals.  This prevents signals from comming
          in before shared_pending.signal is set to delayed.signal
          and being lost.
    V5: - rework list add and delete to account for idle threads
    v6: - Use sigdelsetmask when removing stop signals
    Reported-by: Wen Yang <> and
    Reported-by: majiang <>
    Fixes: 4a2c7a7 ("[PATCH] make fork() atomic wrt pgrp/session signals")
    Signed-off-by: "Eric W. Biederman" <>
Commits on Aug 4, 2018
  1. fork: Have new threads join on-going signal group stops

    ebiederm committed Jul 23, 2018
    There are only two signals that are delivered to every member of a
    signal group: SIGSTOP and SIGKILL.  Signal delivery requires every
    signal appear to be delivered either before or after a clone syscall.
    SIGKILL terminates the clone so does not need to be considered.  Which
    leaves only SIGSTOP that needs to be considered when creating new
    Today in the event of a group stop TIF_SIGPENDING will get set and the
    fork will restart ensuring the fork syscall participates in the group
    A fork (especially of a process with a lot of memory) is one of the
    most expensive system so we really only want to restart a fork when
    It is easy so check to see if a SIGSTOP is ongoing and have the new
    thread join it immediate after the clone completes.  Making it appear
    the clone completed happened just before the SIGSTOP.
    The calculate_sigpending function will see the bits set in jobctl and
    set TIF_SIGPENDING to ensure the new task takes the slow path to userspace.
    V2: The call to task_join_group_stop was moved before the new task is
        added to the thread group list.  This should not matter as
        sighand->siglock is held over both the addition of the threads,
        the call to task_join_group_stop and do_signal_stop.  But the change
        is trivial and it is one less thing to worry about when reading
        the code.
    Signed-off-by: "Eric W. Biederman" <>
  2. fork: Skip setting TIF_SIGPENDING in ptrace_init_task

    ebiederm committed Aug 4, 2018
    The code in calculate_sigpending will now handle this so
    it is just redundant and possibly a little confusing
    to continue setting TIF_SIGPENDING in ptrace_init_task.
    Suggested-by: Oleg Nesterov <>
    Signed-off-by: "Eric W. Biederman" <>
  3. signal: Add calculate_sigpending()

    ebiederm committed Jul 23, 2018
    Add a function calculate_sigpending to test to see if any signals are
    pending for a new task immediately following fork.  Signals have to
    happen either before or after fork.  Today our practice is to push
    all of the signals to before the fork, but that has the downside that
    frequent or periodic signals can make fork take much much longer than
    normal or prevent fork from completing entirely.
    So we need move signals that we can after the fork to prevent that.
    This updates the code to set TIF_SIGPENDING on a new task if there
    are signals or other activities that have moved so that they appear
    to happen after the fork.
    As the code today restarts if it sees any such activity this won't
    immediately have an effect, as there will be no reason for it
    to set TIF_SIGPENDING immediately after the fork.
    Adding calculate_sigpending means the code in fork can safely be
    changed to not always restart if a signal is pending.
    The new calculate_sigpending function sets sigpending if there
    are pending bits in jobctl, pending signals, the freezer needs
    to freeze the new task or the live kernel patching framework
    need the new thread to take the slow path to userspace.
    I have verified that setting TIF_SIGPENDING does make a new process
    take the slow path to userspace before it executes it's first userspace
    I have looked at the callers of signal_wake_up and the code paths
    setting TIF_SIGPENDING and I don't see anything else that needs to be
    handled.  The code probably doesn't need to set TIF_SIGPENDING for the
    kernel live patching as it uses a separate thread flag as well.  But
    at this point it seems safer reuse the recalc_sigpending logic and get
    the kernel live patching folks to sort out their story later.
    V2: I have moved the test into schedule_tail where siglock can
        be grabbed and recalc_sigpending can be reused directly.
        Further as the last action of setting up a new task this
        guarantees that TIF_SIGPENDING will be properly set in the
        new process.
        The helper calculate_sigpending takes the siglock and
        uncontitionally sets TIF_SIGPENDING and let's recalc_sigpending
        clear TIF_SIGPENDING if it is unnecessary.  This allows reusing
        the existing code and keeps maintenance of the conditions simple.
        Oleg Nesterov <>  suggested the movement
        and pointed out the need to take siglock if this code
        was going to be called while the new task is discoverable.
    Signed-off-by: "Eric W. Biederman" <>
Commits on Jul 23, 2018
  1. fork: Unconditionally exit if a fatal signal is pending

    ebiederm committed Jul 23, 2018
    In practice this does not change anything as testing for fatal_signal_pending
    and exiting for with an error code duplicates the work of the next clause
    which recalculates pending signals and then exits fork if any are pending.
    In both cases the pending signal will trigger the slow path when existing
    to userspace, and the fatal signal will cause do_exit to be called.
    The advantage of making this a separate test is that it makes it clear
    processing the fatal signal will terminate the fork, and it allows the
    rest of the signal logic to be updated without fear that this important
    case will be lost.
    Signed-off-by: "Eric W. Biederman" <>
  2. fork: Move and describe why the code examines PIDNS_ADDING

    ebiederm committed Jul 13, 2018
    Normally this would be something that would be handled by handling
    signals that are sent to a group of processes but in this case the
    forking process is not a member of the group being signaled.  Thus
    special code is needed to prevent a race with pid namespaces exiting,
    and fork adding new processes within them.
    Move this test up before the signal restart just in case signals are
    also pending.  Fatal conditions should take presedence over restarts.
    Signed-off-by: "Eric W. Biederman" <>
Commits on Jul 21, 2018
  1. signal: Push pid type down into complete_signal.

    ebiederm committed Jul 14, 2018
    This is the bottom and by pushing this down it simplifies the callers
    and otherwise leaves things as is.  This is in preparation for allowing
    fork to implement better handling of signals set to groups of processes.
    Signed-off-by: "Eric W. Biederman" <>
  2. signal: Push pid type down into __send_signal

    ebiederm committed Jul 14, 2018
    This information is already available in the callers and by pushing it
    down it makes the code a little clearer, and allows implementing
    better handling of signales set to a group of processes in fork.
    Signed-off-by: "Eric W. Biederman" <>
  3. signal: Push pid type down into send_signal

    ebiederm committed Jul 20, 2018
    This information is already available in the callers and by pushing it
    down it makes the code a little clearer, and allows better group
    signal behavior in fork.
    Signed-off-by: "Eric W. Biederman" <>
  4. signal: Pass pid type into do_send_sig_info

    ebiederm committed Jul 21, 2018
    This passes the information we already have at the call sight into
    do_send_sig_info.  Ultimately allowing for better handling of signals
    sent to a group of processes during fork.
    Signed-off-by: "Eric W. Biederman" <>
  5. signal: Pass pid type into send_sigio_to_task & send_sigurg_to_task

    ebiederm committed Jul 21, 2018
    This information is already present and using it directly simplifies the logic
    of the code.
    Signed-off-by: "Eric W. Biederman" <>
  6. signal: Pass pid type into group_send_sig_info

    ebiederm committed Jul 13, 2018
    This passes the information we already have at the call sight
    into group_send_sig_info.  Ultimatelly allowing for to better handle
    signals sent to a group of processes.
    Signed-off-by: "Eric W. Biederman" <>
  7. signal: Pass pid and pid type into send_sigqueue

    ebiederm committed Jul 20, 2018
    Make the code more maintainable by performing more of the signal
    related work in send_sigqueue.
    A quick inspection of do_timer_create will show that this code path
    does not lookup a thread group by a thread's pid.  Making it safe
    to find the task pointed to by it_pid with "pid_task(it_pid, type)";
    This supports the changes needed in fork to tell if a signal was sent
    to a single process or a group of processes.
    Having the pid to task transition in signal.c will also make it easier
    to sort out races with de_thread and and the thread group leader
    exiting when it comes time to address that.
    Signed-off-by: "Eric W. Biederman" <>
  8. posix-timers: Noralize good_sigevent

    ebiederm committed Jul 21, 2018
    In good_sigevent directly compute the default return value as
    "task_tgid(current)".  This is exactly the same as
    "task_pid(current->group_leader)" but written more clearly.
    In the thread case first compute the thread's pid.  Then veify that
    attached to that pid is a thread of the current thread group.
    This has the net effect of making the code a little clearer, and
    making it obvious that posix timers never look up a process by a the
    pid of a thread.
    Signed-off-by: "Eric W. Biederman" <>
  9. signal: Use PIDTYPE_TGID to clearly store where file signals will be …

    ebiederm committed Jul 17, 2017
    When f_setown is called a pid and a pid type are stored.  Replace the use
    of PIDTYPE_PID with PIDTYPE_TGID as PIDTYPE_TGID goes to the entire thread
    group.  Replace the use of PIDTYPE_MAX with PIDTYPE_PID as PIDTYPE_PID now
    is only for a thread.
    Update the users of __f_setown to use PIDTYPE_TGID instead of
    For now the code continues to capture task_pid (when task_tgid would
    really be appropriate), and iterate on PIDTYPE_PID (even when type ==
    PIDTYPE_TGID) out of an abundance of caution to preserve existing
    Oleg Nesterov suggested using the test to ensure we use PIDTYPE_PID
    for tgid lookup also be used to avoid taking the tasklist lock.
    Suggested-by: Oleg Nesterov <>
    Signed-off-by: "Eric W. Biederman" <>
  10. pid: Implement PIDTYPE_TGID

    ebiederm committed Jun 4, 2017
    Everywhere except in the pid array we distinguish between a tasks pid and
    a tasks tgid (thread group id).  Even in the enumeration we want that
    distinction sometimes so we have added __PIDTYPE_TGID.  With leader_pid
    we almost have an implementation of PIDTYPE_TGID in struct signal_struct.
    Add PIDTYPE_TGID as a first class member of the pid_type enumeration and
    into the pids array.  Then remove the __PIDTYPE_TGID special case and the
    leader_pid in signal_struct.
    The net size increase is just an extra pointer added to struct pid and
    an extra pair of pointers of an hlist_node added to task_struct.
    The effect on code maintenance is the removal of a number of special
    cases today and the potential to remove many more special cases as
    PIDTYPE_TGID gets used to it's fullest.  The long term potential
    is allowing zombie thread group leaders to exit, which will remove
    a lot more special cases in the code.
    Signed-off-by: "Eric W. Biederman" <>
  11. pids: Move the pgrp and session pid pointers from task_struct to sign…

    ebiederm committed Sep 26, 2017
    To access these fields the code always has to go to group leader so
    going to signal struct is no loss and is actually a fundamental simplification.
    This saves a little bit of memory by only allocating the pid pointer array
    once instead of once for every thread, and even better this removes a
    few potential races caused by the fact that group_leader can be changed
    by de_thread, while signal_struct can not.
    Signed-off-by: "Eric W. Biederman" <>
  12. kvm: Don't open code task_pid in kvm_vcpu_ioctl

    ebiederm committed Jul 17, 2017
    Signed-off-by: "Eric W. Biederman" <>
  13. pids: Compute task_tgid using signal->leader_pid

    ebiederm committed Sep 26, 2017
    The cost is the the same and this removes the need
    to worry about complications that come from de_thread
    and group_leader changing.
    __task_pid_nr_ns has been updated to take advantage of this change.
    Signed-off-by: "Eric W. Biederman" <>
  14. pids: Move task_pid_type into sched/signal.h

    ebiederm committed May 5, 2017
    The function is general and inline so there is no need
    to hide it inside of exit.c
    Signed-off-by: "Eric W. Biederman" <>
  15. pids: Initialize leader_pid in init_task

    ebiederm committed May 5, 2017
    This is cheap and no cost so we might as well.
    Signed-off-by: "Eric W. Biederman" <>
Commits on May 31, 2018
  1. fuse: Allow fully unprivileged mounts

    ebiederm authored and Miklos Szeredi committed May 29, 2018
    Now that the fuse and the vfs work is complete.  Allow the fuse filesystem
    to be mounted by the root user in a user namespace.
    Signed-off-by: "Eric W. Biederman" <>
    Signed-off-by: Miklos Szeredi <>
  2. fuse: Ensure posix acls are translated outside of init_user_ns

    ebiederm authored and Miklos Szeredi committed May 4, 2018
    Ensure the translation happens by failing to read or write
    posix acls when the filesystem has not indicated it supports
    posix acls.
    This ensures that modern cached posix acl support is available
    and used when dealing with posix acls.  This is important
    because only that path has the code to convernt the uids and
    gids in posix acls into the user namespace of a fuse filesystem.
    Signed-off-by: "Eric W. Biederman" <>
    Signed-off-by: Miklos Szeredi <>
Commits on May 29, 2018
  1. signal/sh: Stop gcc warning about an impossible case in do_divide_error

    ebiederm committed May 29, 2018
    Geert Uytterhoeven <> reported:
    >   HOSTLD  scripts/mod/modpost
    >   CC      arch/sh/kernel/traps_32.o
    > arch/sh/kernel/traps_32.c: In function 'do_divide_error':
    > arch/sh/kernel/traps_32.c:606:17: error: 'code' may be used uninitialized in this function [-Werror=uninitialized]
    > cc1: all warnings being treated as errors
    It is clear from inspection that do_divide_error is only called with
    TRAP_DIVZERO_ERROR or TRAP_DIVOVF_ERROR, as that is the way
    set_exception_table_vec is called.  So let gcc know the other cases
    should not be considered by returning in all other cases.
    This removes the warning and let's the code continue to build.
    Reported-by: Geert Uytterhoeven <>
    Fixes: c65626c ("signal/sh: Use force_sig_fault where appropriate")
    Signed-off-by: "Eric W. Biederman" <>
Commits on May 24, 2018
  1. capabilities: Allow privileged user in s_user_ns to set security.* xa…

    ebiederm committed Apr 22, 2017
    A privileged user in s_user_ns will generally have the ability to
    manipulate the backing store and insert security.* xattrs into
    the filesystem directly. Therefore the kernel must be prepared to
    handle these xattrs from unprivileged mounts, and it makes little
    sense for commoncap to prevent writing these xattrs to the
    filesystem. The capability and LSM code have already been updated
    to appropriately handle xattrs from unprivileged mounts, so it
    is safe to loosen this restriction on setting xattrs.
    The exception to this logic is that writing xattrs to a mounted
    filesystem may also cause the LSM inode_post_setxattr or
    inode_setsecurity callbacks to be invoked. SELinux will deny the
    xattr update by virtue of applying mountpoint labeling to
    unprivileged userns mounts, and Smack will deny the writes for
    any user without global CAP_MAC_ADMIN, so loosening the
    capability check in commoncap is safe in this respect as well.
    Signed-off-by: Seth Forshee <>
    Acked-by: Serge Hallyn <>
    Acked-by: Christian Brauner <>
    Signed-off-by: Eric W. Biederman <>
  2. fs: Allow superblock owner to access do_remount_sb()

    ebiederm committed Sep 18, 2017
    Superblock level remounts are currently restricted to global
    CAP_SYS_ADMIN, as is the path for changing the root mount to
    read only on umount. Loosen both of these permission checks to
    also allow CAP_SYS_ADMIN in any namespace which is privileged
    towards the userns which originally mounted the filesystem.
    Signed-off-by: Seth Forshee <>
    Acked-by: "Eric W. Biederman" <>
    Acked-by: Serge Hallyn <>
    Acked-by: Christian Brauner <>
    Signed-off-by: Eric W. Biederman <>
  3. fs: Allow superblock owner to replace invalid owners of inodes

    ebiederm committed Oct 15, 2016
    Allow users with CAP_SYS_CHOWN over the superblock of a filesystem to
    chown files when inode owner is invalid.  Ordinarily the
    capable_wrt_inode_uidgid check is sufficient to allow access to files
    but when the underlying filesystem has uids or gids that don't map to
    the current user namespace it is not enough, so the chown permission
    checks need to be extended to allow this case.
    Calling chown on filesystem nodes whose uid or gid don't map is
    necessary if those nodes are going to be modified as writing back
    inodes which contain uids or gids that don't map is likely to cause
    filesystem corruption of the uid or gid fields.
    Once chown has been called the existing capable_wrt_inode_uidgid
    checks are sufficient to allow the owner of a superblock to do anything
    the global root user can do with an appropriate set of capabilities.
    An ordinary filesystem mountable by a userns root will limit all uids
    and gids in s_user_ns or the INVALID_UID and INVALID_GID to flag all
    others.  So having this added permission limited to just INVALID_UID
    and INVALID_GID is sufficient to handle every case on an ordinary filesystem.
    Of the virtual filesystems at least proc is known to set s_user_ns to
    something other than &init_user_ns, while at the same time presenting
    some files owned by GLOBAL_ROOT_UID.  Those files the mounter of proc
    in a user namespace should not be able to chown to get access to.
    Limiting the relaxation in permission to just the minimum of allowing
    changing INVALID_UID and INVALID_GID prevents problems with cases like
    The original version of this patch was written by: Seth Forshee.  I
    have rewritten and rethought this patch enough so it's really not the
    same thing (certainly it needs a different description), but he
    deserves credit for getting out there and getting the conversation
    started, and finding the potential gotcha's and putting up with my
    semi-paranoid feedback.
    Inspired-by: Seth Forshee <>
    Acked-by: Seth Forshee <>
    Signed-off-by: Eric W. Biederman <>
  4. vfs: Allow userns root to call mknod on owned filesystems.

    ebiederm committed May 23, 2018
    These filesystems already always set SB_I_NODEV so mknod will not be
    useful for gaining control of any devices no matter their permissions.
    This will allow overlayfs and applications like to fakeroot to use
    device nodes to represent things on disk.
    Acked-by: Seth Forshee <>
    Signed-off-by: "Eric W. Biederman" <>
  5. vfs: Don't allow changing the link count of an inode with an invalid …

    ebiederm committed Sep 14, 2017
    …uid or gid
    Changing the link count of an inode via unlink or link will cause a
    write back of that inode.  If the uids or gids are invalid (aka not known
    to the kernel) writing the inode back may change the uid or gid in the
    filesystem.   To prevent possible filesystem and to avoid the need for
    filesystem maintainers to worry about it don't allow operations on
    inodes with an invalid uid or gid.
    Acked-by: Seth Forshee <>
    Signed-off-by: "Eric W. Biederman" <>
Commits on Apr 28, 2018
  1. signal/um: More carefully relay signals in relay_signal.

    ebiederm committed Apr 16, 2018
    There is a bug in relay signal.  It assumes that when a signal is
    relayed the signal never uses a signal independent si_code, such
    as SI_USER, SI_KERNEL, SI_QUEUE, ... SI_SIGIO etc.  In practice
    siginfo was assuming it was relaying a signal with the SIL_FAULT
    layout.  As that is the common cases for the signals it supported
    that is a reasonable assumption.
    Further user mode linux must be very careful when relaying different
    kinds of signals to prevent an information leak.  This means simply
    increasing the kinds of signals that are handled in relay_signal
    is non-trivial.
    Therefore use siginfo_layout and force_sig_fault to simplify
    the signal relaying in relay_signal.
    By taking advantage of the fact that user mode linux only works
    on x86 and x86_64 we can assume that si_trapno can be ignored,
    and that si_errno is always zero.
    For the signals SIGLL, SIGFPE, SIGSEGV, SIGBUS, and SIGTRAP the only
    fault handler I know of that sets si_errno is SIGTRAP TRAP_HWBKPT on a
    few oddball architectures.  Those architectures have been modified to
    use force_sig_ptrace_errno_trap.
    Similarly only a few architectures set __ARCH_SI_TRAPNO.
    At the point uml supports those architectures again these additional
    cases can be examined and supported if desired in relay_signal.
    Cc: Jeff Dike <>
    Cc: Richard Weinberger <>
    Cc: Anton Ivanov <>
    Cc: Martin Pärtel <>
    Fixes: d3c1cfc ("um: pass siginfo to guest process")
    Signed-off-by: "Eric W. Biederman" <>
Commits on Apr 27, 2018
  1. signal: Extend siginfo_layout with SIL_FAULT_{MCEERR|BNDERR|PKUERR}

    ebiederm committed Apr 25, 2018
    Update the siginfo_layout function and enum siginfo_layout to represent
    all of the possible field layouts of struct siginfo.
    This allows the uses of siginfo_layout in um and arm64 where they are testing
    for SIL_FAULT to be more accurate as this rules out the other cases.
    Further this allows the switch statements on siginfo_layout to be simpler
    if perhaps a little more wordy.  Making it easier to understand what is
    actually going on.
    As SIL_FAULT_BNDERR and SIL_FAULT_PKUERR are never expected to appear
    in signalfd just treat them as SIL_FAULT.  To include them would take
    20 extra bytes an pretty much fill up what is left of
    Signed-off-by: "Eric W. Biederman" <>
  2. signal: Remove unncessary #ifdef SEGV_PKUERR in 32bit compat code

    ebiederm committed Apr 25, 2018
    The only architecture that does not support SEGV_PKUERR is ia64 and
    ia64 has not had 32bit support since some time in 2008.  Therefore
    copy_siginfo_to_user32 and copy_siginfo_from_user32 do not need to
    include support for a missing SEGV_PKUERR.
    Compile test on ia64.
    Signed-off-by: "Eric W. Biederman" <>
  3. signal/signalfd: Add support for SIGSYS

    ebiederm committed Apr 25, 2018
    I don't know why signalfd has never grown support for SIGSYS but grow it now.
    This corrects an oversight and removes a need for a default in the
    switch statement.  Allowing gcc to warn when future members are added
    to the enum siginfo_layout, and signalfd does not handle them.
    Signed-off-by: "Eric W. Biederman" <>
  4. signal/signalfd: Remove __put_user from signalfd_copyinfo

    ebiederm committed Apr 25, 2018
    Put a signalfd_siginfo structure on the stack fully initializae
    it and then copy it to userspace.
    The code is a little less wordy, and this avoids a long series
    of the somewhat costly __put_user calls.
    Signed-off-by: "Eric W. Biederman" <>