Permalink
Commits on Nov 18, 2018
  1. merge branch 'pr-1930'

    cyphar committed Nov 18, 2018
      add missing intelRdt parameters in 'runc update' manpage
    
    LGTMs: @crosbymichael @cyphar
    Closes #1930
Commits on Nov 13, 2018
  1. merge branch 'pr-1928'

    cyphar committed Nov 13, 2018
      rootless: fix potential panic in shouldUseRootlessCgroupManager
    
    LGTMs: @crosbymichael @cyphar
    Closes #1928
Commits on Nov 2, 2018
  1. merge branch 'pr-1923'

    cyphar committed Nov 2, 2018
      readme: add nokmem build tag
    
    LGTMs: @crosbymichael @cyphar
    Closes #1923
Commits on Nov 1, 2018
  1. merge branch 'pr-1921'

    cyphar committed Nov 1, 2018
      libcontainer: ability to compile without kmem
    
    LGTMs: @mrunalp @cyphar
    Closes #1921
Commits on Oct 24, 2018
  1. merge branch 'pr-1903'

    cyphar committed Oct 24, 2018
      clarify license information
    
    LGTMs: @hqhq @cyphar
    Closes #1903
Commits on Oct 23, 2018
  1. libcontainer: implement CLONE_NEWCGROUP

    cyphar authored and crosbymichael committed Apr 26, 2016
    This is a very simple implementation because it doesn't require any
    configuration unlike the other namespaces, and in its current state it
    only masks paths.
    
    This feature is available in Linux 4.6+ and is enabled by default for
    kernels compiled with CONFIG_CGROUP=y.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
    Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
Commits on Oct 13, 2018
  1. merge branch 'pr-1908'

    cyphar committed Oct 13, 2018
      fix build break
    
    LGTMs: @crosbymichael @cyphar
    Closes #1908
Commits on Oct 8, 2018
  1. merge branch 'pr-1894'

    cyphar committed Oct 8, 2018
      Move spec.Linux.IntelRdt check to spec.Linux != nil block
    
    LGTMs: @crosbymichael @cyphar
    Closes #1894
Commits on Sep 21, 2018
  1. tty: clean up epollConsole closing

    cyphar committed Sep 21, 2018
    ec0d23a ("tty: close epollConsole on errors") fixed a significant
    issue, but the cleanup was not ideal (especially if the function is
    changed in future to add additional error conditions to those currently
    present). Using the defer-named-error trick avoids this issue and makes
    the code more readable.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Sep 19, 2018
  1. merge branch 'pr-1817'

    cyphar committed Sep 19, 2018
      Fix duplicate entries and missing entries in getCgroupMountsHelper
      Add test for testing cgroup mounts on bedrock linux
      Stop relying on number of subsystems for cgroups
    
    LGTMs: @crosbymichael @cyphar
    Closes #1817
Commits on Sep 17, 2018
  1. keyring: handle ENOSYS with keyctl(KEYCTL_JOIN_SESSION_KEYRING)

    cyphar committed Sep 17, 2018
    While all modern kernels (and I do mean _all_ of them -- this syscall
    was added in 2.6.10 before git had begun development!) have support for
    this syscall, LXC has a default seccomp profile that returns ENOSYS for
    this syscall. For most syscalls this would be a deal-breaker, and our
    use of session keyrings is security-based there are a few mitigating
    factors that make this change not-completely-insane:
    
      * We already have a flag that disables the use of session keyrings
        (for older kernels that had system-wide keyring limits and so
        on). So disabling it is not a new idea.
    
      * While the primary justification of using session keys *is*
        security-based, it's more of a security-by-obscurity protection.
        The main defense keyrings have is VFS credentials -- which is
        something that users already have better security tools for
        (setuid(2) and user namespaces).
    
      * Given the security justification you might argue that we
        shouldn't silently ignore this. However, the only way for the
        kernel to return -ENOSYS is either being ridiculously old (at
        which point we wouldn't work anyway) or that there is a seccomp
        profile in place blocking it.
    
        Given that the seccomp profile (if malicious) could very easily
        just return 0 or a silly return code (or something even more
        clever with seccomp-bpf) and trick us without this patch, there
        isn't much of a significant change in how much seccomp can trick
        us with or without this patch.
    
    Given all of that over-analysis, I'm pretty convinced there isn't a
    security problem in this very specific case and it will help out the
    ChromeOS folks by allowing Docker to run inside their LXC container
    setup. I'd be happy to be proven wrong.
    
    Ref: https://bugs.chromium.org/p/chromium/issues/detail?id=860565
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Aug 15, 2018
  1. merge branch 'pr-1867'

    cyphar committed Aug 15, 2018
      Revert "libcontainer/rootfs_linux: minor cleanup"
    
    LGTMs: @hqhq @cyphar
    Closes #1867
Commits on Jul 5, 2018
  1. merge branch 'pr-1836'

    cyphar committed Jul 5, 2018
      Add osusergo flag to static build
    
    LGTMs: @crosbymichael @cyphar
    Closes #1836
Commits on Jun 24, 2018
  1. docs: add information about terminals

    cyphar and deitch committed Feb 22, 2018
    Users can get very confused by how terminals work with runc, and the
    quite confusing "terminal: ..." option. Add a document which goes
    through all of the important parts of terminal handling in runc, in the
    hopes that we can just point people to this as an explanation.
    
    Signed-off-by: Avi Deitcher <avi@deitcher.net>
    [cyphar: quite a large rewrite to fix factual errors and structure]
    Co-authored-by: Avi Deitcher <avi@deitcher.net>
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Jun 18, 2018
  1. libcontainer: improve "kernel.{domainname,hostname}" sysctl handling

    cyphar committed Jun 18, 2018
    These sysctls are namespaced by CLONE_NEWUTS, and we need to use
    "kernel.domainname" if we want users to be able to set an NIS domainname
    on Linux. However we disallow "kernel.hostname" because it would
    conflict with the "hostname" field and cause confusion (but we include a
    helpful message to make it clearer to the user).
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Jun 17, 2018
  1. libcontainer: devices: fix mips builds

    cyphar committed Jun 17, 2018
    It turns out that MIPS uses uint32 in the device number returned by
    stat(2), so explicitly wrap everything to make the compiler happy. I
    really wish that Go had C-like numeric type promotion.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Jun 15, 2018
  1. merge branch 'pr-1816'

    cyphar committed Jun 15, 2018
      runc: not require uid/gid mappings if euid()==0
    
    LGTMs: @mrunalp @cyphar
    Closes #1816
Commits on Jun 4, 2018
  1. merge branch 'pr-1812'

    cyphar committed Jun 4, 2018
      Fix race in runc exec
    
    LGTMs: @dqminh @cyphar
    Closes #1812
Commits on May 25, 2018
  1. cgroup: clean up isIgnorableError for skippable EROFS

    cyphar committed May 25, 2018
    Include a rootless argument for isIgnorableError to avoid people
    accidentally using isIgnorableError when they shouldn't (we don't ignore
    any errors when running as root as that really isn't safe).
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Mar 19, 2018
  1. rootless: set sticky bit if using XDG_RUNTIME_DIR

    cyphar committed Mar 18, 2018
    According to the XDG specification[1], in order to avoid the possibility of
    our container states being auto-pruned every 6 hours we need to set the
    sticky bit. Rather than handling all of the users of --root, we just
    create the directory and set the sticky bit during detection, as it's
    not expensive.
    
    [1]: https://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Mar 17, 2018
  1. libcontainer: handle unset oomScoreAdj corectly

    cyphar committed Mar 16, 2018
    Previously if oomScoreAdj was not set in config.json we would implicitly
    set oom_score_adj to 0. This is not allowed according to the spec:
    
    > If oomScoreAdj is not set, the runtime MUST NOT change the value of
    > oom_score_adj.
    
    Change this so that we do not modify oom_score_adj if oomScoreAdj is not
    present in the configuration. While this modifies our internal
    configuration types, the on-disk format is still compatible.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
  2. rootless: cgroup: treat EROFS as a skippable error

    cyphar committed Mar 16, 2018
    In some cases, /sys/fs/cgroups is mounted read-only. In rootless
    containers we can consider this effectively identical to having cgroups
    that we don't have write permission to -- because the user isn't
    responsible for the read-only setup and cannot modify it. The rules are
    identical to when /sys/fs/cgroups is not writable by the unprivileged
    user.
    
    An example of this is the default configuration of Docker, where cgroups
    are mounted as read-only as a preventative security measure.
    
    Reported-by: Vladimir Rutsky <rutsky@google.com>
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Mar 7, 2018
  1. merge branch 'pr-1752'

    cyphar committed Mar 7, 2018
      cgroups/fs: fix NPE on Destroy than no cgroups are set
    
    LGTMs: @crosbymichael @cyphar
    Closes #1752
  2. merge branch 'pr-1751'

    cyphar committed Mar 7, 2018
      Minor wording enhancement in readme
    
    LGTMs: @crosbymichael @cyphar
    Closes #1751
Commits on Feb 28, 2018
  1. makefile: make "release" PHONY

    cyphar committed Feb 28, 2018
    This just makes it nicer to do "make release" if you have to do builds
    for more than one release.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Feb 27, 2018
  1. release v1.0.0~rc5

    cyphar committed Feb 27, 2018
      VERSION: back to development
      VERSION: bump to v1.0.0-rc5
    
    Votes: +5 -0 #2
    LGTMs: @crosbymichael @cyphar @dqminh @hqhq @mrunalp
    Closes #1739
  2. VERSION: back to development

    cyphar committed Feb 24, 2018
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
  3. VERSION: bump to v1.0.0-rc5

    cyphar committed Feb 24, 2018
    This is planned to be the last -rc release before 1.0.0.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
  4. merge branch 'pr-1743'

    cyphar committed Feb 27, 2018
      The setupUserNamespace function is always called.
    
    LGTMs: @crosbymichael @mrunalp @cyphar
    Closes #1743
Commits on Jan 25, 2018
  1. nsenter: move namespace creation after userns creation

    cyphar committed Jan 8, 2018
    Technically, this change should not be necessary, as the kernel
    documentation claims that if you call clone(flags|CLONE_NEWUSER), the
    new user namespace will be the owner of all other namespaces created in
    @flags. Unfortunately this isn't always the case, due to various
    additional semantics and kernel bugs.
    
    One particular instance is SELinux, which acts very strangely towards
    the IPC namespace and mqueue. If you unshare the IPC namespace *before*
    you map a user in the user namespace, the IPC namespace's internal
    kern-mount for mqueue will be labelled incorrectly and the container
    won't be able to access it. The only way of solving this is to unshare
    IPC *after* the user has been mapped and we have changed to that user.
    I've also heard of this happening to the NET namespace while talking to
    some LXC folks, though I haven't personally seen that issue.
    
    This change matches our handling of user namespaces to be the same as
    how LXC handles these problems.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Nov 27, 2017
  1. merge branch 'pr-1661'

    cyphar committed Nov 27, 2017
      Ensure container tests do not write on the host
    
    LGTMs: @hqhq @cyphar
    Closes #1661
Commits on Oct 24, 2017
  1. tests: add various !terminal tests

    cyphar committed Mar 4, 2017
    Previously we weren't testing that detached io works properly -- which
    will be quite important in the case for rootless containers.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
  2. init: correctly handle unmapped stdio with multiple mappings

    cyphar committed Oct 24, 2017
    Previously we would handle the "unmapped stdio" case by just doing a
    simple check, however this didn't handle cases where the overflow_uid
    was actually mapped in the user namespace. Instead of doing some
    userspace checks, just try to do the fchown(2) and ignore EINVAL
    (unmapped) or EPERM (lacking privilege over inode) errors.
    
    Signed-off-by: Aleksa Sarai <asarai@suse.de>
Commits on Oct 18, 2017
  1. merge branch 'pr-1615'

    cyphar committed Oct 18, 2017
      libcontainer: intelrdt: fix a GetStats() issue
    
    LGTMs: @crosbymichael @cyphar
    Closes #1615
Commits on Oct 16, 2017
  1. merge branch 'pr-1453'

    cyphar committed Oct 16, 2017
      propagate argv0 when re-execing from /proc/self/exe
    
    LGTMs: @crosbymichael @cyphar
    Closes #1453