New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

set default seccomp profile #18780

Merged
merged 4 commits into from Dec 29, 2015

Conversation

Projects
None yet
@jessfraz
Contributor

jessfraz commented Dec 18, 2015

This provides a default seccomp profile... I am testing w various images still, ping @rhatdan

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 18, 2015

Contributor

also @ewindisch who might have ideas

Contributor

jessfraz commented Dec 18, 2015

also @ewindisch who might have ideas

@rhatdan

This comment has been minimized.

Show comment
Hide comment
Contributor

rhatdan commented Dec 18, 2015

@rhatdan

This comment has been minimized.

Show comment
Hide comment
@rhatdan

rhatdan Dec 18, 2015

Contributor

I think Matt has a larger black list.

Contributor

rhatdan commented Dec 18, 2015

I think Matt has a larger black list.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 18, 2015

Contributor

great thanks!

On Fri, Dec 18, 2015 at 12:59 PM, Daniel J Walsh notifications@github.com
wrote:

I think Matt has a larger black list.


Reply to this email directly or view it on GitHub
#18780 (comment).

Contributor

jessfraz commented Dec 18, 2015

great thanks!

On Fri, Dec 18, 2015 at 12:59 PM, Daniel J Walsh notifications@github.com
wrote:

I think Matt has a larger black list.


Reply to this email directly or view it on GitHub
#18780 (comment).

@diogomonica

This comment has been minimized.

Show comment
Hide comment
@diogomonica

diogomonica Dec 18, 2015

Contributor

This is awesome. Is there a way where we can run the Top X most popular images on DockerHub to try to see if this works there?

Contributor

diogomonica commented Dec 18, 2015

This is awesome. Is there a way where we can run the Top X most popular images on DockerHub to try to see if this works there?

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 18, 2015

Contributor

im already doing this...

Contributor

jessfraz commented Dec 18, 2015

im already doing this...

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Dec 19, 2015

Contributor

Design LGTM. I'm okay expanding the blacklist slowly as this gets testing. Particularly, I'd like to see blacklists for certain calls to clone(2) (creating user namespaces in particular).

Contributor

ewindisch commented Dec 19, 2015

Design LGTM. I'm okay expanding the blacklist slowly as this gets testing. Particularly, I'd like to see blacklists for certain calls to clone(2) (creating user namespaces in particular).

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 19, 2015

Contributor

Why? In the manpage it says you don't need privs for a userns and I actually rely on this one to run chrome in a container without priv ;) so I'm just curious but obviously chrome is an exception lol so I'd be willing to break my container

On Dec 18, 2015, 17:22 -0800, Eric Windischnotifications@github.com, wrote:

Design LGTM. I'm okay expanding the blacklist slowly as this gets testing. Particularly, I'd like to see blacklists for certain calls to clone(2) (creating user namespaces in particular).


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 19, 2015

Why? In the manpage it says you don't need privs for a userns and I actually rely on this one to run chrome in a container without priv ;) so I'm just curious but obviously chrome is an exception lol so I'd be willing to break my container

On Dec 18, 2015, 17:22 -0800, Eric Windischnotifications@github.com, wrote:

Design LGTM. I'm okay expanding the blacklist slowly as this gets testing. Particularly, I'd like to see blacklists for certain calls to clone(2) (creating user namespaces in particular).


Reply to this email directly orview it on GitHub(#18780 (comment)).

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Dec 19, 2015

Contributor

We arguably break user namespaces already by denying mount. The reason we block mount is related to why I don't think user namespaces should be allowed to be created inside of containers. Honestly, if I had more trust in the kernel then it should be fine.

Contributor

ewindisch commented Dec 19, 2015

We arguably break user namespaces already by denying mount. The reason we block mount is related to why I don't think user namespaces should be allowed to be created inside of containers. Honestly, if I had more trust in the kernel then it should be fine.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 19, 2015

Contributor

True I can add clone&unshare, I specifically excluded these because I was being biased hehe 0:)

On Dec 18, 2015, 17:34 -0800, Eric Windischnotifications@github.com, wrote:

We arguably break user namespaces already by denying mount. The reason we block mount is related to why I don't think user namespaces should be allowed to be created inside of containers. Honestly, if I had more trust in the kernel then itshouldbe fine.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 19, 2015

True I can add clone&unshare, I specifically excluded these because I was being biased hehe 0:)

On Dec 18, 2015, 17:34 -0800, Eric Windischnotifications@github.com, wrote:

We arguably break user namespaces already by denying mount. The reason we block mount is related to why I don't think user namespaces should be allowed to be created inside of containers. Honestly, if I had more trust in the kernel then itshouldbe fine.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Dec 19, 2015

Contributor

I'd just block the CLONE_NEW* flags. The syscall itself is fine.

Contributor

ewindisch commented Dec 19, 2015

I'd just block the CLONE_NEW* flags. The syscall itself is fine.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 19, 2015

Contributor

the tests failed but in the best way possible lol:

----------------------------------------------------------------------
FAIL: docker_cli_run_test.go:2856: DockerSuite.TestRunUnshareProc

docker_cli_run_test.go:2862:
    c.Fatalf("unshare with --mount-proc should have failed with permission denied, got: %s, %v", out, err)
... Error: unshare with --mount-proc should have failed with permission denied, got: unshare: mount /proc failed: Operation not permitted
, exit status 1


----------------------------------------------------------------------

ill update the test hahaha

Contributor

jessfraz commented Dec 19, 2015

the tests failed but in the best way possible lol:

----------------------------------------------------------------------
FAIL: docker_cli_run_test.go:2856: DockerSuite.TestRunUnshareProc

docker_cli_run_test.go:2862:
    c.Fatalf("unshare with --mount-proc should have failed with permission denied, got: %s, %v", out, err)
... Error: unshare with --mount-proc should have failed with permission denied, got: unshare: mount /proc failed: Operation not permitted
, exit status 1


----------------------------------------------------------------------

ill update the test hahaha

@rhatdan

This comment has been minimized.

Show comment
Hide comment
@rhatdan

rhatdan Dec 21, 2015

Contributor

A Use case of running an X Windows app inside of a container is waaaayyyy too lose. Allowing an app to connect to the X Server, is allowing it to totally attack the user session. We should lock down for server use cases and then loosen for a desktop use case.

SELinux would totally block any access to the desktop/X Session. BTW Once we get to Wayland we could start to consider containers on the desktop like xdg-app is attempting.

Contributor

rhatdan commented Dec 21, 2015

A Use case of running an X Windows app inside of a container is waaaayyyy too lose. Allowing an app to connect to the X Server, is allowing it to totally attack the user session. We should lock down for server use cases and then loosen for a desktop use case.

SELinux would totally block any access to the desktop/X Session. BTW Once we get to Wayland we could start to consider containers on the desktop like xdg-app is attempting.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 21, 2015

Contributor

Ya I mean I didn't mind breaking it by any means

On Dec 21, 2015, 06:56 -0800, Daniel J Walshnotifications@github.com, wrote:

A Use case of running an X Windows app inside of a container is waaaayyyy too lose. Allowing an app to connect to the X Server, is allowing it to totally attack the user session. We should lock down for server use cases and then loosen for a desktop use case.

SELinux would totally block any access to the desktop/X Session. BTW Once we get to Wayland we could start to consider containers on the desktop like xdg-app is attempting.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 21, 2015

Ya I mean I didn't mind breaking it by any means

On Dec 21, 2015, 06:56 -0800, Daniel J Walshnotifications@github.com, wrote:

A Use case of running an X Windows app inside of a container is waaaayyyy too lose. Allowing an app to connect to the X Server, is allowing it to totally attack the user session. We should lock down for server use cases and then loosen for a desktop use case.

SELinux would totally block any access to the desktop/X Session. BTW Once we get to Wayland we could start to consider containers on the desktop like xdg-app is attempting.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@mheon

This comment has been minimized.

Show comment
Hide comment
@mheon

mheon Dec 21, 2015

Contributor

As @rhatdan noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

  • kexec_file_load - Sister syscall of kexec_load that does the same thing, slightly different arguments
  • uselib - Older syscall related to shared libraries, unused for a long time
  • add_key, keyctl, request_key - Prevent containers from using the kernel keyring, which is not namespaced
  • ioperm, iopl - Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
  • modify_ldt - Old syscall only used in 16-bit code, and a potential information leak
  • adjtimex - Similar to clock_settime and settimeofday
  • mbind, get_mempolicy, set_mempolicy, move_pages, migrate_pages - Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
  • reboot - Probably a bad idea to let containers reboot the host
  • perf_event_open, lookup_dcookie - Tracing/profiling syscalls which could leak a lot of information on the host
  • open_by_handle_at - Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
  • quatactl, acct - Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
  • personality - Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this.

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.

Contributor

mheon commented Dec 21, 2015

As @rhatdan noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

  • kexec_file_load - Sister syscall of kexec_load that does the same thing, slightly different arguments
  • uselib - Older syscall related to shared libraries, unused for a long time
  • add_key, keyctl, request_key - Prevent containers from using the kernel keyring, which is not namespaced
  • ioperm, iopl - Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
  • modify_ldt - Old syscall only used in 16-bit code, and a potential information leak
  • adjtimex - Similar to clock_settime and settimeofday
  • mbind, get_mempolicy, set_mempolicy, move_pages, migrate_pages - Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
  • reboot - Probably a bad idea to let containers reboot the host
  • perf_event_open, lookup_dcookie - Tracing/profiling syscalls which could leak a lot of information on the host
  • open_by_handle_at - Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
  • quatactl, acct - Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
  • personality - Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this.

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.

@rhatdan

This comment has been minimized.

Show comment
Hide comment
@rhatdan

rhatdan Dec 21, 2015

Contributor

I agree.

--privilege should turn off ALL security. User Namespace and seccomp. That is what users expect. Anything less and we will need to add lots of other customizations. We probably should add an option like

--security-opt seccomp:disable

Also, if we don't have it now.

Matt didn't we have a list of seccomp settings to block access to old network stuff like decnet, appletalk ...

Also is there a way to block access to floppy drives. I know that this was used for a breakout against kvm/qemu.

Contributor

rhatdan commented Dec 21, 2015

I agree.

--privilege should turn off ALL security. User Namespace and seccomp. That is what users expect. Anything less and we will need to add lots of other customizations. We probably should add an option like

--security-opt seccomp:disable

Also, if we don't have it now.

Matt didn't we have a list of seccomp settings to block access to old network stuff like decnet, appletalk ...

Also is there a way to block access to floppy drives. I know that this was used for a breakout against kvm/qemu.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 21, 2015

Contributor

Awesome I'll run the tests on all official images w your list thanks

And wrt to privileged what you noted is the behavior of this PR :)

On Dec 21, 2015, 08:40 -0800, Matthew Heonnotifications@github.com, wrote:

As@rhatdan(https://github.com/rhatdan)noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

kexec_file_load- Sister syscall ofkexec_loadthat does the same thing, slightly different arguments
uselib- Older syscall related to shared libraries, unused for a long time
add_key,keyctl,request_key- Prevent containers from using the kernel keyring, which is not namespaced
ioperm,iopl- Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
modify_ldt- Old syscall only used in 16-bit code, and a potential information leak
adjtimex- Similar toclock_settimeandsettimeofday
mbind,get_mempolicy,set_mempolicy,move_pages,migrate_pages- Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
reboot- Probably a bad idea to let containers reboot the host
perf_event_open,lookup_dcookie- Tracing/profiling syscalls which could leak a lot of information on the host
open_by_handle_at- Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
quatactl,acct- Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
personality- Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 21, 2015

Awesome I'll run the tests on all official images w your list thanks

And wrt to privileged what you noted is the behavior of this PR :)

On Dec 21, 2015, 08:40 -0800, Matthew Heonnotifications@github.com, wrote:

As@rhatdan(https://github.com/rhatdan)noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

kexec_file_load- Sister syscall ofkexec_loadthat does the same thing, slightly different arguments
uselib- Older syscall related to shared libraries, unused for a long time
add_key,keyctl,request_key- Prevent containers from using the kernel keyring, which is not namespaced
ioperm,iopl- Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
modify_ldt- Old syscall only used in 16-bit code, and a potential information leak
adjtimex- Similar toclock_settimeandsettimeofday
mbind,get_mempolicy,set_mempolicy,move_pages,migrate_pages- Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
reboot- Probably a bad idea to let containers reboot the host
perf_event_open,lookup_dcookie- Tracing/profiling syscalls which could leak a lot of information on the host
open_by_handle_at- Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
quatactl,acct- Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
personality- Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@mheon

This comment has been minimized.

Show comment
Hide comment
@mheon

mheon Dec 21, 2015

Contributor

@jfrazelle Cool, good to hear that it's already handled!

@rhatdan Yeah, I should still have that socket conditional block somewhere, let me see about finding it.

Contributor

mheon commented Dec 21, 2015

@jfrazelle Cool, good to hear that it's already handled!

@rhatdan Yeah, I should still have that socket conditional block somewhere, let me see about finding it.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 21, 2015

Contributor

I really appreciate the comments on each as well I think I will add it to the PR on each syscall if someone comes back and needs to debug later

On Dec 21, 2015, 08:40 -0800, Matthew Heonnotifications@github.com, wrote:

As@rhatdan(https://github.com/rhatdan)noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

kexec_file_load- Sister syscall ofkexec_loadthat does the same thing, slightly different arguments
uselib- Older syscall related to shared libraries, unused for a long time
add_key,keyctl,request_key- Prevent containers from using the kernel keyring, which is not namespaced
ioperm,iopl- Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
modify_ldt- Old syscall only used in 16-bit code, and a potential information leak
adjtimex- Similar toclock_settimeandsettimeofday
mbind,get_mempolicy,set_mempolicy,move_pages,migrate_pages- Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
reboot- Probably a bad idea to let containers reboot the host
perf_event_open,lookup_dcookie- Tracing/profiling syscalls which could leak a lot of information on the host
open_by_handle_at- Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
quatactl,acct- Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
personality- Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this.

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 21, 2015

I really appreciate the comments on each as well I think I will add it to the PR on each syscall if someone comes back and needs to debug later

On Dec 21, 2015, 08:40 -0800, Matthew Heonnotifications@github.com, wrote:

As@rhatdan(https://github.com/rhatdan)noted, I do have a larger default blocklist configured. A few syscalls from it that we should consider blocking by default in this PR:

kexec_file_load- Sister syscall ofkexec_loadthat does the same thing, slightly different arguments
uselib- Older syscall related to shared libraries, unused for a long time
add_key,keyctl,request_key- Prevent containers from using the kernel keyring, which is not namespaced
ioperm,iopl- Prevent containers from modifying kernel I/O privilege levels. Already restricted as containers drop CAP_SYS_RAWIO by default.
modify_ldt- Old syscall only used in 16-bit code, and a potential information leak
adjtimex- Similar toclock_settimeandsettimeofday
mbind,get_mempolicy,set_mempolicy,move_pages,migrate_pages- Terrifying syscalls that modify kernel memory and NUMA settings. They're gated by CAP_SYS_NICE, which we do not retain by default in containers.
reboot- Probably a bad idea to let containers reboot the host
perf_event_open,lookup_dcookie- Tracing/profiling syscalls which could leak a lot of information on the host
open_by_handle_at- Cause of an old container breakout in Docker, might as well restrict it to be on the safe side
quatactl,acct- Quota and Accounting syscalls which could let containers disable their own resource limits or process accounting
personality- Prevent container from enabling BSD emulation. Not inherently dangerous, but poorly tested, potential for a lot of kernel vulns in this.

One other note: now that a default Seccomp ruleset exists, it's a good idea to make privileged containers automatically disable Seccomp (clear the ruleset before starting the container) to preserve their previous behavior.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@thaJeztah thaJeztah added this to the 1.10 milestone Dec 21, 2015

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 21, 2015

Contributor

ok so I added a cool little c binary that clones a new userns, etc and its in the integration tests

Contributor

jessfraz commented Dec 21, 2015

ok so I added a cool little c binary that clones a new userns, etc and its in the integration tests

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 22, 2015

Contributor

ok the current profile works on all the official images

Contributor

jessfraz commented Dec 22, 2015

ok the current profile works on all the official images

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 22, 2015

Contributor

so docker-py is unhappy... will look into it

Contributor

jessfraz commented Dec 22, 2015

so docker-py is unhappy... will look into it

Show outdated Hide outdated daemon/execdriver/native/seccomp_default.go
},
{
// Deny cloning new namespaces
Name: "clone",

This comment has been minimized.

@mheon

mheon Dec 22, 2015

Contributor

It looks like this can be combined with the previous Clone entry - just add the argument here to the array in the first one

@mheon

mheon Dec 22, 2015

Contributor

It looks like this can be combined with the previous Clone entry - just add the argument here to the array in the first one

This comment has been minimized.

@jessfraz

jessfraz Dec 22, 2015

Contributor

oh ya sorry meant to clean that up

On Tue, Dec 22, 2015 at 10:04 AM, Matthew Heon notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •           {
    
  •               // flags from sched.h
    
  •               // CLONE_NEWUTS     0x04000000
    
  •               // CLONE_NEWIPC     0x08000000
    
  •               // CLONE_NEWUSER    0x10000000
    
  •               // CLONE_NEWPID     0x20000000
    
  •               // CLONE_NEWNET     0x40000000
    
  •               Index: 0,
    
  •               Value: uint64(0x04000000),
    
  •               Op:    configs.GreaterThanOrEqualTo,
    
  •           },
    
  •       },
    
  •   },
    
  •   {
    
  •       // Deny cloning new namespaces
    
  •       Name:   "clone",
    

It looks like this can be combined with the previous Clone entry - just
add the argument here to the array in the first one


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48279824.

@jessfraz

jessfraz Dec 22, 2015

Contributor

oh ya sorry meant to clean that up

On Tue, Dec 22, 2015 at 10:04 AM, Matthew Heon notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •           {
    
  •               // flags from sched.h
    
  •               // CLONE_NEWUTS     0x04000000
    
  •               // CLONE_NEWIPC     0x08000000
    
  •               // CLONE_NEWUSER    0x10000000
    
  •               // CLONE_NEWPID     0x20000000
    
  •               // CLONE_NEWNET     0x40000000
    
  •               Index: 0,
    
  •               Value: uint64(0x04000000),
    
  •               Op:    configs.GreaterThanOrEqualTo,
    
  •           },
    
  •       },
    
  •   },
    
  •   {
    
  •       // Deny cloning new namespaces
    
  •       Name:   "clone",
    

It looks like this can be combined with the previous Clone entry - just
add the argument here to the array in the first one


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48279824.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 22, 2015

Contributor

ping @shin- idk about docker-py :(

Contributor

jessfraz commented Dec 22, 2015

ping @shin- idk about docker-py :(

@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Dec 22, 2015

Contributor

Does this PR change the behavior of stop? The tests currently expect a stopped container to have a non-zero exit code, but it looks like this is not the case here.

Contributor

shin- commented Dec 22, 2015

Does this PR change the behavior of stop? The tests currently expect a stopped container to have a non-zero exit code, but it looks like this is not the case here.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 22, 2015

Contributor

hmm ya i dont think so because i tested the commands docker run busybox sleep 9999 with the pr and it seems fine but will try again

Contributor

jessfraz commented Dec 22, 2015

hmm ya i dont think so because i tested the commands docker run busybox sleep 9999 with the pr and it seems fine but will try again

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 22, 2015

Contributor
docker run -d busybox sleep 9999
docker stop amazing_ritchie
echo $?
0
Contributor

jessfraz commented Dec 22, 2015

docker run -d busybox sleep 9999
docker stop amazing_ritchie
echo $?
0
@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Dec 22, 2015

Contributor

Sorry, to be more accurate, the test stops the container, then inspects it, and tests the value of Status.ExitCode with the expectation that it has a non-zero value.

With 1.9.0:

$ docker run -d busybox sleep 9999
c13f57f9915e7978be19206e19352b28cb11dc879b889c3c6286f487ef82e73f
$ docker stop -t 1 c13
c13
$ docker inspect c13 | grep ExitCode
        "ExitCode": 137,
$
Contributor

shin- commented Dec 22, 2015

Sorry, to be more accurate, the test stops the container, then inspects it, and tests the value of Status.ExitCode with the expectation that it has a non-zero value.

With 1.9.0:

$ docker run -d busybox sleep 9999
c13f57f9915e7978be19206e19352b28cb11dc879b889c3c6286f487ef82e73f
$ docker stop -t 1 c13
c13
$ docker inspect c13 | grep ExitCode
        "ExitCode": 137,
$
@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

hmmm still seems to be zero, ill try to pull the docker-py tests and run locally

Contributor

jessfraz commented Dec 23, 2015

hmmm still seems to be zero, ill try to pull the docker-py tests and run locally

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

oh just kidding it expects a non-zero exit code :/

Contributor

jessfraz commented Dec 23, 2015

oh just kidding it expects a non-zero exit code :/

@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Dec 23, 2015

Contributor

Yes. Is that an incorrect assumption? I don't know if other people rely on it being that way or not.

Contributor

shin- commented Dec 23, 2015

Yes. Is that an incorrect assumption? I don't know if other people rely on it being that way or not.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

weird thing is i dont know why this would make it change haha, ill look into it

Contributor

jessfraz commented Dec 23, 2015

weird thing is i dont know why this would make it change haha, ill look into it

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

ok so here is my explanation:
the old way was getting 137 which means it was terminated by a SIGKILL the way docker stop works is to try a SIGTERM and then try a SIGKILL if that fails, see: https://github.com/docker/docker/blob/master/daemon/stop.go#L50

the SIGTERM is actually working now, hence the exit code 0... so thats all i have but seems harmless

Contributor

jessfraz commented Dec 23, 2015

ok so here is my explanation:
the old way was getting 137 which means it was terminated by a SIGKILL the way docker stop works is to try a SIGTERM and then try a SIGKILL if that fails, see: https://github.com/docker/docker/blob/master/daemon/stop.go#L50

the SIGTERM is actually working now, hence the exit code 0... so thats all i have but seems harmless

@shin-

This comment has been minimized.

Show comment
Hide comment
@shin-

shin- Dec 23, 2015

Contributor

@jfrazelle 57512760c83fbe41302891aa51e34a86f4db74de docker/docker-py@5751276

Contributor

shin- commented Dec 23, 2015

@jfrazelle 57512760c83fbe41302891aa51e34a86f4db74de docker/docker-py@5751276

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

@shin- you da bomb thanks so much

this PR should be good to go now and I hope we can merge it sooner rather
than later mostly so we can have people test on master / experimental and
make sure nothing breaks :)

On Wed, Dec 23, 2015 at 1:11 PM, Joffrey F notifications@github.com wrote:

@jfrazelle https://github.com/jfrazelle
57512760c83fbe41302891aa51e34a86f4db74de docker/docker-py@5751276
docker/docker-py@5751276


Reply to this email directly or view it on GitHub
#18780 (comment).

Contributor

jessfraz commented Dec 23, 2015

@shin- you da bomb thanks so much

this PR should be good to go now and I hope we can merge it sooner rather
than later mostly so we can have people test on master / experimental and
make sure nothing breaks :)

On Wed, Dec 23, 2015 at 1:11 PM, Joffrey F notifications@github.com wrote:

@jfrazelle https://github.com/jfrazelle
57512760c83fbe41302891aa51e34a86f4db74de docker/docker-py@5751276
docker/docker-py@5751276


Reply to this email directly or view it on GitHub
#18780 (comment).

@tophj-ibm

This comment has been minimized.

Show comment
Hide comment
@tophj-ibm

tophj-ibm Dec 23, 2015

Contributor

@jfrazelle: your unshare image has an old version of unshare that has a bug causing the "Operation not permitted" error. I just tested this with the latest version of unshare and it threw the normal error.

Contributor

tophj-ibm commented Dec 23, 2015

@jfrazelle: your unshare image has an old version of unshare that has a bug causing the "Operation not permitted" error. I just tested this with the latest version of unshare and it threw the normal error.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

we specifically use that version of unshare for the bug for testing the apparmor profile but I have added additional tests with a debian:jessie image :)

Contributor

jessfraz commented Dec 23, 2015

we specifically use that version of unshare for the bug for testing the apparmor profile but I have added additional tests with a debian:jessie image :)

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 23, 2015

Contributor

I actually am trying to remove the jess/unshare all together and find another way to test the apparmor profile but i think it can be a different PR. I know @tianon wants this :)

Contributor

jessfraz commented Dec 23, 2015

I actually am trying to remove the jess/unshare all together and find another way to test the apparmor profile but i think it can be a different PR. I know @tianon wants this :)

@tophj-ibm

This comment has been minimized.

Show comment
Hide comment
@tophj-ibm

tophj-ibm Dec 23, 2015

Contributor

ahh interesting I didn't know that. tmyk :)

Contributor

tophj-ibm commented Dec 23, 2015

ahh interesting I didn't know that. tmyk :)

@tianon

This comment has been minimized.

Show comment
Hide comment
@tianon

tianon Dec 23, 2015

Member
Member

tianon commented Dec 23, 2015

testRequires(c, SameHostDaemon, seccompEnabled)
// apt-key uses setrlimit & getrlimit, so we want to make sure we don't break it
runCmd := exec.Command(dockerBinary, "run", "debian:jessie", "apt-key", "adv", "--keyserver", "hkp://p80.pool.sks-keyservers.net:80", "--recv-keys", "E871F18B51E0147C77796AC81196BA81F6B0FC61")

This comment has been minimized.

@jessfraz

jessfraz Dec 24, 2015

Contributor

added a little integration test for this, because this little bugger was hard to track down :P

@jessfraz

jessfraz Dec 24, 2015

Contributor

added a little integration test for this, because this little bugger was hard to track down :P

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 24, 2015

Contributor

It's green! Early xmas gift? I solomley swear to strace down anything bad if we break something but this works on all official images and I tried a bunch of my weird images :P

Contributor

jessfraz commented Dec 24, 2015

It's green! Early xmas gift? I solomley swear to strace down anything bad if we break something but this works on all official images and I tried a bunch of my weird images :P

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Dec 24, 2015

Contributor

The only thing I think may be a problem is mount.

Also note that these unshare tests fail on 1.9 as is, so not sure they are good tests.

Contributor

cpuguy83 commented Dec 24, 2015

The only thing I think may be a problem is mount.

Also note that these unshare tests fail on 1.9 as is, so not sure they are good tests.

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Dec 24, 2015

Contributor

And when say the tests fail, I mean the actual docker run commands fail, not those specific tests.

Contributor

cpuguy83 commented Dec 24, 2015

And when say the tests fail, I mean the actual docker run commands fail, not those specific tests.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 24, 2015

Contributor

Ya that's why I added the userns tests because the unshare tests should be covered by apparmor

On Dec 24, 2015, 05:24 -0700, Brian Goffnotifications@github.com, wrote:

And when say the tests fail, I mean the actualdocker runcommands fail, not those specific tests.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 24, 2015

Ya that's why I added the userns tests because the unshare tests should be covered by apparmor

On Dec 24, 2015, 05:24 -0700, Brian Goffnotifications@github.com, wrote:

And when say the tests fail, I mean the actualdocker runcommands fail, not those specific tests.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 24, 2015

Contributor

But I can add more

Contributor

jessfraz commented Dec 24, 2015

But I can add more

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Dec 24, 2015

Contributor

@jfrazelle for complete coverage the unshare/apparmor tests we have today should disable seccomp to assure that AppArmor is blocking those calls, not seccomp, and then have a second test for apparmor-disabled/seccomp-enabled. We may similarly may need to disable seccomp for other AppArmor and SELinux tests to make sure the tests pass for the right reason.

Contributor

ewindisch commented Dec 24, 2015

@jfrazelle for complete coverage the unshare/apparmor tests we have today should disable seccomp to assure that AppArmor is blocking those calls, not seccomp, and then have a second test for apparmor-disabled/seccomp-enabled. We may similarly may need to disable seccomp for other AppArmor and SELinux tests to make sure the tests pass for the right reason.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 24, 2015

Contributor

Ah will do I can add ones that test just apparmor and just seccomp ;)

On Dec 24, 2015, 11:21 -0700, Eric Windischnotifications@github.com, wrote:

@jfrazelle(https://github.com/jfrazelle)for complete coverage the unshare/apparmor tests we have today should disable seccomp to assure that AppArmor is blocking those calls, not seccomp, and then have a second test for apparmor-disabled/seccomp-enabled. We may similarly may need to disable seccomp for other AppArmor and SELinux tests to make sure the tests pass for the right reason.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 24, 2015

Ah will do I can add ones that test just apparmor and just seccomp ;)

On Dec 24, 2015, 11:21 -0700, Eric Windischnotifications@github.com, wrote:

@jfrazelle(https://github.com/jfrazelle)for complete coverage the unshare/apparmor tests we have today should disable seccomp to assure that AppArmor is blocking those calls, not seccomp, and then have a second test for apparmor-disabled/seccomp-enabled. We may similarly may need to disable seccomp for other AppArmor and SELinux tests to make sure the tests pass for the right reason.


Reply to this email directly orview it on GitHub(#18780 (comment)).

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Dec 24, 2015

Contributor

With 1.9 I get permission denied on these Unshare tests without apparmor or seccomp, so it doesn't really seem to be testing that seccomp is working is what I'm trying I say.

Contributor

cpuguy83 commented Dec 24, 2015

With 1.9 I get permission denied on these Unshare tests without apparmor or seccomp, so it doesn't really seem to be testing that seccomp is working is what I'm trying I say.

@ewindisch

This comment has been minimized.

Show comment
Hide comment
@ewindisch

ewindisch Dec 24, 2015

Contributor

@cpuguy83 it could be that your kernel doesn't have the feature.

Contributor

ewindisch commented Dec 24, 2015

@cpuguy83 it could be that your kernel doesn't have the feature.

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 24, 2015

Contributor

Ah ok ya nor is it testing apparmor then I'll redo them

On Dec 24, 2015, 11:35 -0700, Brian Goffnotifications@github.com, wrote:

With 1.9 I get permission denied on these Unshare tests without apparmor or seccomp, so it doesn't really seem to be testing that seccomp is working is what I'm trying I say.


Reply to this email directly orview it on GitHub(#18780 (comment)).

Contributor

jessfraz commented Dec 24, 2015

Ah ok ya nor is it testing apparmor then I'll redo them

On Dec 24, 2015, 11:35 -0700, Brian Goffnotifications@github.com, wrote:

With 1.9 I get permission denied on these Unshare tests without apparmor or seccomp, so it doesn't really seem to be testing that seccomp is working is what I'm trying I say.


Reply to this email directly orview it on GitHub(#18780 (comment)).

jessfraz added some commits Dec 18, 2015

set default seccomp profile
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
add default seccomp profile tests
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
add docs and unconfined to run a container without the default seccom…
…p profile

Signed-off-by: Jessica Frazelle <acidburn@docker.com>
bump docker-py
Signed-off-by: Jessica Frazelle <acidburn@docker.com>
@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 28, 2015

Contributor

ok i updated the tests so for unshare only apparmor is being tested for some, and only seccomp is being tested for others. the only tests that use the old jess/unshare image are the apparmor tests so then we can use debian jessie for everything else

Contributor

jessfraz commented Dec 28, 2015

ok i updated the tests so for unshare only apparmor is being tested for some, and only seccomp is being tested for others. the only tests that use the old jess/unshare image are the apparmor tests so then we can use debian jessie for everything else

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Dec 28, 2015

Contributor

LGTM

Contributor

cpuguy83 commented Dec 28, 2015

LGTM

@calavera

This comment has been minimized.

Show comment
Hide comment
@calavera

calavera Dec 29, 2015

Contributor

LGTM

Contributor

calavera commented Dec 29, 2015

LGTM

calavera added a commit that referenced this pull request Dec 29, 2015

@calavera calavera merged commit 78ce43b into moby:master Dec 29, 2015

6 checks passed

docker/dco-signed All commits signed
Details
documentation success 2 tests run, 0 skipped, 0 failed.
Details
experimental Jenkins build Docker-PRs-experimental 12744 has succeeded
Details
janky Jenkins build Docker-PRs 21539 has succeeded
Details
userns Jenkins build Docker-PRs-userns 3994 has succeeded
Details
windows Jenkins build Windows-PRs 19329 has succeeded
Details

@jessfraz jessfraz deleted the jessfraz:seccomp-default branch Dec 29, 2015

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 29, 2015

Contributor

thankssss

Contributor

jessfraz commented Dec 29, 2015

thankssss

@jessfraz

This comment has been minimized.

Show comment
Hide comment
@jessfraz

jessfraz Dec 29, 2015

Contributor

this is now up on https://master.dockerproject.org and soon will be in experimental.docker.com for testing :)

Contributor

jessfraz commented Dec 29, 2015

this is now up on https://master.dockerproject.org and soon will be in experimental.docker.com for testing :)

@phemmer

This comment has been minimized.

Show comment
Hide comment
@phemmer

phemmer Dec 29, 2015

Contributor

This PR is causing issues for me. #18946

Contributor

phemmer commented Dec 29, 2015

This PR is causing issues for me. #18946

},
{
// meta, deny seccomp
Name: "seccomp",

This comment has been minimized.

@cyphar

cyphar Dec 29, 2015

Contributor

@jfrazelle AFAICS, this is not necessary. Seccomp does not allow a process running under a seccomp context to remove restrictions (otherwise it would be a pointless security feature). A Docker container could only make its seccomp context more restrictive (which I'm not sure if we should be blocking).

@cyphar

cyphar Dec 29, 2015

Contributor

@jfrazelle AFAICS, this is not necessary. Seccomp does not allow a process running under a seccomp context to remove restrictions (otherwise it would be a pointless security feature). A Docker container could only make its seccomp context more restrictive (which I'm not sure if we should be blocking).

This comment has been minimized.

@jessfraz

jessfraz Dec 29, 2015

Contributor

I feel like if an app is setting its own seccomp rules the Dev can make their own custom seccomp profile and should... Who knows if down the line this will have a bypass, also you can't load a new apparmor profile inside a container unprivileged even tho this is different and can only restrict it more I don't trust it

On Dec 29, 2015, 07:23 -0800, Aleksa Sarainotifications@github.com, wrote:

Indaemon/execdriver/native/seccomp_default.go(#18780 (comment)):

  • {>+         // Probably a bad idea to let containers restart>+         Name:  "restart_syscall",>+         Action: configs.Errno,>+         Args:  []_configs.Arg{},>+     },>+     {>+         // Prevent containers from using the kernel keyring,>+         // which is not namespaced>+         Name:  "request_key",>+         Action: configs.Errno,>+         Args:  []_configs.Arg{},>+     },>+     {>+         // meta, deny seccomp>+         Name:  "seccomp",
    

@jfrazelle(https://github.com/jfrazelle)AFAICS, this is not necessary. Seccomp does not allow a process running under a seccomp context to remove restrictions (otherwise it would be a pointless security feature). A Docker container could only make its seccomp context more restrictive (which I'm not sure if we should be blocking).


Reply to this email directly orview it on GitHub(https://github.com/docker/docker/pull/18780/files#r48546597).

@jessfraz

jessfraz Dec 29, 2015

Contributor

I feel like if an app is setting its own seccomp rules the Dev can make their own custom seccomp profile and should... Who knows if down the line this will have a bypass, also you can't load a new apparmor profile inside a container unprivileged even tho this is different and can only restrict it more I don't trust it

On Dec 29, 2015, 07:23 -0800, Aleksa Sarainotifications@github.com, wrote:

Indaemon/execdriver/native/seccomp_default.go(#18780 (comment)):

  • {>+         // Probably a bad idea to let containers restart>+         Name:  "restart_syscall",>+         Action: configs.Errno,>+         Args:  []_configs.Arg{},>+     },>+     {>+         // Prevent containers from using the kernel keyring,>+         // which is not namespaced>+         Name:  "request_key",>+         Action: configs.Errno,>+         Args:  []_configs.Arg{},>+     },>+     {>+         // meta, deny seccomp>+         Name:  "seccomp",
    

@jfrazelle(https://github.com/jfrazelle)AFAICS, this is not necessary. Seccomp does not allow a process running under a seccomp context to remove restrictions (otherwise it would be a pointless security feature). A Docker container could only make its seccomp context more restrictive (which I'm not sure if we should be blocking).


Reply to this email directly orview it on GitHub(https://github.com/docker/docker/pull/18780/files#r48546597).

This comment has been minimized.

@cyphar

cyphar Dec 29, 2015

Contributor

Fair enough, I guess. Just felt I should mention it. ;)

@cyphar

cyphar Dec 29, 2015

Contributor

Fair enough, I guess. Just felt I should mention it. ;)

This comment has been minimized.

@ewindisch

ewindisch Dec 29, 2015

Contributor

Chrome is a perfect example of an application that sets its own seccomp profiles and won't be setting them via Docker. I think it's a valid use-case (Chrome is likely to be broken for other reasons per this policy, but I digress)

@ewindisch

ewindisch Dec 29, 2015

Contributor

Chrome is a perfect example of an application that sets its own seccomp profiles and won't be setting them via Docker. I think it's a valid use-case (Chrome is likely to be broken for other reasons per this policy, but I digress)

This comment has been minimized.

@jessfraz

jessfraz Dec 29, 2015

Contributor

ya i broke chrome when I added clone userns to deny ;)

On Tue, Dec 29, 2015 at 8:45 AM, Eric Windisch notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •   {
    
  •       // Probably a bad idea to let containers restart
    
  •       Name:   "restart_syscall",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // Prevent containers from using the kernel keyring,
    
  •       // which is not namespaced
    
  •       Name:   "request_key",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // meta, deny seccomp
    
  •       Name:   "seccomp",
    

Chrome is a perfect example of an application that sets its own seccomp
profiles and won't be setting them via Docker. I think it's a valid
use-case (Chrome is likely to be broken for other reasons per this
policy, but I digress)


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48552525.

@jessfraz

jessfraz Dec 29, 2015

Contributor

ya i broke chrome when I added clone userns to deny ;)

On Tue, Dec 29, 2015 at 8:45 AM, Eric Windisch notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •   {
    
  •       // Probably a bad idea to let containers restart
    
  •       Name:   "restart_syscall",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // Prevent containers from using the kernel keyring,
    
  •       // which is not namespaced
    
  •       Name:   "request_key",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // meta, deny seccomp
    
  •       Name:   "seccomp",
    

Chrome is a perfect example of an application that sets its own seccomp
profiles and won't be setting them via Docker. I think it's a valid
use-case (Chrome is likely to be broken for other reasons per this
policy, but I digress)


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48552525.

This comment has been minimized.

@rhatdan

rhatdan Dec 29, 2015

Contributor

qemu.

@rhatdan

rhatdan Dec 29, 2015

Contributor

qemu.

This comment has been minimized.

@cyphar

cyphar Dec 29, 2015

Contributor

Firefox also uses seccomp (allegedly). OpenSSH and vsftpd have support for it.

@cyphar

cyphar Dec 29, 2015

Contributor

Firefox also uses seccomp (allegedly). OpenSSH and vsftpd have support for it.

This comment has been minimized.

@rmuir

rmuir Dec 29, 2015

chrome of course.

@rmuir

rmuir Dec 29, 2015

chrome of course.

This comment has been minimized.

@jessfraz

jessfraz Dec 29, 2015

Contributor

ok lets remove it, anyone want to make a pr or i can

On Tue, Dec 29, 2015 at 9:42 AM, Robert Muir notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •   {
    
  •       // Probably a bad idea to let containers restart
    
  •       Name:   "restart_syscall",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // Prevent containers from using the kernel keyring,
    
  •       // which is not namespaced
    
  •       Name:   "request_key",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // meta, deny seccomp
    
  •       Name:   "seccomp",
    

chrome of course.


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48556317.

@jessfraz

jessfraz Dec 29, 2015

Contributor

ok lets remove it, anyone want to make a pr or i can

On Tue, Dec 29, 2015 at 9:42 AM, Robert Muir notifications@github.com
wrote:

In daemon/execdriver/native/seccomp_default.go
#18780 (comment):

  •   {
    
  •       // Probably a bad idea to let containers restart
    
  •       Name:   "restart_syscall",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // Prevent containers from using the kernel keyring,
    
  •       // which is not namespaced
    
  •       Name:   "request_key",
    
  •       Action: configs.Errno,
    
  •       Args:   []*configs.Arg{},
    
  •   },
    
  •   {
    
  •       // meta, deny seccomp
    
  •       Name:   "seccomp",
    

chrome of course.


Reply to this email directly or view it on GitHub
https://github.com/docker/docker/pull/18780/files#r48556317.

This comment has been minimized.

@jessfraz

jessfraz Dec 29, 2015

Contributor

opened #18974

@jessfraz

jessfraz Dec 29, 2015

Contributor

opened #18974

@grossws

This comment has been minimized.

Show comment
Hide comment
@grossws

grossws Jun 27, 2016

Contributor

Why is stated that set_mempolicy is gated by CAP_SYS_NICE? I can't find anything about that. Also, adding CAP_SYS_NICE doesn't allow set_mempolicy syscall.

Simple way to check is to run numactl --interleave=all true in container on NUMA-enabled host with more than one node.

Contributor

grossws commented Jun 27, 2016

Why is stated that set_mempolicy is gated by CAP_SYS_NICE? I can't find anything about that. Also, adding CAP_SYS_NICE doesn't allow set_mempolicy syscall.

Simple way to check is to run numactl --interleave=all true in container on NUMA-enabled host with more than one node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment