Add support for --security-opt, --syscall, --ulimit...to swarm mode #25209

mostolog · 2016-07-29T06:23:15Z

Hi

Looks like docker service create doesn't have any kernel configuration options. eg: --security-opt, --sysctl, --ulimit... which are sometimes required.
This is stopping us on using swarm mode to deploy ELK 5 on our testing servers.

Could you add at least a --container-args option? eg:
--container-args="--security-opt seccomp=unconfined --ulimit memlock=-1 --ulimit nofile=102400"

If this can be done somehow, sorry for mistake. Please let me know how to do it.

Regards.

gittycat · 2016-09-18T12:48:24Z

The --security-opt is also needed by Elasticsearch. Currently, starting Elasticsearch gives this error unable to install syscall filter: seccomp unavailable: your kernel is buggy and you should upgrade.
The workaround given is to start containers using --security-opt=seccomp=unconfined but that's not available for services.

thaJeztah · 2016-09-19T21:21:21Z

ping @justincormack perhaps you have thoughts on this. As I commented on #25303 (comment) - one challenge will be "where to put the custom profile file" in a Swarm setup (unless the definition is stored in the Swarm service definition)

mostolog · 2016-09-20T07:46:48Z

@thaJeztah Excuse me, but my lack of english doesn't allow me to properly understand what you are talking about...
Regarding our needs, each service has their own, so every parameter should be service independent (defined on service create/update), instead of swarm-level

thaJeztah · 2016-09-20T07:52:37Z

@mostolog I was thinking how a custom profile should be set (see https://docs.docker.com/engine/security/seccomp/#/passing-a-profile-for-a-container), because I think docker needs to have access to the file that contains that profile (on each node in the swarm)

mostolog · 2016-09-20T08:11:39Z

Please, let me know if I understood properly, despite my far-too-brief description.

I guess when you specify --security-opt for a container, it inherits the default profile + add parameters for running. I also suppose the same happens with services.

If services created under swarm are deployed on other swarm nodes, a "Dockerfile" shall be sent to nodes in order to run those, hence this template could be part of the dockerfile, isnt it?
That's what you mean when you say "stored in the Swarm service definition", right?

thaJeztah · 2016-09-20T08:17:41Z

@mostolog no, not a Dockerfile, the (contents of) the profile.json from the example I linked above. Docker (in "swarm mode") stores the definition of services (what command they are running, which options are passed); the profile itself would have to be stored as part of that definition.

mostolog · 2016-09-20T08:21:27Z

@thaJeztah Clear as water. Thanks a lot.
And yes, I agree with you: those parameters should also be part of service definition sent to swarm nodes.

justincormack · 2016-09-20T12:52:59Z

@thaJeztah seccomp is not an issue - the file is not used, the json contents of it are passed by the client to the daemon.

However, this just seems to be a workaround for elasticsearch trying to set its own seccomp profile and failing, which seems really odd, will look into what the cause is, it looks like a bug in elasticsearch.

justincormack · 2016-09-20T13:08:12Z

The seccomp error in elasticsearch was fixed here elastic/elasticsearch@f77e8a5 - we return EPERM as we already filter the unknown syscalls, they were expecting ENOSYS. I don't think we are that crazy here like the comments suggest. Looks like this will be in 5.0.0 when it is released.

thaJeztah · 2016-09-20T14:29:24Z

@justincormack ah, I was mistaken then, I assumed the file was needed on the daemon side, but thinking more that would only be for a default profile. 😅

PatrickLang · 2017-01-03T20:50:10Z

This is also going to be especially important for Windows containers that need to run as service accounts. We need the --security-opt "credentialspec=..." to be passed through without modifications for this to work.

CC @anweiss @friism

PatrickLang · 2017-01-03T20:52:30Z

--isolation=... is also going to be important. When someone deploys a service, they may need to use --isolation=hyperv for compliance or compatibility reasons. This setting should also be service-specific and not host-wide.

mostolog · 2017-01-13T09:09:54Z

Are --ulimit or --syscall already implemented in 1.13.0-RC5 for docker service or docker stack? I'm not able to get it working...

cpuguy83 · 2017-01-14T13:16:33Z

@mostolog Nope.

xiaohai2016 · 2017-01-26T12:53:31Z

Are we expecting this issue to be fixed soon? It is really important!

macjl · 2017-02-02T13:54:55Z

I've also have the problem. I'm not able to run systemd based containers without the security_opt option.

ehazlett · 2017-02-10T02:22:43Z

FYI I've opened #30894 to address some of these and would love feedback. If that PR is agreed upon, I'm planning to do the same for "resources" which should address the other things (ulimits, isolation, pids-limit, etc).

titpetric · 2017-02-26T17:10:07Z

I'd love to set --sysctl net.core.somaxconn=4096 somehow to a swarm service. The container the swarm service starts has some kind of default (128), and isn't tunable somehow? Redis for example tries to set it to 511 or something, and gives a warning if this can't be set.

1.) I asume --sysctl will be "ported" to service create,
2.) is there some work-around currently?

brandonroyal · 2017-03-02T02:07:32Z

We're seeing lots of asks for use of domain identities using --security-opt "credentialspec=...". Not having this available will be a blocker for using integrated auth for SQL Server (significant blocker for a number of lift&shift .NET apps). Any chance this is being prioritized?

aluzzardi · 2017-03-02T02:14:02Z

/cc @ehazlett @diogomonica @cyli FYI

diogomonica · 2017-03-03T23:31:27Z

@ehazlett and I chatted, we think that this would be a good opportunity to introduce either a secret-type or a good use case for random blobs that have to be delivered to tasks.

For example, this could operate in the following manner:
echo "BLA" | docker secret create —type credential-spec my-cred-spec
and then we could:
docker service create —secrets=my-cred-spec
removing the need for this --security-opt.

We would have to switch on secret types, and then internally pass the contents of that secret to it.

Thoughts @cyli @aaronlehmann @aluzzardi

aluzzardi · 2017-03-03T23:55:22Z

Sorry I don't know what a credential spec is.

Is its content secret in the literal sense?

What's the problem with --security-opt?

diogomonica · 2017-03-04T00:35:49Z

@aluzzardi I don't think we want to propagate any of the security flags of docker run to docker service create

aluzzardi · 2017-03-04T00:53:30Z

But here we are as well - except they're encapsulated into a secret which is even worse to deprecate?

I might be getting out of topic, but I think we have to fix docker run rather than considering it totaled and trying to get a better docker service. 99.9% of our users are using docker run.

I think we should really fix docker run and just have a 1:1 mapping with docker service.

If we continue down this path:

docker run, used by the vast majority, has the wrong security model and there is no incentive to fix this
docker service lacks basic features that other orchestration platforms, docker run and classic swarm support have supported for years
docker run and docker service get farther away every time while in fact we are trying to do the opposite with convergence
It leads to a subpar UX. You have to learn two products at once. First you experiment with docker run to get your container up and running, then when you want to run it for "real" as a service, you'll soon find the same flags don't work and you have to learn about a new way. Which is the worst of both worlds

I believe the number one advantage of built-in orchestration is it feels natural to go from dev (single machine) to prod (cluster) - same tools, same UI, same platform.

However, if we go ahead with this, we're basically creating a fracture where it's going to feel like using different tools.

Let's put ourselves in the shoes of a lambda user deploying SQL server. You'll probably start by doing a docker run to get things going, tweaking the config, and so on and so on. Then you move to a docker service create (or stack deploy), and you'll notice the CLI spitting out errors like --security-opt: no such flag. Then you have to spend some time on Google, only to find out it's not supported and have to use an entirely different workflow. Then you flip the table :)

(╯°□°)╯︵ ┻━┻

Just to re-iterate, I think the way forward is:

We fix stuff that is broken in docker run. Caps, security opts, privileged? Let's fix those.
Docker service is a 1:1 copy of docker run. When we fix run, we fix service.

xificurC · 2017-08-15T07:37:07Z

@imyoungyang IIUC that's a workaround on how to set the ulimits for the docker daemon. Changing those settings changes them for every container. Just because elasticsearch needs e.g. 65k file descriptors doesn't mean we should let everyone have such fun.

I guess we need to wait for libentitlement to land? @n4ss any advance in the last month?

n4ss · 2017-08-15T15:27:16Z

@xificurC yes, we're having more entitlements implemented and images such as nginx or dind are starting to work with it :)

dliappis · 2017-09-06T07:31:57Z

IIUC that's a workaround on how to set the ulimits for the docker daemon. Changing those settings changes them for every container. Just because elasticsearch needs e.g. 65k file descriptors doesn't mean we should let everyone have such fun.

@xificurC The Docker Engine defaults since 8db6109 have high defaults (for performance reasons). Therefore you don't need to change them (for the sake of increased requirements, say, of Elasticsearch) with recent versions of docker-ce/ee etc. However, you'd need to do the reverse, i.e. reduce the limits per container if you feel that a specific one may potentially abuse resources, so entitlements would be needed for this case.

darklow · 2018-01-22T21:31:49Z

It would be great is some workaround could be provided at least low level or at least at daemon.json level (~~btw setting default-ulimits in daemon.json still doesn't work on latest docker, docker daemon doesn't start~~). So many services have downgraded performance because of multiple options missing when running in docker swarm mode. I am still having elasticsearch issues because of memory lock and ulimit problems (ended up removing swap disk partition which is not nice). I am having performance problems on load balancers and webservers because I couldn't find any way of increasingnet.core.somaxconn more than default 128 (even if I increased it on host machine and tried multiple other ideas without success). Almost every single performance issue I had came down to running in docker swarm mode. Unfortunately I'm already in production and wasn't aware of so many limitations and looking for some workarounds or maybe this issue could be prioritised. Thank you.

eyz · 2018-01-22T21:40:44Z

Additionally, there are also some cases where other non-Swarm flags like --privileged are required, such as running docker-in-docker for CI

thaJeztah · 2018-01-23T00:27:15Z

btw setting default-ulimits in daemon.json still doesn't work on latest docker, docker daemon doesn't start

Could you elaborate? This should work; for example:

{
	"default-ulimits": {
		"nofile": {
			"Name": "nofile",
			"Hard": 2048,
			"Soft": 1024
		}
	}
}

darklow · 2018-01-23T00:39:48Z

@thaJeztah Sorry, I must have copied wrong syntax, yours does work indeed, thank you.

jmarcos-cano · 2018-01-31T16:34:07Z

To anyone stumbling with the net.core.somaxconn in swarm, one can do a workaround:

redis:
    image: redis:3
    ports:
      - "6379"
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /proc:/writable-proc
    entrypoint: [ "/bin/bash", "-c", "echo 1024 > /writable-proc/sys/net/core/somaxconn && exec docker-entrypoint.sh redis-server" ]

grabbed the idea from stack overflow

unfortunately options are limited

raarts · 2018-03-06T11:53:45Z

I am deeply worried by the fact that the moby/libentitlement repo (which is supposed to fix this issue) has been at a standstill for 3 months now...

zicklag · 2018-06-19T01:33:14Z

I managed a very limited workaround that I used to run a Docker volume plugin container that needed to do a FUSE mount. I created a Docker image, kadimasolutions/docker-run-d, that is meant to run another container using the Docker CLI. You run this container as a swarm service and mount the Docker socket into it. You pass in a Docker run command and it will use the Docker CLI to run the command against the Docker socket mounted into the container. For example:

...
privileged-nginx:
    image: kadimasolutions/docker-run-d:latest
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    command:
      - "--privileged -p 80:80 nginx"
...

The docker-run-d container will start the nginx container when the swarm service is run and it will stop the nginx container when the service is stopped. This has a whole lot of limitations and nuances and is in no way a good workaround, but it was the only option for my use case.

thaJeztah · 2018-08-23T10:43:42Z

WIP Pull request for setting sysctl for swarm services: #37701 / moby/swarmkit#2729

olljanat · 2019-11-27T17:48:57Z

@thaJeztah Sysctl support for services was added on 19.03 so can we actually close this one?

thaJeztah · 2019-12-04T09:34:24Z

Hm, I think I left this one open because --security-opt and --ulimit are also listed here, but not yet implemented; perhaps someone should open separate tickets for those 🤔

kadahl · 2020-06-03T09:35:38Z

Is this being worked on (specifically --security-opt), or is there any workaround?

Our current project uses gmsa accounts and we would like to use swarm but it does not seem possible at this point.

thaJeztah · 2020-06-08T13:13:53Z

For gmsa, I recall #38632 was added

thaJeztah · 2020-08-19T12:35:26Z

--sysctl was implemented in #37701

For the remaining options;

Add support for --ulimit...to swarm mode #40639 was opened to track/discuss support for --ulimit
docker service create "--security-opt" option #41371 was opened to track/discuss support for--security-opt

Let me close this one

martin-marko · 2021-10-31T17:31:56Z

--security-opt is still not implemented, any workaround for setting seccomp for services in a swarm?

icecrime added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/swarm labels Jul 29, 2016

thaJeztah mentioned this issue Aug 1, 2016

[epic] add more options to service create / service update #25303

Open

thaJeztah added this to backlog in maintainers-session Jan 26, 2017

thaJeztah removed this from backlog in maintainers-session Jan 26, 2017

yosifkit mentioned this issue Feb 27, 2017

WARNING: /proc/sys/net/core/somaxconn is set to the lower value of 128. redis/docker-library-redis#35

Closed

dnephin mentioned this issue Aug 29, 2017

elasticsearch on swarm cluster, ulimits ignored docker/for-linux#88

Closed

3 tasks

benoitm974 mentioned this issue Oct 24, 2017

Pauses/delays with overlay network on swarm #31746

Open

mneedham mentioned this issue Jan 11, 2018

Using unpriviliged user to run Neo4j neo4j/docker-neo4j#22

Closed

vassilvk mentioned this issue Jan 12, 2018

Idle connections over overlay network ends up in a broken state after 15 minutes #31208

Closed

moby deleted a comment from 13428282016 Jun 1, 2018

jefflill mentioned this issue Jul 31, 2018

HAProxy/pfSense client ephemeral port exhaustion nforgeio/neonKUBE#275

Closed

7 tasks

dperny mentioned this issue Aug 23, 2018

Add support for sysctl options in services #37701

Merged

robotdan mentioned this issue Oct 5, 2018

Elasticsearch container fails in swarm mode FusionAuth/fusionauth-containers#1

Closed

thaJeztah mentioned this issue Jun 11, 2019

Add ulimits to unsupported compose fields docker/cli#482

Merged

gm0neyl0ve mentioned this issue Mar 6, 2020

Add support for --ulimit...to swarm mode #40639

Closed

thaJeztah mentioned this issue Aug 19, 2020

docker service create "--security-opt" option #41371

Open

thaJeztah closed this as completed Aug 19, 2020

ilyam8 mentioned this issue Feb 18, 2022

Netdata Docker/Docker-compose installs fail to see attached Volumes/contents of /var/lib/netdata/registry/netdata.public.unique.id netdata/netdata#11933

Closed

Add support for --security-opt, --syscall, --ulimit...to swarm mode #25209

Add support for --security-opt, --syscall, --ulimit...to swarm mode #25209

Comments

mostolog commented Jul 29, 2016

gittycat commented Sep 18, 2016 • edited Loading

thaJeztah commented Sep 19, 2016

mostolog commented Sep 20, 2016

thaJeztah commented Sep 20, 2016

mostolog commented Sep 20, 2016

thaJeztah commented Sep 20, 2016

mostolog commented Sep 20, 2016

justincormack commented Sep 20, 2016

justincormack commented Sep 20, 2016 • edited Loading

thaJeztah commented Sep 20, 2016

PatrickLang commented Jan 3, 2017

PatrickLang commented Jan 3, 2017

mostolog commented Jan 13, 2017

cpuguy83 commented Jan 14, 2017

xiaohai2016 commented Jan 26, 2017

macjl commented Feb 2, 2017

ehazlett commented Feb 10, 2017

titpetric commented Feb 26, 2017

brandonroyal commented Mar 2, 2017

aluzzardi commented Mar 2, 2017

diogomonica commented Mar 3, 2017 • edited Loading

aluzzardi commented Mar 3, 2017

diogomonica commented Mar 4, 2017

aluzzardi commented Mar 4, 2017 • edited Loading

xificurC commented Aug 15, 2017

n4ss commented Aug 15, 2017

dliappis commented Sep 6, 2017

darklow commented Jan 22, 2018 • edited Loading

eyz commented Jan 22, 2018

thaJeztah commented Jan 23, 2018

darklow commented Jan 23, 2018

jmarcos-cano commented Jan 31, 2018 • edited Loading

raarts commented Mar 6, 2018

zicklag commented Jun 19, 2018 • edited Loading

thaJeztah commented Aug 23, 2018

olljanat commented Nov 27, 2019

thaJeztah commented Dec 4, 2019

kadahl commented Jun 3, 2020

thaJeztah commented Jun 8, 2020

thaJeztah commented Aug 19, 2020

martin-marko commented Oct 31, 2021

gittycat commented Sep 18, 2016 •

edited

Loading

justincormack commented Sep 20, 2016 •

edited

Loading

diogomonica commented Mar 3, 2017 •

edited

Loading

aluzzardi commented Mar 4, 2017 •

edited

Loading

darklow commented Jan 22, 2018 •

edited

Loading

jmarcos-cano commented Jan 31, 2018 •

edited

Loading

zicklag commented Jun 19, 2018 •

edited

Loading