Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for --security-opt, --syscall, --ulimit...to swarm mode #25209

Closed
mostolog opened this issue Jul 29, 2016 · 73 comments
Closed

Add support for --security-opt, --syscall, --ulimit...to swarm mode #25209

mostolog opened this issue Jul 29, 2016 · 73 comments

Comments

@mostolog
Copy link

@mostolog mostolog commented Jul 29, 2016

Hi

Looks like docker service create doesn't have any kernel configuration options. eg: --security-opt, --sysctl, --ulimit... which are sometimes required.
This is stopping us on using swarm mode to deploy ELK 5 on our testing servers.

Could you add at least a --container-args option? eg:
--container-args="--security-opt seccomp=unconfined --ulimit memlock=-1 --ulimit nofile=102400"

If this can be done somehow, sorry for mistake. Please let me know how to do it.

Regards.

@gittycat
Copy link

@gittycat gittycat commented Sep 18, 2016

The --security-opt is also needed by Elasticsearch. Currently, starting Elasticsearch gives this error unable to install syscall filter: seccomp unavailable: your kernel is buggy and you should upgrade.
The workaround given is to start containers using --security-opt=seccomp=unconfined but that's not available for services.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 19, 2016

ping @justincormack perhaps you have thoughts on this. As I commented on #25303 (comment) - one challenge will be "where to put the custom profile file" in a Swarm setup (unless the definition is stored in the Swarm service definition)

@mostolog
Copy link
Author

@mostolog mostolog commented Sep 20, 2016

@thaJeztah Excuse me, but my lack of english doesn't allow me to properly understand what you are talking about...
Regarding our needs, each service has their own, so every parameter should be service independent (defined on service create/update), instead of swarm-level

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 20, 2016

@mostolog I was thinking how a custom profile should be set (see https://docs.docker.com/engine/security/seccomp/#/passing-a-profile-for-a-container), because I think docker needs to have access to the file that contains that profile (on each node in the swarm)

@mostolog
Copy link
Author

@mostolog mostolog commented Sep 20, 2016

Please, let me know if I understood properly, despite my far-too-brief description.

I guess when you specify --security-opt for a container, it inherits the default profile + add parameters for running. I also suppose the same happens with services.

If services created under swarm are deployed on other swarm nodes, a "Dockerfile" shall be sent to nodes in order to run those, hence this template could be part of the dockerfile, isnt it?
That's what you mean when you say "stored in the Swarm service definition", right?

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 20, 2016

@mostolog no, not a Dockerfile, the (contents of) the profile.json from the example I linked above. Docker (in "swarm mode") stores the definition of services (what command they are running, which options are passed); the profile itself would have to be stored as part of that definition.

@mostolog
Copy link
Author

@mostolog mostolog commented Sep 20, 2016

@thaJeztah Clear as water. Thanks a lot.
And yes, I agree with you: those parameters should also be part of service definition sent to swarm nodes.

@justincormack
Copy link
Contributor

@justincormack justincormack commented Sep 20, 2016

@thaJeztah seccomp is not an issue - the file is not used, the json contents of it are passed by the client to the daemon.

However, this just seems to be a workaround for elasticsearch trying to set its own seccomp profile and failing, which seems really odd, will look into what the cause is, it looks like a bug in elasticsearch.

@justincormack
Copy link
Contributor

@justincormack justincormack commented Sep 20, 2016

The seccomp error in elasticsearch was fixed here elastic/elasticsearch@f77e8a5 - we return EPERM as we already filter the unknown syscalls, they were expecting ENOSYS. I don't think we are that crazy here like the comments suggest. Looks like this will be in 5.0.0 when it is released.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 20, 2016

@justincormack ah, I was mistaken then, I assumed the file was needed on the daemon side, but thinking more that would only be for a default profile. 😅

@PatrickLang
Copy link

@PatrickLang PatrickLang commented Jan 3, 2017

This is also going to be especially important for Windows containers that need to run as service accounts. We need the --security-opt "credentialspec=..." to be passed through without modifications for this to work.

CC @anweiss @friism

@PatrickLang
Copy link

@PatrickLang PatrickLang commented Jan 3, 2017

--isolation=... is also going to be important. When someone deploys a service, they may need to use --isolation=hyperv for compliance or compatibility reasons. This setting should also be service-specific and not host-wide.

@mostolog
Copy link
Author

@mostolog mostolog commented Jan 13, 2017

Are --ulimit or --syscall already implemented in 1.13.0-RC5 for docker service or docker stack? I'm not able to get it working...

@cpuguy83
Copy link
Collaborator

@cpuguy83 cpuguy83 commented Jan 14, 2017

@mostolog Nope.

@xiaohai2016
Copy link

@xiaohai2016 xiaohai2016 commented Jan 26, 2017

Are we expecting this issue to be fixed soon? It is really important!

@thaJeztah thaJeztah added this to backlog in maintainers-session Jan 26, 2017
@thaJeztah thaJeztah removed this from backlog in maintainers-session Jan 26, 2017
@macjl
Copy link

@macjl macjl commented Feb 2, 2017

I've also have the problem. I'm not able to run systemd based containers without the security_opt option.

@ehazlett
Copy link
Contributor

@ehazlett ehazlett commented Feb 10, 2017

FYI I've opened #30894 to address some of these and would love feedback. If that PR is agreed upon, I'm planning to do the same for "resources" which should address the other things (ulimits, isolation, pids-limit, etc).

@titpetric
Copy link

@titpetric titpetric commented Feb 26, 2017

I'd love to set --sysctl net.core.somaxconn=4096 somehow to a swarm service. The container the swarm service starts has some kind of default (128), and isn't tunable somehow? Redis for example tries to set it to 511 or something, and gives a warning if this can't be set.

1.) I asume --sysctl will be "ported" to service create,
2.) is there some work-around currently?

@brandonroyal
Copy link

@brandonroyal brandonroyal commented Mar 2, 2017

We're seeing lots of asks for use of domain identities using --security-opt "credentialspec=...". Not having this available will be a blocker for using integrated auth for SQL Server (significant blocker for a number of lift&shift .NET apps). Any chance this is being prioritized?

@aluzzardi
Copy link
Member

@aluzzardi aluzzardi commented Mar 2, 2017

@diogomonica
Copy link
Contributor

@diogomonica diogomonica commented Mar 3, 2017

@ehazlett and I chatted, we think that this would be a good opportunity to introduce either a secret-type or a good use case for random blobs that have to be delivered to tasks.

For example, this could operate in the following manner:
echo "BLA" | docker secret create —type credential-spec my-cred-spec
and then we could:
docker service create —secrets=my-cred-spec
removing the need for this --security-opt.

We would have to switch on secret types, and then internally pass the contents of that secret to it.

Thoughts @cyli @aaronlehmann @aluzzardi

@aluzzardi
Copy link
Member

@aluzzardi aluzzardi commented Mar 3, 2017

Sorry I don't know what a credential spec is.

Is its content secret in the literal sense?

What's the problem with --security-opt?

@diogomonica
Copy link
Contributor

@diogomonica diogomonica commented Mar 4, 2017

@aluzzardi I don't think we want to propagate any of the security flags of docker run to docker service create

@aluzzardi
Copy link
Member

@aluzzardi aluzzardi commented Mar 4, 2017

But here we are as well - except they're encapsulated into a secret which is even worse to deprecate?

I might be getting out of topic, but I think we have to fix docker run rather than considering it totaled and trying to get a better docker service. 99.9% of our users are using docker run.

I think we should really fix docker run and just have a 1:1 mapping with docker service.

If we continue down this path:

  • docker run, used by the vast majority, has the wrong security model and there is no incentive to fix this
  • docker service lacks basic features that other orchestration platforms, docker run and classic swarm support have supported for years
  • docker run and docker service get farther away every time while in fact we are trying to do the opposite with convergence
  • It leads to a subpar UX. You have to learn two products at once. First you experiment with docker run to get your container up and running, then when you want to run it for "real" as a service, you'll soon find the same flags don't work and you have to learn about a new way. Which is the worst of both worlds

I believe the number one advantage of built-in orchestration is it feels natural to go from dev (single machine) to prod (cluster) - same tools, same UI, same platform.

However, if we go ahead with this, we're basically creating a fracture where it's going to feel like using different tools.

Let's put ourselves in the shoes of a lambda user deploying SQL server. You'll probably start by doing a docker run to get things going, tweaking the config, and so on and so on. Then you move to a docker service create (or stack deploy), and you'll notice the CLI spitting out errors like --security-opt: no such flag. Then you have to spend some time on Google, only to find out it's not supported and have to use an entirely different workflow. Then you flip the table :)

(╯°□°)╯︵ ┻━┻

Just to re-iterate, I think the way forward is:

  1. We fix stuff that is broken in docker run. Caps, security opts, privileged? Let's fix those.
  2. Docker service is a 1:1 copy of docker run. When we fix run, we fix service.
@bitgandtter
Copy link

@bitgandtter bitgandtter commented Jul 18, 2017

Any advances on the --ulimits flag for swarm stack deploy? without it elasticsearch cant be deployed as part of an stack

@imyoungyang
Copy link

@imyoungyang imyoungyang commented Jul 26, 2017

Hi @bitgandtter
@dliappis comments give us a very clear instruction to adjust the docker service ulimit.

You can reference the Vagrant file to let docker service max locked memory unlimited and Docker image to setup elasticsearch cluster.

@xificurC
Copy link

@xificurC xificurC commented Aug 15, 2017

@imyoungyang IIUC that's a workaround on how to set the ulimits for the docker daemon. Changing those settings changes them for every container. Just because elasticsearch needs e.g. 65k file descriptors doesn't mean we should let everyone have such fun.

I guess we need to wait for libentitlement to land? @n4ss any advance in the last month?

@n4ss
Copy link

@n4ss n4ss commented Aug 15, 2017

@xificurC yes, we're having more entitlements implemented and images such as nginx or dind are starting to work with it :)

@dliappis
Copy link

@dliappis dliappis commented Sep 6, 2017

IIUC that's a workaround on how to set the ulimits for the docker daemon. Changing those settings changes them for every container. Just because elasticsearch needs e.g. 65k file descriptors doesn't mean we should let everyone have such fun.

@xificurC The Docker Engine defaults since 8db6109 have high defaults (for performance reasons). Therefore you don't need to change them (for the sake of increased requirements, say, of Elasticsearch) with recent versions of docker-ce/ee etc. However, you'd need to do the reverse, i.e. reduce the limits per container if you feel that a specific one may potentially abuse resources, so entitlements would be needed for this case.

@darklow
Copy link

@darklow darklow commented Jan 22, 2018

It would be great is some workaround could be provided at least low level or at least at daemon.json level (btw setting default-ulimits in daemon.json still doesn't work on latest docker, docker daemon doesn't start). So many services have downgraded performance because of multiple options missing when running in docker swarm mode. I am still having elasticsearch issues because of memory lock and ulimit problems (ended up removing swap disk partition which is not nice). I am having performance problems on load balancers and webservers because I couldn't find any way of increasingnet.core.somaxconn more than default 128 (even if I increased it on host machine and tried multiple other ideas without success). Almost every single performance issue I had came down to running in docker swarm mode. Unfortunately I'm already in production and wasn't aware of so many limitations and looking for some workarounds or maybe this issue could be prioritised. Thank you.

@eyz
Copy link

@eyz eyz commented Jan 22, 2018

Additionally, there are also some cases where other non-Swarm flags like --privileged are required, such as running docker-in-docker for CI

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Jan 23, 2018

btw setting default-ulimits in daemon.json still doesn't work on latest docker, docker daemon doesn't start

Could you elaborate? This should work; for example:

{
	"default-ulimits": {
		"nofile": {
			"Name": "nofile",
			"Hard": 2048,
			"Soft": 1024
		}
	}
}
@darklow
Copy link

@darklow darklow commented Jan 23, 2018

@thaJeztah Sorry, I must have copied wrong syntax, yours does work indeed, thank you.

@jmarcos-cano
Copy link

@jmarcos-cano jmarcos-cano commented Jan 31, 2018

To anyone stumbling with the net.core.somaxconn in swarm, one can do a workaround:

redis:
    image: redis:3
    ports:
      - "6379"
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /proc:/writable-proc
    entrypoint: [ "/bin/bash", "-c", "echo 1024 > /writable-proc/sys/net/core/somaxconn && exec docker-entrypoint.sh redis-server" ]

grabbed the idea from stack overflow

unfortunately options are limited

@raarts
Copy link

@raarts raarts commented Mar 6, 2018

I am deeply worried by the fact that the moby/libentitlement repo (which is supposed to fix this issue) has been at a standstill for 3 months now...

@moby moby deleted a comment from 13428282016 Jun 1, 2018
@zicklag
Copy link

@zicklag zicklag commented Jun 19, 2018

I managed a very limited workaround that I used to run a Docker volume plugin container that needed to do a FUSE mount. I created a Docker image, kadimasolutions/docker-run-d, that is meant to run another container using the Docker CLI. You run this container as a swarm service and mount the Docker socket into it. You pass in a Docker run command and it will use the Docker CLI to run the command against the Docker socket mounted into the container. For example:

...
privileged-nginx:
    image: kadimasolutions/docker-run-d:latest
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    command:
      - "--privileged -p 80:80 nginx"
...

The docker-run-d container will start the nginx container when the swarm service is run and it will stop the nginx container when the service is stopped. This has a whole lot of limitations and nuances and is in no way a good workaround, but it was the only option for my use case.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Aug 23, 2018

WIP Pull request for setting sysctl for swarm services: #37701 / docker/swarmkit#2729

@olljanat
Copy link
Contributor

@olljanat olljanat commented Nov 27, 2019

@thaJeztah Sysctl support for services was added on 19.03 so can we actually close this one?

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Dec 4, 2019

Hm, I think I left this one open because --security-opt and --ulimit are also listed here, but not yet implemented; perhaps someone should open separate tickets for those 🤔

@kadahl
Copy link

@kadahl kadahl commented Jun 3, 2020

Is this being worked on (specifically --security-opt), or is there any workaround?

Our current project uses gmsa accounts and we would like to use swarm but it does not seem possible at this point.

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Jun 8, 2020

For gmsa, I recall #38632 was added

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Aug 19, 2020

--sysctl was implemented in #37701

For the remaining options;

  • #40639 was opened to track/discuss support for --ulimit
  • #41371 was opened to track/discuss support for--security-opt

Let me close this one

@thaJeztah thaJeztah closed this Aug 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet