New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for sysctl options in services #37701

Merged
merged 1 commit into from Sep 26, 2018

Conversation

@dperny
Contributor

dperny commented Aug 22, 2018

- What I did
Adds support for sysctl options in docker services.

  • Adds API plumbing for creating services with sysctl options set.
  • Adds swagger.yaml documentation for new API field.
  • Changes executor package to make use of the Sysctls field on objects
  • Includes integration test to verify that new behavior works.

Essentially, everything needed to support the equivalent of docker run's --sysctl option except the CLI.

Depends on docker/swarmkit#2729, which is not merged yet, and so has my fork branch of swarmkit vendored in to demonstrate passing integration test.

- How I did it

Altered all of the API objects and plumbing to accommodate the new field, which swarmkit blindly passes into the executor.

- How to verify it

Includes an integration test.

Related issues:

  • partially addresses #25209 Add support for --security-opt, --syscall, --ulimit...to swarm mode
  • fixes/addresses #31961 Configure namespaced kernel parameters use docker service
  • fixes/addresses #33649 Daemon default values of namespaced kernel parameters
  • fixes/addresses #31208 Idle connections over overlay network ends up in a broken state after 15 minutes
  • fixes/addresses #37466 #37466 (comment) connections get "stuck" in swarm between wildfly and postgres
  • fixes/addresses #35082 [SWARM] Very poor performance for ingress network with lots of parallel requests
  • fixes/addresses #31746 Pauses/delays with overlay network on swarm
  • fixes/addresses docker-library/redis#35 WARNING: /proc/sys/net/core/somaxconn is set to the lower value of 128
  • relates to moby/libentitlement#35 Support sysctl / kernel params configuration

- Description for the changelog

Add support for sysctl options on docker services.

@thaJeztah

Thanks! Left comments/thoughts inline. We should also look at the docker/cli side for this (and add this option to the docker-compose file format)

// confident won't be modified by the container runtime, and won't blow
// anything up in the test environment
func TestCreateServiceSysctls(t *testing.T) {
defer setupTest(t)()

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Can you add a skip for older daemon versions? (assuming this is planned for inclusion in API 1.39)

	skip.If(t, versions.LessThan(testEnv.DaemonAPIVersion(), "1.39"), "feature was added in API version 1.39")

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Do we need a separate test for service update ? (perhaps overkill)

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

Yes, we should make one, actually, to verify that service update goes through correctly when Sysctls are changed.

// net.ipv4.ip_nonlocal_bind is, we can verify that setting the sysctl
// options works
for _, expected := range []string{"0", "1"} {

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Wondering if a subtest would make this clearer 🤔

for _, expected := range []string{"0", "1"} {
    t.Run(fmt.Sprintf("net.ipv4.ip_nonlocal_bind = %s", expected), func(t *testing.T) {
		// store the map we're going to be using everywhere.
 		sysctlOpts := map[string]string{"net.ipv4.ip_nonlocal_bind": expected}
 		...            
    })
}

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

I don't think that a subtest is useful honestly, because it's just 1 test, we're just trying two values so we don't have to know what the default is.

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Was looking if we should have a more generic service create test where we define a list of specs to create services with. Agreed that it's just a minor issue; advantage would be to more easily find which of the test-cases failed

)
// remove the service
err = client.ServiceRemove(ctx, serviceID)

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Services should already be removed when the test completes, so I think we can skip this, and just continue with the next iteration

// Create the service with the sysctl options
var instances uint64 = 1
serviceName := "TestService_" + t.Name() + "_" + expected

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

We don't need a name for this service, so better remove it, and just use the serviceID (which is already used)

var instances uint64 = 1
serviceName := "TestService_" + t.Name() + "_" + expected
serviceID := swarm.CreateService(t, d,
swarm.ServiceWithName(serviceName),

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Remove the name

// avoid that case, we're using poll.WaitOn with a closure to wait
// until logs actually respond
poll.WaitOn(t, func(t poll.LogT) poll.Result {
body, err := client.ContainerLogs(

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Should we use service logs instead?

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

As the person ostensibly responsible for service logs, no.

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

In that case, we must skip the test if it's running on a multi-node swarm.

// if the value is empty, we may not have gotten the output of the
// container logs yet, and we should try again.
if val == "" {
return poll.Continue("output of container logs is empty")

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Let me think about this;

Instead of using the container's / service's logs to check if the value was set, could we abuse the container's command for this (something like if cat /proc/sys/net/ipv4/ip_nonlocal_bind != expected value; exit 1)?

Thinking a bit further about this; I'm wondering if we need to check this at all. Creating a container with custom sysctl settings is an existing feature. The only thing being added in this PR is that that container is now created through SwarmKit instead of docker run.

Because of that, I think all we need to verify is if the container-spec is correct (the expected sysctl options are set).

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

Instead of using the container's / service's logs to check if the value was set, could we abuse the container's command for this (something like if cat /proc/sys/net/ipv4/ip_nonlocal_bind != expected value; exit 1)?

What do you think is better about doing it this way? Having containers exit in a swarm service test leaves us dealing with swarmkit rescheduling those containers.

Thinking a bit further about this; I'm wondering if we need to check this at all. Creating a container with custom sysctl settings is an existing feature. The only thing being added in this PR is that that container is now created through SwarmKit instead of docker run.

We do, actually, to make sure that the value of the Sysctls is plumbed through correctly. When writing this test, I actually missed a location where the value was plumbed through and so the value was making its way into the container spec, but not into the actual container.

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

We do, actually, to make sure that the value of the Sysctls is plumbed through correctly.

Discussed this with @dperny on Slack; inspecting the container should contain enough information that the correct container-spec was created. Creating containers from a spec/config is already covered by other tests, so for this feature we don't have to test that part 👍

description: "Set kernel namedspaced parameters (sysctls) in the container."
type: "object"
additionalProperties:
type: "string"

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Can you add an example here?

Sysctls:
  description: "Set kernel namedspaced parameters (sysctls) in the container."
  type: "object"
  additionalProperties:
    type: "string"
  example:
    net.ipv4.tcp_keepalive_time: "600"
    net.ipv4.tcp_keepalive_intvl: "60"
    net.ipv4.tcp_keepalive_probes: "3"
    net.ipv4.tcp_timestamps: "0"

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

Is there a way to reference or crosslink to the documentation for this option on the container HostConfig?

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Hm; good one, perhaps just refer to it in the description 🤔 ("this option accepts the same sysctls as can be specified for containers", or something along those lines 😂)

@@ -2725,6 +2725,11 @@ definitions:
description: "Run an init inside the container that forwards signals and reaps processes. This field is omitted if empty, and the default (as configured on the daemon) is used."
type: "boolean"
x-nullable: true
Sysctls:

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

Can you update the API version history, and add a bullet for this new option? https://github.com/moby/moby/blob/master/docs/api/version-history.md#v139-api-changes

(make sure to mention all of POST /services/create, POST /services/{id}/update, and GET /services/{id})

This comment has been minimized.

@dperny

dperny Aug 23, 2018

Contributor

Ah I always forget the API version history.

Show resolved Hide resolved api/types/swarm/container.go

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from 3ff99e4 to a8e023b Aug 23, 2018

@@ -30,6 +30,12 @@ keywords: "API, Docker, rcli, REST, documentation"
on the node.label. The format of the label filter is `node.label=<key>`/`node.label=<key>=<value>`
to return those with the specified labels, or `node.label!=<key>`/`node.label!=<key>=<value>`
to return those without the specified labels.
* `GET /services` now returns `Sysctls as part of the `ContainerSpec`.

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

missed a closing back tic after Sysctls

queryRegistry = true
}
if versions.LessThan(cliVersion, "1.39") {
if service.TaskTemplate.ContainerSpec != nil {

This comment has been minimized.

@thaJeztah

thaJeztah Aug 23, 2018

Member

👍 Good one; wondering now if creating a service without ContainerSpec is actually a valid request.

Not something to address in this PR, but just wondering that; and if we should validate/error in that situation 🤔

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from a8e023b to 6a24f7b Aug 23, 2018

@codecov

This comment has been minimized.

codecov bot commented Aug 23, 2018

Codecov Report

Merging #37701 into master will increase coverage by 0.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #37701      +/-   ##
==========================================
+ Coverage   36.12%   36.14%   +0.01%     
==========================================
  Files         610      610              
  Lines       45083    45086       +3     
==========================================
+ Hits        16288    16297       +9     
+ Misses      26555    26551       -4     
+ Partials     2240     2238       -2
@thaJeztah

LGTM after the t.Skip() is added for API < 1.39
https://github.com/moby/moby/pull/37701/files#r212267159

of course, depends on the upstream swarmkit PR to be merged

@dperny

This comment has been minimized.

Contributor

dperny commented Aug 23, 2018

I forgot to add the t.Skip(), so I'll go back and do so when I revendor swarmkit

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from 6a24f7b to 1f8f6c3 Aug 24, 2018

@dperny

This comment has been minimized.

Contributor

dperny commented Aug 24, 2018

@thaJeztah updated the swagger.yml description of the Sysctls option in container specs. It should be much more clear. PTAL.

vendor.conf Outdated
@@ -125,7 +125,7 @@ github.com/containerd/ttrpc 94dde388801693c54f88a6596f713b51a8b30b2d
github.com/gogo/googleapis 08a7655d27152912db7aaf4f983275eaf8d128ef
# cluster
github.com/docker/swarmkit cfa742c8abe6f8e922f6e4e920153c408e7d9c3b
github.com/docker/swarmkit 0e685ac60386c15cfefbf6626421bf0468b9444e git://github.com/dperny/swarmkit-1

This comment has been minimized.

@thaJeztah

thaJeztah Aug 27, 2018

Member

Upstream was merged; can you un-fork the dependency?

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Aug 27, 2018

OH, can you also add the example to the swagger YAML (see my comment here; #37701 (comment))

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from 1f8f6c3 to dd7434e Aug 27, 2018

queryRegistry = true
}
if versions.LessThan(cliVersion, "1.39") {
if service.TaskTemplate.ContainerSpec != nil {

This comment has been minimized.

@anshulpundir

anshulpundir Aug 28, 2018

Contributor

Move this to a function and use it here and below.

This comment has been minimized.

@dperny

dperny Sep 10, 2018

Contributor

i disagree, but mostly because i'm unsure what the best signature for that function would be, and considering it's only like 8 lines duplicated, i don't think there's much of a benefit to the refactoring.

This comment has been minimized.

@anshulpundir

anshulpundir Sep 12, 2018

Contributor

every duplicated line of code counts :)
But that's borderline bikeshedding

This comment has been minimized.

@wk8

wk8 Sep 12, 2018

Contributor

I would tend to agree with @anshulpundir here (though the question of the fun's signature is indeed interesting!)

Also, I'm a little worried that this file is gonna get littered with version checks really fast if we really want to ensure never adding new features to frozen versions; don't get me wrong, it makes sense, but I'm actually quite surprised there's only one check as of now. There's certainly going to be a lot more if we do all of #25303 . There might be a better way of doing this?

This comment has been minimized.

@vdemeester

vdemeester Sep 26, 2018

Member

just for the sake of it, I tend to agree with @dperny : sometimes a little duplication is better than the wrong abstraction… once we find a correct abstraction for those, we'll refactor 😉

@@ -159,6 +159,14 @@ func ServiceWithEndpoint(endpoint *swarmtypes.EndpointSpec) ServiceSpecOpt {
}
}
// ServiceWithSysctls sets the Sysctls option of the service's ContainerSpec.
func ServiceWithSysctls(sysctls map[string]string) ServiceSpecOpt {

This comment has been minimized.

@anshulpundir

anshulpundir Aug 28, 2018

Contributor

Is this only used in the unit-test?

This comment has been minimized.

@dperny

dperny Sep 10, 2018

Contributor

Yes (well, it's technically an integration test), but a lot of these functions are only used in one test.

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from dd7434e to cb5c0de Sep 10, 2018

@dperny

This comment has been minimized.

Contributor

dperny commented Sep 10, 2018

Reworded the api/swagger.yaml to more clearly state that the same sysctls as containers are supported, and hopefully clear up @thaJeztah's request for examples.

@thaJeztah thaJeztah changed the title from [WIP] Add support for sysctl options in services to Add support for sysctl options in services Sep 10, 2018

if service.TaskTemplate.ContainerSpec != nil {
// Sysctls for docker swarm services weren't supported before
// API version 1.39
service.TaskTemplate.ContainerSpec.Sysctls = nil

This comment has been minimized.

@wk8

wk8 Sep 12, 2018

Contributor

I don't understand why this is needed?

This comment has been minimized.

@wk8

wk8 Sep 12, 2018

Contributor

(to elaborate a bit more: shouldn't this be nil anyway if it's not present in the request?)

This comment has been minimized.

@wk8
@@ -309,6 +311,101 @@ func TestCreateServiceConfigFileMode(t *testing.T) {
assert.NilError(t, err)
}
// TestServiceCreateSysctls tests that a service created with sysctl options in

This comment has been minimized.

@wk8

wk8 Sep 12, 2018

Contributor

Shouldn't we have a similar test for updates too?

This comment has been minimized.

@dperny

dperny Sep 24, 2018

Contributor

there isn't one for the other things tested here.

@cballou

This comment has been minimized.

cballou commented Sep 15, 2018

The pre-req, docker/swarmkit#2729, was already merged into master 22 days ago.

How close are we to seeing this in nightly? I'm really looking to utilize the addition of sysctls to my docker compose files to run in combination with swarm and docker stack deploy. It's a performance blocker for high IO applications.

@cballou

This comment has been minimized.

cballou commented Sep 18, 2018

@vdemeester Just checking if it's possible to get this in docker-ce 18.9.0 beta. I'm literally in the process of migrating to k8s because they have --allowed-unsafe-sysctls implemented already. It's not feasible to run a production level swarm stack under high throughput without this fixed. Customization at the sysctl level is necessary.

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch 2 times, most recently from 8e0ce06 to 5785f17 Sep 18, 2018

@dperny

This comment has been minimized.

Contributor

dperny commented Sep 19, 2018

I had an error from not running swagger generation. However, when I ran it, container_wait.go was altered. I do not know if this change is of any impact @thaJeztah.

@dave-receptiviti

This comment has been minimized.

dave-receptiviti commented Sep 19, 2018

During a load test against one of our containers, we found that hitting the container directly, outside of swarm mode, resulted in 1.8x more throughput than when that container was running as one service instance in a one node Swarm.

Like @dperny and @cballou, we are really looking forward to having performance fixes / sysctl options expedited

Add support for sysctl options in services
Adds support for sysctl options in docker services.

* Adds API plumbing for creating services with sysctl options set.
* Adds swagger.yaml documentation for new API field.
* Updates the API version history document.
* Changes executor package to make use of the Sysctls field on objects
* Includes integration test to verify that new behavior works.

Essentially, everything needed to support the equivalent of docker run's
`--sysctl` option except the CLI.

Includes a vendoring of swarmkit for proto changes to support the new
behavior.

Signed-off-by: Drew Erny <drew.erny@docker.com>

@dperny dperny force-pushed the dperny:add-swarmkit-sysctl-support branch from 5785f17 to 14da20f Sep 20, 2018

@vdemeester

LGTM 🐯

queryRegistry = true
}
if versions.LessThan(cliVersion, "1.39") {
if service.TaskTemplate.ContainerSpec != nil {

This comment has been minimized.

@vdemeester

vdemeester Sep 26, 2018

Member

just for the sake of it, I tend to agree with @dperny : sometimes a little duplication is better than the wrong abstraction… once we find a correct abstraction for those, we'll refactor 😉

@vdemeester vdemeester merged commit 9f296d1 into moby:master Sep 26, 2018

8 checks passed

codecov/patch 100% of diff hit (target 50%)
Details
codecov/project 36.14% (+0.01%) compared to 9ad4ef7
Details
dco-signed All commits are signed
experimental Jenkins build Docker-PRs-experimental 42234 has succeeded
Details
janky Jenkins build Docker-PRs 51019 has succeeded
Details
powerpc Jenkins build Docker-PRs-powerpc 11441 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 22302 has succeeded
Details
z Jenkins build Docker-PRs-s390x 11308 has succeeded
Details
@marzlarz

This comment has been minimized.

marzlarz commented Oct 23, 2018

So are we finally able to set "sysctl" options for Docker services or is this just the underlying updates required to eventually expose this to the CLI ?

@cballou

This comment has been minimized.

cballou commented Oct 29, 2018

So I see that the API version was changed from 1.3.9 to 1.4.0. Is there a tentative release of that schedule in the near future? The documentation currently points to a non-existent page of the version:

https://docs.docker.com/engine/api/v1.40/

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Oct 29, 2018

@cballou API v1.40 is not finalised yet (as in; changes will still arrive). That version will be included in the next Docker release (after Docker 18.09, which has API v1.39)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment