Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[manager/orchestrator/reaper] Clean out the task reaper dirty set at the end of tick() #2675

Merged
merged 1 commit into from Jun 26, 2018

Conversation

anshulpundir
Copy link
Contributor

@anshulpundir anshulpundir commented Jun 22, 2018

Addresses the following from from #2672:

task reaper dirty set is not cleaned up in various different scenarios at the end of tick() and causes high CPU as each additional task causes tick() to be called.

For testing, we should get coverage for cases where the tasks are not expected to be removed from the task reaper dirty set e.g. service already deleted, size of dirty set 1000 but per slot retention limit not reached etc.

@GordonTheTurtle
Copy link

Please sign your commits following these rules:
https://github.com/moby/moby/blob/master/CONTRIBUTING.md#sign-your-work
The easiest way to do this is to amend the last commit:

$ git clone -b "reaper" git@github.com:anshulpundir/swarmkit.git somewhere
$ cd somewhere
$ git rebase -i HEAD~842354429024
editor opens
change each 'pick' to 'edit'
save the file and quit
$ git commit --amend -s --no-edit
$ git rebase --continue # and repeat the amend for each commit
$ git push -f

Amending updates the existing PR. You DO NOT need to open a new one.

@codecov
Copy link

codecov bot commented Jun 22, 2018

Codecov Report

Merging #2675 into master will increase coverage by 0.11%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #2675      +/-   ##
==========================================
+ Coverage   62.06%   62.18%   +0.11%     
==========================================
  Files         134      134              
  Lines       21740    21742       +2     
==========================================
+ Hits        13494    13520      +26     
+ Misses       6794     6775      -19     
+ Partials     1452     1447       -5

Copy link
Contributor

@dani-docker dani-docker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@anshulpundir
My only concern is that we might end up leaking tasks that matches the condition here
https://github.com/docker/swarmkit/blob/master/manager/orchestrator/taskreaper/task_reaper.go#L257-L266

One would assume that EventUpdateTask gets fired when the task state changes, I am not sure if this is the case, but the below commit comment makes it sounds like it does not, hence why the author removed the reset map logic.
Correction:
EventUpdateTask ends up in cleanup not in dirty map, so we need to handle this use case
2d575ef#diff-55a0275b4dead2bb67850f19207c7e8f

If this is the case, I suggest to delete all tasks from the dirty list except the ones that apply to the above corner case/condition

@anshulpundir
Copy link
Contributor Author

anshulpundir commented Jun 25, 2018

My only concern is that we might end up leaking tasks that matches the condition here
https://github.com/docker/swarmkit/blob/master/manager/orchestrator/taskreaper/task_reaper.go#L257-L266

Assuming that my 'leak' you mean 'never be able to clean up', I don't think that we'll leak those tasks. They will be cleaned up whenever a new task gets created for that slot.

Having said that, I also think its OK to keep track of those tasks and not remove them from the dirty set, but it leads to a slightly inelegant implementation.

@dani-docker

Copy link
Contributor

@talex5 talex5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think EventUpdateTask needs updating to mark tasks as dirty.

// When tick() finishes, the tasks in the slot were either cleaned up,
// or it was skipped because it didn't meet the criteria for cleaning.
// Either way, we can discard the dirty set because future events on that slot
// will cause the task to be readded to the dirty set at that point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything that does it. The EventUpdateTask handler doesn't do anything with completed tasks unless they are also flagged for immediate removal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Historical tasks are only cleaned up on create events. I was going for as little changes as possible. At the same time, I don't see significant benefits from changing behavior to clean up during EventUpdateTask for tasks that have stopped running.

// TODO(anshul) The only edge case is when there are more than one running tasks present
// for a given slot. In that case, the slot may remain uncleaned until there is a task
// restart for that slot.
tr.dirty = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is nil even a valid value for tr.dirty? Won't this just make it segfault on the next EventCreateTask?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change it to tr.dirty = make(map[orchestrator.SlotTuple]struct{})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep!

//
// TODO(anshul) The only edge case is when there are more than one running tasks present
// for a given slot. In that case, the slot may remain uncleaned until there is a task
// restart for that slot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that a problem? We don't have to do anything until one of them terminates, and that will be handled (at least, if the previous comment was correct).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because tasks are only cleaned up on create events.

@@ -272,10 +280,6 @@ func (tr *TaskReaper) tick() {
break
}
}

if runningTasks <= 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could remove runningTasks completely, if we're not going to use it for anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Running tasks can't be removed until they stop running.

@dani-docker
Copy link
Contributor

Assuming that my 'leak' you mean 'never be able to clean up', I don't think that we'll leak those tasks. They will be cleaned up whenever a new task gets created for that slot.

Right, the task will eventually be cleaned up when a create task event comes in; but unlike other tasks that skips this condition and gets processed, here we might end up (temporary) with a number of tasks (+1) larger then taskHistory.

Having said that, I also think its OK to keep track of those tasks and not remove them from the dirty set, but it leads to a slightly inelegant implementation.

If we might end up reporting the wrong number of task history (> configured taskHistory), I think it's worth to keep track of those tasks

// the tasks in that slot have stopped running.
if runningTasks <= 1 {
delete(tr.dirty, dirty)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the indentation level changes here... this statement now executes for every task in historicTasks. Is this intended behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oversight on my part. Fixed!

@anshulpundir anshulpundir force-pushed the reaper branch 5 times, most recently from f09cf69 to 8da0f79 Compare June 25, 2018 22:58

// Setup the task reaper dirty set.
setupTaskReaperDirty(taskReaper)
// CVall tick directly and verify dirty set was cleaned up.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: CVall

…the end of tick()

Signed-off-by: Anshul Pundir <anshul.pundir@docker.com>
@dperny
Copy link
Collaborator

dperny commented Jun 25, 2018

i force-pushed a fix for the typo, i'm gonna merge as soon as CI verifies i didn't accidentally the code.

@dperny dperny merged commit cb78da3 into moby:master Jun 26, 2018
@anshulpundir anshulpundir deleted the reaper branch June 26, 2018 18:12
thaJeztah pushed a commit to thaJeztah/docker that referenced this pull request Jul 5, 2018
This includes the following behavior-modifying PRs:

- moby/swarmkit#2673
- moby/swarmkit#2669
- moby/swarmkit#2675
- moby/swarmkit#2664

Signed-off-by: Ying Li <ying.li@docker.com>
(cherry picked from commit b322705)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Jul 6, 2018
This includes the following behavior-modifying PRs:

- moby/swarmkit#2673
- moby/swarmkit#2669
- moby/swarmkit#2675
- moby/swarmkit#2664

Signed-off-by: Ying Li <ying.li@docker.com>
Upstream-commit: b322705750cddef903a1097e65a46a6a86610d0a
Component: engine
docker-jenkins pushed a commit to docker/docker-ce that referenced this pull request Jul 9, 2018
This includes the following behavior-modifying PRs:

- moby/swarmkit#2673
- moby/swarmkit#2669
- moby/swarmkit#2675
- moby/swarmkit#2664

Signed-off-by: Ying Li <ying.li@docker.com>
(cherry picked from commit b322705750cddef903a1097e65a46a6a86610d0a)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Upstream-commit: ea4127cc0e3cb076900cedd581b965e8988e6e59
Component: engine
asherkariv pushed a commit to ms-iot/moby that referenced this pull request Jul 24, 2018
This includes the following behavior-modifying PRs:

- moby/swarmkit#2673
- moby/swarmkit#2669
- moby/swarmkit#2675
- moby/swarmkit#2664

Signed-off-by: Ying Li <ying.li@docker.com>
asherkariv added a commit to ms-iot/moby that referenced this pull request Jul 25, 2018
…top (#3)

* MSFT:17850093:fix docker to use registry policy rather than product sku to decide if argon is allowed

* Make 'process isolation policy' 3 options: default, allow, deny

* Fix fd leak on attach

With a full attach, each attach was leaking 4 goroutines.
This updates attach to use errgroup instead of the hodge-podge of
waitgroups and channels.

In addition, the detach event was never being sent.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>

* Close readclosers returned by DecompressStream

Signed-off-by: Joe Ferguson <joe@infosiftr.com>

* Refactor and cleanup the intermediate container creation

This PR is trying to refactor the `probeAndCreate` and cleanup
related codes based on the refactoring.

Signed-off-by: Dennis Chen <dennis.chen@arm.com>

* Add support for `init` on services

It's already supported by `swarmkit`, and act the same as
`HostConfig.Init` on container creation.

Signed-off-by: Vincent Demeester <vincent@sbr.pm>

* Added network package to integration/internal to refactor integration tests calls to client.NetworkCreate

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* refactored integration tests under integration/network/macvlan to use network.Create

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* use unique names for resources in create service integration tests

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Add image metrics for push and pull

Signed-off-by: Daniel Nephin <dnephin@gmail.com>

* When id is empty for overlay2/overlay, do not remove the directories.

Signed-off-by: fanjiyun <fan.jiyun@zte.com.cn>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Use stdlib TLS dialer

Since go1.8, the stdlib TLS net.Conn implementation implements the
`CloseWrite()` interface.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix panic on daemon restart with running plugin

Scenario:

Daemon is ungracefully shutdown and leaves plugins running (no
live-restore).
Daemon comes back up.
The next time a container tries to use that plugin it will cause a
daemon panic because the plugin client is not set.

This fixes that by ensuring that the plugin does get shutdown.
Note, I do not think there would be any harm in just re-attaching to the
running plugin instead of shutting it down, however historically we shut
down plugins and containers when live-restore is not enabled.

[kir@: consolidate code to deleteTaskAndContainer, a few minor nits]

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* Allow vim be case insensitive for D in dockerfile

Signed-off-by: Kunal Tyagi <tyagi.kunal@live.com>

* Clean up in TestNegotiateAPIVersionEmpty

Signed-off-by: John Stephens <johnstep@docker.com>

* Fix race condition between exec start and resize

Signed-off-by: David Wang <00107082@163.com>

* Fix link to Docker Toolbox

Signed-off-by: Francesco Mari <mari.francesco@gmail.com>

* vendor: update containerd to 63522d9

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* bump libnetwork to 19279f0492417475b6bfbd0aa529f73e8f178fb5

includes;

- moby/libnetwork#2178 Fix possible race on ingress programming
- moby/libnetwork#2180 Fix spurious deadlock in overlay driver

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* refactor delete network integration tests to use network package

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Dockerfile*: bump Go to 1.10.3

Signed-off-by: Cristian Staretu <unclejack@users.noreply.github.com>

* builder: snapshotter and exporter

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* daemon: access to distribution internals

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* layer: relax graphdriver ID format

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: experimental buildkit base

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: add graceful cancellation endpoint

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: add cache-from support to buildkit

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* vendor: add buildkit dependency

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: adapter update after vendor update

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: produce duplicate cache keys on pull

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: export build cache records

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: add usage to snapshotter adapter

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: expand prune to buildkit

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: fix compiling with buildkit on windows and integration tests

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: fixes after rebase

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: pass DOCKER_BUILDKIT to enable buildkit in tests

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: Add TODOBuildkit test requirement, specifically for TestBuildCancellationKillsSleep

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: patch incomplete download handling

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: add support for building from tarball

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: have TestBuildDockerignoringBadExclusion pass with buildkit

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: protect early progress writes

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: add support for separate upload-request

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: enable gateway through syntax directive

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: support for images without layers

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* integration-cli: fix health test

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: lint fixes

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: move tagging to exporter

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: fix cancellation context issue

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: notify output buffering on body close

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* api: update godoc

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: more experimental/windows validation

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: correct output buffering order

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* integration-cli: fix error message for non-buildkit

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* vendor: update runc for helper packages

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* Add support for schema 1 pull

Signed-off-by: Derek McGowan <derek@mcgstyle.net>

* builder: override history dates from ref metadata

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* Replace gotestyourself by gotest.tools

github.com/gotestyourself/gotestyourself moved to gotest.tools with
version 2.0.0. Moving to that one, bumping it to v2.1.0.

Signed-off-by: Vincent Demeester <vincent@sbr.pm>

* builder: updates for newer containerd

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: update ID of trace messages

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* Upgrade imdario/mergo to v0.3.5

Mainly to get inline with `docker/cli` version of that dependency

Signed-off-by: Vincent Demeester <vincent@sbr.pm>

* refactored network integration tests under integration/network/service_test.go to use network package

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Update tests to use gotest.tools 👼

Signed-off-by: Vincent Demeester <vincent@sbr.pm>

* Fix link anchors in CONTRIBUTING.md

This is a follow up of moby#35168.

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* create service integration tests use network package

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Whitelist syscalls linked to CAP_SYS_NICE in default seccomp profile

* Update profile to match docker documentation at
  https://docs.docker.com/engine/security/seccomp/

Signed-off-by: Nicolas V Castet <nvcastet@us.ibm.com>

* refactor ipvlan network integration tests to use network.Create

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Don't fail to start daemon if builder source is not available

Signed-off-by: John Howard <jhoward@microsoft.com>

* Fix compilation on 32 bit systems

* Update runc commit to ad0f525
* Update buildkit to dbf67a6

Signed-off-by: Eli Uriegas <eli.uriegas@docker.com>

* migrate TestAPINetworkCreateDelete from integration-cli/ to integration/

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* lcow: Allow the client to add or remove capabilities

Signed-off-by: John Starks <jostarks@microsoft.com>

* Updated path to be consistent w/ current Windows build process

Signed-off-by: Benjamin Baker <Benjamin.baker@utexas.edu>

* lcow: Allow the client to add device cgroup rules

Signed-off-by: John Starks <jostarks@microsoft.com>

* fix build on OpenBSD by defining Self()

Signed-off-by: Fabian Raetz <fabian.raetz@gmail.com>

* Update to containerd v1.1.1-rc.1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Update overlay2 to use naive diff for changes

The archive changes function is not implemented correctly
to handle opaque directories.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>

* Fix TestDaemonNoSpaceLeftOnDeviceError

This test is testing if any "no space left on device" errors
that occur during `docker pull` will not be masked by other
errors. To test for this, a new loopback-device was created,
and used as `--data-dir` ("/var/lib/docker").

However, `/var/lib/docker` is used for storing various
other things, including a `cache.db` database, used by
BuildKit, which is created during startup of the daemon.
Creation of that file failed (due to `--data-dir` path
being on a mount with limited size), which caused daemon
start to fail before the test was able to run.

This patch changes the size-limited mount to be used for
the storage-driver directory only, so that the test is
not affected by other parts of the code attempting to
write files in it.

To have a predictable path; the daemon used in this test
is configured to use the `vfs` storage-driver.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix flaky test TestServiceGet

Signed-off-by: Catalin Pirvu <pirvu.catalin94@gmail.com>

* LCOW: Auto-select OS

Signed-off-by: John Howard <jhoward@microsoft.com>

Addresses moby#35089 (comment).
This change enables the daemon to automatically select an image under LCOW
that can be used if the API doesn't specify an explicit platform.

For example:

FROM supertest2014/nyan
ADD Dockerfile /

And docker build . will download the linux image (not a multi-manifest image)

And similarly docker pull ubuntu will match linux/amd64

* add integration test guidelines

Signed-off-by: Anda Xu <anda.xu@docker.com>

* refactor network inspect integration tests to use network package

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* refactored remaining macvlan integration tests to use network package for creating networks

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* refactored integration/service/network integration tests to use network package

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Just satisfying my OCD - fixed comment spacing and removed a hidden character

Signed-off-by: Martin Muzatko <martin@happy-css.com>

* Fix a small spacing issue

As a follow up to moby#37331 (review)

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Initial support for OCI multi-platform image

Add the OCI spec compatible image support in client side.

Signed-off-by: Dennis Chen <dennis.chen@arm.com>

* Update comments about `InitRouter`

This is a follow-up of commit 408c7ad (PR: moby#32453)

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* vendor: update buildkit to cce2080ddb

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: buildkit rebase update

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* api: fix platform type

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* add unique names to integration/network/service_test.go

Signed-off-by: Lotus Fenn <fenn.lotus@gmail.com>

* vendor: update containerd to 08f7ee98

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* LCOW: lazycontext: Use correct lstat, fix archive check

Signed-off-by: John Howard <jhoward@microsoft.com>

* Update containerd to v1.1.1-rc.2

Signed-off-by: Derek McGowan <derek@mcgstyle.net>

* Move network conversions out of API router

This stuff doesn't belong here and is causing imports of libnetwork into
the router, which is not what we want.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>

* distribution: fix passing platform struct to puller

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* builder: update platform support to puller

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* system: add back lcow validation function

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* distribution: remove custom matcher code

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* Register OCI image media types

OCI types are backwards compatible with Docker manifest
types, however the media types must be registered.

Signed-off-by: Derek McGowan <derek@mcgstyle.net>

* Update Microsoft/go-winio to 0.4.8

Fixes named pipe support for hyper-v isolated containers

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* LCOW: Prefer Windows over Linux in a manifest list

When a manifest list contains both Linux and Windows images, always
prefer Windows when the platform OS is unspecified. Also, filter out any
Windows images with a higher build than the host, since they cannot run.

Signed-off-by: John Stephens <johnstep@docker.com>

* Re-add support for a custom .bashrc file in build env

Signed-off-by: Harald Albers <github@albersweb.de>

* Update moby to use scalable-lb libnetwork APIs

This patch is required for the updated version of libnetwork and entails
two minor changes.

First, it uses the new libnetwork.NetworkDeleteOptionRemoveLB option to
the network.Delete() method to automatically remove the load balancing
endpoint for ingress networks.   This allows removal of the
deleteLoadBalancerSandbox() function whose functionality is now within
libnetwork.

The second change is to allocate a load balancer endpoint IP address for
all overlay networks rather than just "ingress" and windows overlay
networks.  Swarmkit is already performing this allocation, but moby was
not making use of these IP addresses for Linux overlay networks (except
ingress).  The current version of libnetwork makes use of these IP
addresses by creating a load balancing sandbox and endpoint similar to
ingress's  for all overlay network and putting all load balancing state
for a given node in that sandbox only.  This reduces the amount of linux
kernel state required per node.

In the prior scheme, libnetwork would program each container's network
namespace with every piece of load balancing state for every other
container that shared *any* network with the first container.  This
meant that the amount of kernel state on a given node scaled with the
square of the number of services in the cluster and with the square of
the number of containers per service.  With the new scheme, kernel state
at each node scales linearly with the number of services and the number
of containers per service.  This also reduces the number of system calls
required to add or remove tasks and containers.  Previously the number
of system calls required grew linearly with the number of other
tasks that shared a network with the container.  Now the number of
system calls grows linearly only with the number of networks that the
task/container is attached to.  This results in a significant
performance improvement when adding and removing services to a cluster
that already heavily loaded.

The primary disadvantage to this scheme is that it requires the
allocation of an additional IP address per node per subnet for every
node in the cluster that has a task on the given subnet.  However, as
mentioned, swarmkit is already allocating these IP addresses for every
node and they are going unused.  Future swarmkit modifications should be
examined to only allocate said IP addresses when nodes actually require
them.

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* Adds a few more names to the name generator.

Signed-off-by: Debayan De <debayande@users.noreply.github.com>

* bump libnetwork to b0186632

Bump libnetwork to b0186632522c68f4e1222c4f6d7dbe518882024f.   This
includes the following changes:
 * Dockerize protocol buffer generation and update (78d9390a..e12dd44c)
 * Use new plugin interfaces provided by plugin pkg (be94e134)
 * Improve linux load-balancing scalability (5111c24e..366b9110)

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* bump libnetwork to 430c00a

Bump libnetwork to 430c00a6a6b3dfdd774f21e1abd4ad6b0216c629.  This
includes the following moby-affecting changes:

 * Update vendoring for go-sockaddr (8df9f31a)
 * Fix inconsistent subnet allocation by preventing allocation of
   overlapping subnets (8579c5d2)
 * Handle IPv6 literals correctly in port bindings (474fcaf4)
 * Update vendoring for miekg/dns (8f307ac8)
 * Avoid subnet reallocation until required (9756ff7ed)
 * Bump libnetwork build to use go version 1.10.2 (603d2c1a)
 * Unwrap error type returned by PluginGetter (aacec8e1)
 * Update vendored components to match moby (d768021dd)
 * Add retry field to cluster-peers probe (dbbd06a7)
 * Fix net driver response loss on createEndpoint (1ab6e506)
   (fixes docker/for-linux#348)

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* Pass endpoint to the CloudWatch Logs logging driver

Signed-off-by: haikuoliu <haikuo@amazon.com>

* Update tests w/ new libnetwork contraints

The TestDockerNetworkIPAMMultipleNetworks test allocates several
networks simultaneously with overlapping IP addresses.  Libnetwork now
forbids this.  Adjust the test case to use distinct IP ranges for the
networks it creates.

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* Fix bindmount autocreate race

When using the mounts API, bind mounts are not supposed to be
automatically created.

Before this patch there is a race condition between valiating that a
bind path exists and then actually setting up the bind mount where the
bind path may exist during validation but was removed during mountpooint
setup.

This adds a field to the mountpoint struct to ensure that binds created
over the mounts API are not accidentally created.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>

* Update documents of `dispatchAdd`

`ADD` does not support git.

Ref: moby#14704 (comment)

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* Update documents of `Detect`

By 0296797, `progressReader`
and `remoteURL` were removed from arguments. So developers who
use `Detect` not need to care about when `ProgressReaderFunc`
is used.

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* builder: return image ID in API when using buildkit

Signed-off-by: Tibor Vass <tibor@docker.com>

* api: Change Platform field back to string (temporary workaround)

This partially reverts moby#37350

Although specs.Platform is desirable in the API, there is more work
to be done on helper functions, namely containerd's platforms.Parse
that assumes the default platform of the Go runtime.

That prevents a client to use the recommended Parse function to
retrieve a specs.Platform object.

With this change, no parsing is expected from the client.

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: do not send duplicate status for completed jobs

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* vendor: update buildkit to 9acf51e491

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* add vim-plug setting

this should work ( tried on my machine)

Signed-off-by: Ian Chen <ianre657@gmail.com>

* update fsnotify to v1.4.7

Fixes a possible deadlock on closing the watcher on kqueue

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Migrate some ipcmode tests to integration

This fix migrates some ipcmode tests in integration-cli
to integration tests.

Signed-off-by: Yong Tang <yong.tang.github@outlook.com>

* Update cloudflare/cfssl to 1.3.2

Matching the version that is used in SwarmKit

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Replaced "--update-cache" argument with "--no-cache" in apk call to reduce alpine base image by 10-12% (avoid useless indexes in /var/cache/apk)
Signed-off-by: Mickaël Remars <github@remars.com>

* Add /proc/acpi to masked paths

The deafult OCI linux spec in oci/defaults{_linux}.go in Docker/Moby
from 1.11 to current upstream master does not block /proc/acpi pathnames
allowing attackers to modify host's hardware like enabling/disabling
bluetooth or turning up/down keyboard brightness. SELinux prevents all
of this if enabled.

Signed-off-by: Antonio Murdaca <runcom@redhat.com>

* Removed the "-i -t" arguments from the smoke test calling printf (these flags seem not really needed, and break jenkins builds with error "the input device is not a TTY")
Signed-off-by: Mickaël Remars <github@remars.com>

* Bump swarmkit to include task reaper fixes and more metrics.

This includes the following behavior-modifying PRs:

- moby/swarmkit#2673
- moby/swarmkit#2669
- moby/swarmkit#2675
- moby/swarmkit#2664

Signed-off-by: Ying Li <ying.li@docker.com>

* Bump libnetwork to 3ac297bc

Bump libnetwork to 3ac297bc7fd0afec9051bbb47024c9bc1d75bf5b in order to
get fix 0c3d9f00 which addresses a flaw that the scalable load balancing
code revealed.  Attempting to print sandbox IDs where the sandbox name
was too short results in a goroutine panic.  This can occur with
sandboxes with names of 1 or 2 characters in the previous code. But due
to naming updates in the scalable load balancing code, it could now
occur for networks whose name was 3 characters and at least one of the
integration tests employed such networks (named 'foo', 'bar' and 'baz').

This update also brings in several changes as well:
 * 6c7c6017 - Fix error handling about bridgeSetup
 * 5ed38221 - Optimize networkDB queue
 * cfa9afdb - ndots: produce error on negative numbers
 * 5586e226 - improve error message for invalid ndots number
 * 449672e5 - Allows to set generic knobs on the Sandbox
 * 6b4c4af7 - do not ignore user-provided "ndots:0" option
 * 843a0e42 - Adjust corner case for reconnect logic

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* Get err type in removeNetworks() w/ errors.Cause()

Commit c0bc14e wrapped the return value of nw.Delete() with some extra
information.  However, this breaks the code in
containerAdaptor.removeNetworks() which ignores certain specific
libnetwork error return codes.  Said codes actually don't represent
errors, but just regular conditions to be expected in normal operation.
The removeNetworks() call checked for these errors by type assertions
which the errors.Wrap(err...) breaks.

This has a cascading effect, because controller.Remove() invokes
containerAdaptor.removeNetworks() and if the latter returns an error,
then Remove() fails to remove the container itself.  This is not
necessarily catastrophic since the container reaper apparently will
purge the container later, but it is clearly not the behavior we want.

Signed-off-by: Chris Telfer <ctelfer@docker.com>

* vendor: update containerd to b41633746

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* Ensure RUN instruction to run without Healthcheck

Before this commit Healthcheck run if HEALTHCHECK
instruction appears before RUN instruction.
By passing `withoutHealthcheck` to `copyRunConfig`,
always RUN instruction run without Healthcheck.

Fix: moby#37362

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* Fix typo on test.md

It said `TESTFLAGS='-test.run ^TestValidateIPAddress$' make test-unit`
runs `TestBuild` test, but actually runs `TestValidateIPAddress` test.

Signed-off-by: Donghwa Kim <shanytt@gmail.com>

* Pass log-level to containerd

dockerd allows the `--log-level` to be specified, but this log-level
was not forwarded to the containerd process.

This patch sets containerd's log-level to the same as dockerd if a
custom level is provided.

Now that `--log-level` is also passed to containerd, the default "info"
is removed, so that containerd's default (or the level configured in containerd.toml)
is still used if no log-level is set.

Before this change:

containerd would always be started without a log-level set (only the level that's configured in `containerd.toml`);

```
root      1014  2.5  2.1 496484 43468 pts/0    Sl+  12:23   0:00 dockerd
root      1023  1.2  1.1 681768 23832 ?        Ssl  12:23   0:00  \_ docker-containerd --config /var/run/docker/containerd/containerd.toml
```

After this change:

when running `dockerd` without options (same as current);

```
root      1014  2.5  2.1 496484 43468 pts/0    Sl+  12:23   0:00 dockerd
root      1023  1.2  1.1 681768 23832 ?        Ssl  12:23   0:00  \_ docker-containerd --config /var/run/docker/containerd/containerd.toml
```

when running `dockerd --debug`:

```
root       600  0.8  2.1 512876 43180 pts/0    Sl+  12:20   0:00 dockerd --debug
root       608  0.6  1.1 624428 23672 ?        Ssl  12:20   0:00  \_ docker-containerd --config /var/run/docker/containerd/containerd.toml --log-level debug
```

when running `dockerd --log-level=panic`

```
root       747  0.6  2.1 496548 43996 pts/0    Sl+  12:21   0:00 dockerd --log-level=panic
root       755  0.7  1.1 550696 24100 ?        Ssl  12:21   0:00  \_ docker-containerd --config /var/run/docker/containerd/containerd.toml --log-level panic
```

combining `--debug` and `--log-level` (`--debug` takes precedence):

```
root       880  2.7  2.1 634692 43336 pts/0    Sl+  12:23   0:00 dockerd --debug --log-level=panic
root       888  1.0  1.1 616232 23652 ?        Ssl  12:23   0:00  \_ docker-containerd --config /var/run/docker/containerd/containerd.toml --log-level debug
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Bump gometalinter to v2.0.6

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Compile fix

Go 1.11beta1 (rightfully) complains:

> 15:38:37 daemon/cluster/controllers/plugin/controller.go:183:
> Entry.Debugf format %#T has unrecognized flag #

This debug print was added by commit 72c3bcf.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* aufs: fix Wrapf args

Fix the following go-1.11beta1 build error:

> daemon/graphdriver/aufs/aufs.go:376: Wrapf format %s reads arg #1, but call has 0 args

While at it, change '%s' to %q.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* loggerutils: build fixes, improve errors

There are two build errors when using go-1.11beta1:

> daemon/logger/loggerutils/logfile.go:367: Warningf format %q arg f.Name is a func value, not called
> daemon/logger/loggerutils/logfile.go:564: Debug call has possible formatting directive %v

In the first place, the file name is actually not required as error
message already includes it.

While at it, fix a couple of other places for more correct messages, and
make sure to not add a file name if an error already has it.

Fixes: f69f09f
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix ineffassign linting

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix golint issues

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Bump libnetwork to d00ceed44cc447c77f25cdf5d59e83163bdcb4c9

The absence of the file /proc/sys/net/ipv6/conf/all/disable_ipv6
doesn't appear to affect functionality, at least at this time.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Bump containerd daemon to v1.1.1

Signed-off-by: Brian Goff <cpuguy83@gmail.com>

* builder: fix duplicate calls to mountable

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>

* daemon/*.go: fix some Wrap[f]/Warn[f] errors

In particular, these two:
> daemon/daemon_unix.go:1129: Wrapf format %v reads arg #1, but call has 0 args
> daemon/kill.go:111: Warn call has possible formatting directive %s

and a few more.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Return error if basename is expanded to blank

Fix: moby#37325

Signed-off-by: Yuichiro Kaneko <spiketeika@gmail.com>

* lcow: fix debug in startServiceVMIfNotRunning()

When go-1.11beta1 is used for building, the following error is
reported:

> 14:56:20 daemon\graphdriver\lcow\lcow.go:236: Debugf format %s reads
> arg #2, but call has 1 arg

While fixing this, let's also fix a few other things in this
very function (startServiceVMIfNotRunning):

1. Do not use fmt.Printf when not required.
2. Use `title` whenever possible.
3. Don't add `id` to messages as `title` already has it.
4. Remove duplicated colons.
5. Try to unify style of messages.
6. s/startservicevmifnotrunning/startServiceVMIfNotRunning/
...

In general, logging/debugging here is a mess and requires much more
love than I can give it at the moment. Areas for improvement:

1. Add a global var logger = logrus.WithField("storage-driver", "lcow")
and use it everywhere else in the code.
2. Use logger.WithField("id", id) whenever possible (same for "context"
and other similar fields).
3. Revise all the errors returned to be uniform.
4. Make use of errors.Wrap[f] whenever possible.

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* Fix error string in docker CLI test

Signed-off-by: Sandeep Bansal <sabansal@microsoft.com>

* vendor: update continuity to 0377f7d767206

This is to include the Go 1.11 fix
(containerd/continuity#120).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* Remove stray uses of "golang.org/x/net/context"

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix API template to not use "golang.org/x/net/context"

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Refactor daemon.info to reduce cyclomatic complexity

Before this change;

    gocyclo daemon/info.go
    17 daemon (*Daemon).SystemInfo daemon/info.go:27:1
    2 daemon (*Daemon).SystemVersion daemon/info.go:150:1
    1 daemon (*Daemon).showPluginsInfo daemon/info.go:195:1

After this change;

    gocyclo daemon/info.go
    8 daemon (*Daemon).fillSecurityOptions daemon/info.go:150:1
    5 daemon operatingSystem daemon/info.go:201:1
    3 daemon (*Daemon).fillDriverInfo daemon/info.go:121:1
    2 daemon hostName daemon/info.go:172:1
    2 daemon memInfo daemon/info.go:192:1
    2 daemon kernelVersion daemon/info.go:182:1
    1 daemon (*Daemon).SystemVersion daemon/info.go:81:1
    1 daemon (*Daemon).SystemInfo daemon/info.go:27:1
    1 daemon (*Daemon).fillPluginsInfo daemon/info.go:138:1

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Unexport daemon.FillPlatformInfo

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix golint warning on generated "volume" types

Should fix

```
api/types/volume/volume_create.go
Line 10: warning: comment on exported type VolumeCreateBody should be of the form "VolumeCreateBody ..." (with optional leading article) (golint)

api/types/volume/volume_list.go
Line 12: warning: comment on exported type VolumeListOKBody should be of the form "VolumeListOKBody ..." (with optional leading article) (golint)
```

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Fix flakyness in TestDockerNetworkInternalMode

Instead of waiting for the DNS to fail, try to access
a specific external IP and verify that 100% of the pakcets
are being lost.

Signed-off-by: Flavio Crisciani <flavio.crisciani@docker.com>

* moved integration tests from docker_cli_config_create_test.go to integration/config

Signed-off-by: Arash Deshmeh <adeshmeh@ca.ibm.com>

* Bump containerd daemon to v1.1.2

Updates cri version to 1.0.4, to add `max-container-log-line-size`

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* pkg/tarsum: fix unit test for Go 1.11+

Since go-1.11beta1 archive/tar, tar headers with Typeflag == TypeRegA
(numeric 0) (which is the default unless explicitly initialized) are
modified to have Typeflag set to either tar.TypeReg (character value
'0', not numeric 0) or tar.TypeDir (character value '5') [1].
This results in different Typeflag value in the resulting header,
leading to a different Checksum, and causing the following test
case errors:

> 12:09:14 --- FAIL: TestTarSums (0.05s)
> 12:09:14 tarsum_test.go:393: expecting
> [tarsum+sha256:8bf12d7e67c51ee2e8306cba569398b1b9f419969521a12ffb9d8875e8836738],
> but got
> [tarsum+sha256:75258b2c5dcd9adfe24ce71eeca5fc5019c7e669912f15703ede92b1a60cb11f]
> ... (etc.)

All the other code explicitly sets the Typeflag field, but this test
case is not, causing the incompatibility with Go 1.11. Therefore,
the fix is to set TypeReg explicitly, and change the expected checksums
in test cases).

Alternatively, we can vendor archive/tar again (for the 100th time),
but given that the issue is limited to the particular test case it
does not make sense.

This fixes the test for all Go versions.

[1] https://go-review.googlesource.com/c/go/+/85656

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* vendor: buildkit to 98f1604134f945d48538ffca0e18662337b4a850

Signed-off-by: Tibor Vass <tibor@docker.com>

* builder: set buildkit's exported product variable via PRODUCT

This introduces a PRODUCT environment variable that is used to set a constant
at dockerversion.ProductName.

That is then used to set BuildKit's ExportedProduct variable in order to show
useful error messages to users when a certain version of the product doesn't
support a BuildKit feature.

Signed-off-by: Tibor Vass <tibor@docker.com>

* validate: please vet

Signed-off-by: Tibor Vass <tibor@docker.com>

* Fix flaky TestExternalGraphDriver/pull test

This test occassionally fails on s390x and Power;

    03:16:04 --- FAIL: TestExternalGraphDriver/pull (1.08s)
    03:16:04 external_test.go:402: assertion failed: error is not nil: Error: No such image: busybox:latest

Most likely these failures are caused due to Docker Hub updating
the busybox:latest image, but not all architectures yet being
available.

Instead of using `:latest`, pull an image by digest, so that
the test doesn't depend on Docker Hub having all architectures
available for `:latest`.

I selected the same digest as is currently used as "frozen image"
in the Dockerfile.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Use constant for task runtime value

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* Update swarmkit to 6826639

changes included:

- swarmkit moby#2706 address unassigned task leak when service is removed
- swarmkit moby#2676 Fix racy batching on the dispatcher
- swarmkit moby#2693 Fix linting issues revealed by Go 1.11

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>

* vendor: bump google/certificate-transparency-go to 1.0.20

This is to include the Windows + Go1.11 fix
(google/certificate-transparency-go#284).

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* Add --device support for Windows

Implements the --device forwarding for Windows daemons. This maps the physical
device into the container at runtime.

Ex:

docker run --device="class/<clsid>" <image> <cmd>

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>

* Add osusergo build tar for static binaries

Go 1.11 includes a fix to os/user to be working in a static binary
(fixing golang/go#23265). The fix requires
`osusergo` build tag to be set for static binaries, which is what
this commit adds (also for containerd).

[v2: sort build tags alphabetically]

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

* Adds LinuxMetadata support by default on Windows

1. Sets the LinuxMetadata flag by default on Windows LCOW v1
MappedDirectories.

Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>

* Dont submit

* Initial check-in for Docker Windows\ARM32

* MSFT:17850093:fix docker to use registry policy rather than product sku to decide if argon is allowed

* Make 'process isolation policy' 3 options: default, allow, deny

* Fixing process isolation policy enforcement

* Rebase merge conflict resdue

* Merge conflict error fix

* Tabs fix

* Typo fix in GetProcessIsolationPolicy()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants