Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: Experimental: ContainerD runtime #38541

Merged
merged 14 commits into from Mar 20, 2019

Conversation

@jhowardmsft
Copy link
Contributor

commented Jan 12, 2019

Signed-off-by: John Howard jhoward@microsoft.com

Fixes #22874
Fixes #38719

See each of the commit messages for more detail on the changes.

TL;DR

Allows the use of containerd as an experimental runtime for containers on Windows (WCOW being both process and Hyper-V isolation, as well as LCOW).

Detail

  • Splits libcontainerd so that both remote and local can be loaded rather than conditional compile
  • Generate a consistent OCI spec that a remote (not-in-process) runtime can escape
  • Fix a couple of bugs
  • Adds ETW logging for Windows (requires bumping logrus to v1.3.0)
  • Fixes a leak if the init proc fails to start
  • Vendoring updates for Microsoft/hcsshim and Microsoft/go-winio

Note containerd and hcsshim for HCS v2 APIs do not yet support all the required
functionality needed for docker. gMSA and cloning/templating are two obvious omissions. The gaps will be resolved in time - this PR is a stepping stone in migrating Docker on Windows to containerd for those wishing to experiment, not a full switch-over.

How to use

You will need RS5 (goal is for RS1+ and although RS1 is currently enabled, there's some issues still being worked through, so it may be disabled prior to merge) and to start with something like the following:

Window 1:

  • containerd --log-level debug

Window 2:

  • $env:DOCKER_WINDOWS_CONTAINERD_RUNTIME=1
  • dockerd --experimental -D --containerd \\.\pipe\containerd-containerd

Required Binaries

You will need the following binary from github.com/containerd/containerd in your path. You will need to make sure this is from master currently, not a 1.2.x branch.

  • containerd.exe

(Do not use containerd-shim-runhcs-v1.exe from the containerd repo - if not already, it will be removed very soon. You should use the one from the https://github.com/Microsoft/hcsshim below)

You will need the following binaries from https://github.com/Microsoft/hcsshim in your path, again currently from 'master' due to the in-flux current status:

  • containerd-shim-runhcs-v1.exe
  • runhcs.exe (not required but potentially useful)

For LCOW, the following binaries are required:

  • C:\Program Files\Linux Containers\initrd.img
  • C:\Program Files\Linux Containers\kernel

This is no different to the current requirements. Linuxkit (https://github.com/linuxkit/lcow) is currently far behind https://github.com/Microsoft/opengcs master, so you may need to build your own initrd.img, as well as kernel. A 4.19 based kernel is preferred.

@jhowardmsft jhowardmsft requested a review from cpuguy83 as a code owner Jan 12, 2019

@jhowardmsft jhowardmsft removed the request for review from cpuguy83 Jan 14, 2019

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch from 9b6dd31 to d1e947e Jan 18, 2019

@jhowardmsft jhowardmsft requested a review from tonistiigi as a code owner Jan 18, 2019

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch from 2450a45 to 3df4681 Jan 18, 2019

@jhowardmsft jhowardmsft removed the request for review from tonistiigi Jan 18, 2019

@ddebroy

This comment has been minimized.

Copy link
Contributor

commented Jan 23, 2019

@andrey-ko PTAL

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Jan 23, 2019

@ddebroy @andrey-ko Waaaaaaay too early and much is in flux.

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch 4 times, most recently from 6faa18b to 462a806 Jan 24, 2019

@codecov

This comment has been minimized.

Copy link

commented Jan 25, 2019

Codecov Report

Merging #38541 into master will increase coverage by 0.54%.
The diff coverage is 26.54%.

@@            Coverage Diff             @@
##           master   #38541      +/-   ##
==========================================
+ Coverage   36.47%   37.01%   +0.54%     
==========================================
  Files         613      610       -3     
  Lines       45814    45321     -493     
==========================================
+ Hits        16709    16775      +66     
+ Misses      26823    26260     -563     
- Partials     2282     2286       +4

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch from 1932955 to c9529d6 Jan 26, 2019

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch 3 times, most recently from 7f97828 to 4936fee Jan 31, 2019

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch from 4936fee to a63f7b6 Feb 1, 2019

Show resolved Hide resolved api/swagger.yaml Outdated

@jhowardmsft jhowardmsft force-pushed the Microsoft:jjh/containerd branch from a63f7b6 to 2ebd6d7 Feb 1, 2019

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

@andrewhsu Cool. I value any help. It's a huge job to undertake. But please bear in mind, I can't accept anything that doesn't take Invoke-DockerCI into the equation - that's used internally on nightly Windows builds, and can't be broken. As well as by Windows devs in general.

I would suggest that an alternate starting point, while not directly related, would be looking into how to get integration (not integration-cli) running on Windows CI. Some groundwork has been done between myself and @vdemeester carrying what I started. Both in hack\windows.ps1 and Invoke-DockerCI. We really need that coverage as integration-cli is locked for additions. That allows us an in-road into getting LCOW and xenon coverage, all of which are desperately needed in containerd-land.

Obviously the binary .exe distribution issues (containerd, the runhcs shim, linux kernel, initrd.img are orthogonal, but equally need resolving for the end-to-end)

#ThisIsAHugePieceOfWork #dontunderestimatehowmuchwork #really-iamnotkiddinghere #Maybe1909ReleaseForOnlyContainerDInDockerOnWindowsWithNoInprocHCSInvocation

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

@crosbymichael @dmcgowan Guys - this was discussed again at the maintainers meeting today. We really, really want to make sure this hits the 1903 cut-off. Any chance you guys could review soon? (Obviously appreciate you're busy on other things.)

@thaJeztah
Copy link
Member

left a comment

LGTM

As a follow-up, could you have a look at;

Adding the containerd and runhcs versions to docker version (filled using fillPlatformVersion, which currently is a no-op on Windows:

moby/daemon/info_unix.go

Lines 119 to 128 in 9c83848

func (daemon *Daemon) fillPlatformVersion(v *types.Version) {
if rv, err := daemon.containerd.Version(context.Background()); err == nil {
v.Components = append(v.Components, types.ComponentVersion{
Name: "containerd",
Version: rv.Version,
Details: map[string]string{
"GitCommit": rv.Revision,
},
})
}

Add something to the docker info output to show that this feature is enabled (similar to how LCOW is shown?)

if d.state.operatingSystem == "windows" &&
len(runConfig.Entrypoint) > 0 &&
d.state.runConfig.ArgsEscaped != argsEscaped {
fmt.Fprintf(d.builder.Stderr, " ---> [Warning] Shell-form ENTRYPOINT and exec-form CMD may have unexpected results\n")

This comment has been minimized.

Copy link
@thaJeztah

thaJeztah Mar 15, 2019

Member

Do we have a similar warning in place already on the BuildKit side? (just so that we don't forget)

This comment has been minimized.

Copy link
@jhowardmsft

jhowardmsft Mar 15, 2019

Author Contributor

I’m not even sure buildkit works on Windows (never tried). It will need more than these warnings as it would need to mirror what this PR does and move to a command line.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Mar 15, 2019

I think this should be ready to merge; I can wait for SF to wake up for an "ack", but otherwise, ready-to-go (thanks so much for the work on this!!)

We'll also need some documentation; probably in https://github.com/docker/cli/tree/master/experimental

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2019

@thaJeztah. Yes I can do those follow ups. Might need a little more work as we don’t use runhcs in the unified shim model in containerd, but I’ll figure something out.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Mar 15, 2019

Ah, well, whatever information would be useful then (instead of runhcs) 😅

@andrewhsu

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

@andrey-ko when you get a chance, can you have a look a verifying this PR with windows binaries according to the john howard instructions? cc @ddebroy

@andrewhsu

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

@jhowardmsft in regards to getting PR jobs modified to test with containerd on windows in the future and getting nightly builds up with the new dependent binaries, we should proceed with that in parallel with this PR cc @dave-tucker

(could use msft help on this if you guys got canonical ways to build windows binaries from golang in an automated pipeline)

@StefanScherer

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

The runhcs.exe is required, I otherwise got this I couldn't run or kill containers.

PS C:\Program Files\Docker> docker run mcr.microsoft.com/windows/nanoserver:1809 ipconfig
...
C:\Program Files\Docker\docker.exe: Error response from daemon: exec: "runhcs.exe": executable file not found in %!P(MISSING)ATH%!(NOVERB): unknown.
@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

Interesting. It shouldn’t be needed. I’ll fix it as a followup

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Mar 19, 2019

%!P(MISSING)ATH%!(NOVERB)

Looks like an incorrect format-string somewhere as well 🤔

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

@StefanScherer doesn't repro here. Are you POSITIVE you are using the right shim binary?

@StefanScherer

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

@jhowardmsft I tried a Dockerfile using double quotes.

# escape=`
ARG core=mcr.microsoft.com/windows/servercore:1809
ARG target=mcr.microsoft.com/windows/servercore:1809
FROM $core as download

SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]

ENV GPG_VERSION 2.3.4

RUN Invoke-WebRequest "https://files.gpg4win.org/gpg4win-vanilla-${env:GPG_VERSION}.exe" -OutFile "gpg4win.exe" -UseBasicParsing ; `
    Start-Process .\gpg4win.exe -ArgumentList '/S' -NoNewWindow -Wait

Single line RUN instructions seem to work 🎉 💯. What a relief.
But I have problems with this multi-line

RUN @( 
    "94AE36675C464D64BAFA68DD7434390BDBE9B9C5", 
    "FD3A5288F042B6850C66B31F09FE44734EB7990E", 
    "71DCFD284A79C3B38668286BC97EC7A07EDE3FC1", 
    "DD8F2338BAE7501E3DD5AC78C273792F7D83545D", 
    "C4F0DFFF4E8C1A8236409D08E73BC641CC11F4C8", 
    "B9AE9905FFD7803F25714661B63B535A4C206CA9", 
    "77984A986EBC2AA786BC0F66B01FBB92821C587A", 
    "8FCCA13FEF1D0C2E91008E09770F7A9A5AE15600", 
    "4ED778F539E3634C779C87C6D7062848A1AB005C", 
    "A48C2BEE680E841632CD4E44F07496B3EB3C1762", 
    "B9E2F5981AA6E0CD28160D9FF13993A75599653C" 
    ) | foreach { 
      gpg --keyserver ha.pool.sks-keyservers.net --recv-keys $_ ; 
    }

Sure I can keep single quotes here, but wanted to try out where people might use it.

Screen Shot 2019-03-19 at 12 33 30 PM

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

Is that a regression (the multi-line)?

@StefanScherer

This comment has been minimized.

Copy link
Contributor

commented Mar 19, 2019

@jhowardmsft oh the shim was my fault, I accidentally copied the wrong one :-(

FROM mcr.microsoft.com/windows/servercore:1809
ENV VERSION=1.2.3
RUN powershell -Command Write-Output "The version is $env:VERSION"
SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
RUN Write-Output "The version is $env:VERSION"

Building this Dockerfile gives me

PS C:\node\quoteetst> docker build --no-cache -t test .
Sending build context to Docker daemon  2.048kB
Step 1/5 : FROM mcr.microsoft.com/windows/servercore:1809
 ---> 1206cbf9524c
Step 2/5 : ENV VERSION=1.2.3
 ---> Running in 1bd568e4ee34
Removing intermediate container 1bd568e4ee34
 ---> 0a74f370c77f
Step 3/5 : RUN powershell -Command Write-Output "The version is $env:VERSION"
 ---> Running in d6a007ab1b1e
The
version
is
1.2.3
Removing intermediate container d6a007ab1b1e
 ---> 0342b5e97e70
Step 4/5 : SHELL ["powershell", "-Command", "$ErrorActionPreference = 'Stop'; $ProgressPreference = 'SilentlyContinue';"]
 ---> Running in 37f64078846b
Removing intermediate container 37f64078846b
 ---> 256f3d5579da
Step 5/5 : RUN Write-Output "The version is $env:VERSION"
 ---> Running in 0b5f8071c718
The
version
is
1.2.3
Removing intermediate container 0b5f8071c718
 ---> a27d43c226db
Successfully built a27d43c226db
Successfully tagged test:latest

The output is multi-line so there were no quotes in the Write-Output command. My download from the previous Dockerfile fortunately worked without the quotes.

@andrey-ko

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

I've access problems with container removal:

PS C:\Users\akolomentsev> docker run --rm -it mcr.microsoft.com/windows/nanoserver:1803
Microsoft Windows [Version 10.0.17134.648]
(c) 2018 Microsoft Corporation. All rights reserved.

C:\>exit
ERRO[0005] Error waiting for container: container a2d618fa593f2f4992590a92ac49e2317497b088ba4fb6a383cee40491de1cb7: driver "windowsfilter" failed to remove root filesystem: failed to detach VHD: invalid argument: rename C:\ProgramData\Docker\windowsfilter\a2d618fa593f2f4992590a92ac49e2317497b088ba4fb6a383cee40491de1cb7 C:\ProgramData\Docker\windowsfilter\a2d618fa593f2f4992590a92ac49e2317497b088ba4fb6a383cee40491de1cb7-removing: Access is denied.

and looks like it stays forever in "removal in progress state"....

PS C:\Users\akolomentsev> docker container ls -a
CONTAINER ID        IMAGE                                       COMMAND                    CREATED              STATUS                PORTS               NAMES
a2d618fa593f        mcr.microsoft.com/windows/nanoserver:1803   "c:\\windows\\system32…"   About a minute ago   Removal In Progress                       sleepy_beaver
PS C:\Users\akolomentsev>
@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

Removal issue known and understood. Needs a fix to the shim which is in flight

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

@StefanScherer I’m still trying to understand if this is a regression or not. Can you be more precise?

@StefanScherer

This comment has been minimized.

Copy link
Contributor

commented Mar 20, 2019

@jhowardmsft I think I remembered something wrong. I kept in my mind that I sometime should test this PR to check an enhancement how we can use quotes in RUN instructions. As I finally had some time today to set everything up I tried it out in the hope that we can use double quotes, but it seems I remembered this wrong. So no regression, it just doesn't work as it would be nice to work. So forget about this false alarm. We should track this in another issue.

All other things with containerd work for me on a Server 2019, even docker run --rm.

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

I’m going to merge this so we can move forward given there are three LGTMs and a slack-LGTM from Derek. @thaJeztah YOLO.

@jhowardmsft jhowardmsft merged commit a3eda72 into moby:master Mar 20, 2019

9 of 10 checks passed

codecov/patch 26.54% of diff hit (target 50%)
Details
codecov/project 37.01% (+0.54%) compared to f196671
Details
dco-signed All commits are signed
experimental Jenkins build Docker-PRs-experimental 44427 has succeeded
Details
janky Jenkins build Docker-PRs 53294 has succeeded
Details
powerpc Jenkins build Docker-PRs-powerpc 13638 has succeeded
Details
vendor Jenkins build Docker-PRs-vendor 5066 has succeeded
Details
windowsRS1 Jenkins build Docker-PRs-WoW-RS1 24386 has succeeded
Details
windowsRS5-process Jenkins build Docker-PRs-WoW-RS5-Process 1728 has succeeded
Details
z Jenkins build Docker-PRs-s390x 13526 has succeeded
Details

@jhowardmsft jhowardmsft deleted the Microsoft:jjh/containerd branch Mar 20, 2019

@jhowardmsft

This comment has been minimized.

Copy link
Contributor Author

commented Mar 20, 2019

( channeling @jessfraz 😱 😁)

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Mar 20, 2019

😂🥳🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.