Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement direct containerd Executor. #1965

Merged
merged 1 commit into from
Jun 14, 2017
Merged

Conversation

ijc
Copy link
Contributor

@ijc ijc commented Feb 17, 2017

This is a very early implementation of an executor which allows standalone swarmd to talk directly to containerd via gRPC. Posting this now to get feedback on the direction as well as the priorities wrt the missing bits (see below)

It is activated by passing --containerd-addr=/run/containerd/containerd.sock to swarmd. The contents of both swarmkit/bin and containerd/bin must be in $PATH.

Note that currently various things expect paths under the swarmkit state dir to be absolute paths, so --state-dir=$(pwd)/swarmkitstate or similar is a good idea (since the default is the non-absolute ./swarmkitstate).

Example:

sudo env PATH=`pwd`/bin:../containerd/bin:$PATH ./bin/swarmd --state-dir=`pwd`/swarmkitstate --containerd-addr=/run/containerd/containerd.sock

then:

sudo ./bin/swarmctl service create --image library/redis:latest --name redis

This uses the existing ContainerSpec and chunks of this new code are pretty similar to the existing dockerapi executor (I have also left in some commented out code which I think will eventually be similar once the underlying scaffolding is implemented). There is likely to be opportunities for refactoring or code sharing. Alternatively (or also) we should perhaps be considering a new spec type specifically for containerd (vs docker engine).

There are many and varied holes and missing pieces of implementation (some of which I have papered over in order to get a basic executor working in the expectation of eventually rewriting to something proper, others I have punted on completely). Including (but not limited to and in no particular order):

  • networking is completely unimplemented.
  • secrets are completely unimplemented.
  • volumes are completely unimplemented.
  • images are mostly unimplemented, current implementation is a quick hack which shells out to @stevvooe's containerd dist tool for pull and rootfs constructions (the former is laundered via a dist-pull helper script). AIUI it will eventually (soon?) be possible to ask containerd to take care of this via gRPC. For now the content store is placed in «swarmkitstate»/content-store and image manifests are cached in a manifests subdirectory of that. Now uses the containerd provided content store interfaces to build rootfs. Pull is still via a script for now.
  • container logging (or stdio of any sort) is unimplemented.
  • Lifecycle tasks other than Create()+Start() (e.g. Shutdown(), Terminate() & Remove()) are unimplemented (containerd API doesn't have the same granularity?). Wait() has a simple implementation but is hampered by lack of events (I don't seem to get any from ctr events, haven't actually tried the gRPC API) in containerd at the moment. Done
  • Healthcheck is unimplemented.

Lastly, this code cribs bits of functionality from a variety of places (e.g. "github.com/docker/containerd/cmd/ctr/utils".prepareStdio() & getGRPCConnection, "github.com/docker/docker/oci".DefaultSpec(), types from github.com/docker/docker/image etc) which mostly ought to be refactored rather than just cut and pasted.

This has been written/tested against docker/containerd#1dc5d652ac1adf8f0a92ee8eff7af07b129d4e21. Bundles are placed as subdirectories under «swarmkitstate»/bundles. I'm aware of @crosbymichael's work in docker/containerd#526 and will look into porting over next. Now based on docker/containerd#61400c57ae1d599626b9a5f28b98ac9074b8f1a9 docker/containerd#a7ef3e5313e9731274f60e14d396dab4729b562f

@@ -0,0 +1,107 @@
package containerd
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this we now have github.com/docker/swarmkit/agent/exec/container and github.com/docker/swarmkit/agent/exec/containerd which differ in a single character suffix. Perhaps we should consider renaming the former to github.com/docker/swarmkit/agent/exec/dockerengine (or .../dockerapi or whatever suites)?

WRT my comment in the PR text "perhaps be considering a new spec type specifically for containerd" maybe it is also worth considering renaming the ContainerSpec gRPC type too?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we should consider renaming the former to github.com/docker/swarmkit/agent/exec/dockerengine (or .../dockerapi or whatever suites)?

I agree. I find the current naming confusing. I think either dockerengine or dockerapi is better. I guess I have a slight preference for dockerapi.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call the swarmkit/agent/exec/container one dockerapi.

Would you mind doing that in a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind at all...

We should consider whether ContainerSpec should become something more dockerapiish too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #1969

@ijc
Copy link
Contributor Author

ijc commented Feb 17, 2017

My thinking on the priority order of the missing bits is (decreasing order or priority):

  1. Remaining lifecycle ops (esp Shutdown).
  2. Image handling. Well underway in containerd, can wait to consume that.
  3. Networking.
  4. Volume handling.
  5. Unordered bucked containing everything else.

Not strictly a missing bit but I'd insert:

  1. Port over docker/containerd#526 changes.

@codecov-io
Copy link

codecov-io commented Feb 17, 2017

Codecov Report

Merging #1965 into master will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1965      +/-   ##
==========================================
- Coverage    60.4%   60.38%   -0.03%     
==========================================
  Files         124      124              
  Lines       20149    20149              
==========================================
- Hits        12171    12166       -5     
- Misses       6620     6624       +4     
- Partials     1358     1359       +1

return err
}

// No idea if this is correct...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks right. Seems kind of unfortunate for swarmkit to know anything about manifests though - this seems like something that belongs at a different layer of the stack. Is the plan to move this into containerd once the relevant APIs exist?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should all go away. I'm working on this as we speak.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I hope to ditch all this code in favour of something @stevvooe produces soon!

},
))

conn, err := grpc.Dial(fmt.Sprintf("unix://%s", bindSocket), dialOpts...)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "unix://" + bindSocket? No need for Sprintf.

@stevvooe
Copy link
Contributor

images are mostly unimplemented, current implementation is a quick hack which shells out to @stevvooe's containerd dist tool for pull and rootfs constructions (the former is laundered via a dist-pull helper script). AIUI it will eventually (soon?) be possible to ask containerd to take care of this via gRPC. For now the content store is placed in «swarmkitstate»/content-store and image manifests are cached in a manifests subdirectory of that.

I am working on this right now. The content store will move over to GRPC and be contained in containerd. I am still toying with the right operations and level granularity required, but I want it to be straightforward.

;;
esac

exit 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to hack up a rootfs by running each layer through dist apply.

vendor.conf Outdated
@@ -23,6 +23,13 @@ github.com/docker/libnetwork 3ab699ea36573d98f481d233c30c742ade737565
github.com/opencontainers/runc 8e8d01d38d7b4fb0a35bf89b72bc3e18c98882d7
github.com/opencontainers/go-digest a6d0ee40d4207ea02364bd3b9e8e77b9159ba1eb

github.com/docker/containerd 1dc5d652ac1adf8f0a92ee8eff7af07b129d4e21
github.com/docker/libtrust 9cbd2a1374f46905c68a4eb3694a130610adc62a
github.com/gorilla/context v1.1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these for?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They seem to be brought in as a side effect of importing distribution packages to parse manifests. I'd prefer if we could move manifest-related functionality out of swarmkit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

github.com/docker/containerd was needed for the gRPC stuff. I think @aaronlehmann is correct that most of the rest came in due to the need for manifest stuff from distribution. @stevvooe is that something which the content store functionality you are adding will address or shall I look for alternatives?

As an aside, figuring out the container config for a v1 schema (not the manifest itself, but the bit with Cwd, Entrypoint, Cmd etc in it) was basically guess work and reverse engineering, if anyone has an actual reference (to docs, or even better usable library code) I'd be glad to have them...


var _ exec.Executor = &executor{}

// Lifted from github.com/docker/containerd/cmd/ctr/utils.go
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this last night, as well. Maybe, we need to provide some easy client making tools there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be great IMHO.

}
}

if err := r.adapter.create(ctx); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adapter pattern is proving useful. I wonder if we can pull out an abstraction for implementing executors that support ContainerSpec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO it'd be interesting/useful to try, assuming we want to continue down the route of mashing ContainerSpec in to containerd as opposed to adding ContainerdSpec (gah, naming).


bundle := filepath.Join(bundleDir, task.ID)

log.G(ctx).Debugf("newContainerAdapter: ID=%s SVC=%s", task.ID, task.ServiceID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use log.G(ctx).WithFields to achieve these keyvalue pairs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I can just kill this debug and do the WithFields thing elsewhere when there is actual logging to do.

return err
}

applyCmd := osexec.CommandContext(ctx, "dist", "apply", rootfs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, there we go!

@stevvooe
Copy link
Contributor

👍

@ijc
Copy link
Contributor Author

ijc commented Mar 1, 2017

Force push:

  • Rebased to swarmkit master.
  • Update to containerd master.
  • Use containerd provided archive library to perform layer application (thanks @stevvooe).
  • Content store is now managed by containerd e.g. for rootfs construction. Fetch is still laundering via a script pending support from content service.
  • Simplified inspect using new ContainerService.Info gRPC call.
  • Bug fixes and cleanups from feedback and own review..
  • List of vendorings has grown :(

@aaronlehmann
Copy link
Collaborator

I'm wondering if this should end up out-of-tree in some kind of swarmd repo. It feels weird to me for swarmkit to be vendoring containerd and a bunch of its dependencies. It's fine for this proof-of-concept PR, but I'm starting to wonder what final form this should take.

@stevvooe @aluzzardi WDYT?

@ijc
Copy link
Contributor Author

ijc commented Mar 2, 2017

@aaronlehmann yeah, it's something I've been wondering too.

Do you mean to move the whole of swarmd (and presumably by extension, swarmctl) out into a new repo (leaving behind the core functionality) or just the containerd related executor?

@dongluochen
Copy link
Contributor

networking is completely unimplemented.
secrets are completely unimplemented.
volumes are completely unimplemented.

I want to understand the architecture better. For example, how should networking be implemented?

@ijc
Copy link
Contributor Author

ijc commented Mar 3, 2017

@dongluochen the three things you quote (networking, secrets, volumes) are for the most part open questions right now and I don't have a concrete answer for any of them. Any thoughts/advice/feedback anyone has would be useful.

For networking I've looked a bit at CNI (via https://github.com/ehazlett/circuit/) and CNM (via https://github.com/docker/libnetwork/), but no real conclusion.

For volumes I haven't really looked yet (I had a quick play with creating local volumes in the containerDir within the swarmkitstate dir, but that isn't really going to cut it). As it stands swarmd/swarmctl don't currently have a concept of volumes or their management. I think I need to look into what docker's swarmmode thinks about volume use by services and think about how that could apply to the swarmd+containerd usecase.

For secrets I haven't looked in any detail and have no answer at all.

@dongluochen
Copy link
Contributor

For networking I've looked a bit at CNI (via https://github.com/ehazlett/circuit/) and CNM (via https://github.com/docker/libnetwork/), but no real conclusion.

Thanks @ijc25! @mavenugo any idea how networking should be supported with direct containerd integration?

@ijc
Copy link
Contributor Author

ijc commented Mar 6, 2017

Force push:

  • Rebased to swarmkit master.
  • Update to containerd master.
  • Implement .Shutdown and .Remove methods. .Terminate remains unimplemented.
  • Implement .Wait.
  • Support for create's --tmpfs and --bind options.
  • Mount an anon-volume on Volumes specified in container images (directory is part of swarmkit state). No support for non-anon volumes (i.e. create --volume) yet.
  • Bug fixes and cleanups.

@ijc
Copy link
Contributor Author

ijc commented Mar 14, 2017

Force push:

  • Handle pull from alternate repo and by digest
  • Stub out executor.Describe with some real (but still incomplete) stuff
  • Update to containerd a160a6a0682a46808dd117fe087510d65cd1e041
  • dist-pull script continues to shrink with new containerd API functionality

@stevvooe
Copy link
Contributor

@ijc Check out containerd/containerd#904 to see if that will help here. The intent of that PR is to make things less hacky. Should we merge this and have another PR that uses the new client package?

@ijc
Copy link
Contributor Author

ijc commented May 26, 2017

@stevvooe Yes, I expect containerd/containerd#904 to help a great deal. I'm OK to merge this and port over to that in a future PR if the maintainers are.

The two outstanding review comments here are:

  1. implement direct containerd Executor. #1965 (review) on what to do with the event stream if containerd dies (looking into this now, I think I have an answer).
  2. implement direct containerd Executor. #1965 (review) on the subject of getting all events.

We've already deferred #1965 (review) (healthcheck) with a TODO.

@ijc
Copy link
Contributor Author

ijc commented May 31, 2017

Force pushed an update:

  • rebase onto latest master
  • handle reconnecting to containerd if it goes away and returns
  • drop the Engine field from node description since Only filter on plugins if running with docker engine #2192 was merged
  • wire up generic resources
  • update to containerd 7fc91b05917e, involved porting over the introduction of the container metadata service

I think I need to spend some time tidying up the usage of the snapshotters etc and the new metadata service, it's a bit organic rather than using as they are designed I think. IMHO these needn't block merging.

@stevvooe said he had a nearly complete event filtering package, which I'll hopefully be able to use to address #1965 (review). Maybe that could also be done in a followup.

Lastly, "wire up generic resources" was just wiring the supplied genericResources through to my Describe method, exactly as is done in the dockerapi executor, it doesn't look like I needed to do anything more but #2090 was pretty daunting to read through.

return err
}

log.G(ctx).Infof("Wait")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a bit overly verbose?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should either be more informative (i.e. say what it is waiting for) or be removed. I think I'll err on the side of removing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR it was more informative, since ctx carries various things including the task id and includes them in the log messages. I've removed it anyway though.

}
case <-closed:
// restart!
eventq, closed, err = r.adapter.events(ctx)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only function that calls r.adapter.events?

Since it unconditionally reopens the Events stream here, I wonder if the retry logic in r.adapter.events is actually necessary or even desirable.

I must have missed this before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted, I'd forgotten all about this code when I added the retry to the r.adapter.events code.

What's odd is that when I simulated an event stream failure (by restarting containerd) the error did bubble up to failing the task rather than restarting here. I'll need to investigate why before I kill the retry code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My theory is that the first call to c.taskClient.Events in r.adapter.events didn't have the grpc.FailFast(false), so it would go through this outer retry and them immediately fail because containerd hadn't actually restarted yet.

I think adding the FailFast unconditionally would be ok and allow me to remove the retry loop in r.adapter.events. I'll give that a shot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed to work. I took the opportunity to add a check that we hadn't missed an exit event while the stream was away too.

log.G(ctx).Debug("Mounts are:")
for i, m := range spec.Mounts {
log.G(ctx).Debugf(" %2d: %+v", i, m)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking: I prefer to emit this as a single log line. Otherwise it might be difficult to parse if it's interleaved with other log lines from concurrent threads.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This set of messages is pretty verbose (with very long lines) and not all that useful now that I have it working, so I think I'll just nuke it.

@aaronlehmann
Copy link
Collaborator

LGTM

If this is merged in its current state, please open an issue about event filtering. I'd like to make sure that gets addressed.

Calls down to containerd via gRPC to create and start containers.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@ijc
Copy link
Contributor Author

ijc commented Jun 6, 2017

I rebased and squashed the fixups again.

I have filed #2219 for the event filtering issue. I don't seem to be able to assign, please someone who can feel free to assign it to me.

@stevvooe
Copy link
Contributor

stevvooe commented Jun 7, 2017

LGTM

#2219 is blocked on me and @ehazlett.

ijc pushed a commit to ijc/moby that referenced this pull request Jun 8, 2017
and update some dependent packages.

We would like to keep moby/moby and swarmkit somewhat in sync here and
moby/swarmkit#2229 proposes a similar bump to
swarmkit, needed due to moby/swarmkit#1965 which
pulls in containerd which uses some newer features of the grpc package.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@aaronlehmann aaronlehmann merged commit cb0b30d into moby:master Jun 14, 2017
@crosbymichael
Copy link
Contributor

How does integration tests work with swarm? Is there a way to test the tests with this integration outside of moby?

@aaronlehmann
Copy link
Collaborator

We have some swarmkit-level integration tests in the integration directory. Currently, these work by mocking out the executor instead of running an running an actual swarm daemon. At a higher level, there are the moby integration-cli tests, and the docker-e2e tests (not open source, AFAIK). Both of the latter rely on a docker daemon.

@ijc ijc deleted the containerd branch June 15, 2017 09:16
@ijc
Copy link
Contributor Author

ijc commented Jun 15, 2017

I was just thinking about automated testing too (OK, I'm a bit late on that one, I confess!) I'll create a new issue.

andrewhsu pushed a commit to docker-archive/docker-ce that referenced this pull request Jun 24, 2017
and update some dependent packages.

We would like to keep moby/moby and swarmkit somewhat in sync here and
moby/swarmkit#2229 proposes a similar bump to
swarmkit, needed due to moby/swarmkit#1965 which
pulls in containerd which uses some newer features of the grpc package.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Upstream-commit: 379557a
Component: engine
silvin-lubecki pushed a commit to silvin-lubecki/engine-extract that referenced this pull request Mar 16, 2020
and update some dependent packages.

We would like to keep moby/moby and swarmkit somewhat in sync here and
moby/swarmkit#2229 proposes a similar bump to
swarmkit, needed due to moby/swarmkit#1965 which
pulls in containerd which uses some newer features of the grpc package.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
Upstream-commit: 379557a
Component: engine
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants