Skip to content
This repository has been archived by the owner on Sep 4, 2021. It is now read-only.

host: Replace libvirt-lxc backend with libcontainer #3030

Merged
merged 26 commits into from Jul 10, 2016

Conversation

lmars
Copy link
Contributor

@lmars lmars commented Jul 1, 2016

Opening as an early PR to see what happens in CI.

Closes #38
Closes #3021
Closes #2553
Closes #2367
Closes #2725
Closes #2803

@lmars lmars force-pushed the host-libcontainer-runtime branch from e7630d9 to 555e7d5 Compare July 3, 2016 00:46
@lmars
Copy link
Contributor Author

lmars commented Jul 3, 2016

There are still a couple of test failures, but I think these are CI flakes present on master, so this is now ready for review.

An overview of how containers work with libcontainer:

  • flynn-host exec's flynn-init with libcontainer-init as a command line argument, and the _LIBCONTAINER_INITPIPE env var pointing at a pipe file descriptor opened by flynn-host
  • flynn-init receives binary netlink data followed by a JSON configuration payload from the pipe which it uses to create namespaces, setup networking etc. (this is handled transparently by libcontainer)
  • flynn-init exec's /.containerinit inside the container (which is just a bind mount of flynn-init from the host), and then containerinit proceeds as it does currently

Some notes:

  • we continue to use cgroup partitions as we did with libvirt, but these are now named /flynn/{system,user,background} (rather than libvirt's /machine/{system,user,background}.partition)
  • starting a container no longer leaves a supervisor process in the root mount namespace, so we no longer need nsumount and I have removed it
  • I have manually verified that mounts are no longer being leaked (so pinkerton cleanup "device or resource busy"  #38 is fixed)
  • networking setup is now handled by libcontainer and has been removed from containerinit
  • since containers are now started as direct children of flynn-host, we could stop passing FDs over unix sockets, but that would break restarting flynn-host or updating it (the containers will get re-parented to init, and we will lose the FD references).

/cc @titanous @josephglanville

if err := factory.StartInitialization(); err != nil {
log.Fatal(err)
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to put this in init instead of main?

@titanous
Copy link
Contributor

titanous commented Jul 3, 2016

This is awesome! Have you tried removing the vendored libvirt client code?

@titanous
Copy link
Contributor

titanous commented Jul 3, 2016

Looks like we'll probably need some minor docs updates.

@@ -2,6 +2,5 @@ include_rules
: |> sed 's/{{TUF-ROOT-KEYS}}/@(TUF_ROOT_KEYS)/g' cli/root_keys.go.tmpl > %o |> cli/root_keys.go
: cli/root_keys.go |> !cgo |> bin/flynn-host
: bin/flynn-host |> gzip -9 --keep bin/flynn-host |> bin/flynn-host.gz
: |> !go ./flynn-init |> bin/flynn-init
: |> !cgo ./flynn-init |> bin/flynn-init
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break containerinit if the rootfs doesn't have libc. Perhaps we should put the libcontainer part in flynn-host instead?

@lmars lmars force-pushed the host-libcontainer-runtime branch from 555e7d5 to ef8b039 Compare July 3, 2016 20:50
@lmars
Copy link
Contributor Author

lmars commented Jul 3, 2016

@titanous comments addressed (in terms of docs, I just updated this line in the architecture docs).

// container, taken from:
// https://github.com/opencontainers/runc/blob/v1.0.0-rc1/libcontainer/SPEC.md#security
var defaultCapabilities = []string{
"CAP_NET_RAW",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list does not appear to match the link.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way do they differ? I just took the ones with Enabled: 1 set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the second column. That makes sense.

@lmars
Copy link
Contributor Author

lmars commented Jul 6, 2016

Updates are not currently working because all containers are exiting when the old flynn-host process dies.

I initially thought flynn-host was killing them on shutdown but this also happens when it dies abruptly with SIGKILL, so it must be the kernel which kills the containers.

The likely suspect is the ParentDeathSignal parameter, but so far I have not been able to configure things so that the containers do not exit (I have tried modifying that config variable, removing all prctl calls, making a prctl call to reset it in containerinit, but none of those approaches prevent the containers from dying).

I am investigating further.

@lmars lmars force-pushed the host-libcontainer-runtime branch 3 times, most recently from da3f1b1 to df57576 Compare July 7, 2016 01:27
@lmars lmars mentioned this pull request Jul 7, 2016
@titanous titanous mentioned this pull request Jul 7, 2016
@lmars lmars force-pushed the host-libcontainer-runtime branch from df57576 to c7d703d Compare July 8, 2016 11:59
@lmars
Copy link
Contributor Author

lmars commented Jul 8, 2016

The containers were previously exiting when the parent flynn-host daemon exited because they were getting a TTY hangup signal due to the fact that flynn-host opened the console, which subsequently closed when it died.

I have changed things so that only TTY jobs get a console (f5d6b84) which means those jobs will not survive an update to flynn-host, but those jobs are usually interactive jobs (e.g. flynn run bash) so I don't think we need to worry about them.

I have also fixed job resurrection so that it only happens for jobs which aren't actually running (c7d703d), and although that could be changed on master I am bundling it in here as I hope we can merge this.

The last few test runs have had some failures but they are all tests which fail intermittently on master, and are mostly controller deployment related (something which I will be refactoring soon).

@titanous @josephglanville this is ready for a review.

Source: lt.FSRef{Usage: "65535"}, // 64MiB
Target: lt.FSRef{Dir: "/dev/shm"},
},
ifaceName, err := netutils.GenerateIfaceName("veth", 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be moved into the else block below?

lmars added 6 commits July 9, 2016 12:55
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
lmars added 18 commits July 9, 2016 12:55
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
The cgroup-lite package is responsible for mounting the cgroup
controllers at /sys/fs/cgroup/{controller}, which flynn-host expects to
be mounted.

Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
To make it easy to identify jobs in the output of `ps`.

Signed-off-by: Lewis Marshall <lewis@lmars.net>
Flannel for example needs this to setup vxlan devices.

Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
Signed-off-by: Lewis Marshall <lewis@lmars.net>
@lmars lmars force-pushed the host-libcontainer-runtime branch from c7d703d to b82b909 Compare July 9, 2016 18:29
@lmars
Copy link
Contributor Author

lmars commented Jul 9, 2016

@titanous so I went on a small TTY adventure today and eventually realised the issue was that not assigning a console to the container meant that when containerinit opened a PTY to hand to the job, it ended up having that PTY as its controlling terminal, so then exec'ing the job with Setctty for that same terminal would fail with EPERM because terminals can only be controlling for a single session.

Long story short, updating github.com/kr/pty to include this change fixed that, and means there is now no need to create consoles in flynn-host (218e56e), so therefore no TTY hang ups when flynn-host exits.

Ready for another review 😄.

@titanous
Copy link
Contributor

titanous commented Jul 9, 2016

LGTM!

@josephglanville
Copy link
Contributor

LGTM. 🎉

@lmars lmars merged commit 17d3592 into master Jul 10, 2016
@lmars lmars deleted the host-libcontainer-runtime branch July 10, 2016 23:24
@lmars
Copy link
Contributor Author

lmars commented Jul 10, 2016

woot!

@temujin9
Copy link
Contributor

Woot indeed. Can I get a head's up when this releases?

@titanous
Copy link
Contributor

The latest nightly has it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants