Skip to content
This repository has been archived by the owner on Dec 21, 2018. It is now read-only.
josephholsten edited this page Oct 11, 2014 · 3 revisions

Docker, Chef, & Immutable Infrastructure

Thursday, Metropolitan, 16:00

Convener

Boyd Hemphill

Participants

So, so many people.

Summary of Discussions

"If you've never read Adrian Cockroft's no-ops blog post, you can get the genesis of immutable infrastructure. With that said I'd like to roll out the Docker grenade."

Tom: "Before we get too far, who's unfamiliar with Docker?" (at least one)

Tom: "Docker is - you may be familiar with the concept of linux containers. Docker is a product that was built on top of linux containers. To oversimplify, it's a virtual machine that only has a linux kernel and it shares that kernel with the host OS. It's just one or more processes that you define. So where you download a vagrant box right now and it's 4.6GB and it breaks everything about your infrastructure, a docker container is intended to be a LOT smaller, more lightweight, and instead of running a full OS in there, you run one or more processes that you want inside of the container. The FS only contains the binaries & libraries necessary for that process to run, no CUPS server that you probably will never use.

Docker just built a social aspect around that, an easy way to take a container, snapshot it, move it around, and share it. That's your super-quick intro to Docker."

Heph: "Who's using it in prod?" (some hands, a dozen.)

Heph: "Outside prod?" (many more.)

Heph: "Why?"

A: Stubborn higher-ups.

A: "All my stuff is built around owning the whole machine. but I don't think we need to get into the idea of what's the difference between a container and a VM. They're lightweight process & network isolation. How do we - What was the question Joseph posed? He wanted to use Chef or Docker but not both."

Joseph: "The reason I'm here is that this is a Chef summit, and this is a session about Docker. That makes no sense to me. I'd either like to use Docker or Chef, and not really mix the things."

Q: Do people use Chef to make docker images to distribute? (many)

Q: If config management is ongoing changes to a system but now you want to pivot to a system where you now don't create changes, you make new systems, then when does CM come in? Are there people managing running Docker images more than just building Docker images?

Heph: "Yes. There are 3 ways:

  1. Lightweight VM. You could run Chef in there with supervisord in PID 1. Run chef in your container. Stupid, I wouldn't recommend it, but you could.
  2. Another way is to have Chef run on the host running the Docker images and mount a config directory into the container.
  3. And the last would be to not do this at all. Use 12 factor. You don't need config files, pull this out into environment variables or a lookup server."

Tom: "What about before the container launches? A container is a running instance of an image. How is that image built?"

Joseph: "You're not asking a sensible question. By running a previous image with processes on it. How do you create an image before you run processes on it?"

Tom: "So let me finish the question. Abstract: If you create a running container, you run commands on the container to change the state, then you snapshot the container. That snapshot is an image."

A: "Good example is ChefDK. There's a Docker ChefDK image."

Tom: "So the command process that you run, could that not be a chef-client run that would execute Chef code?"

A: "Mechanically it could be, that would work fine, but it'd be overcomplicated. For building a container you don't need idempotency. It's kind of a waste in a container because you run it once."

Q: Docker has a build tool. So what advantages does using chef-client to build a system to build an image give you over a Dockerfile? Which I will grant is a glorified shell script. But are we using chef-client to build the image simply because we have experience building Chef? Might be valid but I want to hear.

Tom: Build a Dockerfile with your chef run list. Then you converge on your last build - so the deltas are only what changed in your run_list. If you change a Dockerfile, everything after the step that changed has to rebuild. But Chef container being immutable means you can just change the one in the middle without changing everything else."

A: "But you'll run into layer limits."

Tom: But you can idempotently converge on a previous layer up the stack.

Heph: Still, why? Why use it in a docker container if all you're doing is you want to run a recipe in the container, why not just put the logic in that recipe into the Dockerfile and if recipe says make dir, add files, execute thing? You can do all that in a Dockerfile and do it without the overhead of running chef.

Tom: Then why not just run bash everywhere?

Heph: Chef is for the convergent management of a running system. A Docker container is not a long-running system. It's an executable, a package, a .exe.

Sean O: First there's a whole lot of people who don't take advantage of the idempotent properties of CM. you'll see lots of people who use Chef to build a machine and never run it again. Just do that with VMs, AMIs, no docker. And that's cool. With Chef, think about it as policy. If you have high-resolution policy around something - Apache - in that Chef policy you're ensuring that package is in, directories have permissions, service is off, no CUPS, say all that stuff in Chef and apply it to a container and know the state in a better way than just running that glorified shell script that is a Dockerfile. Even if you never run it again you have a better sense of the container contents.

"The other thing is that - Chef resources & CM are autonomous test and repair agents cooperating in a system where they can signal each other. But zoom out a layer, think about services at that atomic layer - that's what people do with Docker. But you still need to test & repair those containers. Let's call HTTP service: The implementation of the service is unknown from the view of your recipe. It could be pure ruby, embedded chef, wall of assembler, docker container, doesn't matter. Using Docker to implement providers is a really cool use case."

"And even if you don't mean to mutate your infrastructure, you do. If you look at recent commits to Docker they were like well, you shouldn't modify a container buuuuuut we'll make an exception for resolv.conf, /etc/hosts, things with network topology. So now you done mutated, even if it's only 4 bytes. You gonna tear down and build up again? People are making exceptions."

A: "The immutable statement is a misnomer. Just as a matter of course, don't rewrite it. Rather than every process can modify every other."

Adam E: "One of the things I use Chef for is - let's say I don't know how (some standard package) works. I just want to get going, I need 3 or 4 on a given box. I go write something from community cookbooks and I'm off. You benefit from knowledge people invested for re-use. So my question is with Dockerfiles, can I do the same thing and compose them together?"

Heph: "Not quite, but what you can do is - there is a huge community (good and bad) of existing Docker containers on the Hub - like the Supermarket - and people have - you can run a container that is couch, one that is apache, one that is uwsgi server all on the same host, just like you'd add them to a run_list. They're just running as discrete units rather than applying to the same host."

Q: "I went through all this, worked at Twitter. Now we have clients working on Dev distribution - harmonization. They've got different laptops of different versions of everything. The idea of redeploying the whole thing instead of layers is - inetd.conf? It was the same argument over again. We went with VMs. They literally said, how do I manage a bunch of hosts with Docker? How do we get from dev and make sure that those are identical to prod without copying all my host OS's and docker containers? We've talked about layering up, squashing, layering, squashing. So they don't have to manage versions of the container. I still see a role for Chef because you still have to manage configuration bits even on the Host OS."

A: "Are you talking base containers?"

Q: "Not just base containers. Versions of executables, base containers, configurations - Chef templates are super useful if I need apache config. Not gonna run some Bash, I'd rather have data bags with the versions."

A: "Are you talking about clients who are using a Docker registry?"

Q: "They have no registry yet."

A: "If they had a registry they'd have a repo for storing the layers of those containers, which would solve the copying-around-a-huge-container-problem, and the versioning problem. You can update in place and so on."

Q: "That manages config files within the container?"

A: "Yes."

Heph: "Right now I'm running Chef to manage the execution of other Docker containers on the host. The containers themselves don't have an init. My process is PID 1. When I have a container that requires a static file, I write that via Chef on the host into a config directory that I mount within the container. So the container is really treated as an immutable object."

Q: "We're saying the same thing."

Heph: "Exactly right. The 20GB VMDKs is no longer a problem. The difference in the filesystem layer is the size of the code change I put in."

Q: "Thanks both of you! That helped me understand."

Boyd: "Wrote a blog post today about moving from a 900MB container down to an 18MB container. That you can certainly pass around. Not just a matter of squashing, it's a matter of thinking about - the paradigm is really - it's about getting a service into a container and making it as small and portable as possible. I disagree in that we do need to distinguish between a container and a VM. You don't log in, you don't change it, it's not even cattle, it's cattle feed."

A: "You're not going to get an 18M container with chef-client running in there."

Q: "So a lot of us agree about using Chef for placement of running containers on the host? Anyone disagreeing with that?"

A: "That's what we're doing now. 5 years from now there will be better tools for that but it doesn't exist yet. Go back to Chef which we know and trust. Trying to do the 12-factor approach so no config files."

Q: "Cool, just making sure we're on the same page."

Tom: "I guess, question: Raise your hand if you're using Docker again. now how many only run one process in your container. I mean, only PID 1 isn't a supervisor, no monitoring, raise your hand if that's it." (majority of hands came up)

Tom: "Third or 4th most popular on the hub is phusion/baseimage, which exists to have runit. My point is some people want docker containers as this thing. So there's two ways: Complete process isolation, that is totally cool. But also there are people who want to use them a different way."

Heph: "People are comfortable with this model. They're much more comfortable with running Chef to provision a Docker container because they know that this logic has worked for them in the past. Don't have to rewrite it as a Dockerfile. Just run chef-zero on the box or whatever. Still perfectly valid approach. It makes distribution and CD in that model a bit slower and bulkier. Depends how you wanna do it but there are multiple good ways."

Tom: "So if the base rarely changes and the base is just nothing but the Chef installation - when you're tossing around those images that base would rarely get re-downloaded so there is the initial hit of heavy bandwidth, but after that you're not getting super-hit-hard."

Heph: "Did you mention something called knife-container earlier?"

Tom: "Yep, I made it."

Heph: "NEAT."

Tom: "So this is how it works: Assuming that you want a proper PID 1 to describe a process supervisor like an init. But you may want to like manage a running container. nsenter exists for a reason, right? Minute you have that it's no longer guaranteed immutable."

Heph: "Welllllll"

(Bikesheds)

Tom: "So what it does is it uses a run_list to build a Docker image, does a few weird things - Instead of each layer beying an idempotent compoent, since Chef has idempotent in it each layer becomes a distribution mechanism for changes. Creates a new layer for the deltas on each converge. That's what you distribute to get your new version of the container. Layer 1 is version 1, layer 2 is version 2, only contains the changes. Go back in time by going back to the previous layer. I take responsibility for idempotency out of Docker and bring it to Chef. Use Docker to make it easy to move things around."

Tom: "So chef-init is a PID 1 gem that will start runsvdir and if you want it to, run a chef-client and converge. If you want multiple processes it comes with a process supervisor to start processes on boot. So when you want a container to launch, service launches, runit does the service management."

Heph: Where are containers built?

Tom: "I guess it doesn't matter where you run it. Initial design is run on laptop. knife-container docker build would build, then push to your registry. Layers are different converges. Could squash by converging on a different layer. Just orphan the other layers. Use Chef to configure the image, and if you want to you could run Chef when the container launches to manage last-mile, register with third-party system using keys you don't want to build in, figure out environment by querying Chef Server, etc."

Chris Wing: "Just to be clear, with knife-container you assume the container has a client in it. But that wouldn't be necessary, you could rm -rf /etc/chef after a converge and it'd be fine.

Tom: "There's a lot I think would be cool to do! yes, the image has the client in it right now. In the future I'd like to be able to do remote execution, send stuff to the thing without it having Ruby in it. rm/opt/chefwhen you finalize would work, that'd be cool. I'd love to see bindmount in /opt/chef so the omnibus isn't part of the image. The project is like 4 months old and I haven't done it yet."

Sean O: "I went to linux conf in Chicago a bit ago. that was the only thing they talked about the whole time was containers. But the convo was mostly service providers - everyone was concerned about density. How much crap can we cram onto our hardware? That's what early adopters are trying to do. If you look at like why macroscopic config management strategies become popular it's because it makes it cheaper to get your work done. Golden Images were popular because it got cheaper than doing it by hand. Chef was cheaper than shipping golden images. Now it's cheaper to ship around 18mb Docker containers. But as we're cramming more & more docker containers onto machines, that itself constitutes complexity that you're gonna need to manage with - what? probably Chef."

Tom: "The reason it's called knife-container and not knife-docker is because docker's not going to be the only player forever. So if we have a single way with Chef right now, then you can apply it to other things."

Boyd: "Is there any value in the idea that Chef abstracts away all the things about installing stuff? I just say put this there, and chef rolls it without me thinking about the details. Is there value there for building containers that way instead of with Dockerfile?"

A: "First thing I did with the dockerfile was type in apt install package - and I forgot the -y on the commandline, so it never completed. It's nice to abstract that stuff."

Adam E: "Another example, putting powershell on any of 7 different versions of windows, finding the right .net to install, all the other crap. I forgot how to do all that anymore, it's in a cookbook. I don't have to think."

A: "I think the idea of building containers with chef because of the abstraction, there's value there. However, a lot of cookbooks - when I build a CB and I say it supports ubuntu, it means running in aVM. Sometimes cookbooks do things tha won't work in containers. Try to build a docker container with the aufs cookbook, because you can't load kernel modules in containers. So assuming you can take any CB and turn it into an image is naive. There might be alternate cookbooks or recipes that target different systems."

CW: "I don't trust abstractions but our thinking right now is that we like the testing and CI around Chef, and there's value in those workflows vs building images with arbitrary scripts that we have to re-implement."

Joseph: "As much as I'm loving this, is everyone else getting value out of this? Are we answering the question you came here for?"

Miah: "Who here runs immutable systems with Docker or EC2?" Many hands.

Miah: "There's not a lot of value in Chef at that point."

Heph: "If anyone has used CoreOS and fleet and loves it, I don't believe you exist. Chef works as an orchestration layer. Does anyone here use the docker cookbook?"

2 hands.

"Holy shit. You guys need to try it."

Heph: "The reason I ask is there's a big ugly problem, which is that it doesn't respect restarting a container with different envars or attributes. As long as the container with this name is running then it's good. Can't change it."

Heph: "How do you guys deal with spawning containers, running them on a host, and orchestrating them with Chef?"

A: "Ruby blocks. We run docker run -d whateve whatever, have a data bag to read in, check the commandline, restart if there's a change."

Heph: "I'm really surprised a dozen people said they've used Docker with prod but nobody's answering."

CW: "Who here uses PAAS engines?" A: "Who didn't write one?"

A: "I do run Deis."

Q: "Anyone running Mesos?" (no hands)

"Oh that's sad."

Boyd: "So back to your question Heph, we're doing the same thing. we stop them and then delete the container. That's why I want it so small, start up again from image. Happens in ms, and that's for free. So that's a hint for quick + dirty. I just wrote my own because I couldn't grok Docker cb."

A: "I'll tell you why we're not in prod yet. We have an older org with a big support staff, they can't log into machines if they're on Docker. They need old-time linux server infrastructure. Pretty soon we'll have a PHP front-end running in Docker, but we have a Java app. It breaks beyond 127 layers. so there are cases where it's not usable yet. I mean, Docker just hit 1.0. It's pretty new. It takes time to line up process, get acceptance by management & staff and so forth."

Tom: "Who's using docker to do CB testing? Kitchen-docker?" (Many hands)

Tom: "That's completely different than what we talked about. What questions do people have about using Docker to do things without the intention of using Docker to use in production?"

A: "So the use case that I have for using Docker to test my containers is that my Jenkins host is a VPS. So I run digitalocean and you can't run vagrant inside. So either I spin up droplets / containers / instances and pay a cost or I run docker which CAN be run inside a VM. Allows full CB testing on a single host. But the images that come out of that are never intended for production. Kitchen docker deletes on pass. It's a ChefDK use case. Those are really interesting test cases but get into why Docker is cool as opposed to the Chef side of it."

A: "So one thing when testing cookbooks with Docker, anything that requires services is going to break pretty quick. The other cool use case I've been trying to figure out is running my unit tests in containers. I have Chefspec with 40 resources I want to test. Instead of running 1 at a time I'd rather spawn 40 containers, run 1 test and return. Your test run would be so much faster!"

Q: "We have a client that's trying to move to prod with Docker. They've made conflicting statements. We only wnat to run one process inside the container, and we also want to uniquely id each container so we know what ran that app at that time. But we also want to test on laptops, and distribute to prod. So we're struggling with competing requirements. And no ruby interpreter on the machine. How do we manage that move to prod?"

A: "There's definitely some square peg stuff there. It is important to note that this idea of one process vs. another process - it's their version of Roles Are Evil. There are people who will argue passionately on both sides. You might find that multi-process is fine to meet your requirements. But when you drink the 12 factor kool-aid, you'd have many containers all talking to each other. Maybe your orchestrate those with Chef."

Joseph: "Last question - are there things you would not use Docker for? Where Docker is inappropriate?"

Q: "I'd like to see knife-container be more compatible with the Berks style workflow."

Tom: "knife-container is going to evolve pokemon-style into the ChefDK model. You'll be able to describe how you want your image to look using the chef-metal machine resource."

A: "Operationalization tools: cAdvisor from Google exports data about every container on the host to an API. There's also a thing that will do the same for Docker logs, I forget the name. It's by Jeff Lindsay. It'll watch the log streams - anything going to stdout and will throw it to logstash, syslog, etc. Those are super-useful. And the other quick thing within the Chef world, I've been thinking about how to get chef & etcd / chef & consul talking to use chef to start spitting outetcd k/v pairs so that containers can configure themselves with confd around those."

What will we do now? What needs to happen next?

  • Further discussion!
Clone this wiki locally