New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document best practices for how to manage releases in production #1786

Open
bfirsh opened this Issue Jul 29, 2015 · 21 comments

Comments

Projects
None yet
@bfirsh
Copy link
Contributor

bfirsh commented Jul 29, 2015

We currently have some lightweight documentation about how to use Compose in production, but this could do with improvements:

  • How to manage releases, particularly when deploying through the Docker Hub (e.g. in development build images, in production use image from Hub)
  • A step-by-step guide to deploying a Compose app from dev through to production
  • Some examples of how to deploy apps using init scripts
  • Some examples of how to deploy apps using Swarm
  • How to deploy apps on a single server (particularly useful for internal tools, etc)

Resources

@funkyfuture

This comment has been minimized.

Copy link
Contributor

funkyfuture commented Aug 4, 2015

init scripts

and service-configs for systemd, that thing will take over. i thought of a command to generate these configs.

@amylindburg amylindburg added this to the 1.5.0 milestone Aug 12, 2015

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Aug 31, 2015

There's some previous discussion and requests in #93, #1264, #1730, #1035. I've closed those tickets in favour of this one, but linking to them for reference.

@neg3ntropy

This comment has been minimized.

Copy link

neg3ntropy commented Feb 4, 2016

I would like some recommendations on how to deploy config files through docker/compose/swarm.
We have a setup (that was recommended by consultants) that makes images with config files and declares volumes to export them. This looks good in principle, but it does not work as expected in a number of cases.

@dnephin

This comment has been minimized.

Copy link
Contributor

dnephin commented Feb 4, 2016

Putting the configs into a "config" image that just exposes a volume seems like a reasonable way to do it. I'd be great to hear more about the cases where it doesn't work, either here or in a new issue.

@neg3ntropy

This comment has been minimized.

Copy link

neg3ntropy commented Feb 5, 2016

@dnephin If you go down the road of one single configuration container and use --volumes-from you sacrifice some security (every container sees all configs) but it looks easy to set up and nice: it's immutable and does not use any host fs path.

Once you operate outside localhost and do not recreate everything at each run, you start learning the subtleties of the --volumes-from and compose recreation policies: the config image is updated and restarted, but client container still mount their current volumes unless they are recreated as well for independent reasons. This took a while to be noticed and left us with a workaround of deleting the old config container whenever the config changes.

Another solution would seem to be avoiding immutability and change data inside the same volumes, running a cp or similar. At this point it would just be easier to pull the configs from a git repo and skip the image build altogether... which was the stateful solution I originally had in mind. If you want no fixed host path, you need a config data-only container and a config copier.

I am not 100% happy with any of the solutions. Either I am not seeing a better way or maybe some feature is still missing (like some crazy image multiple inheritance or some smarter detection of dependencies when using --volumes-from that I can't figure out).
What this need is essentially is to add a build step, an environment layer, without really building a new image.

@prcorcoran

This comment has been minimized.

Copy link

prcorcoran commented Feb 28, 2016

I have a suggested solution for supporting the zero-downtime deployment a lot of us want.

Why not simply add a new option to docker-compose.yml like "zero_downtime:" that would work as follows:

web:
image: sbgc (rails)
restart: always
links:
- postgres
- proxy
- cache
zero_downtime: 50 (delay 50 milliseconds before stopping old container. default would be 0)

I run separate containers for nginx, web(rails), postgres and cache(memcached). However, it's the application code in the web container that changes and the only one I need zero downtime on.

$ docker-compose up -d web

During "up" processing that creates the new "web" container, if the zero_downtime option is specified, start up the new container first exactly like scale web=2 would. Then stop and remove sbgc_web_1 like it currently does. Then rename sbgc_web_2 to sbgc_web_1. If a delay was specified (as in the 50 milliseconds example above) it would delay 50 milliseconds to give the new container time to come up before stopping the old one.

If there were 10 web containers already running it would start from the end and work backwards.

This is how I do zero downtime deploys today. Clunky but works: [updated]
$ docker-compose scale web=2 (start new container running as sbgc_web_2)
$ docker stop sbgc_web_1 (stop old container)
$ docker rm sbgc_web_1 (remove old container)

Update: we need a way to rename the sbgc_web_2 container to sbgc_web_1. Thought we could just use 'docker rename sbgc_web_2 sbgc_web_1' which works but then running 'docker-compose scale web=2' will produce sbgc_web_3 instead of sbgc_web_2 as expected.

@davibe

This comment has been minimized.

Copy link

davibe commented Mar 9, 2016

What happens to links if you do that ? I guess you need a load balancer container linked to the ones you launch and remove and you can't restart it (?)

@prcorcoran

This comment has been minimized.

Copy link

prcorcoran commented Mar 11, 2016

The links between containers are fine in the scenario above. Adding a load balancer in front would work but seems like overkill if we just need to replace a running web container with a new version. I can accomplish that manually by scaling up and stopping the old container but it leaves the new container numbered at 2. If the internals of docker-compose were changed to accommodate starting the new one first, stopping the old one and renumbering the new one I think this would be a pretty good solution.

@davibe

This comment has been minimized.

Copy link

davibe commented Mar 12, 2016

In a real use case you want to wait for the second (newer) service to be ready before considering it healthy. This may include connecting to dbs, performing stuff. It's very application specific. Then you want to wait for connection draining on the older copy before closing it. Again connection draining and timeouts is application specific too. It could be a bit overkill to add support for all of that to docker-compose.

@prcorcoran

This comment has been minimized.

Copy link

prcorcoran commented Mar 12, 2016

Right, the 2nd container would need time to start up which could take some time depending on the application. That is why I proposed adding a delay:
":zero_downtime: 50 (delay 50 milliseconds before stopping old container. default would be 0)
As far as stopping the original goes it wouldn't be any different than what docker-compose stop does currently.

Basically my proposal is just to start the new container first, give it time to come up if needed and then stop and remove the old container. This can be accomplished manually with the docker command line today. The only remaining piece would be to rename the new container. Also possible to do manually today except that docker compose doesn't change the internal number of the container.

@vincetse

This comment has been minimized.

Copy link

vincetse commented Mar 29, 2016

Hey folks, I was facing the need for a zero-downtime deployment for a web service today, and tried to take the scaling approach which didn't work well for me before I realized I could do it by extending my app into 2 identical services (named service_a and service_b in my sample repo) and restarting them one at a time. Hope some of you will find this pattern useful.

https://github.com/vincetse/docker-compose-zero-downtime-deployment

@davibe

This comment has been minimized.

Copy link

davibe commented Mar 29, 2016

It does not work for me. I have added an issue on your repo.

@iantanwx

This comment has been minimized.

Copy link

iantanwx commented Apr 25, 2016

I just came across this ticket while deciding on whether or not to use compose with flocker and docker swarm, or whether to use ECS for scaling/deployment jobs, using the docker cli only for certain ad-hoc cluster management tasks.

I've decided to go with compose to keep things native. I'm not fond of the AWS API, and I think most developers, like me, would rather not mess about with ridiculously nested JSON objects and so on.

I then came across DevOps Toolkit by Viktor Farcic, and he uses a pretty elegant solution to implement blue-green deployments with compose and Jenkins (if you guys use Jenkins). It's pretty effective having tested it in staging. Otherwise it would seem @vincetse has a pretty good solution that doesn't involve much complexity.

@sebglon

This comment has been minimized.

Copy link

sebglon commented May 17, 2016

a very good implementation of the rolling-upgrade already exists on Rancher
http://docs.rancher.com/rancher/latest/en/rancher-compose/upgrading/

@zh99998

This comment has been minimized.

Copy link

zh99998 commented Jul 21, 2016

as now docker swarm will be native, no need haproxy/nginx for load-balancing, and native health check arguments. is there any more simplified solution?

@oelmekki

This comment has been minimized.

Copy link

oelmekki commented Apr 16, 2017

Update: we need a way to rename the sbgc_web_2 container to sbgc_web_1. Thought we could just use 'docker rename sbgc_web_2 sbgc_web_1' which works but then running 'docker-compose scale web=2' will produce sbgc_web_3 instead of sbgc_web_2 as expected.

If anyone wonders why, that's because of a label that docker-compose adds on container:

           "Labels": {
                // ...
                "com.docker.compose.container-number": "3",
                // ...
            }

Sadly, it's not yet possible to update labels on running containers.

(also, to save people a bit of time: trying to break docker-compose by using its labels: section to force the value of that label does not work :P )

@oelmekki

This comment has been minimized.

Copy link

oelmekki commented Apr 17, 2017

Ok, I managed to automate zero downtime deploy, thanks @prcorcoran for the guidelines.

I'll give here a more detailed way about how to perform it, when using nginx.

  1. scale your service up, using eg docker-compose scale web=2
  2. wait for the container to be ready, either with a blind timeout or ping a specific url on container
  3. update nginx upstream list for your domain to only list new container IP
  4. perform a nginx configuration reload (eg sudo nginx -s reload, do not do a restart, or it will close active connections)
  5. wait for old container to be done with its running requests (a timeout is fine)
  6. stop the previous container using docker, not docker-compose
  7. remove the previous container using docker, not docker-compose (or not, if you want to be able to rollback)
  8. scale the service down, using eg docker-compose scale web=1

useful commands

To find container ids after scaling up, I use:

docker-compose ps -q <service>

This can be used to find new container IP and to stop and remove old container.

The first id is the old container, the second id is the new one The order is not guaranteed, containers has to be inspected to know which one is the oldest.

To find container creation date:

docker inspect -f '{{.Created}}' <container id>

To find new container IP:

docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' <container id>

a few more considerations

As mentioned in previous comments, the number in the container name will keep incrementing. It will be eg app_web_1, then app_web_2, then app_web_3, etc. I didn't find that to be a problem (if there's ever a hard limit in this number, a cold restart of the app reset it). I didn't have either to rename containers manually to keep the newest container up, we just have to manually stop the old container.

You can't specify port mapping in your docker-compose file, because then you can't have two containers running at the same time (they would try to bind to the same port). Instead, you need to specify the port in nginx upstream configuration, which means you have to decide about it outside of docker-compose configuration.

The described method works when you only want a single container per service. That being said, it shouldn't be too hard to just have a look at how many containers are running, scaling to the double of that number, then stop/rm that number of old containers.

Obviously, the more services you have to rotate, the more complicated it gets.

@jonesnc

This comment has been minimized.

Copy link

jonesnc commented Aug 22, 2017

@oelmekki The scale command has been deprecated. The recommend way to scale is now:

docker-compose up --scale web=2

@jonesnc

This comment has been minimized.

Copy link

jonesnc commented Aug 22, 2017

@oelmekki also, if the web container has port bindings to the host, won't running scale create a port conflict?

Bind for 0.0.0.0:9010 failed: port is already allocated is the message I get for a container that has the following ports:

ports:
      - 9000:9000
      - 9010:9010

If you have a setup that utilizes nginx, for instance, this probably won't be an issue since the service you're scaling is not the service that has port bindings to the host.

@oelmekki

This comment has been minimized.

Copy link

oelmekki commented Aug 23, 2017

@jonesnc

also, if the web container has port bindings to the host, won't running scale create a port conflict?

That's why I explicitly mention not to do it :)

In my previous comment:

You can't specify port mapping in your docker-compose file, because then you can't have two containers running at the same time (they would try to bind to the same port). Instead, you need to specify the port in nginx upstream configuration, which means you have to decide about it outside of docker-compose configuration.

--

You can't bind those ports on host, but you can bind those ports on containers, which have each their own IP. So the job is to find the IP of the new container and replace the old container IP with it in nginx upstream configuration. If you don't mind reading golang code, you can see an implementation example here.

@jonesnc

This comment has been minimized.

Copy link

jonesnc commented Aug 23, 2017

@oelmekki oops! That part of your post didn't register in my brain, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment