Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use in Docker #34

Closed
jygastaud opened this issue Aug 3, 2018 · 13 comments
Closed

Use in Docker #34

jygastaud opened this issue Aug 3, 2018 · 13 comments

Comments

@jygastaud
Copy link

Hi,

Nice project. Will definitely try it soon.

Did you already try to integrate / use it with Docker, as a replacement of native files sharing ?

I think of somethings like what is done with unison and rsync via docker-sync tools.

@xenoscopic
Copy link
Member

It's definitely something that's on the radar and something that I've thought about extensively, especially for cases where you're using a remote Docker daemon instance. I hadn't seen docker-sync in particular before, but I'll check it out.

I wanted to get the initial version of Mutagen out with SSH support and see what people's use cases looked like before committing to a design for integrating other platforms. I'm hoping that people who are using Docker more frequently than myself can sort of prototype (with the SSH support) what they'd want Mutagen's Docker support to look like.

There are a few technical issues with supporting Docker that'll require some thought, e.g. how to capture all of the Docker environment variables that influence behavior into the Mutagen session data, but I have some vague thoughts about how to solve these.

With Docker there's also the question of whether or not we might be able to get better performance using docker copy and docker exec than scp and ssh. This would definitely be interesting to explore.

If you're a regular Docker user and get a chance to play with Mutagen against a test container, I'd be interested in your feedback on performance and what you might want Docker integration to look like.

@Toilal
Copy link

Toilal commented Sep 5, 2018

As you know, docker on Windows suffers crap performance when volumes are involved because they are implemented with some kind of network shares between the host (Windows) and the Docker daemon (Linux VM).

For many projects, I have used WinNFSD to setup a network folder between the host and the Docker daemon, which is still better than the default sharing system provided by Docker for windows or any other tooling. But it's still slower than native docker running right on Linux.

I have switch to mutagen on a new project, as it now works really well since most issues have been fixed in lasts versions (thx @havoc-io). The application is now running at native speed, I mean about x3 response time reduction on a PHP Symfony API + Angular 6 Webapp. Sync between host and VM on file changes is almost instant, less than 1 second in most case.

If you are looking for a Vagrant based docker environment for development under windows, you may have a look to this project : https://github.com/GFI-Informatique/docker-devbox. It's based on winnfsd for vagrant shares, but you may ignore those shares and setup a mutagen session instead.

@xenoscopic
Copy link
Member

I'll have a look at some of these projects and write up a proposal for what Docker integration might look like. It sounds like the issue isn't really technical (i.e. you can already connect to a Docker via SSH, so you can already use Mutagen), but more of a question about automation and integration.

I don't really use Docker myself, at least not for development work, so I'm not sure how people would want things to look. My nominal thought is that you should be able to do something like:

mutagen create my/local/path docker://CONTAINER_ID/path

Or is there additional integration that would be useful?

@Toilal
Copy link

Toilal commented Sep 5, 2018

I'm sorry @havoc-io, but this proposal doesn't make much sense. You can't connect to a Docker container through SSH, because a Docker container have no SSH Daemon running. Basically, a Docker container is a single process (ie: apache) that is isolated from the real OS.

If you don't use Docker for Windows or Docker for MacOS, I think you misunderstood the initial request of @jygastaud. It seems he looks for an alternative to docker-sync.

To make it short, when you are running a docker under Linux, everything is OK. You can mount volumes that act as shares between the host and the container, and this volume will be almost as performant as a native hard drive. No problem at all with "Docker for Linux".

Problem arise when using "Docker for Windows" or "Docker for MacOS". In this case, the Docker daemon is still running in a lightweight Linux VM (Hyper-V for Windows) which is hidden by the "Docker for Windows" packaging, but you have to keep in mind it still run on a Linux VM.

In this case, when you mount a volume from host into the container, it goes through an additional network share layer between the Windows host and the Linux VM running the Docker daemon, and then the volume can be mounted from the network drive inside the Docker VM to the container.

The problem is that this additional layer of network share has a big impact on the performance of the application running inside the container, and it also mess file permissions up.

One solution is to use WinNFSD, a NFS Server implemented for windows, which gives better performance than the default Windows share (Samba) provided by "Docker for windows", but it's still really slower than "Docker for Linux".

But another solution is to use Mutagen, as this results in native performance from the container point of view.

Keep in mind the real subject is not to sync files right inside the container, but to sync files inside the VM running the Docker daemon.

The definitive way to handle this would be to add mutagen support inside docker-sync project as a replacement for unison. But it seems out of the scope of mutagen itself. I'll open an issue on docker-sync project.

See EugenMayer/docker-sync#603

@xenoscopic
Copy link
Member

I'm sorry @havoc-io, but this proposal doesn't make much sense. You can't connect to a Docker container through SSH, because a Docker container have no SSH Daemon running. Basically, a Docker container is a single process (ie: apache) that is isolated from the real OS.

Well, that's not really true. A Docker container is an entire Linux userspace in an isolated filesystem attached to an isolated section of a host kernel, but it's a full Linux environment. It's certainly possible to start an SSH server inside of a Docker container, whether it's OpenSSH or Dropbear, by setting it up in the Dockerfile or by installing it later through docker exec. Whether or not people do is obviously another question. I think it depends a lot on the individual user's case. Do most people just use docker attach if they need shell access?

Anyway, this is my point... I'm not sure there's a standard workflow for most people. Whatever Mutagen does here is going to need to be something relatively low-level that people can integrate into their higher-level workflows.

Regarding integration into docker-sync...

It seems to me that it makes more sense to synchronize files directly into the Docker container, instead of synchronizing them to the host VM's filesystem and then mounting that inside the container. Synchronizing directly into the container is going to get you the best performance while keeping the same design on Linux (where there is no intermediate VM), macOS and Windows (where there is an intermediate VM), and cases where your Docker containers may be hosted on another machine (where you might not have access to the filesystem, only the Docker daemon).

I can certainly understand that most people aren't going to want to start an SSH server inside the Docker container to support synchronization. That's why I would propose adding support for using docker copy and docker exec to push a Mutagen binary right inside the container and run it from inside the container (with the synchronization happening over docker exec's standard input/output, just like it happens over SSH's standard input/output).

I'm admittedly not an expert on Docker or docker-sync, but my impression is that docker-sync takes the approach that it does because there's no way for it to synchronize directly into the container. I'm imagining that you could use Mutagen to do something like:

mutagen create some/local/path docker://<container-id>/path/inside/container/filesystem

Then it would use docker copy to push a Mutagen agent binary into the container and docker exec to communicate and synchronize with it over standard input/output.

The nice thing about this design is that it's the same behavior on Linux, Windows, and macOS, and you don't need to worry about whether or not there's an intermediate VM or if the Docker daemon is remote.

@Toilal
Copy link

Toilal commented Sep 5, 2018

Your definition of what is docker is really better than mine :).

I have to go now, and I'll take time to give a longer answer later, but both solutions may exists (docker-sync integration & docker container sync agent).

@xenoscopic
Copy link
Member

xenoscopic commented Sep 20, 2018

I just wanted to give a quick update on this.

I have Docker support working in the docker branch and I will release a beta tomorrow.

Under the hood, it uses docker cp and docker exec, and it synchronizes directly into the container, so it doesn't matter if you're using Docker on Linux or macOS or Windows, and it doesn't matter if you're using Docker natively, through a VM, or on a remote host - it works exactly the same in all cases.

In addition to Linux containers, it also supports Windows containers.

I am going to label the support as "experimental" for now, because I would like some feedback on the interface and usage. I plan to release a beta tomorrow so that you can play with it if you like.

Essentially it works exactly the same as Mutagen for local and SSH synchronization, it just has a different URL format:

docker://[user@]<container>/<path>

So you can do something like:

mutagen create local/path docker://my-container-name-or-id/path/in/container

to set up synchronization between a local path and a Docker container.

Of course, you can use any pair of URLs that you want, so you can do something like:

mutagen create user@host:~/ssh/path docker://<container>/~/path/in/home

to synchronize a remote SSH location to a Docker container (if you wanted to do that for some reason).

I'll write up full documentation today before I put out the beta release.

If you have a chance to play with it next week, that would be great. I would be interested to know what could be improved to integrate it with your Docker workflows.

@divoxx
Copy link

divoxx commented Sep 20, 2018

This is awesome news.

In terms of workflow and usability, is there any plans on integrating this better with orchestration tools or maybe docker-sync?

The main issue I see is the need to manually install the agent directly into a running container every time the container is create or started. In practice, most people will use docker-compose or some other orchestration tool to manage the lifecycle of the containers, etc.

Having to manually execute that step whenever it's needed can be a problem when you have a large number of containers or you have to rebuild your containers a lot (like during development).

EDIT: Or maybe as an alternative, have a way to start a container that exclusively runs the agent and maps a folder to a specific named volume. That container would not need to be restarted/rebuild constantly and then other containers can mount the named volume that is managed by mutagen. This is similar to how docker-sync work but mutagen could support this as a standalone mode.

@jygastaud
Copy link
Author

Great !!

I've ping some members of my team to try that integration.

In addition to @divoxx comment, maybe you can explore the creation of a docker plugin.

@xenoscopic
Copy link
Member

xenoscopic commented Sep 20, 2018

@divoxx There was some discussion started at EugenMayer/docker-sync#603 about integration as a docker-sync strategy. That's probably outside the scope of Mutagen, but I'll ping that issue once I get the beta out tomorrow. Certainly I'm willing to help make that a reality in any way that I can.

The idea with the coming release is to provide the synchronization primitives necessary for people to experiment with integrating into arbitrarily complex Docker setups. For some people, who just have a single container that they want to play with, Mutagen's direct support might be sufficient, but other people will almost certainly want to integrate the functionality into orchestration systems, scripts, etc.

What I'm hoping to determine is how Mutagen's Docker support can help with this. What sort of additional flags are needed? Are there other plumbing commands that would be useful? Are there higher-level constructs (e.g. COMPOSE_* environment variables) that Mutagen should be aware of? That sort of thing.

With Docker there are essentially endless possible setups, so the best thing for Mutagen to do is provide the low-level functionality for people to build up more complex behaviors. I'll probably open a separate "Docker feedback" issue where people can report their findings, any barriers they're running into with regard to scripting and automation, etc.

@xenoscopic
Copy link
Member

Actually it looks like I'll have to cut a beta release tomorrow instead, because the documentation updates took a little bit longer than I expected. I'll ping as soon as it's built and pushed out.

@divoxx
Copy link

divoxx commented Sep 21, 2018

That makes sense, @havoc-io. Thanks for the explanation, and for the great work. I'm excited about the possibilities.

@xenoscopic
Copy link
Member

xenoscopic commented Sep 22, 2018

I've just put out v0.7.0-beta1 which includes Docker support. Documentation on this feature is available here, and I've updated the README and site to reflect Docker support. If installing via Homebrew, you can get this release with brew install --devel mutagen (or, if you haven't added the tap yet, brew install --devel havoc-io/mutagen/mutagen).

Please note: This release is currently marked as a pre-release, and Docker support in particular is considered experimental. There are no known issues, but it's a big release in terms of internal changes, so please play with it in test environments for the time being. If all looks good after the next week or so, I'll take off the beta1 tag.

Since this adds the initial implementation of Docker support, I'm going to close this issue, but I've opened #41 for providing feedback on Docker support to help evolve and extend it. Real world feedback is the only way to know what else needs to be added, so please share your ideas/thoughts/opinions, positive or negative, no matter how small!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants