Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run: introduce -ro for read-only containers #2710

Closed
wants to merge 1 commit into from
Closed

run: introduce -ro for read-only containers #2710

wants to merge 1 commit into from

Conversation

stapelberg
Copy link

Instead of having a read-writable top layer, containers started with -ro
will have a read-only top layer.

The motivation is to avoid mistakes when “dockerizing” software: regular
software is written under the assumption that they can write data to any
writable path on the machine. In order to be able to throw away docker
containers at will and be reasonably sure that no data is lost, one can
use -ro and use volumes for all data that should be writable. That way,
when you forget to use a volume (or the software in question starts
using a new path that you don’t have a volume for), you’ll get errors
instead of silent data loss and bad surprises.

The following changes were necessary to make this work:

• Create /lxc_putold in the docker init layer, otherwise lxc-start
will try to create it and fail (due to read-only failsystem)

• Mount a tmpfs over /dev, create /dev/{shm,pts}, lxc-start wants to
re-create the files in there. Perhaps lxc-start could be changed in
the future to not require that if the files are already there.

Instead of having a read-writable top layer, containers started with -ro
will have a read-only top layer.

The motivation is to avoid mistakes when “dockerizing” software: regular
software is written under the assumption that they can write data to any
writable path on the machine. In order to be able to throw away docker
containers at will and be reasonably sure that no data is lost, one can
use -ro and use volumes for all data that should be writable. That way,
when you forget to use a volume (or the software in question starts
using a new path that you don’t have a volume for), you’ll get errors
instead of silent data loss and bad surprises.

The following changes were necessary to make this work:

  • Create /lxc_putold in the docker init layer, otherwise lxc-start
    will try to create it and fail (due to read-only failsystem)

  • Mount a tmpfs over /dev, create /dev/{shm,pts}, lxc-start wants to
    re-create the files in there. Perhaps lxc-start could be changed in
    the future to not require that if the files are already there.
@crosbymichael
Copy link
Contributor

@stapelberg

Thanks for the PR but I'm not sure we want to support a feature like this right now. There are a lot of exceptions that need to be made, like mounting /dev in tmpfs as your are doing in this PR for lxc.

In the future if we change execution backends, lxc, libvert, openvz, etc we do not full know what may need to be supported for the different backends. We just don't want to make changes right now that we don't fully understand the consequences for going forward with docker development.

@stapelberg
Copy link
Author

@crosbymichael, Thanks for your reply.

I’d argue that read-only systems are possible, no matter which execution backend Docker starts to support. I don’t see what would be different enough between lxc, libvirt, openvz and others that would make this impossible.

I am also somewhat disappointed that you seem to think this feature is not a big deal. IMO, this is the only reliable way to get software dockerized. May I ask you please to reconsider merging this PR, perhaps after discussing it with other docker devs?

@crosbymichael
Copy link
Contributor

@stapelberg

I brought up execution backends because you mention in this PR that we make the filesystem ro except for mounting /dev in tmpfs so that lxc can write a few files. How do we know that libvirt will not need to write files someone where else adding another dir that needs to be mounted in tmpfs? That was the point I was trying to make and we are not fulling done discussing the design changes for libvirt.

I did not say that this is not a big deal. Knowing the current development of docker I do not think that this is the right time to merge a feature like this while we are still working though internals, especially libvirt changes.

Also, why do you think this is the only reliable way to dockerize software?

@stapelberg
Copy link
Author

@crosbymichael, I determined that we need to mount /dev as a tmpfs by trying to run a container :). If we need additional paths, we can add them as the need arises, but I don’t think there should be any.

I think this is the only reliable way to dockerize software because silent data loss is very likely to go unnoticed if you don’t have that. As an example, let’s assume I dockerize askbot, a FOSS stack overflow clone. I realize that I need to make its PostgreSQL database persistent, but maybe I don’t realize it also stores user uploads as files and perhaps even more data. Without a read-only container, I will never see error messages, because the writes actually succeed. Then, only after my server crashes and burns and I deploy the same container on another machine, I realize that all my user data is gone.

@miko
Copy link

miko commented Feb 6, 2014

I think you can run your process as nobody (or a user without a write access) for now (as it works in standard unix systems). And you can do "docker diff" to check if something was written to the container.

@betawaffle
Copy link

You can always run docker diff to see if your container has changed any files you weren't expecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants