Better documentation for storage drivers #18298

xeor · 2015-11-29T22:26:31Z

I'm sorry if this is a tiresome topic, but I can't find a good answer.

I have started to look for which storage driver to use for my Docker machines in production.
Overlayfs looked like the one to use, so I tried it out, getting a ton of errors with socket files. Then I started getting weird errors if a layer was deleting from another layer on buildtime. Switching to ext4 as backing filesystem instead of xfs for the overlay driver solved that.
It's confusing to need to dig down all this old issues to figure out that this are mostly known issues.
It's also confusing to go from devicemapper (loop) > devicemapper (thin) > overlayfs (with xfs) > overlayfs (with ext4).

The best that worked for me (less buggy) was actually devicemapper, both loop and thin.

What is the "at the time" best option? Whats the pros and cons?
Maybe this should be a part of the documentation that is updated from time to time. A matrix-style overview.

GordonTheTurtle · 2015-11-29T22:26:31Z

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues

BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

thaJeztah · 2015-11-29T22:43:32Z

We recently added a whole section dedicated to storage drivers in our documentation; https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/ is that what you're looking for?

xeor · 2015-11-29T23:39:14Z

More something like https://docs.docker.com/engine/userguide/storagedriver/selectadriver/
The article is good, but it's recommendation is not that good. It says to use something that you have experience with, can get support on, comfortable with and so on. But the fact is, that not many have dealing with layer-filesystems before Docker. And many people are using Docker on their own personal systems, to see if they can get it into their professional work some day.

Taking my own experience;

devicemapper (loop): Used as default, worked like a charm most of the time. God a lot of warnings that I used the wrong one.
devicemapper (thin): Got rid of the warning, but soon discovered everyone was using overlayfs.
overlayfs (xfs): Used xfs as that tends to be the default fs for my distro (Centos7). Sockets had to be mounted in so they where not laying on overlayfs. Got other weird issues when a build is deleting something from upper layers.
overlayfs (ext4): Got rid of the "delete above layer" problem. Still no sockets. And discovered yum is having problems. Which I can "kinda" solve with a rpm --rebuilddb before yum. But I'm also reading a lot of complaints that overlayfs is not following POSIX as good as it should.

So, why do everyone push for overlayfs if it's so buggy?

An overview in a table would be really useful. This little story shows that this is a minefield.

What about a page with the backend filesystem combination in the X axis, and a little info about stuff like known issues, use if, dont use if, production ready, consider stable from version, recommended settings, or similar...?

I know that this issue is a complex one. And 1 filesystem clearly doesn't fit all. But trying and failing hitting known bugs along the way as I did, trying to follow recommendation, is not a good thing either.
It would be really helpful if someone with a lot of experience about all this would write down some real recommendations. Which we can keep updated.

thaJeztah · 2015-11-30T00:11:42Z

Yes, even with the additional information in the new section, which storage driver to pick remains a tricky question to answer, and depends on many variables; the distro you're running on, kernel support, and use-case, to name a few.

We're certainly open to improvements (and welcome contributions in that area)

For RHEL/Fedora/CentOS based installations, devicemapper is currently the most used option; it's the most actively maintained driver (actively maintained by Red Hat). Using loopback with devicemapper is highly discouraged by the devicemapper maintainers, especially for production use.

w.r.t Overlay; this driver may become the default in future, one of the reasons for that is that it's included in newer kernels by default, so has widespread support. Unfortunately it still has some quirks (incompatibility with RPMs being one of them, excessive inode usage is another), which makes it currently not suitable for setting as "default" (see #12354) . Depending on your use case, you may not be affected by those quirks, and overlay could be an option to consider. Also, Red Hat is currently working on making changes to RPM to make it work with overlay.

We considered including a compatibility matrix in the documentation, but decided to not include this at this moment, because it may be difficult to maintain and keep up to date (see https://github.com/docker/docker/pull/16766/files?short_path=991890e#diff-991890e619874cd6bb0277584bb7f7a4), but again, open to suggestions here.

xeor · 2015-11-30T01:06:49Z

I think that If I have seen https://www.docker.com/compatibility-maintenance earlier, I would probably stopped at dm with thin..

The compatibility matrix was a good start imo. As suggested in the pull #16766, I don't know if people setting up Docker would even see this message. At least for many of the people I have talked to, I think they want to stay as close to "distro default" as possible.

What about having a link to some more online documentation on the Usage of loopback devices is strongly discouraged for production use. Either use--storage-opt dm.thinpooldevor use--storage-opt dm.no_warn_on_loop_devices=trueto suppress this warning. message I am getting on a default installation on Centos 7?
When I first saw that, I quickly started the journey for the "best" filesystem I could get for my containers, ending up overlay was the best choice in general.
A link to https://www.docker.com/compatibility-maintenance would at least give the user some information on what it's "supported" at the moment.

thaJeztah · 2015-11-30T10:01:03Z

What about having a link to some more online documentation

That could be helpful yes. I don't think we currently output URLs as part of log messages, and if we start
doing so, we should make sure we have a permalink (some URLs have changed over time, and although we try to put redirects in place, we may have missed some)

@moxiegirl @SvenDowideit Any thoughts on this?

A link to https://www.docker.com/compatibility-maintenance would at least give the user some information on what it's "supported" at the moment.

Oh, that's an interesting one. The page you're referring to there, is for the "commercially supported" (CS) release of Docker. The open source docker engine may actually "support" far more combinations (technically), but commercial support may not cover those.

Perhaps we should add some links to commercially supported versions of Docker, to make users more aware of their existence, not sure.

I'll defer that one to @moxiegirl 😄

moxiegirl · 2015-11-30T17:59:14Z

@xeor I'm very sorry our documentation couldn't meet your needs. Thank you for taking the time to provide such great feedback. It is very helpful. This article was written by a guest blogger in the Docker space -- the doc team provided editorial guidance. @nigelpoulton is very knowledgeable and researched anything he wasn't already familiar with. We will work with him to continue to improve what is there.

That said, documentation, unfortunately, cannot always address every user or every use case. For that reason, when seeking production experiences, you are really well served by supplementing your documentation reading by asking the Docker community. Their real-world experience will always be broader as it grows every day. You are likely to find a use-case match there faster than the docs.
The Google group and the IRC channels are good resources.

Adding links in log messages is very hard to maintain and manage. Though, I do agree with you they can be helpful. I'll add it to our requirements but I can't promise that in the near future.

The https://www.docker.com/compatibility-maintenance refers to our Commercially Supported Engine. There is documentation for that but since it is used exclusively with the Trusted Registry, the docs are in that section. It doesn't include the compatibility matrix; we can add that and I will work with our team to make it more prominent.

xeor · 2015-11-30T18:15:49Z

Thanks for the reply @moxiegirl. I usually turn to the community when looking for questions like this. However, this time, I read a bunch of blog-posts, "friends don't let friends use devicemapper"-type-posts, and so on. Many of the articles are out there, and it's hard to find a "correct" one that is still not outdated.
Isn't that something that should be in the official docs? Updated from time to time..

What about a simple "go.docker.com/ab23" type perm-linking / url-shortener / redirector?
I feel like some of the errors/info messages I have been stumbling on when using Docker needs much more text, and maybe even pictures to describe. It's much better than people copy/pasting the one-liner-error they are getting and ends up at the blog-post for whoever was first to "find" a solution.

moxiegirl · 2015-11-30T22:35:22Z

@xeor Yes, we do want to include more information from bloggers. As time and resources allow, we will do more. I'll make sure to review some blogs from external sources when we update the material.

Regarding the links, the technology of shortening is not the issue. Error and information messages are produced by the development team for each product. So, while I can make suggestions about their construction, the ultimate decision is up to the development team. They decide on standards and guidelines around code.

@thaJeztah is an encyclopedia of are PR and coding standards. I'm sure he will bring message guidelines up with the team, as will I. We really do appreciate your suggestion and are not going to forget it even if we can't implement it as quickly as we'd like.

nigelpoulton · 2015-12-01T11:44:14Z

@xeor

I'll have a think about making the "Which storage driver should you choose?" section a bit better.

Thoughts so far are....

Clarify and beef up the paragraph that refers to the CS engine. Include a link to the CS supported configs page. Mention that these configs are the most stable (that's why they're supported by CS engine). Also mention that people can use CS engine configs as a generic guide for what is most stable and production-ready. Other stuff works.... but CS engine goes with device-mapper and AUFS for a reason!
Add some clarification to the paragraph that talks about using "something that you are comfrotable and have experience with". I fully take on your point that layered filesystems etc are new to many people. With that in mind I may word things to suggest that if you're new to this..... you might wanna follow the path being set by the CS engine - e.g. recommend device-mapper in an lvm-direct config. Also .... Device mapper is known by most users of RHEL and CentOS. For those who use Ubuntu then AUFS may still be the most stable and simplest option.
Add a para that that talks about OverlayFS being seen by many as the potential future, but still young compared to device-mapper etc.

I'm thinking out loud but these ideas feel like they'd offer more guidance without being constrictive.

Any further thoughts?

xeor · 2015-12-01T12:24:20Z

When talking about OverlayFS being young, maybe there can also being a mention about what kind of bugs that might show up in overlayfs. Being "young", and mentioned as the "future" in the same paragraph tells me that a lot of people are using it, and it doesn't contain that many bugs.
Maybe we could mention some of the bugs, like rpm and .socket problems..?

I don't know about AUFS, but setting up device-mapper with lvm-thin do require some extra work.
Some of the questions people might have (talking about own experience), is;

How big should the meta-volume be?
Why is /var/lib/docker still filling up? Do I need to have an extra mountpoint for /var/lib/docker in addition to the lvm-thin data volumes?

Refeering to the CS is a good idea imo. I think most of the people playing with Docker want to play with "Docker" itself, not the storage problem :)

nigelpoulton · 2015-12-01T12:47:31Z

OK, so first things first..... I'll whip up a rough redo of the "Which storage driver should you choose?" section and post here for thoughts.

thaJeztah · 2015-12-01T13:16:04Z

Thanks for looking into this @nigelpoulton!

moxiegirl · 2015-12-01T17:07:43Z

Thanks @nigelpoulton :-D and @xeor --- you have a great point about being more specific about the actual limitations.

xeor · 2015-12-01T17:12:04Z

I'm glad you agree :) Some times, it have felt like this is something everyone is knowing everything about..

Thanks again you all!

nigelpoulton · 2015-12-02T11:53:08Z

@xeor

OK so the following is massively rough (I'm at a conference at the moment so short for time) and is only some thoughts on updating the recommendation section.

@moxiegirl I've gone heavy towards using the CS Engine as the gold standard for stability and reliability - a great idea that makes perfect sense (to me at least) and also gives airtime to the existence of the CS Engine. Anyway... here goes -

As you might expect, the answer to this question is “it depends”. However, while there are some clear cases where one particular storage driver outperforms another for certain workloads, you should factor all of the following into your decision:

Stability: For the most stable Docker experience you should use a storage driver certified for use with the Commercially Supported Docker Engine (CS Engine). CS Engine storage drivers are selected based on stability and reliability. Storage drivers not supported by the CS Engine may work fine, but could be less stable and less reliable.
Experience and expertise: Choose a storage driver that you and your team/organization are comfortable with. Consider how much experience you have with a particular driver. There is no substitute for experience and it is rarely a good idea to try something brand new in production. That’s what labs and laptops are for!

However, if you do not have experience with layered filesystems, you may want to choose a storage driver that is supported by the CS Engine. These drivers are selected for stability and reliability, and will give you the smoothest Docker experience - avoiding some of the bugs and nuances associated with some of the newer and less mature drivers.

Future-proofing: Many people consider OverlayFS as the future of the Docker storage driver. However, it is less mature, and potentially less stable than the drivers certified for use with the CS Engine. For this reason, you should use the OverlayFS driver with caution and expect to encounter more bugs and nuances than if you were using drivers supported by the CS Engine.

Whichever driver you choose, make sure it has strong community support and momentum. This is important because storage driver development in the Docker project relies on the community as much as the Docker staff to thrive.
LMK your thoughts.

moxiegirl · 2015-12-02T19:32:10Z

@nigelpoulton I'll see about adding this in the next couple of days. I'll leave it open in case someone else wants to pick it up.

xeor · 2015-12-02T19:59:14Z

Would it also be a good idea to write down what types of bugs people have been reporting using overlay? Something like; "Some of the bugs people have been reporting when using OverlayFS is, not being able to store socket files, problems with yum, and problems deleting files.".

Also, for the stability option. Might it be an idea to write down a couple of quick tips for the different best-practices? Like the ones I mentioned earlier;

How big should the meta-volume be?
Why is /var/lib/docker still filling up? Do I need to have an extra mountpoint for /var/lib/docker in addition to the lvm-thin data volumes?

There are probably similar questions using AUFS..?

A simple; what would be the ideal storage setup if you want to try out Docker at a single node, small scale but in production.

A link to the CS page would also be nice. :)

moxiegirl · 2015-12-03T22:48:18Z

@xeor Writing down the bugs: We use Github issues for that kind of tracking. Users can search for specific labels around storage drivers. Maintaining a list of specific issues is better left to these automated systems than manual lists in the docs. The most we would want to put in the docs is a pre-canned filter for folks.

Using CS requires a purchase...so that has to be made clear.

nigelpoulton · 2015-12-03T23:04:33Z

@moxiegirl I'll add some more to this in the morning so don't add it yet.

@xeor I'll take a look at how feasible it is to adding some best practices on the pages specific to each driver so we can keep this section we're updating here non-driver-specific. Things like why /var/lib/docker still fills up etc belong in the driver specific pages IMO.

moxiegirl · 2015-12-05T18:21:56Z

@nigelpoulton The article is very popular! With great success comes greater scrutiny tho...and here we have some additional comments that came in through email:

--------email ----

I've read your documents under docker.git/docs/userguide/storagedriver/.
It is good and I believe you must be a good engineer.

Here I'd like to make some suggestion to add some notes about aufs,
mainly comparing overlayfs. I know I should post this mail much wider as
docker mailing list, but I don't know where it is. I am developing aufs
but I am not using docker. If you want to forward someone else this
mail, you don't have to hesitate.

I've read docs/userguide/storagedriver/overlayfs-driver.md, particulary
this section.

OverlayFS and Docker Performance

As a general rule, the overlay driver should be fast. Almost certainly faster than aufs and devicemapper. In certain circumstances it may also be faster than btrfs. That said, there are a few things to be aware of relative to the performance of Docker using the overlay storage driver.

I'd point out a few things which may have some impacts to users.
All these facts about overlayfs are known before it was merged into
mainline, and some of them are described in its document.

whiteout in overlayfs consumes inode
Every whiteout special file in overlayfs is not hard-linked, which
means all the whiteout consume its inode one by one.
In aufs, all whiteouts are hard-linked and share the same inode.
concurrent copyup in overlayfs is slow
Internally the copyup operation in overlayfs is 'serialized,' which
means it doesn't run concurrently. It is due to a lock mechanism in
'workdir.' Additionally the lock duration is long.
In aufs, as long as the parent dir differs, all copyup operations run
concurrently.
copyup in overlayfs may result larger
When the being copied-up file is a sparse file (the file has a size,
but doesn't consume the disk block), the copied-up file in overlayfs
becomes a non-sparse file and consumes the disk blocks.
Aufs supports the sparse file copy-up, and the result keeps the hole
(the non-allocated disk blocks) as it is.
file identity in overlayfs is not kept (name-based .vs. inode-based
behaviour)
Generally, the file identity in a system is managed by a pair of
device-id and inode-number rather than pathname. For example, fileA on
the lower readonly layer has its identity as the pair of {dev=0x0801,
inum=100}. When fileA is modified, it is copied-up to the upper
writable layer, and its identity will be {0x0802, 200}. Now what
identity should be returned from the union filesystems? Overlayfs
shows an interesting behaviour. Before copy-up, it handles fileA as
{0x0801, 100}. By copying-up, it is changed to {0x0802, 200}. In other
words, overlayfs doesn't show its own virtual file identitiy by its
very fundamental design.
This results the various interesting behaviours. For example,
- Information from stat(2) and fstat(2) differ between before/after
  copy-up.
- By issuing fd1=open("fileA", O_RDONLY) and then fd2=open("fileA",
  O_WRONLY), users cannot read the latest contents of fileA via fd1.
  Note that open(2) for writing causes the internal copy-up.
- etc.
  All these behaviours are caused by the silently changing file
  identity.
  Aufs provides the virtual its own file identity. For example (in the
  case above), the pair of {0x0803, 300} is kept before/after copyup and
  users can always retrieve the latest information/contents from fileA.

Hmm, I might write badly about overlayfs too much. I am afraid that I am
biased to aufs. :-)
But I agree that overlayfs is smaller than aufs, and generally
faster. Of course it is due to the lack of many POSIXly correct
behaviours. The choice is totally up to users. If he/she (or his/her
application) doesn't need the POSIXly correct behaviour, then overlayfs
is a good choice. When he/she meets a trouble related to such behaviour,
then aufs is the best candidate for him/her.

There are some configuratins and mount options in aufs to gain the
performance. For the general use in containers, here is my
recommendation.

(configs)
CONFIG_AUFS_FS=m
CONFIG_AUFS_BRANCH_MAX_127=y
# CONFIG_AUFS_BRANCH_MAX_511 is not set
# CONFIG_AUFS_BRANCH_MAX_1023 is not set
# CONFIG_AUFS_BRANCH_MAX_32767 is not set
CONFIG_AUFS_SBILIST=y
# CONFIG_AUFS_HNOTIFY is not set
# CONFIG_AUFS_EXPORT is not set
# CONFIG_AUFS_XATTR is not set
# CONFIG_AUFS_FHSM is not set
CONFIG_AUFS_RDU=y
# CONFIG_AUFS_SHWH is not set
# CONFIG_AUFS_BR_RAMFS is not set
# CONFIG_AUFS_BR_FUSE is not set
# CONFIG_AUFS_BR_HFSPLUS is not set
CONFIG_AUFS_BDEV_LOOP=y
# CONFIG_AUFS_DEBUG is not set

(aufs mount options)

1st level: noatime, dirperm1, udba=none
2nd level: notrunc_xino, notrunc_xib
3rd level: noplink, noxino

All thse configs and options are to drop the features from aufs, and
make its behaviour close to overlayfs. As the level forwards (1st,
2nd...), more features are dropped, and the chances to show the POSIXly
incorrect behaviour will grow.
Hmm, it might be better to apply the 1st level options as the default in
docker.

J. R. Okajima

thaJeztah · 2015-12-05T18:25:07Z

That's a cool response!

sfjro · 2015-12-06T13:09:26Z

For the 1st level aufs mount options, I'd suggest xino=/xino too.

nigelpoulton · 2015-12-09T20:45:40Z

Right so I also love the response and detail - wowzers!

But I do worry about getting too much into the weeds in technical docs like these... it can be a rabbit hole and we can end up creating a beast.

Don't get me wrong.... the feedback and detail of the suggestions are great. But the docs then become harder to consume, harder to keep up to date, harder to vet and verify etc. And before we know it, we've got a 60,000 word doc that reads like stereo instructions translated through three different languages and requires a team of writers and engineers to make a single change.

Just my opinions.... and I do love the passion and technical knowledge conveyed in the response.... I just worry that it's not the direction the docs should go.

sfjro · 2015-12-10T02:01:20Z

Nigel:

Right so I also love the response and detail - wowzers!
:::
Just my opinions.... and I do love the passion and technical knowledge conveyed in the response.... I just worry that it's not the direction the docs should go.

Yes, I understand that.
The level how deep to describe the technical details is depending upon
which people the documents are for. It is good to distinguish the design
document and the users manual. The performance matter may exist between
them, especially when we describe the technical background.

Also the configuration as a recommendation from me may not be an option
for the users, since the configuration is a part of building the kernel
module and I guess most docker users won't compile the module by
themselves.

But I'd suggest to consider the 1st level recommended aufs mount options
(noatime or relatime, dirperm1, udba=none, xino=/xino) as default
in docker. It must be benefits for users.

J. R. Okajima

LK4D4 · 2016-09-16T21:56:49Z

It was a long time since this thread was active. Some docs were added.
If it's still doesn't meet expectation - feel free to open new issue with particular concerns.

thaJeztah added kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. area/docs labels Dec 1, 2015

moxiegirl added the exp/beginner label Dec 2, 2015

moxiegirl added the area/storage label Dec 16, 2015

moxiegirl self-assigned this Dec 16, 2015

nigelpoulton mentioned this issue Jan 11, 2016

WIP: Updating for CAS changes and new select a driver section #19240

Closed

jovandeginste mentioned this issue Jan 25, 2016

Apache -D foreground fails kwk/docker-registry-frontend#88

Closed

LK4D4 unassigned moxiegirl Sep 16, 2016

LK4D4 closed this as completed Sep 16, 2016

phith0n mentioned this issue Jun 28, 2018

phpmyadmin/4.8-rce 容器退出。 vulhub/vulhub#39

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Better documentation for storage drivers #18298

Better documentation for storage drivers #18298

xeor commented Nov 29, 2015

GordonTheTurtle commented Nov 29, 2015

thaJeztah commented Nov 29, 2015

xeor commented Nov 29, 2015

thaJeztah commented Nov 30, 2015

xeor commented Nov 30, 2015

thaJeztah commented Nov 30, 2015

moxiegirl commented Nov 30, 2015

xeor commented Nov 30, 2015

moxiegirl commented Nov 30, 2015

nigelpoulton commented Dec 1, 2015

xeor commented Dec 1, 2015

nigelpoulton commented Dec 1, 2015

thaJeztah commented Dec 1, 2015

moxiegirl commented Dec 1, 2015

xeor commented Dec 1, 2015

nigelpoulton commented Dec 2, 2015

moxiegirl commented Dec 2, 2015

xeor commented Dec 2, 2015

moxiegirl commented Dec 3, 2015

nigelpoulton commented Dec 3, 2015

moxiegirl commented Dec 5, 2015

thaJeztah commented Dec 5, 2015

sfjro commented Dec 6, 2015

nigelpoulton commented Dec 9, 2015

sfjro commented Dec 10, 2015

LK4D4 commented Sep 16, 2016

Better documentation for storage drivers #18298

Better documentation for storage drivers #18298

Comments

xeor commented Nov 29, 2015

GordonTheTurtle commented Nov 29, 2015

BUG REPORT INFORMATION

thaJeztah commented Nov 29, 2015

xeor commented Nov 29, 2015

thaJeztah commented Nov 30, 2015

xeor commented Nov 30, 2015

thaJeztah commented Nov 30, 2015

moxiegirl commented Nov 30, 2015

xeor commented Nov 30, 2015

moxiegirl commented Nov 30, 2015

nigelpoulton commented Dec 1, 2015

xeor commented Dec 1, 2015

nigelpoulton commented Dec 1, 2015

thaJeztah commented Dec 1, 2015

moxiegirl commented Dec 1, 2015

xeor commented Dec 1, 2015

nigelpoulton commented Dec 2, 2015

moxiegirl commented Dec 2, 2015

xeor commented Dec 2, 2015

moxiegirl commented Dec 3, 2015

nigelpoulton commented Dec 3, 2015

moxiegirl commented Dec 5, 2015

OverlayFS and Docker Performance

thaJeztah commented Dec 5, 2015

sfjro commented Dec 6, 2015

nigelpoulton commented Dec 9, 2015

sfjro commented Dec 10, 2015

LK4D4 commented Sep 16, 2016