Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volumes should not be defined in base images #404

Open
huggla opened this issue Jan 29, 2018 · 16 comments
Open

Volumes should not be defined in base images #404

huggla opened this issue Jan 29, 2018 · 16 comments
Labels
Request Request for image modification or feature

Comments

@huggla
Copy link

huggla commented Jan 29, 2018

Base images should avoid setting VOLUME since it is currently impossible to unset in child images:
moby/moby#3465

@tianon
Copy link
Member

tianon commented Feb 13, 2018

Setting PGDATA is a trivial way to adjust which directory PostgreSQL saves data to (which is also noted in the image description). See also #375 for another discussion of this same topic.

@huggla
Copy link
Author

huggla commented Feb 13, 2018

Yes, but a pointless volume is still created.

@jannemann
Copy link

jannemann commented Feb 14, 2018

@huggla is right. This is maybe ok if you use docker run and just have a few volumes. But if you use docker-compose or maybe even swarm than there are unaccounted volumes on your docker host, which are connected to the container and thus could not be removed. And even worse is that these containers are not named, they have a random id.

Just to show you. I run an application deployed with docker-compose.

$docker volume ls

DRIVER VOLUME NAME
local f91eefad9a2e564e27d6fd204e94990b39206d641cb0bfaca1cb3dd36cee2b9f
local portus_certificates
local portus_postgres
local portus_registry
local portus_static

There are two volumes for the postgres container, as you can validate with docker inspect

$docker inspect --format="{{.Mounts}}" portus_db_1
[{volume portus_postgres /var/lib/docker/volumes/portus_postgres/_data /var/lib/postgres/data local rw true } {volume f91eefad9a2e564e27d6fd204e94990b39206d641cb0bfaca1cb3dd36cee2b9f /var/lib/docker/volumes/f91eefad9a2e564e27d6fd204e94990b39206d641cb0bfaca1cb3dd36cee2b9f/_data /var/lib/postgresql/data local true }]

@wglambert wglambert added the Request Request for image modification or feature label Apr 25, 2018
@cantino
Copy link

cantino commented Apr 28, 2018

I ran into the same problem and spent a few hours trying to understand why random volumes were being created in docker-compose even though I'd set one for /var/lib/postgresql/data myself. I think the docs should be clearer about this.

@hKaspy
Copy link

hKaspy commented Oct 5, 2018

I can add another view why not to use the VOLUME:

We use automated tests with Postgres Image pre-filled with data during build time. This way the image starts a lot faster which saves computing time. Now imagine running these tests on every commit and pull request.

You make hundreds of empty volumes with that process. Currently we use our own Dockerfile, copy-pasted from official repo, only with the VOLUME line commented out.

@ms4720
Copy link

ms4720 commented Oct 26, 2018

This is also causing me issues on kubernetes, the behavior you use is forbidden in kubernetes for production, https://kubernetes.io/docs/concepts/storage/persistent-volumes/ :

HostPath (Single node testing only – local storage is not supported in any way and WILL NOT WORK in a multi-node cluster)

@ta32
Copy link

ta32 commented Dec 5, 2018

I can add another view why not to use the VOLUME:

We use automated tests with Postgres Image pre-filled with data during build time. This way the image starts a lot faster which saves computing time. Now imagine running these tests on every commit and pull request.

You make hundreds of empty volumes with that process. Currently we use our own Dockerfile, copy-pasted from official repo, only with the VOLUME line commented out.

2018-12-05 23:46:19 (39.5 MB/s) - '/usr/local/bin/gosu.asc' saved [543/543]

  • mktemp -d
  • export GNUPGHOME=/tmp/tmp.Ii0f14Usol
  • gpg --keyserver ha.pool.sks-keyservers.net --recv-keys B42F6819007F00F88E364FD4036A9C25BF357DD4
    gpg: keybox '/tmp/tmp.Ii0f14Usol/pubring.kbx' created
    gpg: keyserver receive failed: Cannot assign requested address

how did you get it to build i always run into the same issue?

@ms4720
Copy link

ms4720 commented Dec 6, 2018

@ta32

gpg: keyserver receive failed: Cannot assign requested address

usbarmory/usbarmory-debian-base_image#9

@workmaster2n
Copy link

I'm not trying to argue for or against the VOLUME in the Dockerfile, but could someone explain the benefits of the VOLUME or intended use case? I'm just curious to learn best practices around Docker.

@mindreader
Copy link

Without the volume call, if you are using it for testing purposes it will write data into the container and that data will be lost upon container deletion. But even with the volume, every time you create a container it just spawns a new anonymous volume, so you get the exact same behavior, but you leave volumes all over the place.

And the workaround is horrendous. Manually forking every version of postgres and changing one line.

@ms4720
Copy link

ms4720 commented Jan 8, 2019

It would be great if the project maintained current behavior and a no volume branch while deprecating over a few version the current behavior.

@ms4720
Copy link

ms4720 commented Jan 8, 2019

@workmaster2n I don't think this is best practice really, it is just quicker to get something working when you don't know what you are doing. Best practice is to know your tools reasonably well.

@tianon
Copy link
Member

tianon commented Jan 25, 2019

@workmaster2n see docker-library/official-images#2437 (comment) for a decent summary of when we (the Official Images maintainers) recommend that image maintainers include a VOLUME (and when not to)

@paddy-hack
Copy link

So this VOLUME is what's hiding the data/ directory in the bind mount that I put on /var/lib/postgresql/ 😮

Say /srv/data/postgresql/data/ contains a perfectly valid PostgreSQL database with gobs of data. Now,

docker run --rm -it \
    -v /srv/data/postgresql:/var/lib/postgresql \
    postgres psql -U postgres

and try to find a sliver of data. No such luck 😰

I actually had my data in /srv/data/postgresql/11/ and used

docker run --rm -it \
    -v /srv/data/postgresql:/var/lib/postgresql \
    -e PGDATA=/srv/data/postgresql/11 \
    postgres psql -U postgres

and that worked fine.
I figured I could drop setting PGDATA by moving 11/ to data/and was surprised I could no longer find any of the data. Using -v /srv/data/postgesql/data:/var/lib/postgresql/data fixes things though.

Anyway, I think I'll stick with using $PG_MAJOR/ style directories as that makes upgrading across major versions a bit easier (see #37).

@yosifkit
Copy link
Member

yosifkit commented Feb 5, 2021

You can still have $PG_MAJOR style directories on your host without having to set PGDATA.

docker run --rm -it \
    -v /srv/data/postgresql/11/:/var/lib/postgresql/data/ \
    postgres:11

@paddy-hack
Copy link

Thanks for the suggestion.
I do like to have access to other places below /var/data/postgresql/ though, e.g. backups/, so I can scribble there instead of in the PGDATA directory. I guess I could achieve the same by adding another volume. Anyway, as usual, there is more than one solution and everyone gets to use whatever suits them 😸

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Request Request for image modification or feature
Projects
None yet
Development

No branches or pull requests