New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flatten images - merge multiple layers into a single one #332

Closed
unclejack opened this Issue Apr 4, 2013 · 231 comments

Comments

Projects
None yet
@unclejack
Contributor

unclejack commented Apr 4, 2013

There's no way to flatten images right now. When performing a build in multiple steps, a few images can be generated and a larger number of layers is produced. When these are pushed to the registry, a lot of data and a large number of layers have to be downloaded.

There are some cases where one starts with a base image (or another image), changes some large files in one step, changes them again in the next and deletes them in the end. This means those files would be stored in 2 separate layers and deleted by whiteout files in the final image.

These intermediary layers aren't necessarily useful to others or to the final deployment system.

Image flattening should work like this:

  • the history of the build steps needs to be preserved
  • the flattening can be done up to a target image (for example, up to a base image)
  • the flattening should also be allowed to be done completely (as if exporting the image)

@justone justone referenced this issue May 1, 2013

Merged

Builder #472

@unclejack

This comment has been minimized.

Contributor

unclejack commented Jun 7, 2013

@shykes How would you like this to work? Could you provide an example of how this should work, please?

It looks like AUFS has some limit with 39-41 layers. We should really have image flattening in order to allow commit->run->commit to be used after deployment as well.

@bortels

This comment has been minimized.

bortels commented Aug 1, 2013

Ping.

My dockerfiles grow as I find neat stuff like this https://gist.github.com/jpetazzo/6127116

and IIRC, each RUN line makes a new level of AUFS, no?

I'm basically ignorant about many things, happy to admit it - if a "docker flatten" isn't coming down the pipe soon, does anyone have a reference for how to do it by hand? Or a reason it can't be?

(I guess I could workaround by moving all of the RUN lines into a single shell script, so it's not vital; but I can't do that with someone else's image. Hmm. Is there a way to "decompile" an image, recreating the Dockerfile used for it (assuming it was done entirely by a Dockerfile, of course).

@dqminh

This comment has been minimized.

Contributor

dqminh commented Aug 5, 2013

I encountered this recently too when building images. Will something like http://aufs.sourceforge.net/aufs2/shwh/README.txt help here ?

@vieux

This comment has been minimized.

Collaborator

vieux commented Aug 5, 2013

I made a small tool to flatten images: https://gist.github.com/vieux/6156567

You have to use full ids, to flatten dhrp/sshd : sudo python flatten.py 2bbfe079a94259b229ae66962d9d06b97fcdce7a5449775ef738bb619ff8ce73

@mhennings

This comment has been minimized.

Contributor

mhennings commented Aug 11, 2013

+1

i see the need too.
if possible i would like a command that allows both, flatten all and squashing some selected layers.

if a container is flattened we should think about what happends when it is pushed. the registry / index could remove unneeded / duplicated layers, if enough inormation is sent during the push.
like "replaces Xxxxxxxxx, yyyyyyy, zzzzzzz"

@jpetazzo

This comment has been minimized.

Contributor

jpetazzo commented Aug 13, 2013

FWIW, the "aubrsync" (in aufs-tools package) might be useful for that,
since it aims at synchronizing and merging AUFS branches.

@shykes

This comment has been minimized.

Collaborator

shykes commented Aug 21, 2013

From my answer in a different thread:

Currently the only way to "squash" the image is to create a container from it, export that container into a raw tarball, and re-import that as an image. Unfortunately that will cause all image metadata to be lost, including its history but also ports, env, default command, maintainer info etc. So it's really not great.

There are 2 things we can do to improve the situation:

  1. A short-term solution is to implement a "lossless export" function, which would allow exporting an image to a tarball with all its metadata preserved, so that it can be re-imported on the other side without loss. This would preserve everything except history, because an image config does not currently carry all of its history. We could try to plan this for 0.7 which is scheduled for mid-September. That is, if our 0.7 release manager @vieux decides we have time to fit it in the release :)

  2. A 2nd step would be to add support for history as well. This is a little more work because we need to start storing an image's full history in each image, instead of spreading it out across all the aufs layers. This is planned for 0.8.

@ykumar6

This comment has been minimized.

ykumar6 commented Aug 21, 2013

Hey guys, here's an idea we are prototyping. Let's say an image consists of 4 layers

L1<-L2<-L3<-L4

When we start a container off L4, we make changes in L5. Once the changes are complete, we commit back to get a new image

L1<-L2<-L3<-L4<-L5

At this point, we do a post-commit merge step where we start a new container, L4A from L3. We copy L5 & L4 into L4A and create a new image like this

L1<-L2<-L3-<-L4A

This way, we preserve the impermutable nature of the image but can compress layers when necessary to create new images

@dqminh

This comment has been minimized.

Contributor

dqminh commented Aug 22, 2013

@shykes @ykumar6 i did some experiments on exporting the image and trying to preserve metadata last night here https://github.com/dqminh/docker-flatten . Would love to know if the approach is reasonable.

What it does is that it will try to compress all image's layers into a tarfile, generate a dockerfile with as much metadata as possible, and create a new image from that.

@jpetazzo

This comment has been minimized.

Contributor

jpetazzo commented Sep 10, 2013

Question: do we really want to flatten existing images, or to reduce the number of layers created by a Dockerfile?

If we want to flatten existing images, it could be the job of an external tool, which would download layers, merge them, upload a new image.

If we want to reduce the number of layers, we could have some syntactic sugar in Dockerfiles, meaning "don't commit between those steps because I want to reduce the number of layers or the first steps are creating lots of intermediary files that I clean up later and don't want to includee in my layers".

@unclejack

This comment has been minimized.

Contributor

unclejack commented Sep 10, 2013

@jpetazzo Removing commits done between two steps of a Dockerfile would be useful, but we might still want to be able to flatten images. There are some use cases which require "-privileged" to be provided during a run and that's not possible with a Dockerfile, so you have to script a Dockerfile run, some docker run -privileged steps and then commit.
We might also want to craft custom images which have one layer and one single parent layer (a common image such as ubuntu, centos, etc).

@dkulchenko

This comment has been minimized.

dkulchenko commented Sep 10, 2013

@jpetazzo I would say both, as they address separate issues.

Flattening existing images allows you to work around the AUFS branch limit (you can only stack so many images), in the case where you're building on someone else's image, and someone else builds on yours, and your stack ends up hitting the limit pretty quick.

The syntactic sugar in the Dockerfile would allow building docker images that necessitate large toolchains to build and produce a comparatively small result (which I would argue is the more pressing of the two issues). Without it, a 2GB toolchain building a 10MB image will result in a 2058MB image.

@bortels

This comment has been minimized.

bortels commented Sep 11, 2013

I second the syntactic sugar - but I'd flip-flop it, in that I do a bunch of stuff (package building), and I really only want to commit the last step.

Maybe simply having an explicit "COMMIT imagename" in the dockerfile? And an implicit one right at the end? (I actually think commit at the end is sufficient - I'm not sure what use I'd have for an intermediate image, where I wouldn't just do it with a seperate dockerfile...)

I'll admit the AUFS limit was floating around in the back of my brain, but being able to flatten an arbitrary dockerfile is perfectly adequate for me there. (Doing so AND keeping history would be even nicer).

@jeffutter

This comment has been minimized.

jeffutter commented Sep 11, 2013

I am somewhat fond of @bortels idea. I can see use cases where you would want the intermediate steps when building the dockerfile (incase something fails, like apt-get due to networking). You would want to be able to resume at that step. However it would be nice to say "When this is done" or "When you get to point A" squash the previous layers.

@tomgruner

This comment has been minimized.

tomgruner commented Sep 25, 2013

An idea and script by Maciej Pasternacki:
http://3ofcoins.net/2013/09/22/flat-docker-images/

Docker looks really exciting, but the limit of 42 layers could cause some issues if a docker needs to be updated over a few years. Flattening every now and then doesn't sound so bad though.

@a7rk6s

This comment has been minimized.

a7rk6s commented Sep 26, 2013

When I started using Docker I soon wished for a "graft" command for image maintenance. Something like this:

$ docker graft d093370af24f 715eaaea0588
67deb2aef0e0

$ docker graft d093370af24f none
e4e168807d31

$ docker graft -t repo:8080/ubuntu12 d093370af24f 715eaaea0588
67deb2aef0e0

In other words it would basically change the parent of an image, or make it into a parent-less base image, and then return the new ID (possibly tagging/naming it). Would it be really slow because it'd have to bring both images into existence and compare them?

I like the "COMMIT" idea too. Or better, a "make a flattened image" flag when building, since this is really is more of a build option.

(Confession: I love Docker but the concept of the Dockerfile never clicked with me. Why add extra syntax just to run some shell commands? Why commit intermediate steps? So I've been making containers 100% with shell scripts. It's nice because it forces me to create build/setup scripts for my code, which is useful outside of Docker).

@jpetazzo

This comment has been minimized.

Contributor

jpetazzo commented Sep 26, 2013

Re "why commit intermediate steps": I find it very convenient when I have
longer Dockerfiles; when I modify one line, it only re-executes from that
line, thanks to the caching system; that saves me time+bandwidth+disk
space, since the first steps are usually those big "apt-get install" etc.;
of course, I could do the apt-get install and other big steps in a separate
Dockerfile, then commit that, then start another Dockerfile "FROM" the
previous image; but the Dockerfile caching system makes the whole thing way
easier. At least, to me :-)

On Wed, Sep 25, 2013 at 5:25 PM, a7rk6s notifications@github.com wrote:

When I started using Docker I soon wished for a "graft" command for image
maintenance. Something like this:

$ docker graft d093370af24f 715eaaea0588
67deb2aef0e0

$ docker graft d093370af24f none
e4e168807d31

$ docker graft -t repo:8080/ubuntu12 d093370af24f 715eaaea0588
67deb2aef0e0

In other words it would basically change the parent of an image, or make
it into a parent-less base image, and then return the new ID (possibly
tagging/naming it). Would it be really slow because it'd have to bring both
images into existence and compare them?

I like the "COMMIT" idea too. Or better, a "make a flattened image" flag
when building, since this is really is more of a build option.

(Confession: I love Docker but the concept of the Dockerfile never clicked
with me. Why add extra syntax just to run some shell commands? Why commit
intermediate steps? So I've been making containers 100% with shell scripts.
It's nice because it forces me to create build/setup scripts for my code,
which is useful outside of Docker).


Reply to this email directly or view it on GitHubhttps://github.com//issues/332#issuecomment-25135724
.

@jpetazzo https://twitter.com/jpetazzo
Latest blog post: http://blog.docker.io/2013/09/docker-joyent-openvpn-bliss/

@shykes

This comment has been minimized.

Collaborator

shykes commented Sep 26, 2013

On Wed, Sep 25, 2013 at 5:25 PM, a7rk6s notifications@github.com wrote:

When I started using Docker I soon wished for a "graft" command for image
maintenance. Something like this:

$ docker graft d093370af24f 715eaaea0588
67deb2aef0e0

$ docker graft d093370af24f none
e4e168807d31

$ docker graft -t repo:8080/ubuntu12 d093370af24f 715eaaea0588
67deb2aef0e0

In other words it would basically change the parent of an image, or make
it into a parent-less base image, and then return the new ID (possibly
tagging/naming it). Would it be really slow because it'd have to bring both
images into existence and compare them?

I like the "COMMIT" idea too. Or better, a "make a flattened image" flag
when building, since this is really is more of a build option.

This problem will go ahead on its own once each image carries its full
history (currently history is encoded in the chain of aufs layers, which
avoids duplication of data, but means you can't get rid of one without
getting rid of the other, hence the problem we're discussing).

Once that's in place, whether you commit at each build step or only at the
end will be entirely up to you (the person running the build). Depending on
the granularity you want. More granularity = more opportunities to re-use
past build steps and save bandwidth and disk space on upgrades. Less
granularity = you can remove build dependencies from the final image,
export to a single tarball without losing context, etc. I doubt we'll add
any syntax to the Dockerfile to control that.

(Confession: I love Docker but the concept of the Dockerfile never clicked
with me. Why add extra syntax just to run some shell commands? Why commit
intermediate steps? So I've been making containers 100% with shell scripts.
It's nice because it forces me to create build/setup scripts for my code,
which is useful outside of Docker).

That's a common misunderstanding. Dockerfiles are not a replacement for
shell scripts. They provide context for running shell scripts (or any
other kind of script) from a know starting point (hence the FROM keyword)
and a known source code repository (hence the ADD keyword).

@shykes

This comment has been minimized.

Collaborator

shykes commented Sep 26, 2013

s/the problem will go ahead/the problem will go away/

On Wed, Sep 25, 2013 at 5:36 PM, Solomon Hykes
solomon.hykes@dotcloud.comwrote:

On Wed, Sep 25, 2013 at 5:25 PM, a7rk6s notifications@github.com wrote:

When I started using Docker I soon wished for a "graft" command for image
maintenance. Something like this:

$ docker graft d093370af24f 715eaaea0588
67deb2aef0e0

$ docker graft d093370af24f none
e4e168807d31

$ docker graft -t repo:8080/ubuntu12 d093370af24f 715eaaea0588
67deb2aef0e0

In other words it would basically change the parent of an image, or make
it into a parent-less base image, and then return the new ID (possibly
tagging/naming it). Would it be really slow because it'd have to bring both
images into existence and compare them?

I like the "COMMIT" idea too. Or better, a "make a flattened image" flag
when building, since this is really is more of a build option.

This problem will go ahead on its own once each image carries its full
history (currently history is encoded in the chain of aufs layers, which
avoids duplication of data, but means you can't get rid of one without
getting rid of the other, hence the problem we're discussing).

Once that's in place, whether you commit at each build step or only at the
end will be entirely up to you (the person running the build). Depending on
the granularity you want. More granularity = more opportunities to re-use
past build steps and save bandwidth and disk space on upgrades. Less
granularity = you can remove build dependencies from the final image,
export to a single tarball without losing context, etc. I doubt we'll add
any syntax to the Dockerfile to control that.

(Confession: I love Docker but the concept of the Dockerfile never
clicked with me. Why add extra syntax just to run some shell commands? Why
commit intermediate steps? So I've been making containers 100% with shell
scripts. It's nice because it forces me to create build/setup scripts for
my code, which is useful outside of Docker).

That's a common misunderstanding. Dockerfiles are not a replacement for
shell scripts. They provide context for running shell scripts (or any
other kind of script) from a know starting point (hence the FROM keyword)
and a known source code repository (hence the ADD keyword).

@a7rk6s

This comment has been minimized.

a7rk6s commented Sep 26, 2013

They provide context

Makes sense. Though, the Dockerfiles I've seen in the wild have been all over the place (as are the ones I've created, since I'm still trying to find the best way to lay things out so it's easy to develop / maintain / repurpose chunks to make different images).

once each image carries its full history

Out of curiosity, will it be possible to do, e.g., "apt-get clean" after the image has been built, and end up with less disk space used?

@mattwallington

This comment has been minimized.

mattwallington commented Dec 3, 2013

I am assuming this didn't make it into .7 as previously mentioned. Any plans for the next release?

@vmadman

This comment has been minimized.

vmadman commented Dec 28, 2013

Am I understanding this correctly? An image can only have a maximum of ~40 RUN/ADD statements in its entire lifetime.. including inheritance?

@justincampbell

This comment has been minimized.

justincampbell commented Feb 2, 2016

@cgrandsjo That would cause the cache to not be used by default.

@cgrandsjo

This comment has been minimized.

cgrandsjo commented Feb 2, 2016

@justincampbell: Sorry, please elaborate your answer. Omitting the ADDLAYER command actually means that ADDLAYER is added to the end of the file "silently" and next time you build with the Dockerfile the cache will be used because an additional layer was created.

Update:
Actually I just realized that it depends on how you modify the Dockerfile, whether the cache will be used or not. Maybe the default behaviour should be as it is right now and for those who know what they are doing, there should be a docker build option to "squash" intermediate layers if that is desired.

@mishunika

This comment has been minimized.

mishunika commented Feb 17, 2016

Yay, now I have images of around 30Gigs in size, and their actual size should not be more than 10G!

So yeah, just stepped into the same difficulty, and I was thinking that Dockerfile is really lacking some kind of COMMIT action (Inspired from the DB transactions) to decide where a layer should end.
Then I found this issue and I've read other commit related ideas that are in fact the same. I think that user should be able to specify explicitly what the layers should contain and when they should start/end.

Furthermore, in my opinion, the issue with caching is not a big one though. It can be the same as now, but delimited by the commit/addlayer levels, if something has changed in a such block, then no cache is used at all for this level. And even more, the commit thing can be optional, and the default behavior can be maintained as it is now.

@foxx

This comment has been minimized.

foxx commented Feb 17, 2016

@mishunika See my previous answer, and also @TomasTomecek, for a workaround. Don't bother trying to push this proposal with Docker, it ain't going to happen any time soon (see previous comments from core devs)

@campbel

This comment has been minimized.

campbel commented Feb 19, 2016

@TomasTomecek is https://github.com/goldmann/docker-scripts#squashing a suitable tool for reducing image size?

For instance given a docker file:

FROM baseimage

RUN apt-get install buildtools
ADD / /src

RUN  buildtools build /src

RUN apt-get remove buildtools
RUN rm -rf /src

After building and squashing, would the resulting image lose the size of the src and buildtools?

@goldmann

This comment has been minimized.

Contributor

goldmann commented Feb 19, 2016

@campbel That's correct. This tool will remove unnecesary files. I haven't tested it with ADDing root filesystem (/) to the image (it's generally, a very bad idea), but I understand that this is just an example.

Please note that Docker 1.10 is still in works (see v2 branch). Feel free to open any issues.

@yoshiwaan

This comment has been minimized.

yoshiwaan commented Mar 18, 2016

I think the AND and ADDLAYER options mentioned above are useful for certain situations (such as controlling what to and what not to cache), but if you are chaining builds from images you control and later builds are removing things from the upstream builds then they don't help with the size problem.

Something as simple as a --squash option to docker build which looks through the layers and removes whiteout files and all underlying files in above layers (correct me if I'm wrong but that's my understanding of how it works) would be extremely useful.

It's the same as when you use git really, sometimes you want to rebase, sometimes you want full commit history and sometimes you just want to squash all that noise out of there.

@sivang

This comment has been minimized.

sivang commented May 26, 2016

So, is this going to be a feature in docker or already solved in the stable release somehow?

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented May 26, 2016

@sivang Maybe, now that the image format has been changed: #22641.

Please don't spam the PR, though.

@foxx

This comment has been minimized.

foxx commented May 26, 2016

@sivang I'll be surprised if you see this feature in a release before 2017. If you need a quick fix, read previous suggestions or check out the far more superior option, rkt

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented May 26, 2016

Thanks for the commercial break, @foxx

@sivang

This comment has been minimized.

sivang commented May 26, 2016

Well, I used jwilder's docker-squash, it seemed to have done the flatten
job but loading the image back doesn't show it on the docker images list...

On Thu, May 26, 2016 at 6:24 PM, Sebastiaan van Stijn <
notifications@github.com> wrote:

Thanks for the commercial break, @foxx https://github.com/foxx


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#332 (comment)

@JonathonReinhart

This comment has been minimized.

JonathonReinhart commented Jul 1, 2016

Yet another disappointment from the Docker team; not because some feature doesn't exist, but because of a dismissive attitude by the maintainers.

@tiborvass said (#332 (comment)):

The problem with this issue is that it provides a solution to a problem that yet has to be defined
...
We're closing this issue. Would love to continue the debate on more focused issues.

Perhaps he didn't read the original issue (which was opened over three years ago), which very clearly stated:

There are some cases where one starts with a base image (or another image), changes some large files in one step, changes them again in the next and deletes them in the end. This means those files would be stored in 2 separate layers and deleted by whiteout files in the final image.

These intermediary layers aren't necessarily useful to others or to the final deployment system.

I don't understand what is "yet to be defined" or not focused about that, but in case you need something concrete:

FROM debian

# This line produces an intermediate layer 400 MB in size
ADD local_400MB_tarball_im_about_to_install.tar /tmp

# This line installs some software, and removes the tarball.
# Lets say it produces a layer with 20 MB of binaries
RUN cd /tmp && tar xf local_400MB_tarball_im_about_to_install.tar && cd foo && make install && cd /tmp && rm local_400MB_tarball_im_about_to_install.tar

The end result is that this image is sizeof(debian) + 420MB in size, when 400 MB of it were removed.

Perhaps if issues were addressed instead of dismissed, this project wouldn't have nearly as many issues in its history as it does commits.

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Jul 1, 2016

@JonathonReinhart This problem is this issue is discussing a particular solution rather than the problems.
In reality, squashing is a stop-gap to a particular problem that is an implementation detail of the current storage subsystem... ie, we don't need squashing if/when the storage subsystem is replaced with a better solution.

Thank you for your kind and thoughtful comments.

@JonathonReinhart

This comment has been minimized.

JonathonReinhart commented Jul 1, 2016

@cpuguy83 Sarcasm isn't necessary when someone is expressing frustration.

Regardless of whether or not this is the right solution, people will find this issue when looking for a solution to a very common problem. When you see that the issue is closed, you'll immediately wonder "Why was this closed? Was it fixed?", and when you see that was closed with a message essentially stating, "Sorry, too vauge, try again", that is a good way to frustrate and alienate users.

I am a big supporter of Docker, and advocate many different types of projects to use it. I think that it would greatly help the project if issues like this were handled better. Specifically, I think when @tiborvass closed this issue, it should have been locked (so the "resolution" of the issue didn't get burried in the middle of the page), and included a reference to other issue(s) where the problem(s) could be discussed in the "more focused" fashion he was advocating for.

@foxx

This comment has been minimized.

foxx commented Jul 1, 2016

to a particular problem that is an implementation detail of the current storage subsystem

@cpuguy83 The entire implementation of Docker is fundamentally flawed, and this issue is just one of many such issues. So unless you are planning on rewriting the entire Docker platform from scratch, then flattening images is the best you're going to get.

The problem with this issue is that it provides a solution to a problem that yet has to be defined

@tiborvass I think it's pretty clear what the problem is, don't you?

@monokrome

This comment has been minimized.

monokrome commented Jul 1, 2016

Problem: We have n layers when the results of actions performed in order to create each one only need to be in 1 layer.
Solution: ?!?!?!?!?

@ohjames

This comment has been minimized.

ohjames commented Jul 14, 2016

when you see that was closed with a message essentially stating, "Sorry, too vauge, try again", that is a good way to frustrate and alienate users.

I see hundreds of people defining a very very clear problem... Hundreds of users all in unanimous agreement that docker handles layers in a way that doesn't make sense to them. Yet the people on the inside actually developing it are the only ones who feel that hundreds of community members all agreeing with each other and stating the same thing haven't "defined" themselves.

Even if I did agree that the problem wasn't clearly well defined (and I definitely don't) the way the core developers have responded to the community basically shows contempt. As for the solutions, how docker-squash manages to be so slow and delay our build time for so long on such a tiny set of layers, I don't know... can't wait for rkt.

@zerthimon

This comment has been minimized.

zerthimon commented Jul 14, 2016

@ohjames +1
I feel the same thing. I asked for a few features before, and they all were rejected with the following reasons:

  1. It will hurt portability
  2. it will hurt security

When will this project realize, users don't like to be FORCED to have portability and security at the price of productivity.
How about adding the feature users ask for, so USER HAS THE CHOICE and DECIDES FOR HIMSELF if he wants to use it even if it hurts protability and security.

Can't wait for someone fork this project and make it more friendly to the users.

@justincormack

This comment has been minimized.

Contributor

justincormack commented Jul 14, 2016

There is an open PR for flattening #22641

@vdemeester

This comment has been minimized.

Member

vdemeester commented Jul 14, 2016

It took me a while to decide to answer something here, but I feel I need to pin-point some stuff.

First, as @justincormack there is a PR for flattening (#22641) — thus maybe we could reopen that issue as we are trying to, maybe, have it built-in.

How about adding the feature users ask for, so USER HAS THE CHOICE and DECIDES FOR HIMSELF if he wants to use it even if it hurts protability and security.

I'm gonna quote Nathan Leclaire here (from The Dockerfile is not the source of truth for your image).

The Dockerfile is a tool for creating images, but it is not the only weapon in your arsenal.

Dockerfiles and docker build is only one way to build Docker images. It is the default/built-in one, but you definitely have the choice to build your image with other tooling (and there is some : packer, rocker, dockramp, s2i… to only list a few). One of the focus of Dockerfile is portability and thus this is one of the main concern when discussing features on Dockerfile and docker.

If you don't care about portability or if the Dockerfile possibilities are too limited for your use cases, again, repeating myself, you are free to use other tooling to build images. Docker does not force you to use Dockerfiles to build image — it's just the default, built-in way to do it.

On the Dockerfile and image building subject, I highly recommend people to watch a talk from Gareth Rushgrove at DockerCon16 : The Dockerfile Explosion and the need for higher level tools.

@rvs

This comment has been minimized.

rvs commented Jul 14, 2016

@justincormack Justin, what I really would love to see is very similar to #22641 but with a full power of git rebase (especially git rebase --interactive). Do you think it is feasible?

@docbill

This comment has been minimized.

docbill commented Jul 15, 2016

It is important to distinguish what is needed per this request, and what is
desired. What is desired is to refactor docker images into a git or git
like repository view, so all the cool features one does with git would be
possible in docker. For example, I have docker image for plex. It is
basically a fedora base, with a download of the plex build onto. I have
the container set to autobuild on the docker hub. So every time the
Fedora base changes, it rebuilds, even though most of the time the plex
download does not change. What annoys me about that one, is when I pull
the update the layer for adding the plex download is treated as brand new,
even though byte per byte is identical to the original delta. With a git
like repository that could be handled, making a docker pull a much much
more efficient operation.

That is the desired...

However, the ask is simply to have a standard way to flatten an image. So
lets say as part of my build I downloaded the plex source installed the
developer dependencies, compiled it, and then deleted the everything except
the actual build. All the tools for the build would still be layers in
the image even though they would be inaccessible from my final image. It
is a huge waste... If the container could be flattened to remove unneeded
layers it would be much much more efficient use of space and bandwidth.

That is the required...

Don't say you can't do the required, because the desired is too much
work... Just hit the low hanging fruit first, and everyone will be much
happier.

On 14 July 2016 at 19:03, Roman V Shaposhnik notifications@github.com
wrote:

@justincormack https://github.com/justincormack Justin, what I really
would love to see is very similar to #22641
#22641 but with a full power of
git rebase (especially git rebase --interactive). Do you think it is
feasible?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#332 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADBcWBAHV30oG5yogvUifDjqAFINyFsLks5qVsBMgaJpZM4AjRHk
.

@jdmarshall

This comment has been minimized.

jdmarshall commented Jul 19, 2016

@vdemeester

| The Dockerfile is not the source of truth for your image

I think some of the people asking for features like this one are more comfortable with the truth in this statement than some of the people with 'Docker member' after their names. There are, for instance, a number of security related issues that have been closed-won't-fix with the reason that docker build should be repeatable.

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Nov 2, 2016

For those interested, we just merged --squash on docker build.
This will squash the final result of the build to it's parent image (ie. the FROM).
#22641

rtyler pushed a commit to rtyler/docker that referenced this issue Feb 23, 2018

Merge pull request moby#332 from jeanlouisboudart/master
Fix moby#331 backport  JENKINS_UC_DOWNLOAD feature in install-plugins.sh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment