Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scratch should be a special keyword in from scratch, not an image #4242

Closed
ewindisch opened this issue Feb 19, 2014 · 31 comments · Fixed by #8827
Closed

scratch should be a special keyword in from scratch, not an image #4242

ewindisch opened this issue Feb 19, 2014 · 31 comments · Fixed by #8827

Comments

@ewindisch
Copy link
Contributor

We should consider treating 'scratch' as a reserved name and prevent images from being tagged as 'scratch'.

I can see arguments for allowing 'scratch' to be treated as any other image and keeping it the way it is, allowing images to be tagged to this name. However, there seems to be some potential for mild abuse or error by allowing 'scratch' to be tagged.

@tianon
Copy link
Member

tianon commented Feb 19, 2014

+1 - even more nice would be if it were actually a special case to the point that FROM scratch \n ADD ....tar / creates a single layer instead of two.

@SvenDowideit
Copy link
Contributor

mmm, and while we're at it, add some documentation mentioning scratch?

@SvenDowideit
Copy link
Contributor

If we're going to make scratch special - do we really need docker pull scratch talk to the server? or can we make an empty tar and import it on the fly?

or will it be important that every scratch is identical?

@tianon
Copy link
Member

tianon commented Feb 20, 2014

It's currently extremely important that every scratch is identical, since it's the base layer of all the turtles and it doesn't make sense to have more than one copy of that minimal empty tar, but it'd be much better if it weren't a minimal empty tar at all and was instead a special case altogether.

@unclejack
Copy link
Contributor

scratch should be a special keyword to not rely on any image, as tianon has also stated above.

I'll change the title.

@cpuguy83
Copy link
Member

I see #5262 closed
Was there further discussion on this? Can this be closed?

@duglin
Copy link
Contributor

duglin commented Sep 30, 2014

sure looks like this can be closed

@tianon
Copy link
Member

tianon commented Sep 30, 2014

There was some further discussion over there, yes:

@shykes:

I don't think it's a good idea to make "from scratch" a special case. It makes more sense to make the opening FROM optional. Isn't that simpler and more elegant in every way?

@tianon:

So once we have a useful way to manage/name two builds in one Dockerfile, how would one build two separate FROM scratch images if the FROM just becomes optional? I really like how explicit FROM scratch is, since it conveys in an unambiguous way that I definitely want an empty canvas as my base.

@unclejack:

The other approach which requires changing the code of other instructions was discussed and it was considered to be the wrong approach. This PR was a minimal implementation of this feature which didn't change much.

Looking at that thread, it's clear that the PR should be closed, but I don't think it's nearly as clear-cut that the whole discussion should be closed.

@unclejack
Copy link
Contributor

I think this issue needs to stay open. I also think that we need to make sure scratch is the same thing everywhere.

@errordeveloper
Copy link
Contributor

It makes more sense to make the opening FROM optional.

👍

@jlhawn
Copy link
Contributor

jlhawn commented Oct 29, 2014

@erikh How do you propose treating FROM scratch? This currently references an empty file system layer on the official registry. Future changes to the image storage system will optimize away empty layers like this one and others generated by configuration specifier build commands like CMD, ENV, VOLUME, etc, which don't actually change the file system.

I've got a current implementation which allows containers to be created with no base image, i.e., an image ID of "" and no longer requires Dockerfiles to begin with FROM. It does not make FROM implicit, but only makes it optional, allowing you to start the build with no image at all, which is different from using the image ID 511136ea... which is what we would like to avoid.

Would you instead prefer that scratch be special cased and treated like image ID ""?

@jlhawn
Copy link
Contributor

jlhawn commented Oct 29, 2014

Perhaps a compromise for these 2 camps is the introduction of a new Dockerfile command and behavior. We could add the requirement that the first command be either FROM <arg> or BASE. BASE explicitly says that you are building a base image, and acknowledges that the user understands this. "Base" may not be the best word to use, but you get the idea.

@jlhawn
Copy link
Contributor

jlhawn commented Oct 29, 2014

And this wouldn't break backwards compatibility because of course you could still use FROM scratch if you wanted to.

@erikh
Copy link
Contributor

erikh commented Oct 29, 2014

I have no objections at this point. If this makes more sense to do it without FROM, that’s fine. It just seems like a subtle, breaking change, but I feel I may be incorrect about that.

I do think we should get @shykes’s opinion before merge if he has one. I’ll review the code tomorrow.

On Oct 28, 2014, at 5:27 PM, Josh Hawn notifications@github.com wrote:

And this wouldn't break backwards compatibility because of course you could still use FROM scratch if you wanted to.


Reply to this email directly or view it on GitHub #4242 (comment).

@thaJeztah
Copy link
Member

I'm having beginning users in mind; making the FROM optional makes it hard to distinguish if someone intentionally wanted to start from scratch or forgot to add it.

From that perspective I like an explicit approach.

Technically, I agree; it's pretty useless to pull an empty image from the repository (and possibly unwanted for people wearing tinfoil hats). Additionally; having no parent image in docker inspect is more explicit than having to know the uuid of "scratch"

But again, from a user perspective, I'd prefer an explicit FROM <something>, perhaps FROM BLANK, FROM SPACE or FROM THIN-AIR 😄

@tiborvass
Copy link
Contributor

From the point of view of rethinking the future of Dockerfiles, it is very convenient right now that all current Dockerfiles start with FROM. It's a way we could distinguish from future Dockerfiles without having to resort to a VERSION instruction. Not saying this is the way we would go, but it is definitely valuable to keep this option possible.

So I would prefer having FROM scratch a special case at least for now, and I'm open for revisiting this later on. The algorithm could be either: "always handle FROM scratch as a special case", or "look for a scratch image and if there are none, then special case", or have FROM - be the new FROM scratch and be handled as a special case, which would also require to forbid image names that start with - (it's a sane thing to do).

I do wonder how any of the solutions would impact fragmentation of FROM scratches, how would the new algorithm handle old scratch images ?

@erikh @jlhawn @tianon @unclejack

@unclejack
Copy link
Contributor

@tiborvass FROM scratch would be very easy to standardize by making it do nothing or by making it always use the same ID everywhere ("produce" the image with the existing ID locally instead of pulling it whenever it's needed). This is what #5262 was doing.

@jlhawn
Copy link
Contributor

jlhawn commented Oct 29, 2014

@unclejack Our goal is to not use that image ID (511136ea...) or any image ID. Going forward, the distribution team's goal is to optimize away any empty file system layers like this.

@unclejack
Copy link
Contributor

@jlhawn We can make it a no-op in that case, I don't see how this would cause troubles. The Dockerfile would still work on all existing Docker versions and pulling existing images would still work.

@jlhawn
Copy link
Contributor

jlhawn commented Oct 29, 2014

I like the no-op approach, and I think I'm really close to an implementation that everyone can agree on. I feel kind of bad for these guys though: http://scratch.mit.edu/ I guess scratch will never be one of our official language pack base images :(

@tiborvass
Copy link
Contributor

@jlhawn @unclejack That's why I offered the "look for image named scratch, if none, then no-op" approach. WDYT?

@unclejack
Copy link
Contributor

@tiborvass Making FROM scratch always be a no-op would be the best approach. Keeping this as simple as possible would be a very good idea. The scratch language could be scratch-lang or something like that.

@tiborvass
Copy link
Contributor

Fair enough :) I guess we should design review this with @shykes and then we should be all set.

@shykes
Copy link
Contributor

shykes commented Nov 12, 2014

Let's do the following:

  • 1: Make FROM optional (as I already decided earlier)
  • 2: As a convenience for reverse-compatibility, make FROM scratch a no-op, which reduces the number of moving parts. Note: this should not be documented or encouraged, and should trigger a deprecation notice as soon as 1) is implemented.

@duglin
Copy link
Contributor

duglin commented Nov 12, 2014

if we do this, and since FROM is required right now, can we treat the absence of FROM as the trigger to indicate that we're using the, yet to be written, v2 parser? Then we can use the parser that will be more like the command line parser (deals with quotes, spaces, 's etc... properly).

@shykes
Copy link
Contributor

shykes commented Nov 12, 2014

@duglin this feels pretty orthogonal from a future v2 parser. We can do all this without any change to the syntax, and without waiting for a much more ambitious change to land.

@duglin
Copy link
Contributor

duglin commented Nov 12, 2014

I wasn't suggesting that we wait, I was just wondering about what possible triggers we could use to know when to use the new parser vs the old one. Thought this might be a good one - along with possible others that we being discussed.

@tiborvass
Copy link
Contributor

@shykes I agree with @duglin on not making it optional right away. I'm +1 for no-op.

The only reason it is not as orthogonal from a future v2 parser, is because it is super convenient for a v2 parser to assume that v1 Dockerfiles have a mandatory FROM. And we can always make the decision of making it optional later. The blocker for distribution is that layer ID for scratch. Making FROM scratch no-op solves it. Making it optional is just a little extra, and it would be throwing away a potentially useful card from our hands. That's all :)

@thaJeztah
Copy link
Member

Fwiw, I'm +1 with Tibors comment.

@erikh
Copy link
Contributor

erikh commented Nov 12, 2014

I think @duglin/@tibor’s idea is kind of hacky, but I totally understand the point of this and agree 100%. Making changes to the builder — big ones — is basically impossible with the lack of versioning atm.

+1 for a no-op but required FROM

@tiborvass
Copy link
Contributor

@crosbymichael @jlhawn Even if FROM scratch were optional for aesthetics, we need a special codepath for it for reverse compatibility. Solomon explained our decision here: #8827 (comment)

There were some misunderstandings about back and forths. I looked into git log and issues and PRs: it seems that scratch was never a special case. The documentation at some point explained how to create a scratch image from /dev/null, and that it was available on the hub. The only PR that attempted to change this was, unclejack's #5262 that, ironically, was closed in favor of a solution that would make from scratch optional.

However, in the last design review session, we have decided to keep from scratch mandatory for now. The newer PR that solves that is #8827 which differs from #5262 in its implementation by not automatically adding an empty layer, but not having a layer at all, which is indeed preferable.

So conclusion: we should merge #8827 with FROM scratch being a mandatory no-op.

jlhawn pushed a commit to jlhawn/docker that referenced this issue Dec 18, 2014
There has been a lot of discussion (issues 4242 and 5262) about making
`FROM scratch` either a special case or making `FROM` optional, implying
starting from an empty file system.

This patch makes the build command `FROM scratch` special cased from now on
and if used does not pull/set the the initial layer of the build to the ancient
image ID (511136ea..) but instead marks the build as having no base image. The
next command in the dockerfile will create an image with a parent image ID of "".
This means every image ever can now use one fewer layer!

This also makes the image name `scratch` a reserved name by the TagStore. You
will not be able to tag an image with this name from now on. If any users
currently have an image tagged as `scratch`, they will still be able to use that
image, but will not be able to tag a new image with that name.

Goodbye '511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158',
it was nice knowing you.

Fixes moby#4242

Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
ColinHuang pushed a commit to fcwu/docker that referenced this issue Jan 5, 2015
There has been a lot of discussion (issues 4242 and 5262) about making
`FROM scratch` either a special case or making `FROM` optional, implying
starting from an empty file system.

This patch makes the build command `FROM scratch` special cased from now on
and if used does not pull/set the the initial layer of the build to the ancient
image ID (511136ea..) but instead marks the build as having no base image. The
next command in the dockerfile will create an image with a parent image ID of "".
This means every image ever can now use one fewer layer!

This also makes the image name `scratch` a reserved name by the TagStore. You
will not be able to tag an image with this name from now on. If any users
currently have an image tagged as `scratch`, they will still be able to use that
image, but will not be able to tag a new image with that name.

Goodbye '511136ea3c5a64f264b78b5433614aec563103b4d4702f3ba7d4d2698e22c158',
it was nice knowing you.

Fixes moby#4242

Docker-DCO-1.1-Signed-off-by: Josh Hawn <josh.hawn@docker.com> (github: jlhawn)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet