New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker image prune/rm removes in-use image tags #36295

Open
githubsaturn opened this Issue Feb 13, 2018 · 21 comments

Comments

Projects
None yet
5 participants
@githubsaturn
Copy link

githubsaturn commented Feb 13, 2018


BUG REPORT INFORMATION

Describe the results you received:
Running docker image prune --all --force removes IN-USE images causing locally created images to get lost.

Describe the results you expected:
If a service is running via locally-create-image:1, running docker image prune --all --force SHOULD NOT remove locally-create-image:1

Steps to reproduce the issue:

  1. docker swarm init # simply create a swarm
  2. mkdir /test-bug
  3. cd /test-bug
  4. echo "FROM nginx:1.13.8" > Dockerfile # this is an example, it can be anything
  5. docker build -t myimage:1 .
  6. docker service create --name my-service myimage:1
  7. wait and make sure that your container is running. Check using docker ps -a, you will see image is listed as myimage:1
  8. docker image prune --all --force # this is supposed to remove all "unused" images
  9. docker service update my-service --force # this can be an update of environmental variables or anything else that causes a container restart.
  10. You will see that the service does not get restarted as there is no myimage:1 image!

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:	17.12.0-ce
 API version:	1.35
 Go version:	go1.9.2
 Git commit:	c97c6d6
 Built:	Wed Dec 27 20:11:19 2017
 OS/Arch:	linux/amd64

Server:
 Engine:
  Version:	17.12.0-ce
  API version:	1.35 (minimum version 1.12)
  Go version:	go1.9.2
  Git commit:	c97c6d6
  Built:	Wed Dec 27 20:09:53 2017
  OS/Arch:	linux/amd64
  Experimental:	false

Output of docker info:

Containers: 7
 Running: 4
 Paused: 0
 Stopped: 3
Images: 66
Server Version: 17.12.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 113
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: active
 NodeID: hl08z3yjk34msargicyuklvzy
 Is Manager: true
 ClusterID: k883g1werpi8dj207nr09kj3h
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
  Force Rotate: 0
 Autolock Managers: false
 Root Rotation In Progress: false
 Node Address: 174.7.187.204
 Manager Addresses:
  174.7.187.204:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 89623f28b87a6004d4b785663257362d1658a729
runc version: b2567b37d7b75eb4cf325b77297b140ea686ce8f
init version: 949e6fa
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.4.0-112-generic
Operating System: Ubuntu 16.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.682GiB
Name: REMOVED
ID: REMOVED
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: dockersaturn
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

WARNING: No swap limit support
@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

The issue seems to be that some images are untagged by prune, even when they are used.

I can reproduce the issue if there are multiple repo tags for a single image id. Because of this:

if !force && isSingleReference(repoRefs) {
if container := daemon.getContainerUsingImage(imgID); container != nil {
// If we removed the repository reference then
// this image would remain "dangling" and since
// we really want to avoid that the client must
// explicitly force its removal.
err := errors.Errorf("conflict: unable to remove repository reference %q (must force) - container %s is using its referenced image %s", imageRef, stringid.TruncateID(container.ID), stringid.TruncateID(imgID.String()))
return nil, errdefs.Conflict(err)
}
}
parsedRef, err := reference.ParseNormalizedNamed(imageRef)
if err != nil {
return nil, err
}
parsedRef, err = daemon.removeImageRef(parsedRef)
if err != nil {
return nil, err
}

This logic seems a bit broken. Why does it matter how many references there are? Shouldn't it simply check if any container is using the specific reference that was requested to be deleted? @tonistiigi

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

This was modified in 1f7a9b1 so cc @dmcgowan as well.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 21, 2018

@dnephin I think the logic in that function means that if an image has multiple references and one reference is deleted it just untags it, in any case, doesn't matter if there are containers or not.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

Yes, but that condition is a filter for when to check if a container uses it. If it evaluates to false it proceeds to untag the image.

Why would it remove a tag that is used, just because there is more than one tag? Doesn't that sound broken?

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

Apparently this very bizarre behaviour is even documented:

// If the given imageRef is a repository reference then that repository
// reference will be removed. However, if there exists any containers which
// were created using the same image reference then the repository reference
// cannot be removed unless either there are other repository references to the
// same image or force is true. Following removal of the repository reference,

I'm not sure why, but apparently this is the expected behaviour.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 21, 2018

@dnephin I guess it so that docker run busybox && docker tag busybox busybox2 && docker rmi busybox2 doesn't error

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

Sure, but docker rmi busybox should not untag busybox, however it does today. That's why I think the behaviour is wrong.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 21, 2018

A container isn't connected to a reference but the image itself by ID as references can always change during the lifetime of the container (I know there is a text field in config but it should be unused apart from container inspect view).

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 21, 2018

I understand that, I don't think that changes my expectation for the behaviour. Also this issue is about services, which do use the reference when updated.

I would argue this is issue is a bug with ImageDelete, not specific to prune.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 21, 2018

Also this issue is about services, which do use the reference when updated.

With the current code/api/ux, a container or rmi has no concept of a cluster.

I would argue this is issue is a bug with ImageDelete, not specific to prune.

I think this was the desired behavior for ImageDelete. Is it the best solution or not is of course arguable. The base issue seems to be that someone thought all cases can be covered without the need for untag and that assumption doesn't seem to hold. But the bug, in this case, seems to be that prune uses the same codepath that rmi although it expects a different behavior than what is documented as the intended behavior for rmi.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 22, 2018

the bug, in this case, seems to be that prune uses the same codepath that rmi although it expects a different behavior

I disagree. image prune and image rm should absolutely use the same code path and have the exact same behaviour. There is no good reason for them to behave differently in some very subtle way. That is a terrible UX.

That leaves us with two options:

  1. ImageDelete is wrong, but no one has really cared until now because it was only subtly wrong.
  2. the current behaviour of docker prune is correct.

I personally think it is the first case.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 22, 2018

image prune and image rm should absolutely use the same code path and have the exact same behaviour.

rmi is a combination of untag and/or delete image. This is not subtle but very intentional decision from the first 0.* releases to have an effect that references are same as images. You can't just skip the reference check in prune(that are different as no reference is explicit) and expect rmi to error when an image wasn't deleted because it will just untag then. Even if we fix this container ID vs image ID check (#36346) and this starts to work again it is almost accidental that the rmi code works here as most of it will be impossible to reach from prune. The bug in ImageDelete is that it should be called ImageDeleteOrUntag as it can return successfully without any image being deleted (as is clearly stated in the code comments and rmi docs).

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 23, 2018

rmi is a combination of untag and/or delete image. This is not subtle but very intentional decision

I agree, this is absolutely correct.

The problem is not that it untags images, the problem is what it untagged is not intuitive. When it was just the container API this was not a bit deal, but with the service API and service update it is more of an issue (which is why this github issue was opened). My reasoning for why this is not just a problem with prune is because the same problem exists with image rm:

$ docker image ls
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
myimage             1                   d3117424aaee        2 weeks ago         27.1MB
redis               alpine              d3117424aaee        2 weeks ago         27.1MB
$ docker service create myimage:1
...
lbeozsxxzluvrtkqvdqgumxho
$ docker container ls
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS               NAMES
50a94c08cf19        myimage:1           "docker-entrypoint..."   10 seconds ago      Up 9 seconds        6379/tcp            serene_volhard.1.t434m24w64enows81ed5034r3
$ docker image rm myimage:1
Untagged: myimage:1
$ docker service update --force lbeozsxxzluvrtkqvdqgumxho
overall progress: 0 out of 1 tasks
1/1: No such image: myimage:1
service update paused: update paused due to failure or early termination of task 
$

Exact same problem with image rm, the service is left in a broken state even though no --force was used on rm. This is exactly the same issue.

The problem is, as I mentioned above, that a reference that is "in use" is allowed to be removed. It's not really relevant the container is actually using an image ID because the image reference needs to exist for a user to re run the docker run command, or to run service update command. So from the perspective of a user, it is still in use. (maybe we need to only consider the tag if a container uses the same image ID as the tag points at, I'm not sure if that extra check is necessary).

ImageDelete should only allow you to untag if nothing is "using" that reference. The number of references that share the same image ID should be irrelevant.

@dnephin dnephin changed the title docker image prune --all removes in-use image docker image prune/rm removes in-use image tags Feb 23, 2018

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 23, 2018

Exact same problem with image rm, the service is left in a broken state even though no --force was used on rm. This is exactly the same issue.

I don't think this is the same issue. The issue you are describing is about different semantics between services and containers. A container is tied to an image, while service is tied to a reference that should point to a highly available external storage(with a fallback case in a worker if it fails). This fallback only works if a worker node has been manually set up in a specific way to answer a possible cluster request, while the node itself has no knowledge of being in a cluster. You are conveniently mixing cluster-manager and local-node commands in that example but in reality, we need to consider these commands running on different machines. To correctly solve this containers vs services problem we need a local cluster-aware storage that keeps track of all service images without the need for an external registry.

Prune is wrong in this case regardless if you want to change the deletion rules(that were designed before there were services). You don't need to use a service for prune to untag your image that is being used by a container.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 23, 2018

I don't think this is the same issue.

Did you see my example? It is identical.

You don't need to use a service for prune to untag your image that is being used by a container.

Yes, both prune and image rm are broken for both containers and services, it's just more obviously wrong with services.

You are conveniently mixing cluster-manager and local-node commands in that example but in reality, we need to consider these commands running on different machines.

It's just as valid to run them on a single machine. It needs to work for both.

I think you're making this problem way more complicated than it needs to be. It's simple. A user issues a request to delete unused images. It removes an image that the user considers in-use, and that our CLI/API shows as in-use. That sounds like a bug to me. Either the the CLI/API shouldn't show the image tag, or the image delete is wrong.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 23, 2018

A user issues a request to delete unused images. It removes an image that the user considers in-use, and that our CLI/API shows as in-use. That sounds like a bug to me

Exactly. Prune is broken. It is broken because of a typo that compared image IDs to container IDs. Let's fix that please.

If you want to change the semantics of rmi that have been there for a long time and are documented and do have clear use-cases, please create another issue/proposal to discuss that.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 23, 2018

It is broken because of a typo that compared image IDs to container IDs. Let's fix that please.

Keeping a map of image IDs was never the right solution. prune is fixed, now the only issue is with image remove. image delete and image prune must behave the same way. We don't need a separate issue.

do have clear use-cases,

What is the use case? Your example of keeping at least one tag is still supported by my proposed behaviour.

@dmcgowan

This comment has been minimized.

Copy link
Member

dmcgowan commented Feb 23, 2018

image delete and image prune must behave the same way

Why? They are very different functionality, one the user is requesting to remove something specific, the second is the daemon deciding what might be eligible for removal.

We shouldn't break the image delete behavior to fix a bug in prune. We have been fairly consistent in not breaking this sort of behavior because it tends to have more unexpected consequences for users than the original behavior. Honestly, it is confusing either way so we just need to make it clear in documentation. I would rather avoid changing this behavior now as it will have an impact on any future changes involving referencing/tagging.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 23, 2018

prune should just be an automated version of image rm. That should be the only difference. Otherwise it's unnecessary confusion for the user.

In the case of image rm the engine still needs to figure out if it's eligible for removal as well. It's even documented here:

// Hard Conflict:
// - a pull or build using the image.
// - any descendant image.
// - any running container using the image.
//
// Soft Conflict:
// - any stopped container using the image.
// - any repository tag or digest references to the image.

We shouldn't break the image delete behavior to fix a bug in prune

We aren't. We're fixing a bug in image rm:

$ docker pull redis:alpine
...
$ docker tag redis:alpine myimage:1
$ docker run -d myimage:1
552ae8f1134235f9c7d41ec220c9978cb51fd387d4a2d2cd4215b108950ee7c1
$ docker container ls --format '{{.Image}}'
myimage:1
$ docker image rm myimage:1
Untagged: myimage:1
$ docker image rm redis:alpine
Error response from daemon: conflict: unable to remove repository reference "redis:alpine" (must force) - container 20c7d341f7f5 is using its referenced image d3117424aaee
$ docker container ls --format '{{.Image}}'
d3117424aaee

I don't know how you can say this isn't broken. It's fine to untag a reference that IS being used and referenced by a container, but it's not ok to remove a completely unreferenced tag?

If we have to keep a tag around, we should keep the tag that's actually being used.

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Feb 24, 2018

What is the use case?

for example:

  • tagging image with myimage-backup without removing the container and freeing up the reference to be used by the new container with an option to rollback to the old image
  • clearing up junk images that are duplicates of used ones
  • untagging myrepo:latest when you know that it is not up-to-date anymore. so that when new containers are created they would point to/pull a new image

prune should just be an automated version of image rm. That should be the only difference. Otherwise it's unnecessary confusion for the user.

I explained the differences 3 comments above. Prune is about removing unused images, rmi is not only about removing images but also about untagging a specific reference that user has picked out.

Those untagging rules are running before determining conflicts and documented just 10 lines before the conflicts documentation.

I don't know how you can say this isn't broken.

It isn't broken because this is what it was designed to do and how it is documented. That design idea/compromise was that container is not related to a reference but an actual image. Changing a reference after creating a container has no effect to the container after that, it is only a pointer for finding an image on creation time. rmi isn't only way to untag an image, you can do the same thing with tag and pull. Your example is much easier to understand than these cases. User explicitly asks to remove or untag the image. It is successfully untagged. It should not be a surprise at all that it can't be used in a completely new docker run. Container list shows a link to parent image so all the tools using that link still work(as well as restarting the container). The ref doesn't disappear from the container list because the reference was deleted but because the list code knows that references are mutable and always checks if whatever string user typed when starting a container would still be valid as a reference to that image. Inspect always shows image ID, and inspecting images shows all current refs as equal.

In case you have better ideas feel free to make proposals but finding a documented behavior that has weird edge cases does not make it a bug. rmi behaves like documented, prune does not behave like documented.

@dnephin

This comment has been minimized.

Copy link
Member

dnephin commented Feb 27, 2018

tagging image with myimage-backup without removing the container and freeing up the reference to be used by the new container with an option to rollback to the old image

This is still supported by my proposed behaviour because you can always build or image tag over an existing tag. You don't even need to delete the original tag. Why would you need to "free it" ?

clearing up junk images that are duplicates of used ones

Should still be supported as long as no container is using them. If a container is using them they aren't junk, right? Or if you really want to you can always -f.

untagging myrepo:latest when you know that it is not up-to-date anymore. so that when new containers are created they would point to/pull a new image

This is the only use case that isn't directly supported. However a more straighforward way of doing this would be to docker pull myrepo:latest. Why just rmi the thing instead of updating it directly?

It isn't broken because this is what it was designed to do and how it is documented

That's fair. Broken is the wrong word for it. I think it's unexpected behaviour that needs to be fixed, so you're correct that I should open a new issue for this. I have opened #36435.

I still maintain that image prune and image rm should behave the same way (with the exception of one being automated). Not because image rm should act like prune, but because consistent and expected behaviour are important qualities of an API. So I consider this issue (#36295) as a wont-fix, as it will be addressed by #36435.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment