Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanent "Removal In Progress" for old containers, using zfs storage driver #40132

Open
satmandu opened this issue Oct 24, 2019 · 60 comments
Open
Assignees

Comments

@satmandu
Copy link

Description
Docker containers slated for removal are piling up, and then when I attempt manual removal, get a message of driver "zfs" failed to remove root filesystem: exit status 1:

docker ls -a shows many containers waiting for removal.
docker ps -a | grep Removal | cut -f1 -d' ' | xargs -rt docker rm fails.

docker ls -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS                      PORTS               NAMES
213d1407a332        a6dfff132da0        "/bin/bash -c 'set -…"   15 hours ago        Exited (100) 15 hours ago                       optimistic_wescoff
9bcf46b5980b        8be4a38f8e02        "/bin/bash -c 'sudo …"   15 hours ago        Exited (127) 15 hours ago                       vibrant_dubinsky
4f02809ed44c        satmandu-kodi       "env /usr/local/bin/…"   24 hours ago        Up 24 hours                                     x11docker_X105_satmandu-kodi_49665454097
075b60b5e967        satmandu-kodi       "env /usr/local/bin/…"   25 hours ago        Removal In Progress                             x11docker_X104_satmandu-kodi_45363708509
ab4a3ef0d96b        satmandu-kodi       "env /usr/local/bin/…"   25 hours ago        Removal In Progress                             x11docker_X103_satmandu-kodi_45207542493
c04bb7e38d7b        satmandu-kodi       "env /usr/local/bin/…"   25 hours ago        Removal In Progress                             x11docker_X102_satmandu-kodi_44789374800
e1ba0ec0df24        88d4a3de6210        "env /usr/local/bin/…"   27 hours ago        Removal In Progress                             x11docker_X101_satmandu-kodi_39512307217
24e304747faf        88d4a3de6210        "env /usr/local/bin/…"   39 hours ago        Removal In Progress                             x11docker_X108_satmandu-kodi_93803639104
36f66dc51047        88d4a3de6210        "env /usr/local/bin/…"   39 hours ago        Removal In Progress                             x11docker_X107_satmandu-kodi_93435839038
716cde2499fb        88d4a3de6210        "env /usr/local/bin/…"   40 hours ago        Removal In Progress                             x11docker_X106_satmandu-kodi_93245427242
b183933fc921        88d4a3de6210        "env /usr/local/bin/…"   40 hours ago        Removal In Progress                             x11docker_X105_satmandu-kodi_91863983068
bdd1c55c4743        88d4a3de6210        "env /usr/local/bin/…"   44 hours ago        Removal In Progress                             x11docker_X104_satmandu-kodi_78364574321
7f430b154666        88d4a3de6210        "env /usr/local/bin/…"   44 hours ago        Removal In Progress                             x11docker_X103_satmandu-kodi_76876480629
7968ec4562f0        88d4a3de6210        "env /usr/local/bin/…"   2 days ago          Removal In Progress                             x11docker_X102_satmandu-kodi_53781254445
2a15a476c879        88d4a3de6210        "env /usr/local/bin/…"   2 days ago          Removal In Progress                             x11docker_X101_satmandu-kodi_53106107069
623b3d291c51        88d4a3de6210        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X104_satmandu-kodi_77806763668
5fc0eac5b0aa        88d4a3de6210        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X103_satmandu-kodi_77339657697
96192687e16e        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X102_satmandu-kodi_76598924552
ea1d929c65da        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X101_satmandu-kodi_66169732770
674981d43f29        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X107_satmandu-kodi_07962283478
98f04a33b6ae        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X106_satmandu-kodi_06757784407
1ae55ba67232        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X105_satmandu-kodi_05434929876
f80356a0b85c        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X104_satmandu-kodi_03975305877
a4fe94d26646        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X103_satmandu-kodi_03797671848
1b1b1049fec9        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X102_satmandu-kodi_03517303389
f7f9f57fa885        13d9bc4cd61a        "env /usr/local/bin/…"   3 days ago          Removal In Progress                             x11docker_X101_satmandu-kodi_92396666201
05fc62552ee0        13d9bc4cd61a        "env /usr/local/bin/…"   4 days ago          Removal In Progress                             x11docker_X103_satmandu-kodi_20839137997
1927abb8609e        13d9bc4cd61a        "env /usr/local/bin/…"   4 days ago          Removal In Progress                             x11docker_X102_satmandu-kodi_20492730103
59b036236f39        13d9bc4cd61a        "env /usr/local/bin/…"   4 days ago          Removal In Progress                             x11docker_X101_satmandu-kodi_17807034858
docker ps -a | grep Removal | cut -f1 -d' ' | xargs -rt docker rm
docker rm 075b60b5e967 ab4a3ef0d96b c04bb7e38d7b e1ba0ec0df24 24e304747faf 36f66dc51047 716cde2499fb b183933fc921 bdd1c55c4743 7f430b154666 7968ec4562f0 2a15a476c879 623b3d291c51 5fc0eac5b0aa 96192687e16e ea1d929c65da 674981d43f29 98f04a33b6ae 1ae55ba67232 f80356a0b85c a4fe94d26646 1b1b1049fec9 f7f9f57fa885 05fc62552ee0 1927abb8609e 59b036236f39 
Error response from daemon: container 075b60b5e9674a509f277b661d8c3e50b83ede32ad6b0bdfad15e99c5f375706: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/4d69eccca2293f5dc5572df069227e830cf8db17cf681d83293e73a5418bbafd" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/4d69eccca2293f5dc5572df069227e830cf8db17cf681d83293e73a5418bbafd': dataset does not exist
Error response from daemon: container ab4a3ef0d96b8eb07a4901e1a9243c32d78f5c20b0af343c9e09eae4f111f9f4: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/23e8e60f02709c79d54a42c70b8784055151bd2e917127dd71f75e8043e45e9e" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/23e8e60f02709c79d54a42c70b8784055151bd2e917127dd71f75e8043e45e9e': dataset does not exist
Error response from daemon: container c04bb7e38d7b03ffe91e107b64755aeaee31c362233ee587be647379f0000007: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/385e06af0f3b86981d49993a0a485ed4ed038cb22f02a5e793a737e572bbf6ab" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/385e06af0f3b86981d49993a0a485ed4ed038cb22f02a5e793a737e572bbf6ab': dataset does not exist
Error response from daemon: container e1ba0ec0df244bae8628e29dc48521d0651e51acf8fd80da5dae522c0568e3b5: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/604a079a496974d8b904f158c4a5ff5f960c3c533afb0937bcb58024b8abc00d" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/604a079a496974d8b904f158c4a5ff5f960c3c533afb0937bcb58024b8abc00d': dataset does not exist
Error response from daemon: container 24e304747faf0c6cedebbc933bc35a3c82c4c54e3714e21474c26288f7abe838: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/e3514e420287680bc8bdaa5ace636603620dbbebef1105e1bc826f26d20e973f" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/e3514e420287680bc8bdaa5ace636603620dbbebef1105e1bc826f26d20e973f': dataset does not exist
Error response from daemon: container 36f66dc510474e16e3b3b791c56d923c08dcae9234429788edd036c6e445a94c: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/fd985e0b541a123995d1c0a55aef44e077ad36448bb706c4f11ec980b0652021" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/fd985e0b541a123995d1c0a55aef44e077ad36448bb706c4f11ec980b0652021': dataset does not exist
Error response from daemon: container 716cde2499fbb116e8ab015b4875ad2051a7505e8d1fa41d3dc4ebb158959439: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/dc7dce1f312e7a9159cdec76ea0e49d4071821b108fe71d73761fd425eed26cb" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/dc7dce1f312e7a9159cdec76ea0e49d4071821b108fe71d73761fd425eed26cb': dataset does not exist
Error response from daemon: container b183933fc921224f29aa5ae6de4b3cff247f276bce20f0b272ccb9e0cc70773e: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/6c95479a2138d7ad61ee78613a6470b8707c13379f044b3057477b23cf67874d" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/6c95479a2138d7ad61ee78613a6470b8707c13379f044b3057477b23cf67874d': dataset does not exist
Error response from daemon: container bdd1c55c47439f6814baf8b75c9ae85a9d2aa18aead88368d066de8c5046859a: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/9d4e984dbd20a7140241bac2d68a2fa8b8330cab66e416109f13095f3af71f8e" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/9d4e984dbd20a7140241bac2d68a2fa8b8330cab66e416109f13095f3af71f8e': dataset does not exist
Error response from daemon: container 7f430b154666635bfe17448564a83ac95a21d807ba6abc82dbd01c63e1d29a88: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/b2b9d4140aa527691e694a0c0a4f59a73785426c1bdd22b3dd6e4db4b96aaeb5" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/b2b9d4140aa527691e694a0c0a4f59a73785426c1bdd22b3dd6e4db4b96aaeb5': dataset does not exist
Error response from daemon: container 7968ec4562f079ca4cbda7cfb0319dd38990d2a2f7776097b209d0513a08c365: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/7ad549461a2e41b48c219b40479d4706be4f6b656250117ac45f42ffbda4fd6a" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/7ad549461a2e41b48c219b40479d4706be4f6b656250117ac45f42ffbda4fd6a': dataset does not exist
Error response from daemon: container 2a15a476c879b2736dfa3e1dc8f7e76840d4498f2687eb605f9c5b65b4089c09: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/5a0174ac1c8f5f20ca9c70989f396406b3fe49e12490fe67d11ab8eb31f29041" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/5a0174ac1c8f5f20ca9c70989f396406b3fe49e12490fe67d11ab8eb31f29041': dataset does not exist
Error response from daemon: container 623b3d291c518905a765ad4b661d8c6a59fccc6b39f0c7a61ed561db891c2754: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/0e544be1bf97d3657701d5526381f90c75baae3da290b8759cd9bb62e5cb99da" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/0e544be1bf97d3657701d5526381f90c75baae3da290b8759cd9bb62e5cb99da': dataset does not exist
Error response from daemon: container 5fc0eac5b0aabe9b3ae05ceee9bd28261ac0dcdec823fef12b734667ad6b2886: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/813474299a2d5e2b1eb8122cefebf2e28aa1f997adf1f70ea7f652856eee2032" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/813474299a2d5e2b1eb8122cefebf2e28aa1f997adf1f70ea7f652856eee2032': dataset does not exist
Error response from daemon: container 96192687e16e6fc3a9ed9f11c664f12eba0c389ba592e7417c156d8a6ed49de2: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/16e9454788a3b1b25efb04f0d0f2136fe9bd2258de44b91afd75f63a625931b9" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/16e9454788a3b1b25efb04f0d0f2136fe9bd2258de44b91afd75f63a625931b9': dataset does not exist
Error response from daemon: container ea1d929c65da55f576cccc5dcf1e50573db91805e09015ad75dc67747f559be4: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/fad3d48c99dff1590e942945f4b245c51f9851f03ab7e7bf648ea0f8795cdb42" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/fad3d48c99dff1590e942945f4b245c51f9851f03ab7e7bf648ea0f8795cdb42': dataset does not exist
Error response from daemon: container 674981d43f29941d7de9af6df44393e3e3907d658347f18a5575b9e21cbc5c97: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/3e99056f6047c63abcad6daa52b8beb7158eaa581edd49a46b27fcd82bd95c55" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/3e99056f6047c63abcad6daa52b8beb7158eaa581edd49a46b27fcd82bd95c55': dataset does not exist
Error response from daemon: container 98f04a33b6aedaa3d84934a0753b26dee1852c3cc7ebed1120a8963cb998f679: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/4a9065eda1c08584c3703306b3c3f856d7e7aa4dc5e7a8cf87b99b83e389f765" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/4a9065eda1c08584c3703306b3c3f856d7e7aa4dc5e7a8cf87b99b83e389f765': dataset does not exist
Error response from daemon: container 1ae55ba6723234507c9d7390c8951c53f8900f491510719a2a6e13259c6061b4: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/3c4b20b36ce8fed6da82407c7cecd7beb553e8fa8170798f7860c805a1ab9cef" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/3c4b20b36ce8fed6da82407c7cecd7beb553e8fa8170798f7860c805a1ab9cef': dataset does not exist
Error response from daemon: container f80356a0b85cc60a6df4f29069432d08a8951a3f05d04e5f4241141ab6569872: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/bf8fb8de598c2706e53fbf0bf8714e19326b7e394a2b8f008591b0198a4d319b" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/bf8fb8de598c2706e53fbf0bf8714e19326b7e394a2b8f008591b0198a4d319b': dataset does not exist
Error response from daemon: container a4fe94d26646532afbede89cdd308f1d6a5dc3ca7c1bb3519bd6c46469716c3c: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/022b26369352fe37e15d275f003a81b4b987a9006eb5e96fef4419a6d48a7773" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/022b26369352fe37e15d275f003a81b4b987a9006eb5e96fef4419a6d48a7773': dataset does not exist
Error response from daemon: container 1b1b1049fec97002ee16f672b91129fa5161e55dac76f71e15f9a732963d8661: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/09a562844aaf9253a2a3da76dd52addf74172ad500bf89a72ce4be4e181994f7" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/09a562844aaf9253a2a3da76dd52addf74172ad500bf89a72ce4be4e181994f7': dataset does not exist
Error response from daemon: container f7f9f57fa885757aa08774a2190ac16937529b4209c1b24a08f51cefec5437fb: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/aec39aa28c3174d5e7ae7484c61bb65bfea9b7ee3745aaa6a1d9153cbef9ba62" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/aec39aa28c3174d5e7ae7484c61bb65bfea9b7ee3745aaa6a1d9153cbef9ba62': dataset does not exist
Error response from daemon: container 05fc62552ee0ab01b95f3292bcdeefcf6529c5e69aee38a1ffa7226e9b7288ee: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/3cce7aed941a1412408a50f7e603ce651a8fa089b940a685fe2ffdc3db71f224" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/3cce7aed941a1412408a50f7e603ce651a8fa089b940a685fe2ffdc3db71f224': dataset does not exist
Error response from daemon: container 1927abb8609ec0144bfac4d87928a4e5f31b1d119b25a6519b0ac33e51d70db0: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/4762d5901cccaa59860f558f341240de5261f276b6d983738f74f683056e047d" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/4762d5901cccaa59860f558f341240de5261f276b6d983738f74f683056e047d': dataset does not exist
Error response from daemon: container 59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13': dataset does not exist

Steps to reproduce the issue:

This is happening for many containers created, and persists across reboots.

Running on a current Ubuntu 19.10 system with a zfs rpool (though I've seen this recently also on a non-zfs root system with a zfs volume for /var/lib/docker)

Output of docker version:

(paste your output here)

Output of docker info:

docker info
Client:
 Debug Mode: false

Server:
 Containers: 29
  Running: 1
  Paused: 0
  Stopped: 28
 Images: 29
 Server Version: 19.03.3
 Storage Driver: zfs
  Zpool: rpool
  Zpool Health: ONLINE
  Parent Dataset: rpool/ROOT/ubuntu_abcdef/var/lib
  Space Used By Parent: 2037235712
  Space Available: 232996900864
  Parent Quota: no
  Compression: lz4
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc version: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 init version: fec3683
 Security Options:
  apparmor
  seccomp
   Profile: default
 Kernel Version: 5.3.0-19-generic
 Operating System: Ubuntu 19.10
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 46.96GiB
 Name: cheekon
 ID: Q5DS:AESB:UPDQ:ZCAM:6TTS:UUG2:DDCV:ZXK2:6LAM:LLSI:4K56:G7PP
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

uname -a
Linux cheekon 5.3.0-19-generic #20-Ubuntu SMP Fri Oct 18 09:04:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
cat /etc/issue
Ubuntu 19.10
 sudo cat /etc/docker/daemon.json 
{
  "storage-driver": "zfs"
}
{
  "ipv6": true
}
{
  "iptables": false
}
cat /etc/apt/sources.list | grep docker
deb [arch=amd64] https://download.docker.com/linux/ubuntu disco stable
# deb-src [arch=amd64] https://download.docker.com/linux/ubuntu eoan stable
@satmandu
Copy link
Author

When I drill down into one of these containers:

docker inspect 59b036236f39
[
    {
        "Id": "59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428",
        "Created": "2019-10-19T20:43:30.926700775Z",
        "Path": "env",
        "Args": [
            "/usr/local/bin/tini",
            "--",
            "/bin/sh",
            "-",
            "/x11docker/containerrc"
        ],
        "State": {
            "Status": "dead",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": true,
            "Pid": 0,
            "ExitCode": 143,
            "Error": "container 59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428: driver \"zfs\" failed to remove root filesystem: exit status 1: \"/usr/sbin/zfs fs destroy -r rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13\" => cannot open 'rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13': dataset does not exist\n",
            "StartedAt": "2019-10-19T20:43:31.613953469Z",
            "FinishedAt": "2019-10-19T21:01:23.553401596Z"
        },
        "Image": "sha256:13d9bc4cd61af51157266b10235136e7fadfa5ba81ab24af5becf6edfcaf92b4",
        "ResolvConfPath": "/var/lib/docker/containers/59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428/hostname",
        "HostsPath": "/var/lib/docker/containers/59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428/hosts",
        "LogPath": "/var/lib/docker/containers/59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428/59b036236f3985af85c13f7b9c6e95e123ad7fad635cbb74acb38030b0dfe428-json.log",
        "Name": "/x11docker_X101_satmandu-kodi_17807034858",
        "RestartCount": 0,
        "Driver": "zfs",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "docker-default",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/tmp/.X11-unix/X101:/X101:rw",
                "/usr/bin/docker-init:/usr/local/bin/tini:ro",
                "/root/.cache/x11docker/satmandu-kodi-17807034858/share:/x11docker:rw",
                "/home/kodi:/home/kodi:rw"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": true,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": [
                "ALL"
            ],
            "Capabilities": null,
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": [
                "29",
                "20",
                "1002",
                "44",
                "109",
                "29"
            ],
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "no-new-privileges",
                "label=type:container_runtime_t"
            ],
            "Tmpfs": {
                "/run": "",
                "/run/lock": ""
            },
            "UTSMode": "",
            "UsernsMode": "host",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [
                {
                    "PathOnHost": "/dev/ttyACM0",
                    "PathInContainer": "/dev/ttyACM0",
                    "CgroupPermissions": "rw"
                },
                {
                    "PathOnHost": "/dev/dri",
                    "PathInContainer": "/dev/dri",
                    "CgroupPermissions": "rw"
                },
                {
                    "PathOnHost": "/dev/vga_arbiter",
                    "PathInContainer": "/dev/vga_arbiter",
                    "CgroupPermissions": "rw"
                },
                {
                    "PathOnHost": "/dev/snd",
                    "PathInContainer": "/dev/snd",
                    "CgroupPermissions": "rw"
                }
            ],
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": false,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "Dataset": "rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13",
                "Mountpoint": "/var/lib/docker/zfs/graph/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13"
            },
            "Name": "zfs"
        },
        "Mounts": [
        ],
        "Config": {
            "Hostname": "59b036236f39",
            "Domainname": "",
            "User": "1002:1002",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": true,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "USER=kodi",
                "container=docker",
                "XAUTHORITY=/x11docker/Xauthority.client",
                "DISPLAY=:101",
                "ALSA_CARD=default",
                "XDG_RUNTIME_DIR=/tmp/XDG_RUNTIME_DIR",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
            ],
            "Cmd": [
                "/usr/local/bin/tini",
                "--",
                "/bin/sh",
                "-",
                "/x11docker/containerrc"
            ],
            "Image": "satmandu-kodi",
            "Volumes": null,
            "WorkingDir": "/tmp",
            "Entrypoint": [
                "env"
            ],
            "OnBuild": null,
            "Labels": {}
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "c3e605f021180af1442d1d53bc845138e6a4e3501f683802a168a50872ba0d1d",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/c3e605f02118",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "0f4f314dda1173cf4801448e687bb3eb557ae63c2d5e2ee87760e411ca6006b5",
                    "EndpointID": "",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "",
                    "DriverOpts": null
                }
            }
        }
    }
]

I see this dataset mentioned: rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13

And yet when I check to see which datasets exist with similar names I see this:

zfs get creation | grep 58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13
rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13-init            creation  Sat Oct 19 16:43 2019  -
rpool/ROOT/ubuntu_abcdef/var/lib/58a909851ed24b012f325be58586ca3f7421979e508c823eef18d97373cfab13-init@927215024  creation  Sat Oct 19 16:43 2019  -

So clearly there are some datasets being left, just not the one which docker is trying to delete.

@satmandu
Copy link
Author

Maybe there is a connection to docker/for-linux#124 (comment) ?

@dotwaffle
Copy link

dotwaffle commented Oct 28, 2019

I get into this state using zfs and swarm too, but only when rebooting. It doesn't matter if the containers are running or stopped (i.e. node drained), they always appear "dead" after a reboot, and can't be deleted through the cli. Restarting the docker systemd service does not cause the condition for me.

All I can think of is that perhaps there is a race condition between systemd bringing docker down, and zfs having committed the filesystem destroys and returning. I haven't had time to test this theory yet.

At the moment, I'm draining the node, waiting for things to finish, then doing a system prune before doing the reboot.

@satmandu
Copy link
Author

docker system prune -a doesn't appear to remove the dangling containers for me.

@dotwaffle
Copy link

dotwaffle commented Oct 30, 2019 via email

@jeanparpaillon
Copy link

I have the same issue. The workaround I've found so far consist of:

  • create the missing zones with zfs create ...
  • then remove the container

I'm looking forward to get a real solution

@dotwaffle
Copy link

I got tired of the issue so chose to create a zvol, format it with ext4, mount /var/lib/docker there, and switch from the zfs driver to overlay2. Things not only persist across reboots and no longer occasionally get stuck in "Removal In Progress", but also perform operations a lot faster -- no waiting for 3-5 seconds on every stage of a docker build invocation etc.

I don't think the zfs driver is quite ready for prime-time yet, but the documentation does say as much!

@CTGControls
Copy link

CTGControls commented Dec 14, 2019

I am running in to this problem as well. If you need so info or someone to test I am available for either.

This is the error I got:
us 1: "/sbin/zfs fs destroy -r zfspoola/docker/122185d523e2662465ade50473fb9a8b523d5667e5d7d4eb3f4a6645be2d0f65" => cannot open 'zfspoola/docker/122185d523e2662465ade50473fb9a8b523d5667e5d7d4eb3f4a6645be2d0f65': dataset does not exist
Error response from daemon: container 21be9d2c1852e5955bca442c27d9f930ff199f8491c82611c6a545db31906c92: driver "zfs" failed to remove root filesystem: exit stat

To be able to remove the dead containers I had to create 2 zfs file systems

zfs create zfspoola/docker/122185d523e2662465ade50473fb9a8b523d5667e5d7d4eb3f4a6645be2d0f65

and

zfs create zfspoola/docker/122185d523e2662465ade50473fb9a8b523d5667e5d7d4eb3f4a6645be2d0f65-init

@satmandu
Copy link
Author

satmandu commented Dec 31, 2019

This is a partial, ugly, unsafe solution which appears to be helping me.

docker ps -a | grep Removal | cut -f1 -d' ' | xargs -rt docker rm  2>&1 >/dev/null | grep "dataset does not exist" |  awk '{print $(NF-4)}' | sed "s/'//g" | cut -f1 -d':' |  xargs -L1 sh -c 'for arg do sudo zfs destroy -R "$arg"; sudo zfs destroy -R "$arg"-init ; sudo zfs create "$arg" ; sudo zfs create "$arg"-init ; ...; done' _ ; docker ps -a | grep Removal | cut -f1 -d' ' | xargs -rt docker rm 2>&1 >/dev/null

@nicedevil007
Copy link

@satmandu thank you for this! works great.
ZFS is creating a lot of legacy child datasets for the docker environment. I think there will be realy much that won't be needed anymore.... like those Removal in Progress stuff... does somebody knows a solution to find those not needed datasets?

@spencerhughes
Copy link

@satmandu Also thank you so much this is working for me also! Gotta admit though that's a scary command to just run.

@jsenecal
Copy link

jsenecal commented Feb 3, 2020

Ran into the same issue - Created a similar script to the one @satmandu created but it somehow works better for me. You need to have jq installed.

#!/bin/bash

stuck=$(docker ps -a | grep Removal | cut -f1 -d' ')
echo $stuck
for container in $stuck; do
	zfs_path=$(docker inspect $container | jq -c '.[] | select(.State | .Status == "dead")|.GraphDriver.Data.Dataset')
	zfs_path=$(echo $zfs_path|tr -d '"')
	sudo zfs destroy -R $zfs_path
	sudo zfs destroy -R $zfs_path-init
       	sudo zfs create $zfs_path
       	sudo zfs create $zfs_path-init
	docker rm $container
done

@satmandu
Copy link
Author

satmandu commented Feb 3, 2020

Ran into the same issue - Created a similar script to the one @satmandu created but it somehow works better for me.

Nice @jsenecal! This looks much cleaner!

@satmandu
Copy link
Author

satmandu commented Feb 3, 2020

shellcheck suggests these changes. I imagine you want this as clean as possible since you could really destroy data. Anyways, I just setup a cron job to run this once a day now.

#!/bin/bash
# Install jq thus:
# sudo apt-get install -y jq
# 
stuck=$(docker ps -a | grep Removal | cut -f1 -d' ')
echo "$stuck"
for container in $stuck; do
	zfs_path=$(docker inspect "$container" | jq -c '.[] | select(.State | .Status == "dead")|.GraphDriver.Data.Dataset')
	zfs_path=$(echo "$zfs_path"|tr -d '"')
	sudo zfs destroy -R "$zfs_path"
	sudo zfs destroy -R "$zfs_path"-init
       	sudo zfs create "$zfs_path"
       	sudo zfs create "$zfs_path"-init
	docker rm "$container"
done

@oramirite
Copy link

oramirite commented Jun 14, 2020

Oh my gosh - I find this workaround a little hilarious, but also VERY welcome. Thanks y'all.

I'm experimenting with a new infra setup centered around ZFS+Docker, and yeah, I'm having this issue as well. It seems to occur for me on reboots pretty much exclusively - on a fresh system install, I can spin up a bunch of stacks and then remove them, and all containers will get destroyed properly. But if I reboot with all of those stacks running, I'll get a ton of zombie containers left over when it boots back up. This seems to produce additional issues with new containers coming up sometimes, I assume because Swarm gets hung up trying to remove them.

@nepeat
Copy link

nepeat commented Jul 24, 2020

Can add another datapoint of this happening on every reboot. Make running Ceph with cephadm a huge pain because I have to run the script before restarting Ceph.

Ubuntu 20.04,
zfs-0.8.3-1ubuntu12.2
zfs-kmod-0.8.3-1ubuntu12.1

@satmandu
Copy link
Author

The Root Cause is being discussed here: #41055

@nepeat
Copy link

nepeat commented Jul 24, 2020

The Root Cause is being discussed here: #41055

That issue seems unrelated to this issue.

That one seems to be about Docker creating many ZFS volumes, making external tools and other tools have issues. I cannot seem to find anything in that issue about Docker failing to start or having dangling containers for removal in that issue.

@satmandu
Copy link
Author

The Root Cause is being discussed here: #41055

That issue seems unrelated to this issue.

That one seems to be about Docker creating many ZFS volumes, making external tools and other tools have issues. I cannot seem to find anything in that issue about Docker failing to start or having dangling containers for removal in that issue.

That issue is literally about docker creating zfs datasets and not deleting them, which are literally the dangling containers I was talking about when I created this bug report.

@xavier83ar
Copy link

I'm having same issue, recently installed ubuntu over zfs and it seems that docker took zfs to use as file system by default. @dotwaffle could you explain how did you do it (mount a ext4 vol and move docker to overlay fs)? or if you have a link explaining steps to make it would be great. Thanks.

@satmandu
Copy link
Author

satmandu commented Aug 26, 2020

@xavier83ar Once openzfs/zfs#9414 is merged into zfs the problem should hopefully go away. This won't make it into openzfs before 2.1 though, probably sometime in 2021.

I worked around the problem for now by just adding a xfs volume in fstab:

UUID="91bffdfd-184f-4338-bc7f-3041fcfefe7e"     /var/lib/docker xfs     defaults        0  1

(Of course you need to format a partition with xfs using mkfs.xfs for use there.)

Then make sure that /etc/docker/daemon.json has a section like this:

{
  "storage-driver": "overlay2"
}

@sgiacomel
Copy link

sgiacomel commented Jan 8, 2021

I have been trying to deal with this myself, and this was my workaround:

#!/usr/bin/env bash
for container_id in $(docker container ls -qa --filter status=removing); do 
	zpool_object=$(docker container inspect --format='{{.GraphDriver.Data.Dataset}}' ${container_id})
	zfs create "${zpool_object}"
	zfs create "${zpool_object}-init"
	docker container rm $container_id
done

No need for jq, grep or cut, just use the right filter and format for docker commands.

@almereyda
Copy link

Merging the approaches from @satmandu and @sgiacomel, we get:

#!/usr/bin/env bash
for container_id in $(docker container ls -qa --filter status=removing); do 
        zpool_object=$(docker container inspect --format='{{.GraphDriver.Data.Dataset}}' ${container_id})
        zfs destroy -R "${zpool_object}"
        zfs destroy -R "${zpool_object}-init"
        zfs create "${zpool_object}"
        zfs create "${zpool_object}-init"
        docker container rm $container_id
done

@ingokeck
Copy link

I have the same problem. Creating/destroying manually does not really help in my case, because the bug already happens when an interim container in the Dockerfile is deleted. Unfortunately, this is something I use quite a bit. So for now I think it would be better to warn people from running Docker on ZFS (which btw is now featured in the newest Ubuntu install which is also the reason I have it).

@ingokeck
Copy link

I got tired of the issue so chose to create a zvol, format it with ext4, mount /var/lib/docker there, and switch from the zfs driver to overlay2. Things not only persist across reboots and no longer occasionally get stuck in "Removal In Progress", but also perform operations a lot faster -- no waiting for 3-5 seconds on every stage of a docker build invocation etc.

I can confirm that this works nicely.

@cpuguy83
Copy link
Member

cpuguy83 commented Aug 11, 2021

We recently merged (or are in the process of merging) a change that will use overlay2 as the default driver over zfs or btrfs for new setups.
Overlay didn't use to work on top of these filesystems but now seems to be ok.

@eZtaR1
Copy link

eZtaR1 commented Sep 18, 2022

Merging the approaches from @satmandu and @sgiacomel, we get:

#!/usr/bin/env bash
for container_id in $(docker container ls -qa --filter status=removing); do 
        zpool_object=$(docker container inspect --format='{{.GraphDriver.Data.Dataset}}' ${container_id})
        zfs destroy -R "${zpool_object}"
        zfs destroy -R "${zpool_object}-init"
        zfs create "${zpool_object}"
        zfs create "${zpool_object}-init"
        docker container rm $container_id
done

Running this allowed me to remove the container, but whenever I try to re-create, I get the below:

Wondering if anyone has run into similar issues? (I don't have enough space on my device to create a zvol to migrate my docker image as overlay2)

Creating app ... error

ERROR: for app  Cannot create container for service wekan: exit status 2: "/usr/sbin/zfs fs snapshot rpool/ROOT/ubuntu_xxxxx/var/lib/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx@000000000" => cannot open 'rpool/ROOT/ubuntu_xxxxx/var/lib/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx': dataset does not exist
usage:
        snapshot [-r] [-o property=value] ... <filesystem|volume>@<snap> ...

For the property list, run: zfs set|get

For the delegated permission list, run: zfs allow|unallow


ERROR: for wekan  Cannot create container for service wekan: exit status 2: "/usr/sbin/zfs fs snapshot rpool/ROOT/ubuntu_xxxxx/var/lib/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx@000000000" => cannot open 'rpool/ROOT/ubuntu_xxxxx/var/lib/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx': dataset does not exist
usage:
        snapshot [-r] [-o property=value] ... <filesystem|volume>@<snap> ...

For the property list, run: zfs set|get

For the delegated permission list, run: zfs allow|unallow

ERROR: Encountered errors while bringing up the project.

I am sadly plagued with the same errors, the only thing I've found to help is
service docker stop rm -rf /var/lib/docker service docker start

Which nukes all containers :/

@ingokeck
Copy link

Maybe best avoid docker on ZFS for now. Best workaround in my opinion if you need to use zfs is to create a separate ext4 volume for docker as described above.

@jgoerzen
Copy link

The simplest workaround for the "dataset does not exist" is to just zfs create the dataset it wishes exists, then re-run the docker rm, and it will be destroyed and all will be happy.

@eZtaR1
Copy link

eZtaR1 commented Sep 19, 2022

Maybe best avoid docker on ZFS for now. Best workaround in my opinion if you need to use zfs is to create a separate ext4 volume for docker as described above.

Thank you, that's what I've decided to do :)

@satmandu
Copy link
Author

overlayfs is NOT yet ready atop ZFS. See:

openzfs/zfs#9414
openzfs/zfs#12209
openzfs/zfs#8774

Looks like we're getting closer!

openzfs/zfs#12209 just got completed!

@russkel
Copy link

russkel commented Dec 19, 2022

This is still an issue for me. I notice that referenced openzfs PRs don't seem to have been merged.

@satmandu
Copy link
Author

As per openzfs/zfs#9414 (comment) overlayfs support has been merged!

I'm not sure if that will require a post 2.1.x zfs release though...

@satmandu
Copy link
Author

Looks like openzfs/zfs@dbf6108 isn't in the 2.1.x tree. We can ask to have that back ported for overlayfs support in 2.1.8.

@satmandu
Copy link
Author

openzfs/zfs#14070 (comment) states that overlayfs support will not be back ported to 2.1.x, but will be added to a subsequent release. ☹️

@russkel
Copy link

russkel commented Dec 19, 2022

Sad faces all round. Thanks @satmandu

@woutersamaey
Copy link

ZFS 2.2 is not out yet. Is this near or far future? Does anyone know?

@satmandu
Copy link
Author

Maybe bring up that question on the zfs mailing lists?

@robd003
Copy link

robd003 commented Dec 21, 2022

Still seeing this problem on Ubuntu 22.04 with Docker 20.10.22

@satmandu
Copy link
Author

Still seeing this problem on Ubuntu 22.04 with Docker 20.10.22

This problem won't (likely) go away until we have ZFS 2.2 available. That's not the case yet.

@valkolaci
Copy link

This issue has been open for 3+ years. This is just ridiculous...

@oramirite
Copy link

This issue has been open for 3+ years. This is just ridiculous...

Almost as ridiculous as someone who comes into an issue and decides to make a useless angry post about it just because they still see a green Open symbol.

I'll just offer an alternative approach for right now: Docker on ZFS works great, aside from this annoying but aesthetic issue. It doesn't affect performance or stability, so I just deal with these dead containers every once in a while. I just run the workaround bash script to clean then up when it gets cluttered.

@ThomDietrich
Copy link

@oramirite I did move away from "Docker on ZFS" somewhere between today and three years ago. The issue did affect performance and eventually rendered my deployment server unresponsive, sometimes only solvable via a hard reset of the machine. So yes, sorry to say, this is not a minor inconvenience bug and 3+ years to resolve it justifies some hard feelings.

@sgiacomel
Copy link

This issue has been open for 3+ years. This is just ridiculous...

Almost as ridiculous as someone who comes into an issue and decides to make a useless angry post about it just because they still see a green Open symbol.

I'll just offer an alternative approach for right now: Docker on ZFS works great, aside from this annoying but aesthetic issue. It doesn't affect performance or stability, so I just deal with these dead containers every once in a while. I just run the workaround bash script to clean then up when it gets cluttered.

I use a cronjob to deal with this

@oramirite
Copy link

@ThomDietrich That's good to know, thanks. What sort of performance issues were you experiencing? You're sure they were tied to this issue? I'll have to keep an eye out for that (Watch, now I'll get called out for not reading something earlier in the thread too, lol)

I didn't mean to sound like the guy that handwaves away legitimate issues, my bad there. We of course all want to improve the software. Y'all are preaching to the choir, is the only issue.

@ThomDietrich
Copy link

Happy to contribute where I can. Thanks!
I can't quite remember the complete story but I remember that my system suffered because of a higher rate of created and destroyed docker containers PLUS zfs-auto-snapshot exponentially worsening the effects 🥴
Obviously this sounds like a bad combination and I can't quite recall what I did to resolve it. Obviously I disabled zfs-auto-snapshot for the majority of volumes. I can tell you that I never dared to have a destructive workaround as a cronjob and therefore the issues reappeared regularly.
I could have been a better sysadmin at the time, however, with no permanent solution in sight and no actual benefits of the zfs storage driver, I changed my configuration to a closer-to-default non-zfs setup.

@satmandu
Copy link
Author

Note that this issue should be resolved once zfs 2.2 comes out as then you should be able to just use the overlay driver for docker with zfs...

@jbruggem
Copy link

jbruggem commented Jan 24, 2023

For anybody interested in another type work-around: you can create an ext4 filesystem within your zfs pool.

I was inspired by this article.

I created a zvol with an ext4 filesystem in it:

zfs create -s -V 50GB zpool/docker
mkfs.ext4 /dev/zvol/zpool/docker

Then I stopped the docker daemon and cleaned /var/lib/docker.

Then I mounted the filesystem. In /etc/fstab:

/dev/zvol/zpool/docker /var/lib/docker ext4 defaults,_netdev 0 0

Then:

sudo mount /var/lib/docker

Also, in my docker config, I changed the storage driver:

"storage-driver": "overlay2"

Finally, I restarted docker.

@satmandu
Copy link
Author

Feel free to try a build of zfs/master with the overlay2 storage driver on ubuntu 22.10, since zfs-master now supports overlay.

I'm doing so right now using my own compiled build of zfs-master (which will eventually become 2.2) here: https://launchpad.net/~satadru-umich/+archive/ubuntu/zfs-experimental

@gitterspec
Copy link

gitterspec commented Feb 18, 2023

The simplest workaround for the "dataset does not exist" is to just zfs create the dataset it wishes exists, then re-run the docker rm, and it will be destroyed and all will be happy.

Sorry, missed this till now. Thanks for the tip. Unfortunately, when I zfs create the dataset and docker rm again, it says "filesystem has dependent clones" and lists ~100 datasets. Then it goes back to the same state, with the "dataset does not exist" error.

I guess it's able to remove the dataset but not its clones? Is there any way to have it discard the reliance on the old data set name when re-creating the docker?

@ThomDietrich
Copy link

@satmandu thanks for sharing your solution! I am currently in the process of setting up a new system.

Just to be clear: I intend to install the zfs package from your ppa on Ubuntu 22.04 LTS, then configure docker with "storage-driver": "overlay2" and it should work. Would you recommend that approach? Is the solution better than "storage-driver": "zfs"? Is it safe to use your zfs package now and mainstream later on? Am I okay to use 22.04 or is 22.10 mandatory for your package to work?

Cheers!

@satmandu
Copy link
Author

@ThomDietrich I would hold off for a couple of days or weeks until a patch for the issue at openzfs/zfs#13608 is tested, as it might affect docker volumes.

Having said that, I've only tested my ppa with 22.10 and 23.04. The ppa is not built for 22.04.

I use the overlay storage driver on my own system, with kernel 6.2.0, and it has worked stably for me with multiple volumes, primarily mirrored.

I'm not sure when OpenZFS 2.2 will get officially released though. It might get released in time for inclusion in Ubuntu 23.10, but as I'm not a maintainer of that software, I have no way of knowing.

@Joly0
Copy link

Joly0 commented Feb 9, 2024

Hey guys, i am currently on zfs-2.1.14 and i cant update due to os limitations. I have multiple containers stuck in the "Removal In Progress" status and the various scripts here do not work.

For example here is the output of one of the scripts here i tried:

#!/usr/bin/env bash
for container_id in $(docker container ls -qa --filter status=removing); do 
        zpool_object=$(docker container inspect --format='{{.GraphDriver.Data.Dataset}}' ${container_id})
        zfs destroy -R "${zpool_object}"
        zfs destroy -R "${zpool_object}-init"
        zfs create "${zpool_object}"
        zfs create "${zpool_object}-init"
        docker container rm $container_id
done
cannot unmount '/var/lib/docker/zfs/graph/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8': unmount failed
cannot unmount '/var/lib/docker/zfs/graph/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8': unmount failed
cannot destroy snapshot cache/docker/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8-init@838700103: snapshot is cloned
cannot create 'cache/docker/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8': dataset already exists
cannot create 'cache/docker/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8-init': dataset already exists
Error response from daemon: container 93c1ad67988b85900227d6a42b82afd020facb203e5e7d378d597883554ad703: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8" => cannot unmount '/var/lib/docker/zfs/graph/99c5825b669059791facb748b7b5032473c5c52179141bbce22b456d609603a8': unmount failed
cannot open 'cache/docker/8c1efcd5dcc2edfd0d1bcfee07d4c8500584f41245c3375ea3fea1b14c381bcd': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/8c1efcd5dcc2edfd0d1bcfee07d4c8500584f41245c3375ea3fea1b14c381bcd-init': unmount failed
cannot create 'cache/docker/8c1efcd5dcc2edfd0d1bcfee07d4c8500584f41245c3375ea3fea1b14c381bcd-init': dataset already exists
Error response from daemon: container 2551748b7add171155fe6f3da9acba58d68b823a6b5975e3e337732743d6078b: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/8c1efcd5dcc2edfd0d1bcfee07d4c8500584f41245c3375ea3fea1b14c381bcd-init" => cannot unmount '/var/lib/docker/zfs/graph/8c1efcd5dcc2edfd0d1bcfee07d4c8500584f41245c3375ea3fea1b14c381bcd-init': unmount failed
cannot open 'cache/docker/c27e4bd7c26ac7e64ebbc7589bcf4229c5617eb66dba66b54e22a1c2ce7c591d': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/c27e4bd7c26ac7e64ebbc7589bcf4229c5617eb66dba66b54e22a1c2ce7c591d-init': unmount failed
cannot create 'cache/docker/c27e4bd7c26ac7e64ebbc7589bcf4229c5617eb66dba66b54e22a1c2ce7c591d-init': dataset already exists
Error response from daemon: container 6767905f2a1e5883a19722a185dffec24689b9e60a918ef217a9a7f219bbc679: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/c27e4bd7c26ac7e64ebbc7589bcf4229c5617eb66dba66b54e22a1c2ce7c591d-init" => cannot unmount '/var/lib/docker/zfs/graph/c27e4bd7c26ac7e64ebbc7589bcf4229c5617eb66dba66b54e22a1c2ce7c591d-init': unmount failed
cannot open 'cache/docker/16e161adfbf3c076ce86a8d71dec1dc853f74503df879d3f8f5471425f2c29aa': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/16e161adfbf3c076ce86a8d71dec1dc853f74503df879d3f8f5471425f2c29aa-init': unmount failed
cannot create 'cache/docker/16e161adfbf3c076ce86a8d71dec1dc853f74503df879d3f8f5471425f2c29aa-init': dataset already exists
Error response from daemon: container 4fa5287b2e5cb415959e2a081a6aeae454e9ef43983580ed04a4ad2367bb4bd0: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/16e161adfbf3c076ce86a8d71dec1dc853f74503df879d3f8f5471425f2c29aa-init" => cannot unmount '/var/lib/docker/zfs/graph/16e161adfbf3c076ce86a8d71dec1dc853f74503df879d3f8f5471425f2c29aa-init': unmount failed
cannot open 'cache/docker/3516996e5550c81fe8d15a99bb75c6afaad8f6dc46673a72da733e50ee379afb': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/3516996e5550c81fe8d15a99bb75c6afaad8f6dc46673a72da733e50ee379afb-init': unmount failed
cannot create 'cache/docker/3516996e5550c81fe8d15a99bb75c6afaad8f6dc46673a72da733e50ee379afb-init': dataset already exists
Error response from daemon: container 53263416879d4eb5c443206b8fa2a92a97a53bc311cf630ee286ef71541b5583: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/3516996e5550c81fe8d15a99bb75c6afaad8f6dc46673a72da733e50ee379afb-init" => cannot unmount '/var/lib/docker/zfs/graph/3516996e5550c81fe8d15a99bb75c6afaad8f6dc46673a72da733e50ee379afb-init': unmount failed
cannot open 'cache/docker/a902877ef48f8ea3d2daa8684cb6af4ceb7e120678b4b6a06f58ec81e5819bcc': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/a902877ef48f8ea3d2daa8684cb6af4ceb7e120678b4b6a06f58ec81e5819bcc-init': unmount failed
cannot create 'cache/docker/a902877ef48f8ea3d2daa8684cb6af4ceb7e120678b4b6a06f58ec81e5819bcc-init': dataset already exists
Error response from daemon: container 5a4232cf3f7598ad5da57e4f814012c51a440f93a2b14f33d35ba6252d9f6354: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/a902877ef48f8ea3d2daa8684cb6af4ceb7e120678b4b6a06f58ec81e5819bcc-init" => cannot unmount '/var/lib/docker/zfs/graph/a902877ef48f8ea3d2daa8684cb6af4ceb7e120678b4b6a06f58ec81e5819bcc-init': unmount failed
cannot open 'cache/docker/8ab666cb49d2b7b2cf4a2f4eff3a5223920dedb97692ff7b5ac01dbbdce9468a': dataset does not exist
cannot unmount '/var/lib/docker/zfs/graph/8ab666cb49d2b7b2cf4a2f4eff3a5223920dedb97692ff7b5ac01dbbdce9468a-init': unmount failed
cannot create 'cache/docker/8ab666cb49d2b7b2cf4a2f4eff3a5223920dedb97692ff7b5ac01dbbdce9468a-init': dataset already exists
Error response from daemon: container a8cab4939026c5f5b094dc6e6f511f5a9aa53c594dee2722199461c6fadc5b59: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/8ab666cb49d2b7b2cf4a2f4eff3a5223920dedb97692ff7b5ac01dbbdce9468a-init" => cannot unmount '/var/lib/docker/zfs/graph/8ab666cb49d2b7b2cf4a2f4eff3a5223920dedb97692ff7b5ac01dbbdce9468a-init': unmount failed
cannot unmount '/var/lib/docker/zfs/graph/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497': unmount failed
cannot unmount '/var/lib/docker/zfs/graph/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497': unmount failed
cannot destroy snapshot cache/docker/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497-init@853552064: snapshot is cloned
cannot create 'cache/docker/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497': dataset already exists
cannot create 'cache/docker/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497-init': dataset already exists
Error response from daemon: container d8bb3569bd079caa8b27abaeae7735559dae6d21b6112ca7845d3379ac0ad6e1: driver "zfs" failed to remove root filesystem: exit status 1: "/usr/sbin/zfs fs destroy -r cache/docker/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497" => cannot unmount '/var/lib/docker/zfs/graph/40c72a15eb455ba2e53949c96d94afea55e52584b0972bee88dbea6b84e32497': unmount failed

@kevdogg
Copy link

kevdogg commented Mar 1, 2024

I have the same issue. The workaround I've found so far consist of:

* create the missing zones with `zfs create ...`

* then remove the container

I'm looking forward to get a real solution

Hey I know its been years since you posted this but just wanted to let you know this is a pretty good "hack" workaround to manually recreate the containers. What happened to me was the zfs datasets were created through docker however another container pruning program pruned the dataset and removed them. Thanks for the advice (5 years later).

@Lockszmith-GH
Copy link

Starting to feel like https://m.xkcd.com/2881/ here

@jeanparpaillon
Copy link

Starting to feel like https://m.xkcd.com/2881/ here

Hi, bug mates !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests