Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock on image delete and container attach #22732

Closed
tonistiigi opened this issue May 13, 2016 · 6 comments
Closed

Deadlock on image delete and container attach #22732

tonistiigi opened this issue May 13, 2016 · 6 comments
Assignees
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. priority/P1 Important: P1 issues are a top priority and a must-have for the next release.
Milestone

Comments

@tonistiigi
Copy link
Member

tonistiigi commented May 13, 2016

Trace from @bukzor in #22124 https://gist.github.com/394589b78225d9be727a3cd7a9381156

There seems to be different order of taking container.Lock() and memoryStore.Lock(). In the trace one way it shows is the contention between goroutines for attaching container stdio and removing images. It's not certain if this is the only way this deadlock can appear(even in this trace).

goroutine 434715 [semacquire, 9318 minutes]:
sync.runtime_Semacquire(0xc8203370f4)
    /usr/local/go/src/runtime/sema.go:43 +0x26
sync.(*RWMutex).RLock(0xc8203370e8)
    /usr/local/go/src/sync/rwmutex.go:36 +0x58
--->>> takes memstore.Lock() github.com/docker/docker/container.(*memoryStore).Get(0xc8203370e0, 0xc823bbb69e, 0x40, 0x7f912377bfe0)
    /go/src/github.com/docker/docker/container/memory_store.go:28 +0x35
github.com/docker/docker/daemon.(*Daemon).AttachStreams(0xc8204bc180, 0xc823bbb69e, 0x40, 0x7f9122f42438, 0xc826795920, 0x7f9122f424e0, 0xc822138e78, 0x7f9122f424e0, 0xc8220bf0a8, 0x0, ...)
    /go/src/github.com/docker/docker/daemon/monitor.go:95 +0x82
github.com/docker/docker/libcontainerd.(*container).start(0xc823682480, 0x0, 0x0)
    /go/src/github.com/docker/docker/libcontainerd/container_linux.go:96 +0x517
github.com/docker/docker/lib containerd.(*client).Create(0xc8205843c0, 0xc8258291c0, 0x40, 0xc8203d3f80, 0x9, 0x17b15e0, 0x5, 0x17adaf0, 0x5, 0x0, ...)
    /go/src/github.com/docker/docker/libcontainerd/client_linux.go:179 +0x734
--->>> holds container.Lock() github.com/docker/docker/daemon.(*Daemon).containerStart(0xc8204bc180, 0xc827c228c0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/start.go:134 +0x658
github.com/docker/docker/daemon.(*Daemon).ContainerStart(0xc8204bc180, 0xc825e16a97, 0x40, 0xc822efca80, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/start.go:75 +0x57c
github.com/docker/docker/api/server/router/container.(*containerRouter).postContainersStart(0xc8208b0120, 0x7f912377c250, 0xc82620cb10, 0x7f9124fba000, 0xc824d1c000, 0xc823cbd500, 0xc82620c990, 0x0, 0x0)


goroutine 434738 [semacquire, 9318 minutes]:
sync.runtime_Semacquire(0xc822c8e8c4)
    /usr/local/go/src/runtime/sema.go:43 +0x26
sync.(*Mutex).Lock(0xc822c8e8c0)
    /usr/local/go/src/sync/mutex.go:82 +0x1c4
--->>> takes container.Lock() github.com/docker/docker/container.(*State).IsRunning(0xc822c8e8c0, 0x12a5260)
    /go/src/github.com/docker/docker/container/state.go:159 +0x28
github.com/docker/docker/daemon.(*Daemon).check ImageDeleteConflict.func1(0xc8272d5a40, 0xc820891200)
    /go/src/github.com/docker/docker/daemon/image_delete.go:329 +0x3e
--->>> holds memStore.Lock() github.com/docker/docker/container.(*memoryStore).First(0xc8203370e0, 0xc821a4f940, 0x0)
    /go/src/github.com/docker/docker/container/memory_store.go:66 +0xf2
github.com/docker/docker/daemon.(*Daemon).checkImageDeleteConflict(0xc8204bc180, 0xc82146f180, 0x47, 0xf, 0xc8204ec000)
    /go/src/github.com/docker/docker/daemon/image_delete.go:331 +0x1c8
github.com/docker/docker/daemon.(*Daemon).imageDeleteHelper(0xc8204bc180, 0xc82146f180, 0x47, 0xc82818d410, 0x10100, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/image_delete.go:264 +0x79
github.com/docker/docker/daemon.(*Daemon).ImageDelete(0xc8204bc180, 0xc825e2214e, 0x2c, 0x100, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/image_delete.go:163 +0xd7c
github.com/docker/docker/api/server/router/image.(*imageRouter).deleteImages(0xc8208b06c0, 0x7f912377c250, 0xc824a55560, 0x7f9124fba000, 0xc8215bfa20, 0xc82018 3ce0, 0xc824a55470, 0x0, 0x0)


goroutine 80506 [semacquire, 9318 minutes]:
sync.runtime_Semacquire(0xc82021ee0c)
    /usr/local/go/src/runtime/sema.go:43 +0x26
sync.(*Mutex).Lock(0xc82021ee08)
    /usr/local/go/s rc/sync/mutex.go:82 +0x1c4
sync.(*RWMutex).Lock(0xc82021ee08)
    /usr/local/go/src/sync/rwmutex.go:82 +0x30
--->>> waits for the memoryStore RWLock so the other readers can't get the lock github.com/docker/docker/container.(*memoryStore).Add(0xc82021ee00, 0xc821686b00, 0x40, 0xc8227df340)
    /go/src/github.com/docker/docker/container/memory_store.go:21 +0x35
github.com/docker/docker/daemon.(*Daemon).Register(0xc820470780, 0xc8227df340, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/daemon.go:233 +0x2af
github.com/docker/docker/daemon.(*Daemon).create(0xc820470780, 0x0, 0x0, 0xc8258fb440, 0xc8214d8380, 0xc824be54e8, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/docker/docker/daemon/create.go:92 +0x38f
github.com/docker/docker/daemon.(*Daemon).ContainerCreate(0xc820470780, 0x0, 0x0, 0xc8258fb440, 0xc8214d8380, 0xc824be54e8, 0x0, 0x0, 0x0, 0x0, ...)
    /go/src/github.com/docker/docker/daemon/create.go:43 +0x496

@tonistiigi tonistiigi added the kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. label May 13, 2016
@tonistiigi
Copy link
Member Author

I'm thinking the best (and safest) way to fix this would be to make a copy in https://github.com/docker/docker/blob/a213a446d7828c54e9ce104b38e1adfd4827871e/container/memory_store.go#L66 before running the filters/apply and releasing the lock before the actual filters get called. cc @LK4D4

@thaJeztah thaJeztah added this to the 1.12.0 milestone May 13, 2016
@bukzor
Copy link

bukzor commented May 16, 2016

Let me know if there's anything more I can do to help with this.

@bukzor
Copy link

bukzor commented May 17, 2016

What are the odds that this will make it into the 1.12.0 release?
This is relevant to our own project planning; this bug affects us heavily.

@bukzor
Copy link

bukzor commented May 20, 2016

Bump; I expect the 1.12 freeze is quickly approaching.

@tonistiigi tonistiigi self-assigned this May 20, 2016
@tonistiigi tonistiigi added the priority/P1 Important: P1 issues are a top priority and a must-have for the next release. label May 20, 2016
@tonistiigi
Copy link
Member Author

I'll make sure this lands before v1.12.

@LK4D4
Copy link
Contributor

LK4D4 commented May 23, 2016

Yeah, I think working with copies is best idea.

tonistiigi added a commit to tonistiigi/docker that referenced this issue May 23, 2016
Rework memoryStore so that filters and apply run
on a cloned list of containers after the lock has
been released. This avoids possible deadlocks when
these filter/apply callbacks take locks for a
container.

Fixes moby#22732

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@thaJeztah thaJeztah modified the milestones: 1.11.2, 1.12.0 May 23, 2016
mlaventure pushed a commit to mlaventure/docker that referenced this issue May 27, 2016
Rework memoryStore so that filters and apply run
on a cloned list of containers after the lock has
been released. This avoids possible deadlocks when
these filter/apply callbacks take locks for a
container.

Fixes moby#22732

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
(cherry picked from commit bd2b3d3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. priority/P1 Important: P1 issues are a top priority and a must-have for the next release.
Projects
None yet
Development

No branches or pull requests

4 participants