Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics/cgroups: fix deadlock issue in Add during Collect #6788

Merged
merged 1 commit into from Apr 11, 2022

Conversation

fuweid
Copy link
Member

@fuweid fuweid commented Apr 7, 2022

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

ns.Collect
  ns.Lock
    Collector.Collect
      Collector.RLock

Goroutine Y:

Collector.Add
  Collector.Lock
    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: #6772

Signed-off-by: Wei Fu fuweid89@gmail.com

@fuweid fuweid force-pushed the fix-issue-6772 branch 3 times, most recently from 87c0dd6 to 72258cd Compare April 7, 2022 16:26
@fuweid fuweid requested a review from AkihiroSuda April 7, 2022 16:30
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 7, 2022

Build succeeded.

@fuweid fuweid added this to New in Code Review via automation Apr 7, 2022
@fuweid fuweid moved this from New to Ready For Review in Code Review Apr 8, 2022
@fuweid fuweid requested a review from dmcgowan April 8, 2022 00:01
@fuweid fuweid added cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch priority/P1 labels Apr 8, 2022
@thaJeztah
Copy link
Member

This doesn't affect v1.5? Or same issue there?

@fuweid
Copy link
Member Author

fuweid commented Apr 8, 2022

This doesn't affect v1.5? Or same issue there?

It was introduced by #5744 and released from 1.6 😃

@thaJeztah
Copy link
Member

Thanks! (just checking if I didn't have to 🍒⛏ for 1.5 as well ☺️)

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The Collector.Collect will be the field ns'Collect's callback, which be
invoked periodically with internal lock. And Collector.Add also runs
with ns.Lock in Collector.Lock, which is easy to cause deadlock.

Goroutine X:

	ns.Collect
	  ns.Lock
	    Collector.Collect
	      Collector.RLock

Goroutine Y:

	Collector.Add
	  Collector.Lock
	    ns.Lock

We should use ns.Lock without Collector.Lock in Add.

Fix: containerd#6772

Signed-off-by: Wei Fu <fuweid89@gmail.com>
@theopenlab-ci
Copy link

theopenlab-ci bot commented Apr 10, 2022

Build succeeded.

Copy link
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikebrow mikebrow merged commit 449eb08 into containerd:main Apr 11, 2022
Code Review automation moved this from Ready For Review to Done Apr 11, 2022
@fuweid fuweid deleted the fix-issue-6772 branch April 11, 2022 01:37
@alam0rt
Copy link

alam0rt commented Apr 11, 2022

Thanks heaps for the quick response to this

@fuweid fuweid added cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch and removed cherry-pick/1.6.x Change to be cherry picked to release/1.6 branch labels Apr 11, 2022
@uthark
Copy link
Contributor

uthark commented Apr 14, 2022

When do you plan to release 1.6.3 with the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-picked/1.6.x PR commits are cherry-picked into release/1.6 branch priority/P1
Projects
Development

Successfully merging this pull request may close these issues.

containers stuck terminating / creating - many runc init processes
6 participants