Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runc create hung when apply systemd cgroup config when use godbus #321

Open
zvier opened this issue Apr 14, 2022 · 1 comment
Open

runc create hung when apply systemd cgroup config when use godbus #321

zvier opened this issue Apr 14, 2022 · 1 comment

Comments

@zvier
Copy link

zvier commented Apr 14, 2022

runc create hung when apply systemd cgroup config use godbus. We found the cause is
Object call CallWithContext method In object.go will wait for read <-o.createCall().Done channel.

dbus/object.go

Line 38 in fc37d31

func (o *Object) CallWithContext(ctx context.Context, method string, flags Flags, args ...interface{}) *Call {

If Conn's send method have no data write the above channel, Object.CallWithContext also will wait forever. Finally, runc create hungs and cause k8s node PLEG notready.

dbus/conn.go

Line 541 in fc37d31

if msg.Type == TypeMethodCall && msg.Flags&FlagNoReplyExpected == 0 {

Maybe it's better to set ch <- call according to if condition even if msg is not expected.

@guelfey
Copy link
Member

guelfey commented Apr 15, 2022

I'm not sure what exactly you are suggesting. Maybe you can point out the code in runc where the problem occurs, or do you have a reproducible setup? But generally, from your description the behavior sounds as expected from the library point of view:

CallWithContext intentionally blocks until a response is received. The channel write that unblocks the call doesn't happen directly in the send call; instead the call is tracked internally (see e.g.

dbus/conn.go

Line 905 in fc37d31

func (tracker *callTracker) track(sn uint32, call *Call) {
) and a different worker Goroutine writes to the channel later when a response is received. If you don't want to block forever, you can set a context deadline or e.g. use Go.

It could of course be the case that the response doesn't correctly reach the call channel due to some bug, but there's not enough info here to investigate that further. The other possibility is that the method reply gets dropped in the dbus daemon (maybe due to some policy configuration) or systemd doesn't even send one for some reason. You could investigate first using e.g. dbus-monitor whether this is the case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants