New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
virt-launcher, Fix Guest Agent updates causing an event handling deadlock #4253
Conversation
…lock When guest-agent is installed, it will send update events, which will send events to a channel of virt-launcher. This channel has two meanings only: 1. Catch the first ADD event. 2. Catch the delete event. Since this channel is sampled for checking of delete event only after the qemu PID is gone, this channel can get full. In such case the event handler (eventCallback) will hang forever. Guest agent wont be restarted and events wont be handled. Since we just need ADD events and DELETE events we dont need to send events to this channel upon guest agent updates. Signed-off-by: Or Shoval <oshoval@redhat.com>
There is no need to send the event twice to virt-handler Signed-off-by: Or Shoval <oshoval@redhat.com>
The log also is not right, it should be logged only in case the osInfo indeed changed, and it doesnt add important info beside that we are in this section, and we have prints before and after it anyway. Signed-off-by: Or Shoval <oshoval@redhat.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch.
maybe we should make a wrapper function around the events <- event
logic that ensures we only send the event to that channel if it is an ADD
event or a MODIFY
event with a domain with DeletionTimestamp != nil.
This is exactly what i did at first and removed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check if we need to make this thread safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify the closure since the caller knows the kind of event.
Since the channel is meant only for the first add and delete events, its better to protected it, because only the first events are needed, and the delete event will be consumed only after the qemu PID is gone. So this channel should not receive unexpected events, because it will cause a deadlock. Signed-off-by: Or Shoval <oshoval@redhat.com>
/test pull-kubevirt-e2e-k8s-cnao-1.17 |
@oshoval: The specified target(s) for
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm Although the function updateEVents smell a little since it has mixed responsabilities, without proper naming. |
imho its in order to keep things simpler |
looks good. Thanks. |
just in order to have the variables static (so they wont die between runs, because we need them to be persistent), and not use global for this purpose, so a closure gives this behavior https://stackoverflow.com/questions/30558071/static-local-variable-in-go |
Thanks for explaining:) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidvossel The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Thanks |
When guest-agent is installed, it will send update events,
which will send events to a channel of virt-launcher.
This channel has two purposes only:
Since this channel is sampled for checking of delete event
only after the qemu PID is gone, this channel can get full, in case wrong events are sent to it.
In such case the event handler (eventCallback) will hang forever
(because it tries to send to a full channel).
Guest agent wont be restarted and events wont be handled.
Since we just need ADD events and DELETE events
we dont need to send events to this channel upon
guest agent updates.
Moreover there was a double send to the virt-handler
which isnt needed as well.
Add mechanism to avoid sending unnecessary events to the channel.
Found as part of working on
https://bugzilla.redhat.com/show_bug.cgi?id=1874812
Signed-off-by: Or Shoval oshoval@redhat.com