Waiting event of vsock running before use it when use_vsock enabled #1918
Conversation
|
Nice find @BetaXOi - pulling in some folks to review... |
|
Hello everyone, this PR is a simple methed to fix #1917, it's a demo just. |
|
thanks @BetaXOi I'll mark this as WIP and DNM |
|
@jodh-intel @markdryan I find a new way to fix it without check Qemu version, but need modify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @BetaXOi - thanks for raising, but the govmm change needs to be raised as a PR for https://github.com/intel/govmm.
|
@stefanha This fell off my radar for a bit. Can you take a look at this PR? |
Report vsock running event so that the upper application can control boot sequence. see kata-containers/runtime#1918 Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>
The upper hyervisor manager application maybe need to wait some QMP event to control boot sequence, but the event we wanted maybe not exist in some older version, so we need query all QMP ABI and check the event is supported or not. related: kata-containers/runtime#1918 Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>
The upper hyervisor manager application maybe need to wait some QMP event to control boot sequence, but the event we wanted maybe not exist in some older version, so we need query all QMP ABI and check the event is supported or not. related: kata-containers/runtime#1918 Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>
|
I have updated the commit with revendor |
|
Thanks @BetaXOi. Travis is failing and if you look at the log: https://github.com/kata-containers/community/blob/master/CONTRIBUTING.md#patch-format gives a full explanation and examples but in your case, I'd change your commit message to the following to keep Travis happy: |
virtcontainers/qemu.go
Outdated
|
|
||
| q.Logger().Info("waiting event of vsock running") | ||
| for { | ||
| ev := <-eventCh |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ningbo9 @BetaXOi Is there a possibility of a race here ie you may have missed the vosck "running" event while you are querying for the schema? If so, I would suggest listening on the channel in a separate go-routine that is started as early as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the current PR, query-qmp-schema is almost the first QMP command (because qmp_capabilities must be executed before any other QMP commands are executed) and is called immediately after LaunchQemu, and eventCh receives the event until we read it.
I have no idea how to get this process earlier.
…abled If `kata-runtime` connect vsock device in guest but the device is not start ready, the 'connect' will block and timeout after 2 sencond. This will cause the boot is slower 2 second when use vsock than serial. So we should wait 'VSOCK_RUNNING' QMP event before connect to avoid this case. Fixes: kata-containers#1917 Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>
|
@devimc I have updated the commit as suggested, the PR is ready to be merge, could you remove the 'wip' label and trigger the CI? |
|
/test |
| EventCh: eventCh, | ||
| Logger: newQMPLogger(), | ||
| // response of qmp.ExecuteQMPCapabilities bigger than default 64k in bufio.NewScanner() | ||
| MaxCapacity: 512 * 1024, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this be a constant ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases, we don't need to modify this value. We only need to keep the default value. In special cases, we need to modify the value according to the actual situation, but we can't find a suitable value to adapt to any situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, can 512 * 1024 be a constant and use it here
for example:
MaxCapacity: QMPCapabilitiesMaxCap,There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, the benefit of doing this is that it is convenient to do unit testing, right?
So, do I need to add an additional test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, defining constant is useful for unit testing, not sure if it's possible to add unit tests for this function, but if you can, go for it please
| break Loop | ||
| } | ||
| case <-time.After(time.Duration(timeout)*time.Second - time.Since(timeStart)): | ||
| q.Logger().Info("waiting event of vsock running") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this log is not needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it in next commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or you can move it before for
|
Hi @BetaXOi, if I understand correctly, this patch depends on a qemu patch to send the qmp event when vsock is running.
Also, can you address the above comment? This seems to have been marked as resolved. |
Yes, if qemu does not support reporting the
No, the patch has not been merged yet |
|
@amshinde Thanks for letting me know. I'll discuss this upstream in QEMU. |
|
@BetaXOi Do you want to respond to my qemu-devel mailing list reply so we can make progress on this issue? https://lists.nongnu.org/archive/html/qemu-devel/2019-08/msg01672.html |
|
@raravena80 I haven't seen a response from @BetaXOi: |
|
@BetaXOi ping. Your weekly Kata herder. |
|
After some discussion through the mail, it may be possible to solve the problem with a reverse connection. Close this PR first, I will submit a new one. FYI: https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg04233.html |
Fixes: #1917
kata-runtimeneeds wait qemu report event of vsock is running.see BetaXOi/qemu@9b536b6
Signed-off-by: Ning Bo n.b@live.com