New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
block: Leverage multiqueue for virtio-block #4503
Conversation
Similar to network, we can use multiple queues for virtio-block devices. This can help improve storage performance. This commit changes the number of queues for block devices to the number of cpus for cloud-hypervisor and qemu. Today the default number of cpus a VM starts with is 1. Hence the queues used will be 1. This change will help improve performance when the default cold-plugged cpus is greater than one by changing this in the config file. This may also help when we use the sandboxing feature with k8s that passes down the sum of the resources required down to Kata. Fixes kata-containers#4502 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
if err = q.qmpMonitorCh.qmp.ExecutePCIDeviceAdd(q.qmpMonitorCh.ctx, drive.ID, devID, driver, addr, bridge.ID, romFile, 0, true, defaultDisableModern); err != nil { | ||
queues := int(q.config.NumVCPUs) | ||
|
||
if err = q.qmpMonitorCh.qmp.ExecutePCIDeviceAdd(q.qmpMonitorCh.ctx, drive.ID, devID, driver, addr, bridge.ID, romFile, queues, true, defaultDisableModern); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about virtio-scsi
? Can we also enable multi-queue for virtio-scsi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fengwang666 Yes, I am planning to add this to scsi as well, maybe in a follow-up PR. I do want to do some performance testing before that, like figuring out if we need to cap the number of queues. Beyond a point, increasing the queues may not give any benefits.
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -753,6 +753,11 @@ func (clh *cloudHypervisor) hotplugAddBlockDevice(drive *config.BlockDrive) erro | |||
clhDisk.Readonly = &drive.ReadOnly | |||
clhDisk.VhostUser = func(b bool) *bool { return &b }(false) | |||
|
|||
queues := int32(clh.config.NumVCPUs) | |||
queueSize := int32(1024) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amshinde Can you add a comment to describe why here use 1024 as a queue size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I agree with @liubin it'd be good to understand why you picked 1024. Having a deeper queue might bring some benefits or drawbacks depending on the use case.
Nice work @amshinde ! Is it the only missing piece to have multi-queue support throughout the block IO stack or do we need to change more things inside the guest to more performance out of it (e.g., enable blk-mq in the guest virtio-block driver etc.)? |
@bergwolf This is the kernel config required |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Similar to network, we can use multiple queues for virtio-block
devices. This can help improve storage performance.
This commit changes the number of queues for block devices to
the number of cpus for cloud-hypervisor and qemu.
Today the default number of cpus a VM starts with is 1.
Hence the queues used will be 1. This change will help
improve performance when the default cold-plugged cpus is greater
than one by changing this in the config file. This may also help
when we use the sandboxing feature with k8s that passes down
the sum of the resources required down to Kata.
Fixes #4502
Signed-off-by: Archana Shinde archana.m.shinde@intel.com