-
Notifications
You must be signed in to change notification settings - Fork 161
vmm_tests: default to nvme keepalive, and adjust the many-devices test to better accomodate private pool #2401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR enables NVMe Keepalive by default in OpenHCL servicing operations to provide more test coverage in CI. The main changes include updating the default value of enable_nvme_keepalive from false to true in OpenHclServicingFlags, adjusting test resource configurations to support keepalive functionality, and improving NVMe error message formatting.
- Changed
OpenHclServicingFlags::enable_nvme_keepaliveto default totrueinstead of requiring explicit opt-in - Enhanced the "many NVMe devices" test with increased memory, VPs, and private pool allocation to support keepalive testing
- Improved NVMe error formatting to provide more detailed status code type and status code information
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
petri/src/vm/mod.rs |
Implements custom Default trait for OpenHclServicingFlags with enable_nvme_keepalive: true as the new default value |
vmm_tests/vmm_tests/tests/tests/multiarch/openhcl_servicing.rs |
Updates servicing tests to use the new default flags, simplifying test code by removing explicit enable_nvme_keepalive: true settings |
vmm_tests/vmm_tests/tests/tests/x86_64/openhcl_linux_direct.rs |
Renames test to many_nvme_devices_servicing_heavy and configures it with 8GB memory, 4 VPs, and 64MB private pool to support keepalive functionality |
vm/devices/storage/disk_nvme/nvme_driver/src/queue_pair.rs |
Enhances NVMe error display to show detailed status code types (SCT) and status codes (SC) for better debugging, covering all defined status code types |
openhcl/underhill_core/src/dispatch/mod.rs |
Adds vf_keepalive_flag to the tracing span for better observability during NVMe manager save operations |
…that decides what is default and supported (or not)
| /// Get the default servicing flags (based on what this backend supports) | ||
| fn default_servicing_flags() -> OpenHclServicingFlags; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjones60 does this look like the right way to get some per-backend defaults?
| OpenHclServicingFlags { | ||
| enable_nvme_keepalive: true, | ||
| override_version_checks: false, | ||
| stop_timeout_hint_secs: None, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| (firmware.quirks().hyperv, VmmQuirks::default()) | ||
| } | ||
|
|
||
| fn default_servicing_flags() -> OpenHclServicingFlags { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much work is needed for this support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I remember adding this check, but I don't quite know the delta. I will certainly be digging into this soon-ish.
This PR enables NVMe Keepalive across servicing operations by default (the per-servicing flag
is the final control of whether the actual servicing uses the feature). This gives slightly
more coverage in our CI.
In addition, ensure that the "many devices" test has enough resources to allocate from private
pool, so that keepalive will be used.