ch_tests_tool: run all perf tests in one go#4419
Merged
Merged
Conversation
total_idle_time should be reset whenever there is log growth. There is no point in adding to total idle time if progress was eventually made. In long running tests, it is expected that there is no log growth for a while. The existing logic would continue adding to total_time and eventually timeout even if the tests are not stuck. Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
This is already checked by LISA's tool installation logic and we would never reach here if dev_cli.sh didn't exist. Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors the Cloud Hypervisor performance metrics test flow to run all metrics subtests in a single invocation, then derive per-subtest pass/fail and metrics from a generated JSON report, removing legacy per-test env/config handling.
Changes:
- Run metrics tests once with
--continue-on-failureand collect per-subtest results from a JSON report. - Consolidate one-time host setup into
_ensure_host_setup()and remove per-test policies/env var logic. - Remove MS-CLH variables for per-test block-size tuning that are no longer used.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| lisa/microsoft/testsuites/cloud_hypervisor/ch_tests_tool.py | Reworks metrics execution to one run + JSON report parsing; adjusts host setup and diagnostics behavior. |
| lisa/microsoft/testsuites/cloud_hypervisor/ch_tests.py | Removes now-unused variables for metrics block-size configuration passed into the tool. |
log_path parameter is unused in several places. Remove it. Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
e3b3163 to
80ec8fe
Compare
80ec8fe to
ca8df13
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
lisa/microsoft/testsuites/cloud_hypervisor/ch_tests_tool.py:539
run_metrics_testsno longer treats the overall command exit status as authoritative. If the metrics runner exits non-zero (e.g., one or more subtests failed) but the JSON report still contains entries, the function can end up passing becausefailed_testcasesstays empty and there is noresult.exit_code/result.is_timeoutassertion. Please fail the test whenresult.is_timeoutis true orresult.exit_code != 0, and/or cross-check the report contents against the expected testcase list so missing/failed subtests can’t be silently ignored.
for testcase, entry in per_test_results.items():
status, metrics, trace = self._classify_test_result(testcase, entry)
if status == TestStatus.FAILED:
failed_testcases.append(testcase)
self._send_metrics_test_result(
test_result, testcase, status, metrics, trace
)
if len(failed_testcases) > 0:
diagnostic_info = self._extract_diagnostic_info(
log_path, test_name, result
)
self._log.error(
f"Test cases failed: {failed_testcases} "
f"Diagnostics: {diagnostic_info}"
)
self._check_test_panic_from_logs(
test_result=test_result,
content=result.stdout,
stage="metrics tests",
test_name="run_metrics_tests",
)
self._save_kernel_logs(log_path)
# Check for kernel panic after all tests complete
if self.node.features.is_supported(SerialConsole):
self.node.features[SerialConsole].check_panic(
saved_path=log_path, force_run=True, test_result=test_result
)
assert_that(
failed_testcases, f"Failed Testcases: {failed_testcases}"
).is_empty()
ca8df13 to
6788aeb
Compare
pupacha
reviewed
May 4, 2026
pupacha
previously approved these changes
May 4, 2026
The current implementation lists all the perf tests and runs them one-by-one i.e. one dev_cli.sh per test. This process is too slow. It takes up around 5.5 hours for all the tests to run. Instead, run all the tests in one go and use the --continue-on-failure and --report-file parameters. --continue-on-failure ensures that we continue running the tests even after a test failure. --report-file produces a JSON file with all the test results. Signed-off-by: Anirudh Rayabharam <anrayabh@microsoft.com>
6788aeb to
a798f6f
Compare
pupacha
approved these changes
May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request significantly refactors the Cloud Hypervisor performance metrics test logic to simplify test execution, improve result collection, and remove legacy per-test configuration. The main focus is on running all metrics subtests in a single invocation, collecting results from a generated JSON report, and eliminating no-longer-needed per-test environment variables and logic.
Related Issue
Type of Change
Checklist
Test Validation
Key Test Cases:
Impacted LISA Features:
Tested Azure Marketplace Images:
Test Results