Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gha: Fix the failure of gha metrics for StratoVirt #8657

Merged

Conversation

WenyuanLau
Copy link
Contributor

Update the metrics test baseline for StratoVirt and re-enable.

Fixes: #8656

Related: #8496

@katacontainersbot katacontainersbot added the size/large Task of significant size label Dec 13, 2023
@WenyuanLau WenyuanLau added no-backport-needed ok-to-test and removed size/large Task of significant size labels Dec 13, 2023
@WenyuanLau
Copy link
Contributor Author

/test

@GabyCT
Copy link
Contributor

GabyCT commented Dec 13, 2023

@WenyuanLau how do you get the limits for each test? I would prefer if you enable the stratovirt but do not run the tests and enabling them one by one until you find out how much stable they are. The reason behind this is that I noticed in the baremetal that when we were running with stratovirt some of the tests did not run properly for example some of the ctr tests had an UNKNOWN status when the container was running or some of the k8s tests the pods did not run properly which cause that the entire baremetal fail when running other tests with qemu or clh because the environment was not clean enough and in order to fix it we need it to reboot the machine

@GabyCT
Copy link
Contributor

GabyCT commented Dec 13, 2023

If you enable but do not run each tests it will be safer and easier to verify which test has issues and avoid the random failures with other hypervisors

@WenyuanLau
Copy link
Contributor Author

WenyuanLau commented Dec 13, 2023

@GabyCT I record the data for 5 times when StratoVirt running in GHA but fail, and I get the fio limit just by compare the performance between StratoVirt and other VMMs in the same baremetal. I did run the metrics in the baremetal for many times and get the stable data.

How can I repeat and get the inproper status you said? :)

@GabyCT
Copy link
Contributor

GabyCT commented Dec 13, 2023

@GabyCT I record the data for 5 times when StratoVirt running in GHA but fail, and I get the fio limit just by compare the performance between StratoVirt and other VMMs in the same baremetal. I did run the metrics in the baremetal for many times and get the stable data.

How can I repeat and get the inproper status you said? :)

you will need to run like the GHA runner is doing all the metrics and see if they are failing

Update the Speed & Density metric tests baseline for StratoVirt
and re-enable them, and skip other metric tests temporarily.

Fixes: kata-containers#8656

Signed-off-by: Liu Wenyuan <liuwenyuan9@huawei.com>
@WenyuanLau
Copy link
Contributor Author

I've done some tests, how about we enable Speed & Density metric tests and skip other tests for StratoVirt first? After things are stable, we can enable other tests step by step. @GabyCT

My plan:

  1. Speed & Density (launchtimes, memory_usage, memory_usage_inside_container)
  2. iperf
  3. latency
  4. blogbench
  5. fio
  6. tensorflow

@WenyuanLau WenyuanLau force-pushed the 8656/Fix_StratoVirt_on_gha_metrics branch from 4829f73 to 61fe20c Compare December 15, 2023 12:22
@katacontainersbot katacontainersbot added the size/large Task of significant size label Dec 15, 2023
@fidencio
Copy link
Member

/test

Copy link
Member

@fidencio fidencio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks @WenyuanLau!

@GabyCT GabyCT merged commit 69be050 into kata-containers:main Jan 11, 2024
176 of 243 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test size/large Task of significant size
Projects
None yet
Development

Successfully merging this pull request may close these issues.

gha: Fix the StratoVirt failure on gha metrics
4 participants