Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix test_qos_sai teardown for dualtor #13363

Merged

Conversation

vivekverma-arista
Copy link
Contributor

Description of PR

Summary: Fix qos/tes_qos_sai.py teardown failure for dualtor.
Fixes #130

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012
  • 202205
  • 202305
  • 202311
  • 202405

Approach

What is the motivation for this PR?

qos/test_qos_sai.py fails at teardown

failed on setup with "Failed: All critical services should be fully started!

Regression introduced by #10651 for dualtor.

How did you do it?

The config_reload in the fixture dut_disable_ipv6 waits until all critical processes are up after issuing config reload command and it timeouts in case of dualtor because mux container doesn't come up. Mux container is disabled by another fixture stopServices in the same file. These two fixtures have no dependency on each other hence the execution can happen in any order, so if the teardown of dut_disable_ipv6 happens before stopServices then this issue is seen.

This change ensures that the teardown of stopServices happens before dut_disable_ipv6 so that mux is no longer disabled at the time of config_reload.

How did you verify/test it?

Ran qos/test_qos_sai.py on Arista-7260CX3 platform with dualtor topology with 202305 and 202311 images.

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@StormLiangMS
Copy link
Collaborator

hi @XuChen-MSFT @lolyu this fix makes sense to me, could you help to take a look?

Copy link
Contributor

@lolyu lolyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, good catch, thanks!

@StormLiangMS StormLiangMS merged commit 20c8cdf into sonic-net:master Jun 21, 2024
14 checks passed
mssonicbld pushed a commit to mssonicbld/sonic-mgmt that referenced this pull request Jun 21, 2024
Approach
What is the motivation for this PR?
qos/test_qos_sai.py fails at teardown

failed on setup with "Failed: All critical services should be fully started!
Regression introduced by sonic-net#10651 for dualtor.

How did you do it?
The config_reload in the fixture dut_disable_ipv6 waits until all critical processes are up after issuing config reload command and it timeouts in case of dualtor because mux container doesn't come up. Mux container is disabled by another fixture stopServices in the same file. These two fixtures have no dependency on each other hence the execution can happen in any order, so if the teardown of dut_disable_ipv6 happens before stopServices then this issue is seen.

This change ensures that the teardown of stopServices happens before dut_disable_ipv6 so that mux is no longer disabled at the time of config_reload.

How did you verify/test it?
Ran qos/test_qos_sai.py on Arista-7260CX3 platform with dualtor topology with 202305 and 202311 images.
@mssonicbld
Copy link
Collaborator

Cherry-pick PR to 202305: #13401

@vivekverma-arista vivekverma-arista deleted the fix-qos-teardown-dualtor branch June 21, 2024 13:23
mssonicbld pushed a commit that referenced this pull request Jun 21, 2024
Approach
What is the motivation for this PR?
qos/test_qos_sai.py fails at teardown

failed on setup with "Failed: All critical services should be fully started!
Regression introduced by #10651 for dualtor.

How did you do it?
The config_reload in the fixture dut_disable_ipv6 waits until all critical processes are up after issuing config reload command and it timeouts in case of dualtor because mux container doesn't come up. Mux container is disabled by another fixture stopServices in the same file. These two fixtures have no dependency on each other hence the execution can happen in any order, so if the teardown of dut_disable_ipv6 happens before stopServices then this issue is seen.

This change ensures that the teardown of stopServices happens before dut_disable_ipv6 so that mux is no longer disabled at the time of config_reload.

How did you verify/test it?
Ran qos/test_qos_sai.py on Arista-7260CX3 platform with dualtor topology with 202305 and 202311 images.
XuChen-MSFT added a commit to XuChen-MSFT/sonic-mgmt that referenced this pull request Jun 24, 2024
@XuChen-MSFT
Copy link
Contributor

@vivekverma-arista @StormLiangMS
this pr cause qos sai test error when checking critical processors.

reverted in PR #13436

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants