Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mellanox] Support PSU power threshold checking #6288

Merged

Conversation

stephenxs
Copy link
Contributor

Description of PR

Summary:

Support test case for PSU power threshold exceeding check.

Signed-off-by: Stephen Sun stephens@nvidia.com

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Back port request

  • 201911
  • 202012

Approach

What is the motivation for this PR?

  1. Implement the regression test case of PSU power threshold exceeding check.
    It can be done on a per vendor/platform basis only because the way to trigger PSU power threshold exceeding varies among vendors and platforms.
  2. Support parsing WARNING state in the output of show platform psustatus

How did you do it?

We implement the test as a mock test because

  • It's not able to meet the conditions to trigger the PSU power to exceed its threshold in any regression test bed, (which requires 100% utilization of throughput and high power-consumption xSFP modules
  • PSU power threshold exceeding checking is not supported on all testbeds. If we would like to run the test on a test bed which does not physically support it, we also need to mock it.

How did you verify/test it?

Manually test and run regression test.

Any platform specific information?

Mellanox platforms only.

Supported testbed topology if it's a new test case?

Documentation

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- update ambient sensors and check db
- update power and check db

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
- allure report
- CLI option

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs stephenxs changed the title [Mellanox] PSU power threshold exceeding check [Mellanox] PSU power threshold checking Sep 5, 2022
@stephenxs stephenxs changed the title [Mellanox] PSU power threshold checking [Mellanox] Support PSU power threshold checking Sep 5, 2022
@lgtm-com
Copy link

lgtm-com bot commented Sep 5, 2022

This pull request introduces 1 alert when merging 36ed54a into 2516c9c - view on LGTM.com

new alerts:

  • 1 for Unused import

Signed-off-by: Stephen Sun <stephens@nvidia.com>
@stephenxs
Copy link
Contributor Author

This pull request introduces 1 alert when merging 36ed54a into 2516c9c - view on LGTM.com

new alerts:

  • 1 for Unused import

Fixed.

@stephenxs
Copy link
Contributor Author

Failed on t0 due to error messages. Retried:

2022-09-06T01:22:15.3725826Z E               Failed: Processes "['analyze_logs--<MultiAsicSonicHost vlab-02>']" failed with exit code "1"
2022-09-06T01:22:15.3726759Z E               Exception:
2022-09-06T01:22:15.3727353Z E               expected_match: 0
2022-09-06T01:22:15.3728000Z E               expected_missing_match: 0
2022-09-06T01:22:15.3728609Z E               match: 3
2022-09-06T01:22:15.3729117Z E               
2022-09-06T01:22:15.3729625Z E               Match Messages:
2022-09-06T01:22:15.3731112Z E               Sep  6 01:20:05.211115 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list
2022-09-06T01:22:15.3732057Z E               
2022-09-06T01:22:15.3733423Z E               Sep  6 01:20:05.218191 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list
2022-09-06T01:22:15.3734332Z E               
2022-09-06T01:22:15.3737627Z E               Sep  6 01:20:05.218578 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list
2022-09-06T01:22:15.3738598Z E               
2022-09-06T01:22:15.3739067Z E               Traceback:
2022-09-06T01:22:15.3739695Z E               Traceback (most recent call last):
2022-09-06T01:22:15.3740705Z E                 File "/var/src/s/tests/common/helpers/parallel.py", line 31, in run
2022-09-06T01:22:15.3741457Z E                   Process.run(self)
2022-09-06T01:22:15.3742151Z E                 File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
2022-09-06T01:22:15.3742969Z E                   self._target(*self._args, **self._kwargs)
2022-09-06T01:22:15.3743772Z E                 File "/var/src/s/tests/common/helpers/parallel.py", line 226, in wrapper
2022-09-06T01:22:15.3744588Z E                   target(*args, **kwargs)
2022-09-06T01:22:15.3745459Z E                 File "/var/src/s/tests/common/plugins/loganalyzer/__init__.py", line 39, in analyze_logs
2022-09-06T01:22:15.3746427Z E                   dut_analyzer.analyze(markers[node.hostname])
2022-09-06T01:22:15.3747384Z E                 File "/var/src/s/tests/common/plugins/loganalyzer/loganalyzer.py", line 369, in analyze
2022-09-06T01:22:15.3748310Z E                   self._verify_log(analyzer_summary)
2022-09-06T01:22:15.3749634Z E                 File "/var/src/s/tests/common/plugins/loganalyzer/loganalyzer.py", line 133, in _verify_log
2022-09-06T01:22:15.3750557Z E                   raise LogAnalyzerError(result_str)
2022-09-06T01:22:15.3751291Z E               LogAnalyzerError: expected_match: 0
2022-09-06T01:22:15.3752155Z E               expected_missing_match: 0
2022-09-06T01:22:15.3752763Z E               match: 3
2022-09-06T01:22:15.3753274Z E               
2022-09-06T01:22:15.3753813Z E               Match Messages:
2022-09-06T01:22:15.3755401Z E               Sep  6 01:20:05.211115 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list
2022-09-06T01:22:15.3756367Z E               
2022-09-06T01:22:15.3757758Z E               Sep  6 01:20:05.218191 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list
2022-09-06T01:22:15.3758681Z E               
2022-09-06T01:22:15.3760339Z E               Sep  6 01:20:05.218578 vlab-02 ERR macsec#wpa_supplicant[176]: KaY: The key server is not in my live peers list

@Pterosaur
Copy link
Contributor

/azp run Azure.sonic-mgmt

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@wangxin wangxin merged commit 9933dcd into sonic-net:master Sep 8, 2022
Azarack pushed a commit to Azarack/sonic-mgmt that referenced this pull request Oct 17, 2022
What is the motivation for this PR?
Implement the regression test case of PSU power threshold exceeding check.
1. It can be done on a per vendor/platform basis only because the way to trigger PSU power threshold exceeding varies among vendors and platforms.
2. Support parsing WARNING state in the output of show platform psustatus

How did you do it?
We implement the test as a mock test because
* It's not able to meet the conditions to trigger the PSU power to exceed its threshold in any regression test bed, (which requires 100% utilization of throughput and high power-consumption xSFP modules
* PSU power threshold exceeding checking is not supported on all testbeds. If we would like to run the test on a test bed which does not physically support it, we also need to mock it.

How did you verify/test it?
Manually test and run regression test.

Any platform specific information?
Mellanox platforms only.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
allen-xf pushed a commit to allen-xf/sonic-mgmt that referenced this pull request Oct 28, 2022
What is the motivation for this PR?
Implement the regression test case of PSU power threshold exceeding check.
1. It can be done on a per vendor/platform basis only because the way to trigger PSU power threshold exceeding varies among vendors and platforms.
2. Support parsing WARNING state in the output of show platform psustatus

How did you do it?
We implement the test as a mock test because
* It's not able to meet the conditions to trigger the PSU power to exceed its threshold in any regression test bed, (which requires 100% utilization of throughput and high power-consumption xSFP modules
* PSU power threshold exceeding checking is not supported on all testbeds. If we would like to run the test on a test bed which does not physically support it, we also need to mock it.

How did you verify/test it?
Manually test and run regression test.

Any platform specific information?
Mellanox platforms only.

Signed-off-by: Stephen Sun <stephens@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants