[action] [PR:16579] fix t1-isolated-d28u1 link mapping#39
Merged
Conversation
<!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: In d28u1 topology, current link was not selected correctly for t1 hwsku. ``` Ethernet96 121,122,123,124 400G 9100 N/A Ethernet13/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet100 125,126,127,128 400G 9100 N/A Ethernet13/5 routed down down OSFP 8X Pluggable Transceiver off Ethernet112 97 100G 9100 rs Ethernet15/1 routed down down OSFP 8X Pluggable Transceiver off Ethernet113 98 100G 9100 rs Ethernet15/2 routed down down OSFP 8X Pluggable Transceiver off Ethernet114 99 100G 9100 rs Ethernet15/3 routed down down OSFP 8X Pluggable Transceiver off Ethernet115 100 100G 9100 rs Ethernet15/4 routed down down OSFP 8X Pluggable Transceiver off Ethernet116 101 100G 9100 rs Ethernet15/5 routed down down OSFP 8X Pluggable Transceiver off Ethernet117 102 100G 9100 rs Ethernet15/6 routed down down OSFP 8X Pluggable Transceiver off Ethernet118 103 100G 9100 rs Ethernet15/7 routed up up OSFP 8X Pluggable Transceiver off ``` After the fix: ``` admin@xxxx:~$ show interface status | grep up Ethernet0 17 100G 9100 rs Ethernet1/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet16 9 100G 9100 rs Ethernet3/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet32 57 100G 9100 rs Ethernet5/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet48 33 100G 9100 rs Ethernet7/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet64 89 100G 9100 rs Ethernet9/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet80 65 100G 9100 rs Ethernet11/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet96 121,122,123,124 400G 9100 N/A Ethernet13/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet112 97 100G 9100 rs Ethernet15/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet144 129 100G 9100 rs Ethernet19/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet160 185 100G 9100 rs Ethernet21/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet176 161 100G 9100 rs Ethernet23/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet192 217 100G 9100 rs Ethernet25/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet208 193 100G 9100 rs Ethernet27/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet224 249 100G 9100 rs Ethernet29/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet240 225 100G 9100 rs Ethernet31/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet256 273 100G 9100 rs Ethernet33/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet272 265 100G 9100 rs Ethernet35/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet288 313 100G 9100 rs Ethernet37/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet304 289 100G 9100 rs Ethernet39/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet320 345 100G 9100 rs Ethernet41/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet336 321 100G 9100 rs Ethernet43/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet368 353 100G 9100 rs Ethernet47/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet400 385 100G 9100 rs Ethernet51/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet416 441 100G 9100 rs Ethernet53/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet432 417 100G 9100 rs Ethernet55/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet448 473 100G 9100 rs Ethernet57/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet464 449 100G 9100 rs Ethernet59/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet480 505 100G 9100 rs Ethernet61/1 routed up up OSFP 8X Pluggable Transceiver off Ethernet496 481 100G 9100 rs Ethernet63/1 routed up up OSFP 8X Pluggable Transceiver off admin@xxx:~$ ``` ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202012 - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [x] 202412 ### Approach #### What is the motivation for this PR? update topology with correct link #### How did you do it? update the check for current port #### How did you verify/test it? on physical testbed. #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? -->
Collaborator
Author
|
Original PR: sonic-net/sonic-mgmt#16579 |
Collaborator
Author
|
/azp run |
11 tasks
github-actions Bot
pushed a commit
that referenced
this pull request
May 19, 2026
… stale (#24691) ### Description of PR Summary: When `sudo monit validate` is run just before `sudo monit status`, the status output may still carry the **old** "data collected" timestamp because monit hasn't finished its internal refresh cycle yet. This causes the memory-utilization plugin to read stale baseline data before or after a test run. This PR adds a **freshness-retry** mechanism: 1. `record_monit_baseline_from_validate_output(validate_output)` — parses the System-block "data collected" timestamp from the `sudo monit validate` stdout and saves it as a baseline. Called in both `pytest_runtest_setup` and `pytest_runtest_teardown` (in `__init__.py`) right after `sudo monit validate`. 2. `read_monit_status_with_freshness_retry(cmd)` — executes `sudo monit status`, compares the System-block "data collected" timestamp against the saved baseline, and if they still match (stale), sleeps `MONIT_STATUS_FRESHNESS_WAIT_SECONDS` (60 s) and retries, up to `MONIT_STATUS_FRESHNESS_MAX_RETRIES` (3) times. Used only for the `monit` command entry. Both constants are module-level tunables so they can be overridden in tests. ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Intermittent false-positive memory alarms were caused by the monit daemon not having refreshed its internal data by the time `sudo monit status` was issued right after `sudo monit validate`. The stale status output contained pre-test memory readings which then incorrectly appeared as the "before test" baseline, making normal memory usage look like an increase. #### How did you do it? - Added `record_monit_baseline_from_validate_output()` to capture the System-block "data collected" timestamp immediately after `sudo monit validate`. - Added `read_monit_status_with_freshness_retry()` to compare the current monit status timestamp against the saved baseline; if still stale, sleep and retry (up to 3 times, 60 s each). - Hooked both functions into `pytest_runtest_setup` and `pytest_runtest_teardown` in `__init__.py`. - Only the `monit` command entry uses the freshness-retry path; all other memory commands (`top`, `free`, `docker stats`, FRR) are unchanged. #### How did you verify/test it? - Manually verified on a VS testbed that `_parse_monit_memory_data_collected_timestamp` correctly extracts the System-block timestamp while ignoring Filesystem/Process/Program block timestamps. - Unit-tested the retry logic by mocking `execute_command` to return stale output for the first N calls and fresh output on the final call. #### Any platform specific information? The retry wait time (60 s) matches the monit default poll cycle; can be lowered if the target device uses a shorter cycle. #### Supported testbed topology if it's a new test case? N/A — this is a framework fix for the memory-utilization plugin, not a new test case. ### Documentation No documentation update required — this is an internal framework fix. ### Verification Elastic test jobs for `generic_config_updater` (branch: `dev/xuliping/20260512_internal-202511_monit-freshness-retry`, image: `internal-202511`): | Testbed | Job Link | |---------|----------| | testbed-bjw2-can-t0-7260-9 | https://elastictest.org/scheduler/testplan/6a03157feb4c0d0f5d30bd70 | | testbed-bjw2-can-t0-7260-1 | https://elastictest.org/scheduler/testplan/6a031580a907302e5e8240cb | | testbed-bjw3-can-t0-7060-7 | https://elastictest.org/scheduler/testplan/6a0315c99f3385605e3ddb9b | | testbed-bjw3-can-t0-7060-6 | https://elastictest.org/scheduler/testplan/6a0315c9ea3a02a739d03786 | 12/05/2026 17:35:56 memory_utilization.read_monit_status_wit L0126 INFO | [MemoryUtilization] status data refreshed on retry 1/3 (System block ts: Tue, 12 May 2026 17:35:34) Signed-off-by: xuliping <xuliping@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Liping Xu <108326363+lipxu@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of PR
Summary:
In d28u1 topology, current link was not selected correctly for t1 hwsku.
After the fix:
Type of change
Back port request
Approach
What is the motivation for this PR?
update topology with correct link
How did you do it?
update the check for current port
How did you verify/test it?
on physical testbed.
Any platform specific information?
Supported testbed topology if it's a new test case?
Documentation