Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLD for Shutdown and Startup Fabric module #1694

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mlok-nokia
Copy link

@mlok-nokia mlok-nokia commented May 8, 2024

This is HLD for Shutdown and Startup the Fabric Module.

@mlok-nokia mlok-nokia force-pushed the shutdown_startup_fabric_hld branch from 88063c1 to 04ba416 Compare May 10, 2024 13:57
Signed-off-by: mlok <marty.lok@nokia.com>
@kenneth-arista
Copy link

@jfeng-arista for awareness

2. Modify the module_db_update() to call get_module_admin_status() to check the config module. If the module_cfg_status is not set to down, then populate the CH-TBDASSIS_FABRIC_ASIC_TABLE. Otherwise, just ignore it even the SFM module is present. This mechanism prevents the event is triggered in the swss.sh when admin_status is set to down state.

# 3 Test Considerations
UTs are also added to simulate the Fabric shutdown and startup

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to consider sonic-mgmt test to cover this. The tests should cover validating effect on thermal and pci devices.

@mlok-nokia
Copy link
Author

@abdosi @judyjoseph I have update the document with investigation of the Impact of PCIed and Thermal sensors. Based on the current implementation, there is NO impact. Please review it


# 3 Impact and Test Considerations
## 3.1 Impact of the PCIed and Thermal sensors
For PCIed, based on the investigation, the current design of the Fabric module shutdown has NO impact on the PCIed. The PCIed current checks the basic PCI components. For the Fabric slot which is shutdown, if platform supports PCI on the Fabric card, it should check if its power is on that particular card before it is added to the PCIe check. That is how is handled in the Arista vendor code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When Fabric is shutdown through CLI, the Vendor code needs to modify the pci device list and remove the the device from list. We would also need support to be able to update this list dynamically. Currently, it's loaded during boot with the start of pcied daemon. If the list is updated, this process needs to restart in the current implementation.

judyjoseph pushed a commit to sonic-net/sonic-utilities that referenced this pull request May 29, 2024
…ule(SFM) by using "config chassis modules shutdown/startup" commands (#3283)

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below:
sonic-net/SONiC#1694
arfeigin pushed a commit to arfeigin/sonic-utilities that referenced this pull request Jun 16, 2024
…ule(SFM) by using "config chassis modules shutdown/startup" commands (sonic-net#3283)

sudo config chassis modules shutdown/startup <module name>

The HLD for Shutdown and Startup of the Fabric Module is below:
sonic-net/SONiC#1694
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants