Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/known-limitations/known-limitations.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ working hard to address:
* [Configuration not allowed when port is member of PortChannel](#configuration-not-allowed-when-port-is-member-of-portchannel)
* [External peering over a connection originating from an MCLAG switch can fail](#external-peering-over-a-connection-originating-from-an-mclag-switch-can-fail)
* [Mesh limitations on TH5-based devices](#mesh-limitations-on-th5-based-devices)
* [Breakout and CMIS transceiver initialization issues on DS5000](#breakout-and-cmis-transceiver-initialization-issues-on-ds5000)

### Deleting a VPC and creating a new one right away can cause the agent to fail

Expand Down Expand Up @@ -90,3 +91,28 @@ root cause and possible workarounds.
None. We recommend avoiding mesh topologies on TH5-based devices for the
time being, with the exception of 2-node topologies without gateway, where
the above issues would not apply.

### Breakout and CMIS transceiver initialization issues on DS5000

On Celestica DS5000 devices, certain transceivers using the Common Management Interface Specification (CMIS) fail to initialize properly under specific conditions.

CMIS is an open standard for managing high-speed pluggable transceivers, providing a uniform way for the network operating system to interact with and monitor them.

#### Diagnosing the issue

If you breakout a port (for example, changing from 1x800G to 2x400G or 8x100G) while no transceiver is present, and then insert a transceiver afterward, initialization may fail and the transceiver may be missing or appear as failed in SONiC.

This occurs because SONiC did not always correctly reinitialize hardware abstraction for the port after breakout and re-insertion in this scenario, especially affecting CMIS modules.

#### Resolution

- The Hedgehog Fabric agent now automatically patches `/usr/share/sonic/platform/pddf/pddf-device.json` as needed after NOS installation (the patch is indicated by `-hh1` in the description). No user action is required to apply this workaround.
- A full switch reboot is still required after agent deployment for the patch to take effect.
- The `REBOOTREQ` column for the agent object in `kubectl` or `k9s` will indicate if a reboot is needed.
- If you encounter existing transceiver failures (such as after an upgrade), a full power cycle of the switch, sometimes referred as cold boot, may still be required in addition to the reboot.

#### Additional guidance

- Prefer inserting transceivers before breaking out ports to avoid the issue altogether, if possible.
- Always follow any REBOOTREQ status after upgrades or configuration changes.
- If problems persist, perform a full power cycle as a last resort.