[202305] Handling exceptions in CMIS SM to prevent xcvrd crash #484
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
202305 cherry-pick for #483
Description
Currently, the CmisManagerTask thread crashes upon encountering an exception which causes the entire XCVRD process to restart. The CmisManagerTask thread crash scenarios are more often seen during instances of failure to read EEPROM of the transceivers.
Crash snippet
Motivation and Context
In order to avoid restarting of XCVRD triggered due to CmisManagerTask thread crash, this PR will ensure to move the CMIS SM to
CMIS_STATE_FAILED
state for the corresponding ports which have generated an exception. This will also help in ensuring that if module EEPROM access fails for 1 or more ports, the corresponding port will transition toCMIS_STATE_FAILED
instead.How Has This Been Tested?
An exception was manually generated while CMIS SM was in
CMIS_STATE_INSERTED
and it was ensured that XCVRD did not crash.Also, CMIS initialization was successful on the same port after the exception was not seen any more.
Additional Information (Optional)
MSFT ADO - 27441561