You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What would you like to be added:
Enhance subctl diagnose to do more analysis than it currently does. It currently focuses on finding out if something has gone wrong, not why it has gone wrong. Some enhancements that can be done are:
For OVN-CI, make sure legacy ports etc. are not present.
Make sure OVN flows, router policies etc. are using correct IPs as per endpoints.
Check of IP Tables rules programed are using correct IPs.
For Globalnet, make sure exported services are using same IPs as GlobalIngressIPs allocated to them.
Check the logs for frequency of logs. Too frequent logs can cause log overflow in long running setups, losing crucial information. This shold help catch any overzealous logs.
Check if pod logs are about to runover, so user can back them up for future troubleshooting. Note: This should probably be an alert.
Check if any multicluster objects match in contents on source, broker and destination.
Why is this needed:
Currently subctl diagnose only does basic diagnosis. Checks for deployments and pods states, run firewall test etc. But lot of troubleshooting still requires dev team to gather logs and analyze them. Some of the analysis done manually can be easily automated. Aim is to minimize effort and time dev team has to spend on troubleshooting.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further
activity occurs. Thank you for your contributions.
What would you like to be added:
Enhance subctl diagnose to do more analysis than it currently does. It currently focuses on finding out if something has gone wrong, not why it has gone wrong. Some enhancements that can be done are:
Why is this needed:
Currently
subctl diagnose
only does basic diagnosis. Checks for deployments and pods states, run firewall test etc. But lot of troubleshooting still requires dev team to gather logs and analyze them. Some of the analysis done manually can be easily automated. Aim is to minimize effort and time dev team has to spend on troubleshooting.The text was updated successfully, but these errors were encountered: