Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
bugtool: Collect XFRM error counters twice
This commit changes the bugtool report to collect the XFRM error counters (i.e., /proc/net/xfrm_stat) twice instead of only once. We will collect at the beginning and end of the bugtool collection. In that way, there will be around 5-6 seconds between the two collections and we may see if any counter is currently increasing. $ diff cilium-bugtool-cilium-7d54p-20231025-115151/cmd/cat*--proc-net-xfrm_stat.md 5c5 < XfrmInStateProtoError 4 --- > XfrmInStateProtoError 6 In this example, we can easily see that the XfrmInStateProtoError is increasing. That suggests a key rotation issue is currently ongoing (cf. IPsec troubleshooting docs). I tried other approaches to collect over a longer timespan. That may allow us to see slower increases. They all end up being more complex or messier (we'd need to collect at beginning and end of the sysdump instead). In the end, considering this is already a fallback plan for when customers don't collect Prometheus metrics, I think the current, simpler approach is good enough. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
- Loading branch information