Skip to content

Commit

Permalink
bugtool: Collect XFRM error counters twice
Browse files Browse the repository at this point in the history
This commit changes the bugtool report to collect the XFRM error
counters (i.e., /proc/net/xfrm_stat) twice instead of only once. We will
collect at the beginning and end of the bugtool collection. In that way,
there will be around 5-6 seconds between the two collections and we may
see if any counter is currently increasing.

    $ diff cilium-bugtool-cilium-7d54p-20231025-115151/cmd/cat*--proc-net-xfrm_stat.md
    5c5
    < XfrmInStateProtoError   	4
    ---
    > XfrmInStateProtoError   	6

In this example, we can easily see that the XfrmInStateProtoError is
increasing. That suggests a key rotation issue is currently ongoing (cf.
IPsec troubleshooting docs).

I tried other approaches to collect over a longer timespan. That may
allow us to see slower increases. They all end up being more complex or
messier (we'd need to collect at beginning and end of the sysdump
instead). In the end, considering this is already a fallback plan for
when customers don't collect Prometheus metrics, I think the current,
simpler approach is good enough.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
  • Loading branch information
pchaigno committed Oct 25, 2023
1 parent 3f499b4 commit 96e76f3
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion bugtool/cmd/configuration.go
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,10 @@ func defaultCommands(confDir string, cmdDir string, k8sPods []string) []string {
var commands []string
// Not expecting all of the commands to be available
commands = []string{
// We want to collect this twice: at the very beginning and at the
// very end of the bugtool collection, to see if the counters are
// increasing.
"cat /proc/net/xfrm_stat",
// Host and misc
"ps auxfw",
"hostname",
Expand Down Expand Up @@ -214,6 +218,11 @@ func defaultCommands(confDir string, cmdDir string, k8sPods []string) []string {
commands = append(commands, tcCommands...)
}

// We want to collect this twice: at the very beginning and at the
// very end of the bugtool collection, to see if the counters are
// increasing.
commands = append(commands, "cat -u /proc/net/xfrm_stat")

return k8sCommands(commands, k8sPods)
}

Expand Down Expand Up @@ -269,7 +278,6 @@ func tcInterfaceCommands() ([]string, error) {

func catCommands() []string {
files := []string{
"/proc/net/xfrm_stat",
"/proc/sys/net/core/bpf_jit_enable",
"/proc/kallsyms",
"/etc/resolv.conf",
Expand Down

0 comments on commit 96e76f3

Please sign in to comment.