Monitoring performed by systemd-monitor-counter.json file does not work for type FrequentKubeletRestart.

I'm testing the node-problem-detector and despite restarting the kubelet service on purpose in one of the nodes, the node-problem-detector doesn't show a problem with FrequentKubeletRestart.

I'm using the helm chart available at https://github.com/deliveryhero/helm-charts/tree/master/stable/node-problem-detector to deploy the node-problem-detector.

Below is my values.yaml file.:
```
---
settings:
  log_monitors:
    - /config/kernel-monitor.json
    - /config/docker-monitor.json
    - /config/abrt-adaptor.json
    - /config/kernel-monitor-filelog.json
  custom_plugin_monitors:
    - /config/kernel-monitor-counter.json
    - /config/docker-monitor-counter.json
    - /config/systemd-monitor-counter.json
    - /config/health-checker-kubelet.json
    - /custom-config/FrequentKubeletRestartCustom.json
  custom_monitor_definitions:
    FrequentKubeletRestartCustom.json: |
      {
        "plugin": "custom",
        "pluginConfig": {
          "invoke_interval": "5m",
          "timeout": "1m",
          "max_output_length": 80,
          "concurrency": 1
        },
        "source": "systemd-monitor",
        "metricsReporting": true,
        "conditions": [
          {
            "type": "FrequentKubeletRestartCustom",
            "reason": "FrequentKubeletRestartCustom",
            "message": "kubelet is functioning properly"
          }
        ],
        "rules": [
          {
            "type": "permanent",
            "condition": "FrequentKubeletRestartCustom",
            "reason": "FrequentKubeletRestartCustom",
            "path": "/home/kubernetes/bin/log-counter",
            "args": [
              "--journald-source=systemd",
              "--log-path=/var/log/journal",
              "--lookback=20m",
              "--delay=5m",
              "--count=5",
              "--pattern=Started"
            ],
            "timeout": "1m"
          }
        ]
      }
metrics:
  # metrics.enabled -- Expose metrics in Prometheus format with default configuration.
  enabled: true
  serviceMonitor:
    enabled: true
    additionalLabels: {
      release: monitoring
    }
resources:
  requests:
    cpu: 20m
    memory: 20Mi
hostNetwork: true
```

It's important to mention that the default configuration of the systemd-monitor-counter.json file didn't work, so I tried to create a custom configuration as you can see above with the FrequentKubeletRestartCustom.json file, but I didn't get any results either. I tried modifying the "--pattern" flag in different ways and I didn't succeed.

Below is some information about the test performed on one of the nodes:
1) I accessed the node via ssh and ran the command repeatedly.
![image](https://github.com/kubernetes/node-problem-detector/assets/63365577/0adedf29-4137-420a-9020-902631a287c0)

2) I looked at the log with the command "journalctl -u kubelet -f | grep "Started"" and noticed that the message "Started Kubernetes Kubelet Server." it always appeared in the log when I restarted the kubelet.
<img width="652" alt="image" src="https://github.com/kubernetes/node-problem-detector/assets/63365577/7edc02a1-f66f-4e0a-9476-cb7d69b6e339">

3) I ran the command "kubectl describe nodes" to check if the condition types FrequentKubeletRestartCustom or FrequentKubeletRestart changed the status from false to true. But the status always remained false as if the kubelet had not been restarted.
<img width="914" alt="image" src="https://github.com/kubernetes/node-problem-detector/assets/63365577/086080a3-9b84-448e-804d-5c8d44f0ce33">

4) the logs also didn't show any information at the time I restarted the kubelet.
![image](https://github.com/kubernetes/node-problem-detector/assets/63365577/a17e78b5-1340-421c-b12c-7754bf23cafb)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Monitoring performed by systemd-monitor-counter.json file does not work for type FrequentKubeletRestart. #786

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Monitoring performed by systemd-monitor-counter.json file does not work for type FrequentKubeletRestart. #786

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions