Description
Deploy node problem detector for early hardware issue detection.
Requirements
- NPD DaemonSet deployment
- Custom problem definitions
- Alert rules for node problems
- Automatic node cordoning
- Incident creation integration
Files to Touch
- infrastructure/k8s/node-problem-detector/daemonset.yaml
- infrastructure/k8s/node-problem-detector/config.yaml
- infrastructure/monitoring/node-alerts.yml
Complexity: 250 points
Description
Deploy node problem detector for early hardware issue detection.
Requirements
Files to Touch
Complexity: 250 points