-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alerts firing: ControllerManager, Scheduler and TargetDown #1530
Comments
If it helps, it looks like some services have no endpoints:
|
I had a similar issue, but I've used If you have kubernetes core components as pods in the
|
I am also facing same issue, but in my case i have used Azure acs-engine to launch the cluster. Keep getting the Scheduler and Controller alert. I can see the pods are running, but there is no corresponding service available there. |
@sandromello The problem is that I don't have the Pods |
This is a known issue with GKE prometheus-operator/prometheus-operator#355 prometheus-operator/prometheus-operator#845. I ended up just deleting the two alerts. |
This also seems to be the case for https://github.com/rancher/rke deployments (at least it is happening on my dev cluster) |
@domcar one way to avoid this issue is to have a flag to control if some dependencies from kube-prometheus will be deployed. Looking on alertmanager example on how it's possible to skip the installation of a dependency. PR are always welcome :) |
I don't have any endpoints for kube controller manager and scheduler then how to monitor them using prometheus and prometheus operator. Alerts are being triggered from the alert manager |
@ScottBrenner what's the best way to delete an alert using helm? Is it possible to cherry-pick out the alerts, or would I need to recreate them all (minus the non-working alerts for GKE)? |
…kube state exporters optional When running prometheus operator on hosted kuberenetes like GCE, few of the exporters are optional, so adding ability to conditional installations. Fixes #1001, prometheus-operator#355, prometheus-operator#845
…kube state exporters optional (#1525) * Helm: Improving readme instructions for testing helm chart locally Adding note about where to run commands from and also breaking up large bash commands into multiple lines for simple copy paste. * kube-prometheus: Making kubelets, kubescheduler, kube controller and kube state exporters optional When running prometheus operator on hosted kuberenetes like GCE, few of the exporters are optional, so adding ability to conditional installations. Fixes #1001, #355, #845 * Update Chart.yaml * Update Chart.yaml
@bonovoxly Was using |
@domcar @ScottBrenner I also met the same issue, but in my case i have used binary packages to install the cluster , can you give me a piece of advice fix the issue? |
I ran into this issue with a cluster deployed through kops in AWS. The solution that worked for me was sitting in an old version of the repo: I had to deploy the services listed here to Edit: N.B. that I think you can also generate the requisite files by adding
|
Same issue with aws eks |
I am facing the same issue. I don't have the Pods kube-scheduler or controller-manager. @domcar how did you fixed the issue?? P.S I used helm for installation. CLoud using: AWS |
I noticed the labels the service was looking for was not returning any pods. After adding the label k8s-app=kube-controller-manager to the control manager, and k8s-app=kube-scheduler to the scheduler the alerts cleared up as the service could find pods now. |
@chris530 I had to do something very similar to the service selectors; basically null out the |
@chris530 how were you able to add these labels to the controller manager and the kube scheduler ? I don't even have the pods and services associated with neither kube-scheduler nor kube-controller-manager. My kubernetes is installed with RKE. |
Hello, I am currently using prometheus-stack version 20.0.1 Thanks for your help. |
Hi, I've been dealing with these false positives on GKE. After investigating a little, I realized that GKE doesn't expose the Kubernetes Scheduler nor the Control Manager to end users. As we are blinded to these services, there is no need for deploying neither the Scheduler Scraper nor the Control Manager Scraper or their respective Alerts. The easiest way of dealing with these false positive alerts is to disable the Scraping and Alerts related to services managed by GKE on the Values file of the Helm Chart.
This is probably the case for other cloud providers, although I'm not sure about it. Cheers, |
Hi @ferpizza, Now I no longer receive alerts for KubeScheduler and KubeControllerManager. However, a new KubeProxyDown alert now appears. Cheers |
Hello @woody3549, I haven't found official documentation setting apart those k8s components that are exposed to end-users form the ones that are kept private for Google's management. You can make an assumption based on whether such component is key for ensuring GKE services.
When I wrote my first comment I was on version 18.1.1 of the Kube Prometheus Stack helm chart, and that version did not include the Since then I have updated to version 27.1.0, which includes the We can solve this, and the two prior alerts, by adding the following lines to our Values file.
|
Hello, Ok thanks. This makes sense and is very helpful. Regards |
What did you do?
I installed prometheus-operator and kube-prometheus using helm:
What did you expect to see?
Everything green in Alert Manager
What did you see instead? Under which circumstances?
Some Alerts are firing:
Environment
GKE
Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"8", GitVersion:"v1.8.1", GitCommit:"f38e43b221d08850172a9a4ea785a86a3ffa3b3a", GitTreeState:"clean", BuildDate:"2017-10-11T23:27:35Z", GoVersion:"go1.8.3", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"8+", GitVersion:"v1.8.5-gke.0", GitCommit:"2c2a807131fa8708abc92f3513fe167126c8cce5", GitTreeState:"clean", BuildDate:"2017-12-19T20:05:45Z", GoVersion:"go1.8.3b4", Compiler:"gc", Platform:"linux/amd64"}
Kubernetes cluster kind:
I used terraform to create the cluster on GKE
Prometheus Operator Logs:
No Errors nor warnings
I guess somehow these targets get not scraped. Can you help me out on how to solve this issue please? Thanks
The text was updated successfully, but these errors were encountered: