New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vmagent debug kube-state-metrics 0/0 target #5389
Comments
Hi @k0nstantinv ! Can you share the |
@dmitryk-dk kube-state-metrics deployed separetly from vm-stack, for now it's a part of kube-prometheus-stack chart, so vm-stack's values file contains nothing about kube-state-metrics except the |
Hi @k0nstantinv, we want to reproduce your issue, but we can't deploy kube-state-metrics without any At that moment we have our test where we see the target How we did it:
|
@dmitryk-dk thanks! I understand your installation way. As I mentioned above kube-state-metrics itself is a part of kube-prometheus-stack which was deployed earlier. I want to understand the way of debugging zero targets in vmagent, so there is no difference how KMS was actually deployed I have KMS servicemonitor, I have KMS vmservicescrape, I have job definition in vmagent conf, but this one particular target shows only 0/0. There are no error log or something, so my goal is to learn how to make it work |
@dmitryk-dk some additional info, cluster is really huge
my active targets tab shows thousands (0/0) targets, like and I really stuck in debugging what is the reason with no luck, kube-state-metrics was just example one to start from here is the values file
hope it helps |
Hi @k0nstantinv ! First of all, you should use Can you share an example of service which service scrape should scrape? You should check the labels of the service and the service. Those labels should be equal. And the namespaces should be equal. But if you can share this information it could help to find the problem |
I tried vmagent 1.95.1 and result is the same
sure, here is kube-state-metrics
labels are similar ones to that ones in VMServiceScrape above all the staff either vm-stack or prometheus-stack deployed in the namespace monitoring |
Can you check in the describe of the service can find (match) any target? Because labels are identical and namespaces the same. |
@dmitryk-dk sure
seems like there are some more complicated issues with my setup, maybe some bottleneck in vmagent or kubernetes_sd misconfiguration...no idea |
@k0nstantinv could you try to downgrade the vmagent to the 1.93.5 version to be sure that vmagent is a bottleneck? |
tried with no luck
still 0/0 targets with no errors |
I would like to set up an environment with precisely the same configuration. Can I ask you to share the deployment config? |
Here is VMagent
|
Hi @k0nstantinv ! In the previous message, I asked about the Deployment manifest itself. Could you please share it? |
here is the deployment of the first vmagent shard
|
@dmitryk-dk I understand my VMagent definition has the option |
If |
@valyala Thanks a lot! I know |
Hi @k0nstantinv ! I will try to reproduce it today on my local. But if I find that scrape targets are available, so that we will need to check the code what can cause the issue if there are a lot of targets present |
Hi @k0nstantinv ! Can you share deployment for the |
@dmitryk-dk sure
|
…discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates #5389
Oh, I didn't pay attention that you pass I think it would be better from debuggability PoV to show all the dropped targets at |
Hi @k0nstantinv ! I have tested your configurations, it works as expected:
|
@valyala @dmitryk-dk Thanks for the details! it is clear for me now. I’ll try it asap |
…discovery page Previously the /service-discovery page didn't show targets dropped because of sharding ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). Show also the reason why every target is dropped at /service-discovery page. This should improve debuging why particular targets are dropped. While at it, do not remove dropped targets from the list at /service-discovery page until the total number of targets exceeds the limit passed to -promscrape.maxDroppedTargets . Previously the list was cleaned up every 10 minutes from the entries, which weren't updated for the last minute. This could complicate debugging of dropped targets. Updates #5389
… instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit 487f638 on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates #5389 Updates #4018
… instances, which scrape the given dropped target at /service-discovery page The /service-discovery page contains the list of all the discovered targets after the commit 487f638 on all the vmagent instances in cluster mode ( https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets ). This commit improves debuggability of targets in cluster mode by providing a list of -promscrape.cluster.memberNum values per each target at /service-discovery page, which has been dropped becasue of sharding, e.g. if this target is scraped by other vmagent instances in the cluster. Updates #5389 Updates #4018
FYI, the next release of |
Thank you for your attention! Seems like everything works as described here. Glad to know our report helps to improve something in the project |
|
FYI, |
FYI, the regression fix has been also included in v1.97.2 LTS release. |
Is your question request related to a specific component?
vmagent
Describe the question in detail
I'm in a process of moving from Prometheus to VictoiraMetrics in a huge AWS cluster
I use kube-prometheus-stack and vm-stack at the same time during the moving
VictoriaMetrics deployed via chart victoria-metrics-k8s-stack. All the ServiceMonitors where converted to VMServiceScrappers
We've discovered that lots of vmagent targets show
0/0
and we have no idea why. There are no relative errors logs across the vmagents or somewhere elsePlease give some advises how to debug a target, for example
kube-state-metrics
targetAlso, vmagent's
discovered targets
endpoint does not contain any entries about the kube-state-metrics targetready to post any logs or attachments
environment Info:
aws eks
vmcluster - v1.94.0
vmagent - v1.94.0, also v1.95.1 tried
VMserviceScrape kube-state-metrics
job definition from vmagent
Thanks for your attention anyway, appreciate any help
Troubleshooting docs
The text was updated successfully, but these errors were encountered: