Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help request: prometheus collection indicator interface timeout #11274

Closed
smileby opened this issue May 22, 2024 · 8 comments
Closed

help request: prometheus collection indicator interface timeout #11274

smileby opened this issue May 22, 2024 · 8 comments

Comments

@smileby
Copy link
Contributor

smileby commented May 22, 2024

Description

When I used prometheus to collect apisix monitoring data, I found that the /apisix/prometheus/metrics data interface occasionally took a long time, causing grafana monitoring data to be unstable. what is the reason?

How should we optimize or solve the time-consuming problem of APISIX data interface?

Except for the following log occasionally, there are no other abnormalities.

2024/05/20 21:15:27 [warn] 31025#0: *11889570993 [lua] conf_server.lua:181: report_failure(): report failure, endpoint: xxx.xxx.xxx.xxx:2579 count: 1 while connecting to upstream, client: unix:, server: , request: "POST /v3/lease/grant HTTP/1.1", upstream: "http://xxx.xxx.xxx.xxx:2579/v3/lease/grant", host: "127.0.0.1"

Environment

  • APISIX version (run apisix version): 3.2.0
  • Operating system (run uname -a): Linux localhost.localdomain 3.10.0-327.el7.x86_64 SMP Thu Oct 29 17:29:29 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
  • OpenResty / Nginx version (run openresty -V or nginx -V): openresty/1.21.4.1
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info): 3.5.4
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):
@hanqingwu
Copy link
Contributor

what is your apisix pod resource request & limit config ? is it enough ?

@smileby
Copy link
Contributor Author

smileby commented May 28, 2024

I used the default configuration. The shared_dict size used by prometheus-metrics is 10M. I saw on the monitoring that the peak usage of prometheus-metrics has reached 30m+, and the lowest is 21m. I don’t know if it is caused by insufficient space. My APISIX monitoring appears. Lots of volatility

image

@smileby
Copy link
Contributor Author

smileby commented May 28, 2024

When the number of connections is 2000, this problem does not exist. Can it be solved by increasing the capacity of shared_dict?

image

@smileby
Copy link
Contributor Author

smileby commented May 28, 2024

The picture below is my question. At this time, the /apisix/prometheus/metrics interface is very time-consuming.

image

@hanqingwu
Copy link
Contributor

can you try add some log to check what process cost most time ?

@smileby
Copy link
Contributor Author

smileby commented May 28, 2024

hen I used prometheus to collect apisix monitoring data, I found that the /apisix/prometheus/metrics data interface occasionally took a long time, causing grafana monitoring data to be unstable. what is the reaso

My guess is that the amount of data is too large, resulting in time-consuming data transmission. The question I am more concerned about is why the monitoring fluctuates. Is it because the shared_dict elimination policy is triggered, resulting in data loss? Do you have any better suggestions for monitoring fluctuations, or can I try to expand shared_dict?

@hanqingwu
Copy link
Contributor

yes, you can try to expand shared_dict
or check whether metrics loss because of timeout

@smileby
Copy link
Contributor Author

smileby commented May 28, 2024

yes, you can try to expand shared_dict or check whether metrics loss because of timeout

thank you for your reply.

I want to know about APISIX's shared_dict configuration of prometheus' default configuration is 10mb. However, when collecting monitoring data, it will be 2-3 times the default configuration.

@smileby smileby closed this as completed May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants