Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prometheus metrics /metrics in kong does not work #7673

Closed
albertforweb opened this issue Jun 3, 2021 · 10 comments
Closed

prometheus metrics /metrics in kong does not work #7673

albertforweb opened this issue Jun 3, 2021 · 10 comments

Comments

@albertforweb
Copy link

the prometheus metrics /metrics in kong does not work with default kong configuration

similar like this : Kong/kong-plugin-prometheus#102

What api is tested?
curl http://localhost:8001/metrics

What versions are tested ?
2.1.0
2.1.4
2.2.0
2.4.1

What environment is used?
Ubuntu "20.04.2 LTS (Focal Fossa)"
postgresql-12

How to reproduce ?
1 follow steps here https://docs.konghq.com/install/ubuntu/?_ga=2.14922879.811534633.1622575399-740554415.1600964794 to install , config and start kong
2 run command "curl http://localhost:8001/metrics"

What is the problem?
constantly run "curl http://localhost:8001/metrics" every 1 seconds, sometimes you get list of metrics, sometimes you get below error:

2021/06/03 09:47:20 [error] 205436#0: *33469 lua coroutine: runtime error: ...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:620: attempt to index field '_counter' (a nil value)
stack traceback:
coroutine 0:
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua: in function 'metric_data'
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:666: in function 'collect'
...local/share/lua/5.1/kong/plugins/prometheus/exporter.lua:239: in function 'collect'
/usr/local/share/lua/5.1/kong/plugins/prometheus/api.lua:6: in function 'fn'
/usr/local/share/lua/5.1/kong/api/api_helpers.lua:258: in function </usr/local/share/lua/5.1/kong/api/api_helpers.lua:241>
coroutine 1:
[C]: in function 'resume'
/usr/local/share/lua/5.1/lapis/application.lua:397: in function 'handler'
/usr/local/share/lua/5.1/lapis/application.lua:130: in function 'resolve'
/usr/local/share/lua/5.1/lapis/application.lua:167: in function </usr/local/share/lua/5.1/lapis/application.lua:165>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/lapis/application.lua:173: in function 'dispatch'
/usr/local/share/lua/5.1/lapis/nginx.lua:230: in function 'serve'
/usr/local/share/lua/5.1/kong/init.lua:1219: in function 'admin_content'
content_by_lua(nginx-kong.conf:239):2: in main chunk, client: 127.0.0.1, server: kong_admin, request: "GET /metrics HTTP/1.1", host: "localhost:8001"
2021/06/03 09:47:20 [error] 205436#0: *33469 [lua] api_helpers.lua:429: handle_error(): /usr/local/share/lua/5.1/lapis/application.lua:401: ...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:620: attempt to index field '_counter' (a nil value)
stack traceback:
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua: in function 'metric_data'
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:666: in function 'collect'
...local/share/lua/5.1/kong/plugins/prometheus/exporter.lua:239: in function 'collect'
/usr/local/share/lua/5.1/kong/plugins/prometheus/api.lua:6: in function 'fn'
/usr/local/share/lua/5.1/kong/api/api_helpers.lua:258: in function </usr/local/share/lua/5.1/kong/api/api_helpers.lua:241>

stack traceback:
[C]: in function 'error'
/usr/local/share/lua/5.1/lapis/application.lua:401: in function 'handler'
/usr/local/share/lua/5.1/lapis/application.lua:130: in function 'resolve'
/usr/local/share/lua/5.1/lapis/application.lua:167: in function </usr/local/share/lua/5.1/lapis/application.lua:165>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/lapis/application.lua:173: in function 'dispatch'
/usr/local/share/lua/5.1/lapis/nginx.lua:230: in function 'serve'
/usr/local/share/lua/5.1/kong/init.lua:1219: in function 'admin_content'
content_by_lua(nginx-kong.conf:239):2: in main chunk, client: 127.0.0.1, server: kong_admin, request: "GET /metrics HTTP/1.1", host: "localhost:8001"

@hbagdi
Copy link
Member

hbagdi commented Jun 3, 2021

@fffonion Any clue whats going wrong here?

@fffonion
Copy link
Contributor

fffonion commented Jun 4, 2021

It could be caused by an aborted init_worker phase, which may resulted from buggy code elsewhere or the enviroment. @albertforweb please check for any error logs before the stack trace and use keyword "init_worker".

@albertforweb
Copy link
Author

i remembered the init_worker from two workers has no errors, all started normally.

not sure what really caused the issue on that day. two days later I reinstalled the kong, could not reproduce it anymore. but will try some time later.

thanks.

@curious2k
Copy link

curious2k commented Jul 9, 2021

I am running into the same issue and I checked the logs before the stack trace for keyword init_worker, there are no errors.

When I start Kong and try to hit the /metrics API, i run into the same error:
2021/07/09 07:50:22 [error] 27#0: *562 [lua] prometheus.lua:629: metric_data(): Prometheus:metric_data _counter is nil before call sync() : nil, client: 10.0.26.234, server: , request: "GET /metrics HTTP/1.1", host: "10.0.90.146:8001"
2021/07/09 07:50:22 [error] 27#0: *562 lua coroutine: runtime error: ...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:632: attempt to index field '_counter' (a nil value)
stack traceback:
coroutine 0:
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua: in function 'metric_data'
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:678: in function 'collect'
...local/share/lua/5.1/kong/plugins/prometheus/exporter.lua:239: in function 'collect'
/usr/local/share/lua/5.1/kong/plugins/prometheus/api.lua:6: in function 'fn'
/usr/local/share/lua/5.1/kong/api/api_helpers.lua:258: in function </usr/local/share/lua/5.1/kong/api/api_helpers.lua:241>

Interesting point to note here here is, if I exec in kong pod I am able to get the metrics with kong:8001/metrics but localhost:8001/metrics or <pod_ip>/metrics, both are giving Unexpected error.

Any help will be appreciated.

@hbagdi
Copy link
Member

hbagdi commented Jul 9, 2021

cc @Tieske @fffonion

@curious2k
Copy link

@Tieske @fffonion

To add on the above issue, I added few loggers in prometheus plugin to root cause the issue and observed that the PrometheusHandler.init_worker() is not invoked everytime. Whenever it is not invoked, I ran into the above error because the counter is not initialised.

@fffonion
Copy link
Contributor

@curious2k Could you share the log around PrometheusHandler.init_worker() is not invoked? Is it in a Kong start, kong restart or reload?

@albertforweb
Copy link
Author

albertforweb commented Jul 13, 2021

environment variables used:

KONG_PG_DATABASE=kong
KONG_ADMIN_LISTEN=0.0.0.0:8001
KONG_PROXY_ACCESS_LOG=logs/access.log
KONG_LOG_LEVEL=notice
KONG_PROXY_ERROR_LOG=logs/error.log
KONG_DATABASE=postgres
KONG_PG_HOST=postgresql
KONG_PLUGINS=bundled, jwt-auth
KONG_PROXY_LISTEN=0.0.0.0:80
KONG_ADMIN_ACCESS_LOG=logs/admin_access.log
KONG_ADMIN_ERROR_LOG=logs/error.log
KONG_PG_PORT=5432
KONG_PG_PASSWORD=password123
KONG_VERSION=2.1.4
KONG_DEB=kong_2.1.4_xenial_amd64.deb

not sure why the PrometheusHandler.init_worker() is not called:

Starting kong
nginx: [warn] duplicate MIME type "text/html" in /etc/kong/custom_config.conf:14
nginx: [warn] duplicate MIME type "text/html" in /etc/kong/nginx.conf:159
nginx: [error] [kong] exporter.lua:21 prometheus.exporter init is called
nginx: [error] [lua] prometheus.lua:471: init(): Prometheus init is called
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:55 +0000","request":"GET / HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"95","body_bytes_sent":"8939","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.190","route":"","request_id":"4bff543af5b4bb74a9d362bab168f6c3","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:55 +0000","request":"GET /tags HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"99","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.007","route":"","request_id":"1c25cb52a1aa14ce88833702a26ea2c5","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /consumers?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"149","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.211","route":"","request_id":"fad303c86b92b86d0868dd4af144ec2d","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /plugins?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"147","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.014","route":"","request_id":"2ba43ad70707b561c6d52da677e4b9c4","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /certificates?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"152","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.013","route":"","request_id":"3fa87d442d6218f29a1da51e3447631e","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /upstreams?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"149","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.022","route":"","request_id":"652e0cc1adf59f70a598152f0b2d4c61","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /services?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"148","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.056","route":"","request_id":"97305aba5ca5f43594bf4734878bb1a9","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /routes?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"146","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.026","route":"","request_id":"d832f41eb7842422586dd879458211c7","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /jwts?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"109","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.024","route":"","request_id":"e438886077467d158f942444298c49dd","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /ca_certificates?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"155","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.026","route":"","request_id":"dbdefaed6a9c8baa3e2548de1baba068","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /snis?size=1000&tags=managed-by-ingress-controller HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"144","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.028","route":"","request_id":"ffd0c20d06d764184dcfecb7be805c4b","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /acls?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"109","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.031","route":"","request_id":"c0de45cafa94fa31b394a0de7c54e23f","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /basic-auths?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"116","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.031","route":"","request_id":"948285cb9cc264c8bb2bc1077be7e024","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /hmac-auths?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"115","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.034","route":"","request_id":"26417e9d8fbb5d81d0ee9d1e6237ccd4","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /key-auths?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"114","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.036","route":"","request_id":"daad4730d1a74b905f83079001585015","client_id":"","tenant_id":""}
{"level":"info","remote_addr":"127.0.0.1","time_local":"13/Jul/2021:17:09:59 +0000","request":"GET /oauth2?size=1000 HTTP/1.1","status": "200","content-type": "application/json; charset=utf-8","http_user_agent":"Go-http-client/1.1","request_length":"111","body_bytes_sent":"23","upstream_addr":"","upstream_service":"","upstream_status":"","upstream_connect_time":"","upstream_response_time":"","request_time":"0.037","route":"","request_id":"4d715a6b485b9e09b778c95558c3c3ac","client_id":"","tenant_id":""}
2021/07/13 17:13:42 [error] 29#0: *984 [kong] exporter.lua:162 prometheus.exporter collect is called, client: 192.168.158.72, server: , request: "GET /metrics HTTP/1.1", host: "192.168.157.121:8001"
2021/07/13 17:13:42 [error] 29#0: *984 [lua] prometheus.lua:678: collect(): Prometheus:collect is called, client: 192.168.158.72, server: , request: "GET /metrics HTTP/1.1", host: "192.168.157.121:8001"
2021/07/13 17:13:42 [error] 29#0: *984 [lua] prometheus.lua:630: metric_data(): Prometheus:metric_data _counter is nil before call sync() : nil, client: 192.168.158.72, server: , request: "GET /metrics HTTP/1.1", host: "192.168.157.121:8001"
2021/07/13 17:13:42 [error] 29#0: *984 lua coroutine: runtime error: ...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:633: attempt to index field '_counter' (a nil value)
stack traceback:
coroutine 0:
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua: in function 'metric_data'
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:680: in function 'collect'
...local/share/lua/5.1/kong/plugins/prometheus/exporter.lua:242: in function 'collect'
/usr/local/share/lua/5.1/kong/plugins/prometheus/api.lua:6: in function 'fn'
/usr/local/share/lua/5.1/kong/api/api_helpers.lua:258: in function </usr/local/share/lua/5.1/kong/api/api_helpers.lua:241>
coroutine 1:
[C]: in function 'resume'
/usr/local/share/lua/5.1/lapis/application.lua:397: in function 'handler'
/usr/local/share/lua/5.1/lapis/application.lua:130: in function 'resolve'
/usr/local/share/lua/5.1/lapis/application.lua:167: in function </usr/local/share/lua/5.1/lapis/application.lua:165>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/lapis/application.lua:173: in function 'dispatch'
/usr/local/share/lua/5.1/lapis/nginx.lua:230: in function 'serve'
/usr/local/share/lua/5.1/kong/init.lua:1210: in function 'admin_content'
content_by_lua(nginx.conf:328):2: in main chunk, client: 192.168.158.72, server: , request: "GET /metrics HTTP/1.1", host: "192.168.157.121:8001"
2021/07/13 17:13:42 [error] 29#0: *984 [lua] api_helpers.lua:429: handle_error(): /usr/local/share/lua/5.1/lapis/application.lua:401: ...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:633: attempt to index field '_counter' (a nil value)
stack traceback:
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua: in function 'metric_data'
...cal/share/lua/5.1/kong/plugins/prometheus/prometheus.lua:680: in function 'collect'
...local/share/lua/5.1/kong/plugins/prometheus/exporter.lua:242: in function 'collect'
/usr/local/share/lua/5.1/kong/plugins/prometheus/api.lua:6: in function 'fn'
/usr/local/share/lua/5.1/kong/api/api_helpers.lua:258: in function </usr/local/share/lua/5.1/kong/api/api_helpers.lua:241>

@albertforweb
Copy link
Author

FYI.

as a workaround, i have to modify the prometheus exporter, by detecting whether init_worker is called or not in the collect() method. if prometheus:init_worker was not called then call it before the other logic.

this worked,

but still have no clue why Prometheus:init_worker was not called by kong.

@gszr gszr transferred this issue from Kong/kong-plugin-prometheus Aug 7, 2021
@fffonion
Copy link
Contributor

Closing for inactivity, please re-open if the necessary ; )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants