Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Go Functions] - Failed to collect metrics #9177

Closed
flowchartsman opened this issue Jan 11, 2021 · 1 comment · Fixed by #9318
Closed

[Go Functions] - Failed to collect metrics #9177

flowchartsman opened this issue Jan 11, 2021 · 1 comment · Fixed by #9318
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@flowchartsman
Copy link
Contributor

Seeing errors in my log for go functions refusiing a metrics connection.

15:59:47.285 [pulsar-web-43-1] INFO  org.eclipse.jetty.server.RequestLog - 10.13.2.61 - - [10/Jan/2021:15:59:47 +0000] "GET /metrics HTTP/1.1" 302 0 "-" "Prometheus/2.15.2" 0
15:59:47.289 [prometheus-stats-44-1] WARN  org.apache.pulsar.functions.worker.FunctionsStatsGenerator - Failed to collect metrics for function instance tenantname/namespacename/samplertest:0
java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method) ~[?:1.8.0_275]
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) ~[?:1.8.0_275]
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) ~[?:1.8.0_275]
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) ~[?:1.8.0_275]
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) ~[?:1.8.0_275]
        at java.net.Socket.connect(Socket.java:607) ~[?:1.8.0_275]
        at java.net.Socket.connect(Socket.java:556) ~[?:1.8.0_275]
        at sun.net.NetworkClient.doConnect(NetworkClient.java:180) ~[?:1.8.0_275]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:463) ~[?:1.8.0_275]
        at sun.net.www.http.HttpClient.openServer(HttpClient.java:558) ~[?:1.8.0_275]
        at sun.net.www.http.HttpClient.<init>(HttpClient.java:242) ~[?:1.8.0_275]
        at sun.net.www.http.HttpClient.New(HttpClient.java:339) ~[?:1.8.0_275]
        at sun.net.www.http.HttpClient.New(HttpClient.java:357) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1226) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1162) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1056) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:990) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1570) ~[?:1.8.0_275]
        at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498) ~[?:1.8.0_275]
        at org.apache.pulsar.functions.runtime.RuntimeUtils.getPrometheusMetrics(RuntimeUtils.java:427) ~[org.apache.pulsar-pulsar-functions-runtime-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.pulsar.functions.runtime.process.ProcessRuntime.getPrometheusMetrics(ProcessRuntime.java:320) ~[org.apache.pulsar-pulsar-functions-runtime-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.pulsar.functions.worker.FunctionsStatsGenerator.generate(FunctionsStatsGenerator.java:71) ~[org.apache.pulsar-pulsar-functions-worker-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsGenerator.generate(PrometheusMetricsGenerator.java:96) ~[org.apache.pulsar-pulsar-broker-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.pulsar.broker.stats.prometheus.PrometheusMetricsServlet.lambda$doGet$0(PrometheusMetricsServlet.java:66) ~[org.apache.pulsar-pulsar-broker-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [org.apache.pulsar-managed-ledger-2.8.0-SNAPSHOT.jar:2.8.0-SNAPSHOT]
        at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.12.0.jar:4.12.0]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_275]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_275]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_275]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [?:1.8.0_275]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_275]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_275]
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275]
15:59:47.291 [prometheus-stats-44-1] INFO  org.eclipse.jetty.server.RequestLog - 10.13.2.61 - - [10/Jan/2021:15:59:47 +0000] "GET /metrics/ HTTP/1.1" 200 63238 "http://10.13.2.236:8080/metrics" "Prometheus/2.15.2" 5

The function appears to start a listener on startup

021/01/10 15:57:16.848 log.go:46: [info] Serving InstanceCommunication on port 33915

However its counts DO eventually go up, so perhaps it's intermittant?

Running off of 2.8.0-SNAPSHOT

@freeznet
Copy link
Contributor

looked through the go functions with metrics implementation, it seems the go functions are using wrong port to expose metrics. i will draft a PR to fix this issue.

codelipenghui pushed a commit that referenced this issue Jan 30, 2021
Fixes #9177

### Motivation

go function added metrics collector by #6105, but havnt pass `metricsPort` to go function, also not init & start prometheus http server. As the result, function worker will keep trying to access to the metrics port to collect data, which will cause massive log errors in log history.

### Modifications

- expose `metricsPort` to go function
- add prometheus http server to go function

### Verifying this change

- [x] Make sure that the change passes the CI checks.
codelipenghui pushed a commit that referenced this issue Feb 4, 2021
Fixes #9177

go function added metrics collector by #6105, but havnt pass `metricsPort` to go function, also not init & start prometheus http server. As the result, function worker will keep trying to access to the metrics port to collect data, which will cause massive log errors in log history.

- expose `metricsPort` to go function
- add prometheus http server to go function

- [x] Make sure that the change passes the CI checks.

(cherry picked from commit 211a125)
zymap pushed a commit that referenced this issue Feb 22, 2021
Master Issue: #9177

### Motivation

As discussed in #9318, both @zymap and @wolfstudy suggested, to add `metricsPort` as a field of `InstanceConfig`.

### Modifications

- add metricsPort to InstanceConfig
- add hasValidMetricsPort to InstanceConfig to check if metrics port is valid
- applied changes to k8s runtime & process runtime
ivankelly pushed a commit to ivankelly/pulsar that referenced this issue Aug 10, 2021
Master Issue: apache#9177

As discussed in apache#9318, both @zymap and @wolfstudy suggested, to add `metricsPort` as a field of `InstanceConfig`.

- add metricsPort to InstanceConfig
- add hasValidMetricsPort to InstanceConfig to check if metrics port is valid
- applied changes to k8s runtime & process runtime
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants