Addon: expose /metrics endpoints for Prometheus #49

solsson · 2017-07-28T06:12:16Z

Good fit with https://github.com/Yolean/kubernetes-monitoring.

TODO recommend a Grafana dashboard json

#49 but maybe with tests instead of talk

solsson · 2017-07-31T17:58:45Z

Based on the observation that the test pod in https://github.com/Yolean/kubernetes-kafka/blob/addon-metrics/test/jmx-selftest.yml takes 40 - 100 MB memory (in GKE according to kubectl top) I've tried to fit all metrics containers in a 100 MB resources limit. The problem is that the JVM will have to be restricted to such a limit, or it will have spikes that cause pod restarts. I think that the current 64 MB for app and 32 MB for "metaspace" avoids such restarts while keeping scrapes almost as performant as without resource limits.

solsson · 2017-07-31T19:42:19Z

Poor results in GKE, getting pod restarts at least once per five minutes:

    Command:
      java
      -Xmx64M
      -XX:MaxMetaspaceSize=32m
      -jar
      jmx_prometheus_httpserver.jar
      5556
      example_configs/kafka-prometheus-monitoring.yml
    State:		Running
      Started:		Mon, 31 Jul 2017 21:36:37 +0200
    Last State:		Terminated
      Reason:		OOMKilled
      Exit Code:	137

at least for now, as it allows exec into the pods to investigate. We've been having frequent restarts that are not due to OOMKilled (i.e. not #49). Now failed probes will lead to unready pods, which we can monitor for using #60.

which might not matter because we no longer have a loadbalancing service. These probes won't catch all failure modes, but if they fail we're pretty sure the container is malfunctioning. I found some sources recommending ./bin/kafka-topics.sh for probes but to me it looks risky to introduce a dependency to some other service for such things. One such source is helm/charts#144 The zookeeper probe is from https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-probes/ An issue is that zookeeper's logs are quite verbose for every probe.

yacut · 2017-09-25T15:25:56Z

@solsson I had the same problem with kafka metrics.

the metrics are too big for small server (especially if you have many topics/partitions and exports all java lang metrics) That was the reason of an error java.lang.OutOfMemoryError: GC Overhead Limit Exceeded. I have reduced the metrics for kafka to:

lowercaseOutputName: true
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:5555/jmxrmi
rules:
  - pattern : kafka.server<type=ReplicaFetcherManager, name=MaxLag, clientId=(.+)><>Value
  - pattern : kafka.server<type=BrokerTopicMetrics, name=(.+), topic=(.+)><>OneMinuteRate
  - pattern : kafka.server<type=KafkaRequestHandlerPool, name=RequestHandlerAvgIdlePercent><>OneMinuteRate
  - pattern : kafka.server<type=Produce><>queue-size
  - pattern : kafka.server<type=ReplicaManager, name=(.+)><>(Value|OneMinuteRate)
  - pattern : kafka.server<type=controller-channel-metrics, broker-id=(.+)><>(.*)
  - pattern : kafka.server<type=socket-server-metrics, networkProcessor=(.+)><>(.*)
  - pattern : kafka.server<type=Fetch><>queue-size
  - pattern : kafka.server<type=SessionExpireListener, name=(.+)><>OneMinuteRate
  - pattern : java.lang<type=OperatingSystem><>SystemCpuLoad
  - pattern : java.lang<type=Memory><HeapMemoryUsage>used
  - pattern : java.lang<type=OperatingSystem><>FreePhysicalMemorySize

jmx exporter ist very slow. It does not matter how many memory or cpu has the server. So the solution is increase timeouts for prometheus and kubernetes.

prometheus global config:

global:
  scrape_interval: 30s
  scrape_timeout: 30s
  evaluation_interval: 30s

k8s container live probe:

        livenessProbe:
            httpGet:
              path: /metrics
              port: 8080
            initialDelaySeconds: 60
            periodSeconds: 60
            timeoutSeconds: 60
            successThreshold: 1
            failureThreshold: 3

Enjoy :)

solsson · 2017-09-26T05:40:53Z

@yacut Thanks for the feedback. I noticed the default export just included everything, so I'll update the exporter config for brokers to your suggestion.

I'm surprised about the performance issue though. I the tests I ran, 3 seconds were sufficient according to https://github.com/Yolean/kubernetes-kafka/blob/addon-metrics/test/metrics.yml#L80.

yacut · 2017-09-26T06:37:06Z

@solsson I'm surprised too.

There are some issue about it:

I'm not an expert, but I guess the bigger kafka cluster (brokers/topics/partitions/messages rate) the slower the responses. With our cluster size the responses are ~ 15-35 seconds 😟

I also saw that jmx exporter responds very quickly, if I stop the broker and he is no longer in the cluster replication, but still runs a bit.

solsson · 2017-09-26T11:49:53Z

Thanks for the background. This looks like a weakness with jmx_exporter. Before we dig deep here it could be worth investigating if there's other ways to get Prometheus compliant metrics out of Kafka.

yacut · 2017-09-26T16:45:30Z

There are not much exporters for kafka: https://prometheus.io/docs/instrumenting/exporters/

For me the important metrics are:

brokers live cycle (up/down/replica), that only jmx exporter can
message rate per topic, that also can only jmx exporter
consumer group lag, two exporters can do that
free disk space in volume, no exporter for that, only with k8s v1.8 possible with node exporter ;(
maybe also java cpu and heap, again jmx exporter

If you find another exporter, it would be great, but at the moment we have no choice...

yacut · 2017-10-06T12:03:36Z

@solsson Performance improved from ~35-40 seconds to ~5-8 seconds per request by adding the settings ssl and whitelistObjectNames:

lowercaseOutputName: true
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:5555/jmxrmi
ssl: false
whitelistObjectNames: ["kafka.server:*","java.lang:*"]
rules:
  - pattern : kafka.server<type=ReplicaFetcherManager, name=MaxLag, clientId=(.+)><>Value
  - pattern : kafka.server<type=BrokerTopicMetrics, name=(.+), topic=(.+)><>OneMinuteRate
  - pattern : kafka.server<type=KafkaRequestHandlerPool, name=RequestHandlerAvgIdlePercent><>OneMinuteRate
  - pattern : kafka.server<type=Produce><>queue-size
  - pattern : kafka.server<type=ReplicaManager, name=(.+)><>(Value|OneMinuteRate)
  - pattern : kafka.server<type=controller-channel-metrics, broker-id=(.+)><>(.*)
  - pattern : kafka.server<type=socket-server-metrics, networkProcessor=(.+)><>(.*)
  - pattern : kafka.server<type=Fetch><>queue-size
  - pattern : kafka.server<type=SessionExpireListener, name=(.+)><>OneMinuteRate
  - pattern : java.lang<type=OperatingSystem><>SystemCpuLoad
  - pattern : java.lang<type=Memory><HeapMemoryUsage>used
  - pattern : java.lang<type=OperatingSystem><>FreePhysicalMemorySize

Prometheus scrape settings are back to normal:

global:
  scrape_interval: 15s
  scrape_timeout: 15s

@yacut

through ssl=false and whitelist. Thanks to @yacut, see #49

solsson · 2017-10-06T18:45:10Z

@yacut great find. Does the branch metrics-improve-scrape-times correspond to your config? It get speedy scrapes with it, and it contains the metrics I've looked for except jmx_scrape_duration_seconds.

Have you had a look at the scrape config for zookeeper? I failed completely to extract meaningful metrics in #61.

k8s container live probe:

I assume this is for the metrics container, but I don't understand port 8080. Do you think it's worth the extra jmx runs to have this kind of liveness probe, given performance is an issue already?

yacut · 2017-10-08T19:54:50Z

@solsson Basically yes, but I don't think that (.+) pattern is good for the jmx exporter performance. I use it only if necessary, e.g. for the topic labels:

  - pattern : kafka.server<type=ReplicaManager, name=(PartitionCount|UnderReplicatedPartitions)><>Value
  - pattern : kafka.server<type=BrokerTopicMetrics, name=(BytesInPerSec|BytesOutPerSec|MessagesInPerSec), topic=(.+)><>OneMinuteRate

I believe the k8s container live probe is important because if the jmx exporter can't response anymore than it's useless. One minute live probe duration should not be a problem if you use the whitelist config and only the metrics that important to you.

In my humble opinion, the following metrics are important for zookeeper:

Alive connection: shows the number of brokers that joined to cluster
Packets Sent Rate: shows the zookeeper liveness rate and who is the leader right now
Quorum Size: shows the zookeeper quorum config and the member id

More info here: https://zookeeper.apache.org/doc/r3.1.2/zookeeperJMX.html

solsson · 2017-10-22T18:57:20Z

I believe the k8s container live probe is important because if the jmx exporter can't response anymore than it's useless.

Suggested a liveness probe in e4fadac

solsson · 2017-11-01T19:47:47Z

Confluent's release post for 1.0.0 mentions changes to metrics. Most of it, according to the release notes is in Connect. For Kafka I found https://issues.apache.org/jira/browse/KAFKA-5341.

This reverts commit 22a314a.

for jmx containers in kafka and zoo pods

It'll just make the requests slower. Dreadfully slow on Minikube (>30s even when limit is increased to 100m).

and with 150Mi limit I got zero restarts in 48 hours.

@yacut

through ssl=false and whitelist. Thanks to @yacut, see #49

But before this, how did the metrics container know which port to connect to?

Try to get meaningful metrics from Zookeeper

solsson · 2017-11-03T13:03:57Z

This is a great addition, but with + 100M-150M memory per pod (+800M with the default scale) I'm a bit hesitant to merge. Will test more in #84.

solsson · 2017-11-04T08:37:48Z

I had always started from jmx-exporter's sample yaml for kafka, but it's much more enlightening to do as in metrics-experiment -- export everything.

To inspect the result I'm using:

metrics_save() {
  pod=$1
  kubectl -n kafka port-forward $pod 5556:5556 &
  sleep 1
  time curl -o "tmp-metrics-$pod-$(date +%FT%H%M%S).txt" -f -s http://localhost:5556/metrics
  kill %%
}
metrics_save kafka-0
metrics_save pzoo-0

Sample full kafka /metrics at https://gist.github.com/solsson/efb929260fd663a9e15e0ac8557c5028, zoo at https://gist.github.com/solsson/15e2bdce7c23b2d1c7aea0ef895900cb

solsson · 2017-11-07T15:35:49Z

I've been testing kafka on a cluster with quite busy nodes, and I'm having more problems with the metrics containers than with Kafka itself. Currently exporting more metrics than committed conf, but with ssl=false.

  Normal   Pulled                 10m (x2 over 13m)   kubelet, gke-eu-west-3-b1-default-pool-b345de87-whj6  Container image "solsson/kafka-prometheus-jmx-exporter@sha256:40a6ab24ccac0ed5acb8c02dccfbb1f5924fd97f46c0450e0245686c24138b53" already present on machine
  Normal   Created                10m (x2 over 13m)   kubelet, gke-eu-west-3-b1-default-pool-b345de87-whj6  Created container
  Normal   Killing                10m                 kubelet, gke-eu-west-3-b1-default-pool-b345de87-whj6  Killing container with id docker://metrics:Container failed liveness probe.. Container will be killed and recreated.
  Normal   Started                10m (x2 over 13m)   kubelet, gke-eu-west-3-b1-default-pool-b345de87-whj6  Started container
  Warning  Unhealthy              3m (x10 over 12m)   kubelet, gke-eu-west-3-b1-default-pool-b345de87-whj6  Liveness probe failed: Get http://10.0.8.195:5556/liveness: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

I've also raised the memory limit to 200M. I think we must find a liveness probe that doesn't cost an additional round of JMX probing.

Or drop the liveness probes, and have the monitoring system alert on stale metrics.

Already included in #49, but here we don't add any export container to the pod. Can be utilized by kafka-manager (#83) - just tick the JMX box when adding a cluster - to see bytes in/out rates.

see #49

solsson added the addon label Jul 28, 2017

solsson added a commit that referenced this pull request Jul 28, 2017

Metrics intro belongs in ...

a880307

#49 but maybe with tests instead of talk

solsson force-pushed the addon-metrics branch 3 times, most recently from 7240ded to 5221e4d Compare July 31, 2017 07:30

solsson force-pushed the addon-metrics branch from 788011f to 2cae3e4 Compare August 5, 2017 05:24

This was referenced Aug 8, 2017

Try to get meaningful metrics from Zookeeper #61

Merged

Error and Completed state on fresh pods with negligible load #36

Closed

solsson added a commit that referenced this pull request Oct 6, 2017

Scrape less, and improve scrape time further ...

6504390

through ssl=false and whitelist. Thanks to @yacut, see #49

solsson changed the base branch from kafka-011 to master October 22, 2017 18:22

solsson added a commit that referenced this pull request Oct 22, 2017

For performance, again thanks to @yacut #49

cce1d0b

solsson added 5 commits November 3, 2017 13:36

Exposes /metrics endpoints for Prometheus scraping

41cdfd6

This reverts commit 22a314a.

Adds pod that can be used to estimate resource limits

ffb89dd

for jmx containers in kafka and zoo pods

CPU limit on metrics export won't actually save any cycles

51bbedb

It'll just make the requests slower. Dreadfully slow on Minikube (>30s even when limit is increased to 100m).

The test that caught the performance problem

8ec2045

Demonstrates OOMKilled with current resource limits

11feb28

solsson added 5 commits November 3, 2017 13:36

Uses prometheus/jmx_exporter parent-0.10 tag

d7d5044

Had 10 OOMKilled/hour with 100Mi so let's increase request,

b6c85eb

and with 150Mi limit I got zero restarts in 48 hours.

Uses JMX config from config map, so we can experiment

db52a3c

Scrape less, and improve scrape time further ...

37e58e9

through ssl=false and whitelist. Thanks to @yacut, see #49

For performance, again thanks to @yacut #49

d4b95d2

solsson force-pushed the addon-metrics branch from d7784d0 to d4b95d2 Compare November 3, 2017 12:38

solsson and others added 5 commits November 3, 2017 13:55

Gets you JVM metrics from zoo, lots and lots of it

0994950

Still not getting anything zookeeper-specific

e35d077

Adds directives from kafka's rules, now for pzoo too.

42d1b1a

But before this, how did the metrics container know which port to connect to?

Zookeeper metrics conf contributed by @yacut #61

253633f

Merge pull request #61 from Yolean/metrics-jmx-zookeeper

b0e6145

Try to get meaningful metrics from Zookeeper

solsson added 3 commits November 3, 2017 14:22

Upgrade to jmx-exporter 0.1.0

dc1c1da

Upgrade test also to jmx-exporter 0.1.0

4c35576

Adapts test instructions to debian based jre image

6a26cf3

solsson mentioned this pull request Nov 5, 2017

Track progress for Kubernetes 1.8 / kubernetes-kafka v3.0.0 #84

Closed

12 tasks

solsson mentioned this pull request Nov 9, 2017

Exposes JMX for brokers, and exemplify key cluster-level metric #93

Closed

solsson added the monitoring label Nov 10, 2017

This was referenced Nov 10, 2017

Enable broker JMX_PORT 5555 #96

Merged

Add linkedin/kafka-monitor #97

Open

solsson mentioned this pull request Dec 13, 2017

Wanted: topic management, declarative #101

Open

solsson mentioned this pull request Dec 22, 2017

JVM Settings #112

Closed

solsson mentioned this pull request Jan 15, 2018

Test failures for console producer/consumer with min.insync.replicas=2 #116

Closed

solsson added a commit that referenced this pull request Jan 19, 2018

Current config from the metrics-improve-scrape-times branch,

b275895

see #49

solsson added a commit that referenced this pull request Jan 19, 2018

The metrics part of #49

d82b419

solsson mentioned this pull request Jan 19, 2018

Add Kafka Prometheus metrics export #128

Merged

solsson mentioned this pull request Feb 2, 2018

Set internal replication factors to match default and min.insync.replicas #140

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addon: expose /metrics endpoints for Prometheus #49

Addon: expose /metrics endpoints for Prometheus #49

solsson commented Jul 28, 2017

solsson commented Jul 31, 2017 •

edited

Loading

solsson commented Jul 31, 2017

yacut commented Sep 25, 2017

solsson commented Sep 26, 2017

yacut commented Sep 26, 2017 •

edited

Loading

solsson commented Sep 26, 2017

yacut commented Sep 26, 2017

yacut commented Oct 6, 2017 •

edited

Loading

solsson commented Oct 6, 2017

yacut commented Oct 8, 2017 •

edited

Loading

solsson commented Oct 22, 2017

solsson commented Nov 1, 2017

solsson commented Nov 3, 2017

solsson commented Nov 4, 2017 •

edited

Loading

solsson commented Nov 7, 2017 •

edited

Loading

Addon: expose /metrics endpoints for Prometheus #49

Are you sure you want to change the base?

Addon: expose /metrics endpoints for Prometheus #49

Conversation

solsson commented Jul 28, 2017

solsson commented Jul 31, 2017 • edited Loading

solsson commented Jul 31, 2017

yacut commented Sep 25, 2017

solsson commented Sep 26, 2017

yacut commented Sep 26, 2017 • edited Loading

solsson commented Sep 26, 2017

yacut commented Sep 26, 2017

yacut commented Oct 6, 2017 • edited Loading

solsson commented Oct 6, 2017

yacut commented Oct 8, 2017 • edited Loading

solsson commented Oct 22, 2017

solsson commented Nov 1, 2017

solsson commented Nov 3, 2017

solsson commented Nov 4, 2017 • edited Loading

solsson commented Nov 7, 2017 • edited Loading

solsson commented Jul 31, 2017 •

edited

Loading

yacut commented Sep 26, 2017 •

edited

Loading

yacut commented Oct 6, 2017 •

edited

Loading

yacut commented Oct 8, 2017 •

edited

Loading

solsson commented Nov 4, 2017 •

edited

Loading

solsson commented Nov 7, 2017 •

edited

Loading