-
Notifications
You must be signed in to change notification settings - Fork 16.9k
Conversation
Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA. It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
7733844
to
0a91751
Compare
Also changes the readiness probe to match the new one upstream
0a91751
to
3ad70c0
Compare
Can you bump the version number to |
Not a problem, can totally do that. That being said, it also seems like the standard in other charts is for metrics to be disabled by default, so I'll get that flipped too |
After looking through some of the metrics that this is providing, I'm not 100% sure that these are necessarily the correct metrics that should be included in the chart (ie, I think there are metrics that should be included, but are not). Based on that, I think we should put this on hold for right now. I'd hate to get something merged in upstream only for it to not actually solve the problem it aimed to. |
So I agree that metrics would be super nice and I've been working on it as well. Few things to bring up. (1) the whitelist/pattern is removing a lot of useful stats. I think its too limiting Also I'd gladly contribute too, but not quite ready for a PR, my codes working but still needs some love. |
@bradenwright Regarding the other points: |
@nrmitchi cool, that's where I started too. There ended up being a number of metrics I thought were overkill, so I was gonna blacklist so stats but make configureable. (3) I more than happy to contribute that part, I'm hoping to have it all done by Monday, if not earlier I'm more than happy to collaborate however you want on this, I think it will work out well that you did it with Prometheus and I did it with Prometheus Operator. |
Code wise too if you/anyone wanted to compare, again I can push my code running as a sidecar instead of java-agent for a comparison (if wanted), again I'll have a clean PR Monday maybe earlier. |
Updated to include both the jmx exporter, as well as the kafka-exporter referenced at: https://prometheus.io/docs/instrumenting/exporters/ Considering adding a configurable burrow (https://github.com/linkedin/Burrow) deployment as well. Also planning to swap out the default image that is currently being used with the base jmx_exporter (https://github.com/prometheus/jmx_exporter) rather than the custom one that is currently being used |
@bradenwright I would be interested in seeing your sidecar approach, especially if it works with prometheus-operator |
@nrmitchi What is the idea behind the change in the readiness probe? I found this pull request after running into an issue with the readiness probe when enabling JMX on the original chart. I added the following to the values file:
I got the following issue when running the readiness probe:
And I fixed it by changing the readiness probe in the chart to this:
I don't know kafka well enough to know which one would be the most meaningful. |
@NicolasTr a member of my team did a lot of the initial work in getting the jmx export up and running, but my understanding is that the readiness probe change was taken from the upstream (non helm) kubernetes-kafka project, and the pre-existing one didn't work due to a port conflict. I'm honestly not entirely sure if it is the best readinessProbe, but again I'm trusting the change upstream (https://github.com/Yolean/kubernetes-kafka/blob/master/kafka/50kafka.yml#L63) |
I still need to do a little clean up, and I'm currently tweaking jmx-exporter-configmap.yaml to produce stats that are more appropriate for Prometheus, but it should be fine to discuss things above: |
@nrmitchi the readiness probe is a good one, we're going to merge another PR from @t-d-d that already has this change included in it. If possible, it would be nice to just hone this PR down to just the changes around JMX and metrics export. My initial thoughts are that I prefer the sidecar method to modifying the core container to export metrics as well. Sidecars are a fairly common paradigm in the community generally, and are typically recommended for this kind of metrics export operation from what I've seen. full disclosure: @bradenwright and I are coworkers at SpotHero The final thought I have -- I think the zookeeper metrics are great -- but my initial reaction is that they likely belong with zookeeper. Although many users will deploy a zookeeper cluster with a kafka cluster, that isnt universally true. I'd prefer to encourage people to think about and package metrics with the component to which they belong. What do you think @nrmitchi? How about others on this PR @bradenwright @t-d-d do you have any opinions given your time using Kafka? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of questions/comments
port: 5555 | ||
|
||
# Rules to apply to the Kafka JMX Exporter | ||
kafkaConfig: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think we could drop this key and tab everything over?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so due to the way the key is used; ie, the blob is injected as {{ toYaml .Values.metrics.jmx.kafkaConfig | indent 4 }}
. If we were to drop the key and unindent, we'd end up including all of the other .metrics.jmx
values in the jmx-kafka-prometheus.yml
configuration. I'm not sure if that will cause the jmx exporter to error on startup, but it seems unclean either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yeah, of course 👍
jmxUrl: service:jmx:rmi:///jndi/rmi://127.0.0.1:5555/jmxrmi | ||
ssl: false | ||
whitelistObjectNames: ["kafka.server:*", "kafka.controller:*", "java.lang:*"] | ||
# rules: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps these should just be set as defaults, what do you think (eg not commented)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My reasoning for leaving them commented was because including them excludes a lot of metrics; I'm personally of the opinion that the default should include everything, allowing a user to exclude stuff if necessary, rather than hide metrics by default, requiring a user to figure out why they are missing and how to expose them. Typically articles/help guides about monitoring in general will reference using specific metrics, and it can be confusing while going through those if the metrics that should be there happened to be hidden by default.
I left them in and commented mostly as a reference in case someone did want to write their own rules, but if the choice is between uncommenting, or removing entirely, I would err on the side of "just remove them"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable to me 👍
/ok-to-test |
@nrmitchi if we clean up the jmx port definition I think we're good to go. Lets use the structure you've defined and remove the one at the top level and update the docs to remove that. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: benjigoldberg, nrmitchi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@nrmitchi thanks much for your contribution and working through all the questions/requests with us! |
@nrmitchi I am trying to install kafka chart after this merge. Now I get this message "Error: render error in "kafka/templates/statefulset.yaml": template: kafka/templates/statefulset.yaml:174:28: executing "kafka/templates/statefulset.yaml" at <{{template "kafka.co...>: template "kafka.configmap" not defined" . I hard coded configmap name to "kafka-metrics" to unblock myself. Please test it. |
@nrmitchi I run prometheus/grafans cluster using charts here - https://github.com/camilb/prometheus-kubernetes |
@aruneli the conflict seems to be with this PR which was merged in between the initial branch for this PR, and the merge: ff9f02d#diff-e4513710c9e51c6f539000cb48ad85f0 Putting something up to fix it. |
@nrmitchi Thanks. Could you also please check my other comment on how to enable running prometheus to scrape data from the kafka exporter. |
@aruneli It would be similar to how you have configured prometheus to scrape other pods. That seems like a better question for https://github.com/camilb/prometheus-kubernetes |
How you scrape the metrics would be entirely up to you; they are exposed, but I'm not sure how that particular chart determines its targets. This chart now adds annotations to the statefulset if you enable the jmx metrics (and to the kafka-exporter if you enabled that), so it should work out of the box if you prometheus config uses service discovery. |
@nrmitchi thanks for putting this together, I'll make sure to open an new PR to add support for Prometheus Operator. |
@bradenwright Thanks |
In case anyone is follow here I opened a PR for the Prometheus Operator changes: #5120 |
* Helm-ify the JMX exporter that was added to the base project Also changes the readiness probe to match the new one upstream * Increment version in Chart.yaml * Bump Charts.yaml version and make metrics off by default * Template out jmx rules, thus allowing people to whitelist if they want * Pull a bit more into the config * lint * Fix value from copypaste error * Add metric export option using the recommend kafka metric exporter as well * Remove zookeeper metrics * Update kafka-exporter default * Add back additionalPorts that I accidently nixed in merge * Parameterize jmx port and add new values to readme * Fix readme formatting * Remove old jmxPort from values.yaml
* Helm-ify the JMX exporter that was added to the base project Also changes the readiness probe to match the new one upstream * Increment version in Chart.yaml * Bump Charts.yaml version and make metrics off by default * Template out jmx rules, thus allowing people to whitelist if they want * Pull a bit more into the config * lint * Fix value from copypaste error * Add metric export option using the recommend kafka metric exporter as well * Remove zookeeper metrics * Update kafka-exporter default * Add back additionalPorts that I accidently nixed in merge * Parameterize jmx port and add new values to readme * Fix readme formatting * Remove old jmxPort from values.yaml Signed-off-by: voron <av@arilot.com>
What this PR does / why we need it:
Currently this chart does not expose jmx metrics to prometheus. This PR is basically a helm-ification of the changes at: Yolean/kubernetes-kafka@5a2b8c7, which adds an exporter.
@faraazkhan @h0tbird @benjigoldberg