-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#3761 Provide metrics to monitor certificates expiration #9861
#3761 Provide metrics to monitor certificates expiration #9861
Conversation
53cd989
to
5a2ab21
Compare
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
26843ea
to
efe5161
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I was at KubeCon so I got to the Pr only now. I left some comments.
...r-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractOperator.java
Outdated
Show resolved
Hide resolved
...r-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractOperator.java
Outdated
Show resolved
Hide resolved
...operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractReconciler.java
Outdated
Show resolved
Hide resolved
operator-common/src/main/java/io/strimzi/operator/common/metrics/MetricsHolder.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetController.java
Outdated
Show resolved
Hide resolved
operator-common/src/main/java/io/strimzi/operator/common/operator/resource/ReconcileResult.java
Outdated
Show resolved
Hide resolved
...r-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractOperator.java
Outdated
Show resolved
Hide resolved
Sure, no worries at all :) Hope you had a good time! Hopefully will join next year - was at Kafka Summit this year. |
2112e7b
to
c48cf39
Compare
...r-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/AbstractOperator.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetController.java
Outdated
Show resolved
Hide resolved
...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetController.java
Outdated
Show resolved
Hide resolved
Struggling to understand what the latest graph represents. |
@ppatierno Cert was renewed after 5 mins and it was just to show that it is reflected in the metric. This graph is not included in any dashboard. |
64c8ec3
to
2634b28
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, this is definitely useful. Thanks.
I tried changes on Grafana 7.3.7, including having multiple Kafka clusters, and deleting a cluster.
Issue: Expiry time set to epoch start
To reproduce the issue, create a Kafka cluster with metrics, wait for the new metric to appear in operator dashboard, then delete the whole cluster, finally recreate the cluster with the same name.
Certificate content:
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
5f:25:69:2e:67:74:21:35:4e:e8:16:7a:1b:ad:43:01:7f:29:ba:7f
Signature Algorithm: sha512WithRSAEncryption
Issuer: O=io.strimzi, CN=cluster-ca v0
Validity
Not Before: Mar 25 17:51:58 2024 GMT
Not After : Mar 25 17:51:58 2025 GMT
Subject: O=io.strimzi, CN=my-cluster-kafka
Raw metric:
# HELP strimzi_certificate_expiration_ms Time in milliseconds when the certificate expires
# TYPE strimzi_certificate_expiration_ms gauge
strimzi_certificate_expiration_ms{cluster_name="my-cluster",kind="Kafka",namespace="test",selector="",} 0.0
Grafana dashboard:
cluster-operator/src/main/java/io/strimzi/operator/cluster/model/CertUtils.java
Outdated
Show resolved
Hide resolved
...er-operator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaReconciler.java
Outdated
Show resolved
Hide resolved
...test/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetControllerMockTest.java
Outdated
Show resolved
Hide resolved
...tor/src/main/java/io/strimzi/operator/cluster/operator/assembly/StrimziPodSetController.java
Outdated
Show resolved
Hide resolved
...ing/helm-charts/helm3/strimzi-kafka-operator/files/grafana-dashboards/strimzi-operators.json
Outdated
Show resolved
Hide resolved
Hello @fvaleri Thanks for your comments will access them tomorrow. In short essence the problem is that |
84c13ff
to
fd34a04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The screenshot lists only a Cluster CA. Is the Clients CA planned for a follow-up PR?
@steffen-karlsson I replied to the main open points. If you want to have a Zoom call or something to go through it together, feel free to ping me on Slack. Sometimes it's easier than through the GitHub messages. |
37c54d6
to
42700ee
Compare
42700ee
to
f3f5f39
Compare
Well, we will squash it when merging it. So not sure it matters that much. But that is likely one of the things confusing the DCO signoff. |
Well couldn't be better to have them squashed now and providing a meaningful single commit message instead of using the squash and merge from GitHub which will produce a very long commit message including even all the others commit messages not related to this PR. |
...rator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaAssemblyOperator.java
Outdated
Show resolved
Hide resolved
...perator/src/test/java/io/strimzi/operator/cluster/operator/assembly/OperatorMetricsTest.java
Outdated
Show resolved
Hide resolved
I don't know what happened, the commits are actually all mine minus a few, had to alter the last half amount of mine as they were not including the Sign off message, therefore the high amount. While altering the messages, it added the other ones, don't know why, have tried to fix it, but cannot seems to do it, branch is fully up to date with main branch and number of files changes makes sense in terms of the PR. If you @ppatierno or @scholzj also does not have any idea, the only solution I see is to create a new branch and diff all changes and create a new PR. I used the suggestions by the DCO action. |
I guess Git happened ... I think that all of us had this kind of issue in the past.
I think creating a new PR from another branch might be an option. I think that would be completely fine. Alternatively, I think if you simply squash all of the commits and fix the commit message, it might work as well (but make sure to have a backup copy before you do that :-o) |
I am fine with whatever is the best solution for you to pick from ... 1. new PR or 2. squashing and leaving one single commit message which makes sense for the overall PR |
…board as well as alert example. Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
ccb444b
to
fa4b46b
Compare
Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
Squashed it all into one commit @ppatierno, DCO is green and tests will be soon :) Only missing last comment from @scholzj regarding |
Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, two more things ... one is my fault and one isn't ...
packaging/examples/metrics/grafana-dashboards/strimzi-operators.json
Outdated
Show resolved
Hide resolved
...rator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KafkaAssemblyOperator.java
Outdated
Show resolved
Hide resolved
…rator/assembly/KafkaAssemblyOperator.java Co-authored-by: Jakub Scholz <www@scholzj.com> Signed-off-by: Steffen Wirenfeldt Karlsson <steffen.karlsson@maersk.com>
Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
Signed-off-by: Steffen Karlsson <steffen.karlsson@maersk.com>
Signed-off-by: maciej-tatarski <maciej.tatarski@maersk.com>
Signed-off-by: maciej-tatarski <maciej.tatarski@maersk.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: maciej-tatarski <maciej.tatarski@maersk.com>
You are right, I removed aggregation and excluded some labels by transformation. I reverted it now as we don't really know which labels people will have so it is better to just aggregate by labels that are always there. @fvaleri please check now, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @maciej-tatarski and @steffen-karlsson.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the PR!
I was waiting for the commits storm to finish :-)
I didn't run it but trust Fede and Jakub for this, but I had another pass on the code, and it makes sense to me.
Type of change
Description
Implement metric emitter for certificates expiration and dashboard to monitor expiration for certificates.
Checklist