-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-15129;[3/N] Remove metrics in AbstractFetcherManager when fetcher manager instance shutdown #13929
Conversation
…er manager instance shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with the other two PRs I have a preference for variable names to start with a lowercase unless there is a reason (i.e. a convention) for them to be with uppercase, but beyond that, this seems reasonable to me.
And please help to review this @divijvaidya |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for these code changes @hudeqi. I have seen cases of OOM caused by metric leaks in production and the changes you are making go a long way towards ensure safe code.
core/src/main/scala/kafka/server/ReplicaAlterLogDirsManager.scala
Outdated
Show resolved
Hide resolved
core/src/main/scala/kafka/server/ReplicaAlterLogDirsManager.scala
Outdated
Show resolved
Hide resolved
Are there any hidden problems in this PR? @divijvaidya |
I will get to all your PRs on metrics close by end of this week @hudeqi. Right now I am focusing on releasing 3.5.1. |
@divijvaidya hi, reminding~ |
I still need to get 3.5.1 out and focus on KIP-405 PRs (targeting 3.6 release) before I get some more time @hudeqi! |
Hi, do you have time to review this PR now? I want to combine all the currently opened PRs about metric leaks into one PR. I don’t know if this will help or burden your review? @divijvaidya |
…tric_replicaManager
@nizhikov Hi, can you help review the PR of this series? Looks like @divijvaidya doesn't have time. |
Hello. I will review this PR in the nearest time. |
@@ -31,6 +32,9 @@ abstract class AbstractFetcherManager[T <: AbstractFetcherThread](val name: Stri | |||
extends Logging { | |||
private val metricsGroup = new KafkaMetricsGroup(this.getClass) | |||
|
|||
// Visible for test | |||
private[server] var metricNamesToTags: Map[String, java.util.Map[String, String]] = Map.empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use mutable map like it's done for fetcherThreadMap
?
Or even Set<MetricName>
, see comment below
}, tags) | ||
private[server] def newMetrics(): Unit = { | ||
val tags = Map("clientId" -> clientId).asJava | ||
metricsGroup.newGauge(MaxLagMetricName, () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use
Set<MetricName> metrics = ...;
MetricName name = metricGroup.metricName(MaxLagMetricName, tags);
KafkaYammerMetrics.defaultRegistry().newGauge(name, () => { ... });
metrics.add(name);
And store explicit MetricName
instead of implicit name and tags.
metrics.forEach(KafkaYammerMetrics.defaultRegistry()::removeMetric);
`` `
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have metrigGroup#meter
and other methods that takes MetricName as parameter.
It's look convinient to create one for Gauge
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder, do we really need metricNamesToTags
set. It's just 4 metric so may be it will be more clear to create explicit MetricName
for each and use them while closing fetcher manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @nizhikov , thank you for taking the time to review. The significance of metricNamesToTags
is that it can be used in the unit test to check whether all the metrics of the corresponding module are removed when closed, to prevent someone from adding new metrics but forgetting remove, with this collection, you only need to add the new metric to this collection.
Removing the leaking metrics is a long and potentially fixed process, as can be seen from the original reviews reached with @divijvaidya.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems you can use MetricName
to check metrics registered and removed during test.
Hello. @hudeqi , thanks for the PR. |
This pr is used to remove the metrics in
AbstractFetcherManager
whenReplicaFetcherManager/ReplicaAlterLogDirsManager
instance shutdown.This pr has passed the corresponding unit test, and it is part of KAFKA-15129.