KAFKA-5238: BrokerTopicMetrics can be recreated after topic is deleted #4204

edoardocomar · 2017-11-10T17:31:23Z

Avoiding a DelayedFetch recreate the metrics when a topic has been
deleted

developed with @mimaison

added unit test borrowed from @ijuma JIRA

asfgit · 2017-11-10T18:33:15Z

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

asfgit · 2017-11-10T18:33:54Z

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

asfgit · 2017-11-10T18:35:59Z

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

edoardocomar · 2017-11-15T10:05:08Z

Hi @ijuma @rajinisivaram any comments ?

edoardocomar · 2017-11-20T10:37:43Z

Hi @ijuma @rajinisivaram could anyone review please ?

edoardocomar · 2017-11-21T10:33:36Z

The reason we believe this fix is important is the leak that deleted topics metrics cause, over a long enough time they built up and we were flooding the graphite backend with obsolete metrics ...

rajinisivaram · 2017-11-21T10:58:48Z

@edoardocomar Sorry about the delay. The reason I was hesitant in committing this was because I wasn't sure if there were any race conditions in the fix. For example, could a topic be deleted after the check for UNKNOWN_TOPIC_OR_PARTITION in handleFetchRequest that could cause the metrics to be recreated? I haven't looked at the threads in detail, so I am not sure either way, but I would need to spend more time to be certain.

edoardocomar · 2017-11-21T11:01:37Z

Hi @rajinisivaram yes there could still be a race, however the window for that to happen is minimal, as the highest chances for the race to happen are while there is a DelayedFetch in the Purgatory. In fact it was very easy to reproduce the issue.
After this fix we verified on our systems that the likelihood of the leak is minimal (i.e. we could not reproduce it any longer!)

rajinisivaram · 2017-11-21T11:18:31Z

@edoardocomar Yes, I do agree that the fix makes it better than what we have currently since it reduces the timing window. But it will be good to explore if we can fix it properly without leaving any timing windows. I will take a look later today.

ijuma · 2017-11-21T11:22:48Z

core/src/main/scala/kafka/server/ReplicaManager.scala

+           brokerTopicStats.topicStats(tp.topic).totalFetchRequestRate.mark()
+           brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()
+        } else { 
+          info(s"Delayed fetch for deleted partition $tp")


We should not do this. readFromLocalLog should not encode information about its callers, it should be the other way around.

Thanks, the log message is bad !

rajinisivaram · 2017-11-22T12:29:30Z

@edoardocomar I had a look through the code and I think it should be possible to close the timing window altogether by creating metrics within one of the existing locks (locking should be required only when metrics for the topic is not available). I am not sure if it is worth the additional complexity. Perhaps @ijuma could comment since he created this JIRA.

ijuma · 2017-11-22T12:34:24Z

I haven't looked at the complexity, but generally we try to fix the root cause and not just make it less likely to happen.

ijuma · 2017-11-22T12:39:17Z

core/src/main/scala/kafka/server/ReplicaManager.scala

      try {
+        if (allPartitions.contains(tp)) {
+           brokerTopicStats.topicStats(tp.topic).totalFetchRequestRate.mark()
+           brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()


It seems that we can always call this one. It's only the topic variant that should not be called if the topic has been deleted.

edoardocomar · 2017-12-07T17:57:18Z

Hi @ijuma @rajinisivaram any further observations? I agree that fixing a root cause is preferable, however here the simplicity of reducing the time window from large to minimal appeared to have paid off in practice (we're running with this patch).

rajinisivaram · 2017-12-08T14:08:39Z

I am ok with committing this and creating another JIRA to close the timing window later. @ijuma What do you think?

edoardocomar · 2017-12-14T09:11:24Z

@ijuma ?

edoardocomar · 2018-01-17T18:12:57Z

@ijuma @rajinisivaram ?

edoardocomar · 2018-01-18T22:44:46Z

retest this please

mimaison · 2018-01-29T10:21:45Z

retest this please

edoardocomar · 2018-01-29T13:41:03Z

@ijuma ? @rajinisivaram @junrao ?

mimaison · 2018-01-29T13:46:35Z

Opened https://issues.apache.org/jira/browse/KAFKA-6495 for fully closing the timing window

edoardocomar · 2018-03-01T14:50:45Z

rebased on trunk to fix conflicts, can be merged clean again ... any volunteers ?
@rajinisivaram

rajinisivaram · 2018-03-08T10:10:07Z

@ijuma Shall we merge this now since it is a simple fix and reduces the timing window? There is a separate JIRA (KAFKA-6495) to close the window properly later.

edoardocomar · 2018-03-20T13:07:10Z

Hi @ijuma would you be ok to merge ?

edoardocomar · 2018-04-26T13:16:24Z

Hi @rajinisivaram thanks for finding the time to chat about this PR
what do we need to do to get it merged ?

@ijuma even if this is not a theoretical fix to the issue, it is a practical one,
we have been running with this patch in production for months and never experienced the leak again

Please if you do not want to get it merged just say so and we will stop pursuing it,
just do not leave us in limbo forever 😄

thanks

edoardocomar · 2020-02-06T18:16:06Z

Hi @ijuma @rajinisivaram we have rebased this old PR.

Please note that we have been running this patch on our production systems for over a year, and we no longer suffer from the leaks.
Admittedly the scenario where this may be an issue is when users fetch from short lived topics which they create and destroy very often and the brokers are not restarted often.

Please reconsider this patch, thanks.

edoardocomar · 2020-02-27T17:18:11Z

bump ... hi @ijuma @rajinisivaram ...

edoardocomar · 2020-04-02T15:06:14Z

bump ... trying others @vahidhashemian @dguy ?

edoardocomar · 2020-04-30T15:41:08Z

rebased and retested

edoardocomar · 2020-04-30T15:55:55Z

@hachikuji would you like to take a look as you're working on #8586

hachikuji · 2020-05-01T17:41:21Z

core/src/main/scala/kafka/server/ReplicaManager.scala

      val adjustedMaxBytes = math.min(fetchInfo.maxBytes, limitBytes)
      try {
+        brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()
+        if (allPartitions.contains(tp)) {


Hmm.. Does this solve the issue or just make it less likely? Does anything protect a call to stopReplica between this check and the metric update below?

It does seem this is related to #8586. Since we are updating the metric after completion of the DelayedFetch currently, then this case actually seems likely to be hit in practice. We would just need to have a fetch in purgatory when the call to stop replica is received. However, after #8586, I guess it becomes less likely? Still it would be nice to think through a complete fix.

Yeah, this is the reason this PR got stuck a bit. @rajinisivaram had said:

I had a look through the code and I think it should be possible to close the timing window altogether by creating metrics within one of the existing locks (locking should be required only when metrics for the topic is not available). I am not sure if it is worth the additional complexity.

I think the problem here is that the metric is created on demand. We need to tie it to partition lifecycles a little more closely. The thought I had is basically just to create the topic metric whenever we receive a LeaderAndIsr request so that creation/deletion are protected by replicaStateChangeLock. We can then just ignore updates to the metric if it doesn't exist rather than letting it be recreated.

This PR makes the issue less likely to happen. It's not bulletproof but we've been using this patch in production for years and has worked great for us. Heuristic and pragmatic...

benhannel · 2020-06-15T22:10:23Z

Perfect is the enemy of good.

Avoid a DelayedFetch recreate the metrics when a topic has been deleted Always tick global fetch request metric Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com> Co-authored-by: Mickael Maison <mickael.maison@gmail.com>

edoardocomar · 2020-06-18T16:47:11Z

After @hachikuji fixes in #8586
the metrics are no longer ticked at the end of a DelayedFetch, so the time window for topic deletion is almost non existent and the only guard code needed is left in KafkaApis, as shown by the unit test added by this PR.
This PR is now tiny and would be nice to have it merged :-)

ijuma reviewed Nov 21, 2017

View reviewed changes

ijuma reviewed Nov 22, 2017

View reviewed changes

edoardocomar force-pushed the KAFKA-5238 branch from 94062e2 to a213df2 Compare January 17, 2018 21:49

edoardocomar force-pushed the KAFKA-5238 branch 2 times, most recently from c2f9f20 to 2477776 Compare March 1, 2018 14:48

mimaison force-pushed the KAFKA-5238 branch from 2477776 to 58e9f02 Compare April 26, 2018 13:09

edoardocomar force-pushed the KAFKA-5238 branch from 58e9f02 to 42373dd Compare February 6, 2020 18:08

edoardocomar force-pushed the KAFKA-5238 branch from 42373dd to 5d68873 Compare April 2, 2020 13:59

edoardocomar force-pushed the KAFKA-5238 branch from 5d68873 to d136b77 Compare April 30, 2020 15:40

edoardocomar mentioned this pull request Apr 30, 2020

KAFKA-9939; Fix overcounting delayed fetches in request rate metrics #8586

Merged

3 tasks

hachikuji reviewed May 1, 2020

View reviewed changes

edoardocomar and others added 3 commits June 18, 2020 14:36

fixed unit test syntax

4103c2d

updated after rebase included KAFKA-9939 fix

63cdf8d

edoardocomar force-pushed the KAFKA-5238 branch from d136b77 to 63cdf8d Compare June 18, 2020 16:20

edoardocomar closed this Jan 13, 2023

edoardocomar deleted the KAFKA-5238 branch January 13, 2023 15:33

edoardocomar mentioned this pull request Jan 13, 2023

KAFKA-5238: BrokerTopicMetrics can be recreated after topic is deleted #13113

Open

3 tasks

KAFKA-5238: BrokerTopicMetrics can be recreated after topic is deleted #4204

KAFKA-5238: BrokerTopicMetrics can be recreated after topic is deleted #4204

Conversation

edoardocomar commented Nov 10, 2017

asfgit commented Nov 10, 2017

asfgit commented Nov 10, 2017

asfgit commented Nov 10, 2017

edoardocomar commented Nov 15, 2017

edoardocomar commented Nov 20, 2017

edoardocomar commented Nov 21, 2017

rajinisivaram commented Nov 21, 2017

edoardocomar commented Nov 21, 2017

rajinisivaram commented Nov 21, 2017

ijuma Nov 21, 2017

Choose a reason for hiding this comment

edoardocomar Nov 21, 2017

Choose a reason for hiding this comment

rajinisivaram commented Nov 22, 2017

ijuma commented Nov 22, 2017 • edited

ijuma Nov 22, 2017

Choose a reason for hiding this comment

edoardocomar Nov 22, 2017

Choose a reason for hiding this comment

edoardocomar commented Dec 7, 2017

rajinisivaram commented Dec 8, 2017

edoardocomar commented Dec 14, 2017

edoardocomar commented Jan 17, 2018

edoardocomar commented Jan 18, 2018

mimaison commented Jan 29, 2018

edoardocomar commented Jan 29, 2018

mimaison commented Jan 29, 2018

edoardocomar commented Mar 1, 2018

rajinisivaram commented Mar 8, 2018

edoardocomar commented Mar 20, 2018

edoardocomar commented Apr 26, 2018

edoardocomar commented Feb 6, 2020

edoardocomar commented Feb 27, 2020

edoardocomar commented Apr 2, 2020

edoardocomar commented Apr 30, 2020

edoardocomar commented Apr 30, 2020

hachikuji May 1, 2020

Choose a reason for hiding this comment

ijuma May 1, 2020

Choose a reason for hiding this comment

hachikuji May 1, 2020 • edited

Choose a reason for hiding this comment

edoardocomar May 4, 2020

Choose a reason for hiding this comment

benhannel commented Jun 15, 2020 • edited

edoardocomar commented Jun 18, 2020

ijuma commented Nov 22, 2017 •

edited

hachikuji May 1, 2020 •

edited

benhannel commented Jun 15, 2020 •

edited