Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-5238: BrokerTopicMetrics can be recreated after topic is deleted #4204

Closed
wants to merge 3 commits into from

Conversation

edoardocomar
Copy link
Contributor

Avoiding a DelayedFetch recreate the metrics when a topic has been
deleted

developed with @mimaison

added unit test borrowed from @ijuma JIRA

@asfgit
Copy link

asfgit commented Nov 10, 2017

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

2 similar comments
@asfgit
Copy link

asfgit commented Nov 10, 2017

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

@asfgit
Copy link

asfgit commented Nov 10, 2017

SUCCESS
8068 tests run, 5 skipped, 0 failed.
--none--

@edoardocomar
Copy link
Contributor Author

Hi @ijuma @rajinisivaram any comments ?

@edoardocomar
Copy link
Contributor Author

Hi @ijuma @rajinisivaram could anyone review please ?

@edoardocomar
Copy link
Contributor Author

The reason we believe this fix is important is the leak that deleted topics metrics cause, over a long enough time they built up and we were flooding the graphite backend with obsolete metrics ...

@rajinisivaram
Copy link
Contributor

@edoardocomar Sorry about the delay. The reason I was hesitant in committing this was because I wasn't sure if there were any race conditions in the fix. For example, could a topic be deleted after the check for UNKNOWN_TOPIC_OR_PARTITION in handleFetchRequest that could cause the metrics to be recreated? I haven't looked at the threads in detail, so I am not sure either way, but I would need to spend more time to be certain.

@edoardocomar
Copy link
Contributor Author

Hi @rajinisivaram yes there could still be a race, however the window for that to happen is minimal, as the highest chances for the race to happen are while there is a DelayedFetch in the Purgatory. In fact it was very easy to reproduce the issue.
After this fix we verified on our systems that the likelihood of the leak is minimal (i.e. we could not reproduce it any longer!)

@rajinisivaram
Copy link
Contributor

@edoardocomar Yes, I do agree that the fix makes it better than what we have currently since it reduces the timing window. But it will be good to explore if we can fix it properly without leaving any timing windows. I will take a look later today.

brokerTopicStats.topicStats(tp.topic).totalFetchRequestRate.mark()
brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()
} else {
info(s"Delayed fetch for deleted partition $tp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not do this. readFromLocalLog should not encode information about its callers, it should be the other way around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the log message is bad !

@rajinisivaram
Copy link
Contributor

@edoardocomar I had a look through the code and I think it should be possible to close the timing window altogether by creating metrics within one of the existing locks (locking should be required only when metrics for the topic is not available). I am not sure if it is worth the additional complexity. Perhaps @ijuma could comment since he created this JIRA.

@ijuma
Copy link
Contributor

ijuma commented Nov 22, 2017

I haven't looked at the complexity, but generally we try to fix the root cause and not just make it less likely to happen.

try {
if (allPartitions.contains(tp)) {
brokerTopicStats.topicStats(tp.topic).totalFetchRequestRate.mark()
brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that we can always call this one. It's only the topic variant that should not be called if the topic has been deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

@edoardocomar
Copy link
Contributor Author

Hi @ijuma @rajinisivaram any further observations? I agree that fixing a root cause is preferable, however here the simplicity of reducing the time window from large to minimal appeared to have paid off in practice (we're running with this patch).

@rajinisivaram
Copy link
Contributor

I am ok with committing this and creating another JIRA to close the timing window later. @ijuma What do you think?

@edoardocomar
Copy link
Contributor Author

@ijuma ?

@edoardocomar
Copy link
Contributor Author

@ijuma @rajinisivaram ?

@edoardocomar
Copy link
Contributor Author

retest this please

1 similar comment
@mimaison
Copy link
Member

retest this please

@edoardocomar
Copy link
Contributor Author

@ijuma ? @rajinisivaram @junrao ?

@mimaison
Copy link
Member

Opened https://issues.apache.org/jira/browse/KAFKA-6495 for fully closing the timing window

@edoardocomar edoardocomar force-pushed the KAFKA-5238 branch 2 times, most recently from c2f9f20 to 2477776 Compare March 1, 2018 14:48
@edoardocomar
Copy link
Contributor Author

rebased on trunk to fix conflicts, can be merged clean again ... any volunteers ?
@rajinisivaram

@rajinisivaram
Copy link
Contributor

@ijuma Shall we merge this now since it is a simple fix and reduces the timing window? There is a separate JIRA (KAFKA-6495) to close the window properly later.

@edoardocomar
Copy link
Contributor Author

Hi @ijuma would you be ok to merge ?

@edoardocomar
Copy link
Contributor Author

Hi @rajinisivaram thanks for finding the time to chat about this PR
what do we need to do to get it merged ?

@ijuma even if this is not a theoretical fix to the issue, it is a practical one,
we have been running with this patch in production for months and never experienced the leak again

Please if you do not want to get it merged just say so and we will stop pursuing it,
just do not leave us in limbo forever 😄

thanks

@edoardocomar
Copy link
Contributor Author

Hi @ijuma @rajinisivaram we have rebased this old PR.

Please note that we have been running this patch on our production systems for over a year, and we no longer suffer from the leaks.
Admittedly the scenario where this may be an issue is when users fetch from short lived topics which they create and destroy very often and the brokers are not restarted often.

Please reconsider this patch, thanks.

@edoardocomar
Copy link
Contributor Author

bump ... hi @ijuma @rajinisivaram ...

@edoardocomar
Copy link
Contributor Author

bump ... trying others @vahidhashemian @dguy ?

@edoardocomar
Copy link
Contributor Author

rebased and retested

@edoardocomar
Copy link
Contributor Author

@hachikuji would you like to take a look as you're working on #8586

val adjustedMaxBytes = math.min(fetchInfo.maxBytes, limitBytes)
try {
brokerTopicStats.allTopicsStats.totalFetchRequestRate.mark()
if (allPartitions.contains(tp)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.. Does this solve the issue or just make it less likely? Does anything protect a call to stopReplica between this check and the metric update below?

It does seem this is related to #8586. Since we are updating the metric after completion of the DelayedFetch currently, then this case actually seems likely to be hit in practice. We would just need to have a fetch in purgatory when the call to stop replica is received. However, after #8586, I guess it becomes less likely? Still it would be nice to think through a complete fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is the reason this PR got stuck a bit. @rajinisivaram had said:

I had a look through the code and I think it should be possible to close the timing window altogether by creating metrics within one of the existing locks (locking should be required only when metrics for the topic is not available). I am not sure if it is worth the additional complexity.

Copy link
Contributor

@hachikuji hachikuji May 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that the metric is created on demand. We need to tie it to partition lifecycles a little more closely. The thought I had is basically just to create the topic metric whenever we receive a LeaderAndIsr request so that creation/deletion are protected by replicaStateChangeLock. We can then just ignore updates to the metric if it doesn't exist rather than letting it be recreated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes the issue less likely to happen. It's not bulletproof but we've been using this patch in production for years and has worked great for us. Heuristic and pragmatic...

@benhannel
Copy link

benhannel commented Jun 15, 2020

Perfect is the enemy of good.

edoardocomar and others added 3 commits June 18, 2020 14:36
Avoid a DelayedFetch recreate the metrics when a topic has been
deleted

Always tick global fetch request metric

Co-authored-by: Edoardo Comar <ecomar@uk.ibm.com>
Co-authored-by: Mickael Maison <mickael.maison@gmail.com>
@edoardocomar
Copy link
Contributor Author

After @hachikuji fixes in #8586
the metrics are no longer ticked at the end of a DelayedFetch, so the time window for topic deletion is almost non existent and the only guard code needed is left in KafkaApis, as shown by the unit test added by this PR.
This PR is now tiny and would be nice to have it merged :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants