KAFKA-9939; Fix overcounting delayed fetches in request rate metrics#8586
Conversation
|
The failure in the test case is a problem with the isolation of metrics between test cases. I will submit a fix shortly. |
| * This test also verifies counts of fetch requests recorded by the ReplicaManager | ||
| */ | ||
| @Test | ||
| def testReadFromLog(): Unit = { |
There was a problem hiding this comment.
I decided to get rid of this. I believe it is already covered by test cases in ReplicaManagerTest. See for example, testFetchBeyondHighWatermarkReturnEmptyResponse.
There was a problem hiding this comment.
That test name is a bit misleading. Should we name it testFetchBeyondHighwatermarkForConsumerAndFollower or something?
There was a problem hiding this comment.
Yeah, that's fair. How about just testFetchBeyondHighwatermark?
There was a problem hiding this comment.
That's fine. Is it worth adding a brief comment stating what it's testing?
There was a problem hiding this comment.
Borderline overkill I guess, but I added a few comments explaining the test behavior.
|
Hi @hachikuji as you're working on this ... do you mind taking a look at #4204 ? the test we have added there still fails with this change of yours, without the workarounds we had suggested. |
|
cc @mimaison |
ijuma
left a comment
There was a problem hiding this comment.
Nice catch, LGTM. A question and a minor comment below.
| * This test also verifies counts of fetch requests recorded by the ReplicaManager | ||
| */ | ||
| @Test | ||
| def testReadFromLog(): Unit = { |
There was a problem hiding this comment.
That test name is a bit misleading. Should we name it testFetchBeyondHighwatermarkForConsumerAndFollower or something?
| mockTimer.advanceClock(11) | ||
| assertNotNull(purgatoryFetchResult.get) | ||
| assertEquals(Errors.NONE, purgatoryFetchResult.get.error) | ||
| assertMetricCount(2) |
There was a problem hiding this comment.
This would return 3 without the fix?
|
@edoardocomar Thanks for the comment. I agree it is related. Left a comment on #4204. |
|
The 32 test failures on jdk11 were due to threads being left behind after a failure in |
Fetches which hit purgatory are currently counted twice in fetch request rate metrics. This patch moves the metric update into
fetchMessagesso that they are only counted once.Committer Checklist (excluded from commit message)