KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count #1315

zhuchen1018 · 2016-05-04T00:32:44Z

No description provided.

… decrement flushesInProgress count

zhuchen1018 · 2016-05-04T00:33:25Z

@ijuma Do you have time to take a look? Thanks!

ijuma · 2016-05-04T00:51:51Z

Thanks for the PR. The change looks good. The usual question follows: can we include a test please?

zhuchen1018 · 2016-05-04T02:52:07Z

@ijuma It is possible to write a test case for this. But I don't feel its value is worth the extra code here in this particular case. Adding unit tests for bug fixes is useful to make sure that we will not have the same bug again in the future. And it is generally for detecting bugs that is not evident by simply looking at the code.

However, the bug fixed in this patch is really straightforward and shouldn't happen again if anyone change this method in the future. A unit test for this bug should validate that flushesInProgress is decremented in the event of InterruptedException, which is essentially testing the usage of finally than testing any kafka-specific logic. If we should add tests for something as straightforward as this, we should probably add test for a bunch of other methods that use finally to enforce something, which doesn't seem feasible. What do you think?

ijuma · 2016-05-04T09:10:56Z

@zhuchen1018, The fact that the bug exists in the first place implies that a test is worth it. Why do you think we can't regress on this? The code could be refactored and anything can happen at that point. Unless the test is slow or particularly complex, I don't see the harm in adding it.

onurkaraman · 2016-05-04T18:25:07Z

btw the corresponding ticket is KAFKA-3655, not KAFKA-3651 as the PR title suggests.

zhuchen1018 · 2016-05-04T18:49:50Z

@ijuma Thanks for the explanation. I agree a test can be useful here. I am going to write test for this particular patch. But I would like to discuss it a bit more to understand the best practice for adding tests in kafka , which hopefully can encourage more contributions in the open source community.

I think Kafka open source community should probably have a uniform standard for what should be tested. The standard should be enforced consistently for at least new patches. Bug is good for detecting something that were missed by originally developers and reviewers; once a bug is detected, we should start to require higher standard for tests by original developers so that similar issues can be covered in the future. Does this sound reasonable?

For this particular patch, I suppose that we can add a test like this: call beginFlush, call awaitFlushCompletion, interrupt the thread, and verify that flushInProgress() == false. Say it is useful to test it, should we similarly test that appendsInProgress does't change before and after RecordAccumulator.append()? Should all future developers who use finally to enforce that the statement in the finally is executed, to always verify this by writing tests for exceptions that may be thrown?

ijuma · 2016-05-04T19:07:42Z

@zhuchen1018 I agree that it would be good to be consistent around this. Your description regarding tests for bugs sounds reasonable to me. We should also do better on requiring good test coverage for new work (it's a process and it will take some time before we have a clear process for this).

For this PR, I'm happy if you add the first test you described.

As you said in another comment, we could write tests for a lot of tests if we want to ensure that our finally works correctly. It's also a bit subjective what is worth testing and what isn't. However, a simple rule is that if we have a bug for something, we should include a test with the fix unless there is a good reason not to (test is brittle, takes too long, too complex, etc.). I think this is a fair rule and it doesn't overburden contributors.

zhuchen1018 · 2016-05-04T20:27:56Z

@ijuma Sure, I certainly agree that it is a good principle and common to "add test for a bug". Because the fact that there is a bug indicates this is probably hard to detect. But I feel that, if we couldn't require developers who originally write these code to add test for finally, it probably indicates that the benefits of these tests are not worth the developer's time; if this is not worth developer's time who write the feature, then we shouldn't require contributors who fix the bug to write the test. It would be a fair rule to hold developers of the feature to no lower standard than those who help fix their code, right?

It would be good to take cost of time into account and be consistent about a standard for test since there will be growing number of these bug fixes moving forward. The argument I make here is for long term development rather than trying to avoid adding test for a specific bug :)

ijuma · 2016-05-04T23:09:18Z

@zhuchen1018 I think we're going in circles a bit. I'll leave some thoughts and if you want to discuss the topic more generally, then please start a mailing list thread. So here it goes:

I think focusing on a language construct (finally) is perhaps the wrong way to think about it. Let's describe the issue as an interruptible method. It is a good idea to have a test that causes the method to be interrupted because this is a common source of bugs (examples are both the issue in this PR and in ByteBuffer.allocate where there were actually multiple bugs and PRs that you were involved with).
When someone contributes a new feature, they contribute a bunch of tests and the reviewer tries to ensure that the test coverage is good. This is a bit of a subjective thing at the moment as we don't yet use coverage tools (and they have limits as well). Things will be missed, that's just the nature of it.
When someone contributes a bug fix, there is evidence that a test would have been useful so we take that into account.
When someone files a PR, it's worth being aware that reviewer time is also limited. Your PR is more likely to be merged faster if it fits the project guidelines.
I hope that instead of a growing number of bug fixes, we write code with less bugs. We will see!

OK, that's all I have to say for now. :)

zhuchen1018 · 2016-05-06T17:55:58Z

@ijuma Thanks again for the review and explanation. I have added unit test. Can you take a look?

ijuma · 2016-05-06T19:35:09Z

clients/src/test/java/org/apache/kafka/clients/producer/internals/RecordAccumulatorTest.java

+
+        accum.beginFlush();
+        assertTrue(accum.flushInProgress());
+        delayedInterrupt(Thread.currentThread(), 2000L);


Is 1 second enough?

@ijuma Are you talking about the maxBlockTimeMs? It should not matter since buffer should be allocated immediately for accum.append(...), right?

I was talking about the delayed interrupt time. Ideally, the shorter it is, the faster the test terminates. So I thought we could use 1s instead of 2s.

I see. 1 sec should be enough. Fixed now.

ijuma · 2016-05-06T19:37:02Z

Thanks @zhuchen1018, a couple of minor comments. Looks good otherwise. Will merge as soon as they are addressed.

ijuma · 2016-05-06T20:44:44Z

LGTM

… decrement flushesInProgress count Author: Chen Zhu <amandazhu19620701@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes #1315 from zhuchen1018/KAFKA-3655 (cherry picked from commit 717eea8) Signed-off-by: Ismael Juma <ismael@juma.me.uk>

… decrement flushesInProgress count Author: Chen Zhu <amandazhu19620701@gmail.com> Reviewers: Ismael Juma <ismael@juma.me.uk> Closes apache#1315 from zhuchen1018/KAFKA-3655

…oop implementation. (apache#1315)

KAFKA-3651; awaitFlushCompletion() in RecordAccumulator should always…

e7533f5

… decrement flushesInProgress count

zhuchen1018 changed the title ~~KAFKA-3651; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count~~ KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count May 4, 2016

Add unit test

0c52f3c

ijuma reviewed May 6, 2016
View reviewed changes

zhuchen1018 added 2 commits May 6, 2016 12:47

Address review

5108b3e

Address review

c2cec8e

asfgit closed this in 717eea8 May 6, 2016

efeg added a commit to efeg/kafka that referenced this pull request May 29, 2024

Add relevant interfaces for detecting maintenance events along with n…

a637943

…oop implementation. (apache#1315)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count #1315

KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count #1315

zhuchen1018 commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

onurkaraman commented May 4, 2016

zhuchen1018 commented May 4, 2016 •

edited

ijuma commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

zhuchen1018 commented May 6, 2016

ijuma May 6, 2016

zhuchen1018 May 6, 2016

ijuma May 6, 2016

zhuchen1018 May 6, 2016

ijuma commented May 6, 2016

ijuma commented May 6, 2016

KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count #1315

KAFKA-3655; awaitFlushCompletion() in RecordAccumulator should always decrement flushesInProgress count #1315

Conversation

zhuchen1018 commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

onurkaraman commented May 4, 2016

zhuchen1018 commented May 4, 2016 • edited

ijuma commented May 4, 2016

zhuchen1018 commented May 4, 2016

ijuma commented May 4, 2016

zhuchen1018 commented May 6, 2016

ijuma May 6, 2016

Choose a reason for hiding this comment

zhuchen1018 May 6, 2016

Choose a reason for hiding this comment

ijuma May 6, 2016

Choose a reason for hiding this comment

zhuchen1018 May 6, 2016

Choose a reason for hiding this comment

ijuma commented May 6, 2016

ijuma commented May 6, 2016

zhuchen1018 commented May 4, 2016 •

edited