Skip to content

MINOR: fix checkpoint write failure warning log#6008

Merged
guozhangwang merged 4 commits intoapache:trunkfrom
vvcephei:fix-checkpoint-warning
Dec 9, 2018
Merged

MINOR: fix checkpoint write failure warning log#6008
guozhangwang merged 4 commits intoapache:trunkfrom
vvcephei:fix-checkpoint-warning

Conversation

@vvcephei
Copy link
Copy Markdown
Contributor

@vvcephei vvcephei commented Dec 6, 2018

We saw a log statement in which the cause of the failure to write a checkpoint was not properly logged.
This change logs the exception properly and also verifies the log message.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 6, 2018

@guozhangwang @bbejeck , do you mind taking a look at this when you get the chance?

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 7, 2018

checkstyle failure... I'll look into it tomorrow.

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @vvcephei! Just a couple of minor comments otherwise looks good.

checkpoint.write(this.checkpointableOffsets);
} catch (final IOException e) {
log.warn("Failed to write offset checkpoint file to {}: {}", checkpoint, e);
if (log.isWarnEnabled()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we don't need log. isWarnEnabled() if we use
log.warn("Failed to write offset checkpoint file to [{}]", checkpoint, e); I pulled down the branch and tested this out.

The issue with the current log statement in trunk is an extra pair of {} for the exception parameter but that is handled automatically.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah. Thanks!

final LogCaptureAppender.Event lastEvent = messages.get(messages.size() - 1);

assertThat(lastEvent.getLevel(), is("WARN"));
assertThat(lastEvent.getMessage(), startsWith("process-state-manager-test Failed to write offset checkpoint file to [/tmp/"));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails on the mac on this line as the path is not /tmp but starts with /var/folders... by changing the assertion to startsWith("process-state-manager-test Failed to write offset checkpoint file to [ then the test passes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes perfect sense. I should have predicted that the "temporary file" directory is different on every OS.

final String[] throwableStrRep = event.getThrowableStrRep();
final Optional<String> throwableString;
if (throwableStrRep == null) {
throwableString = Optional.empty();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we should set a default value instead? If this is empty the test will fail from NoSuchElementException. I know you are aware of this, I'm just thinking about in the future about others using this class and having to test for isPresent() vs. get()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly what I was going for. The type (Optional) documents that you have to check isPresent before get. It's just as true if we make the field nullable, but then it's not well documented, and therefore more vulnerable to NPEs.

In my test, I deliberately don't check isPresent because I expect it to be present. If it's not, we'll get a NoSuchElementException, which contains the exact same information as if we had an extra isPresent check that then throws an AssertionException(Expected a throwable).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you think it's better to add the isPresent assertion, I can do it; I just tend not to do such checks in tests, for the reason above.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, your arguments make sense, I'd leave it as is.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 7, 2018

Thanks for the review, @bbejeck !

Copy link
Copy Markdown
Member

@bbejeck bbejeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this @vvcephei! LGTM

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 8, 2018

Unrelated java 8 failure:

kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup

Error Message
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.
Stacktrace
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262)
	at kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:306)
	at kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89)

Retest this, please.

@vvcephei
Copy link
Copy Markdown
Contributor Author

vvcephei commented Dec 8, 2018

Unrelated java 11 failure:

kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup

Error Message
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.
Stacktrace
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.CoordinatorNotAvailableException: The coordinator is not available.
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:262)
	at kafka.admin.ConsumerGroupCommand$ConsumerGroupService.resetOffsets(ConsumerGroupCommand.scala:306)
	at kafka.admin.ResetConsumerGroupOffsetTest.testResetOffsetsNotExistingGroup(ResetConsumerGroupOffsetTest.scala:89)

Retest this, please.

Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @vvcephei !

@guozhangwang guozhangwang merged commit b492296 into apache:trunk Dec 9, 2018
@vvcephei vvcephei deleted the fix-checkpoint-warning branch December 11, 2018 15:02
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
We saw a log statement in which the cause of the failure to write a checkpoint was not properly logged.
This change logs the exception properly and also verifies the log message.

Reviewers: Bill Bejeck <bill@confluent.io>, Guozhang Wang <wangguoz@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants