Skip to content

Commit

Permalink
GH-1259: Handle Failed Record Recovery
Browse files Browse the repository at this point in the history
Resolves #1259

Previously if the recoverer in a `SeekToCurrentErrorHandler` or
`DefaultAfterRollbackProcessor` failed to recover a record, the
record could be lost; the `FailedRecordTracker` simply logged
the exception.

Change the `SeekUtils` to detect a failure in the recoverer (actually
any failure when determining if the failed record should be recovered)
and include the failed record in the seeks.

In this way the recovery will be attempted once more on each delivery
attempt.

**cherry-pick to 2.2.x**
  • Loading branch information
garyrussell authored and artembilan committed Oct 3, 2019
1 parent 0feba30 commit 9abc873
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 15 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ class FailedRecordTracker {

boolean skip(ConsumerRecord<?, ?> record, Exception exception) {
if (this.noRetries) {
recover(record, exception);
this.recoverer.accept(record, exception);
return true;
}
Map<TopicPartition, FailedRecord> map = this.failures.get();
Expand All @@ -101,7 +101,7 @@ boolean skip(ConsumerRecord<?, ?> record, Exception exception) {
return false;
}
else {
recover(record, exception);
this.recoverer.accept(record, exception);
map.remove(topicPartition);
if (map.isEmpty()) {
this.failures.remove();
Expand All @@ -110,15 +110,6 @@ boolean skip(ConsumerRecord<?, ?> record, Exception exception) {
}
}

private void recover(ConsumerRecord<?, ?> record, Exception exception) {
try {
this.recoverer.accept(record, exception);
}
catch (Exception ex) {
this.logger.error(ex, "Recoverer threw exception");
}
}

void clearThreadState() {
this.failures.remove();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,14 @@ public static boolean doSeeks(List<ConsumerRecord<?, ?>> records, Consumer<?, ?>
AtomicBoolean skipped = new AtomicBoolean();
records.forEach(record -> {
if (recoverable && first.get()) {
skipped.set(skipper.test(record, exception));
try {
boolean test = skipper.test(record, exception);
skipped.set(test);
}
catch (Exception ex) {
logger.error(ex, "Failed to determine if this record should be recovererd, including in seeks");
skipped.set(false);
}
if (skipped.get()) {
logger.debug(() -> "Skipping seek of: " + record);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -536,10 +536,14 @@ public void testMaxFailures() throws Exception {
container.setBeanName("testMaxFailures");
final CountDownLatch recoverLatch = new CountDownLatch(1);
final KafkaTemplate<Object, Object> dlTemplate = spy(new KafkaTemplate<>(pf));
AtomicBoolean recovererShouldFail = new AtomicBoolean(true);
DeadLetterPublishingRecoverer recoverer = new DeadLetterPublishingRecoverer(dlTemplate) {

@Override
public void accept(ConsumerRecord<?, ?> record, Exception exception) {
if (recovererShouldFail.getAndSet(false)) {
throw new RuntimeException("test recoverer failure");
}
super.accept(record, exception);
recoverLatch.countDown();
}
Expand Down Expand Up @@ -590,8 +594,8 @@ public void accept(ConsumerRecord<?, ?> record, Exception exception) {
assertThat(headers.get("baz")).isEqualTo("qux".getBytes());
pf.destroy();
assertThat(stopLatch.await(10, TimeUnit.SECONDS)).isTrue();
verify(afterRollbackProcessor, times(3)).isProcessInTransaction();
verify(afterRollbackProcessor, times(3)).process(any(), any(), any(), anyBoolean());
verify(afterRollbackProcessor, times(4)).isProcessInTransaction();
verify(afterRollbackProcessor, times(4)).process(any(), any(), any(), anyBoolean());
verify(afterRollbackProcessor).clearThreadState();
verify(dlTemplate).send(any(ProducerRecord.class));
verify(dlTemplate).sendOffsetsToTransaction(
Expand Down Expand Up @@ -632,8 +636,11 @@ public void testRollbackProcessorCrash() throws Exception {
KafkaMessageListenerContainer<Integer, String> container =
new KafkaMessageListenerContainer<>(cf, containerProps);
container.setBeanName("testRollbackNoRetries");
AtomicBoolean recovererShouldFail = new AtomicBoolean(true);
BiConsumer<ConsumerRecord<?, ?>, Exception> recoverer = (rec, ex) -> {
throw new RuntimeException("arbp fail");
if (recovererShouldFail.getAndSet(false)) {
throw new RuntimeException("arbp fail");
}
};
DefaultAfterRollbackProcessor<Object, Object> afterRollbackProcessor =
spy(new DefaultAfterRollbackProcessor<>(recoverer, new FixedBackOff(0L, 0L)));
Expand Down
8 changes: 8 additions & 0 deletions src/reference/asciidoc/kafka.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -1813,6 +1813,8 @@ public SeekToCurrentErrorHandler eh() {

However, see the note at the beginning of this section; you can avoid using the `RetryTemplate` altogether.

IMPORTANT: If the recoverer fails (throws an exception), the record will be included in the seeks and recovery will be attempted again during the next delivery.

[[events]]
===== Listener Consumer Lifecycle Events

Expand Down Expand Up @@ -3588,6 +3590,8 @@ Generally, you should configure the `BackOff` to never return `STOP`.
However, since this error handler has no mechanism to "recover" after retries are exhausted, if the `BackOffExecution` returns `STOP`, the previous interval will be used for all subsequent delays.
Again, the maximum delay must be less than the `max.poll.interval.ms` consumer property.

IMPORTANT: If the recoverer fails (throws an exception), the record will be included in the seeks and recovery will be attempted again during the next delivery.

===== Container Stopping Error Handlers

The `ContainerStoppingErrorHandler` (used with record listeners) stops the container if the listener throws an exception.
Expand Down Expand Up @@ -3638,6 +3642,8 @@ Starting with version 2.2.5, the `DefaultAfterRollbackProcessor` can be invoked
Then, if you are using the `DeadLetterPublishingRecoverer` to publish a failed record, the processor will send the recovered record's offset in the original topic/partition to the transaction.
To enable this feature, set the `commitRecovered` and `kafkaTemplate` properties on the `DefaultAfterRollbackProcessor`.

IMPORTANT: If the recoverer fails (throws an exception), the record will be included in the seeks and recovery will be attempted again during the next delivery.

[[dead-letters]]
===== Publishing Dead-letter Records

Expand Down Expand Up @@ -3704,6 +3710,8 @@ public DeadLetterPublishingRecoverer publisher(KafkaTemplate<?, ?> stringTemplat
The publisher uses the map keys to locate a template that is suitable for the `value()` about to be published.
A `LinkedHashMap` is recommended so that the keys are examined in order.

IMPORTANT: If the recoverer fails (throws an exception), the record will be included in the seeks and recovery will be attempted again during the next delivery.

Starting with version 2.3, the recoverer can also be used with Kafka Streams - see <<streams-deser-recovery>> for more information.

[[kerberos]]
Expand Down

0 comments on commit 9abc873

Please sign in to comment.