Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k/group: recover leader epoch on leader change #17260

Merged

Conversation

nvartolomei
Copy link
Contributor

@nvartolomei nvartolomei commented Mar 22, 2024

From my understanding, this can cause a problem only when write caching is enabled.

It could also apply to ACKS=1 in an edge case but haven't thought it through. We make data available only after majority replicated so it’s very unlikely for a truncation to happen after that and trigger KIP-320

This was discovered while testing write caching feature. After leadership change or node restart we would reply with default field value -2147483648 which breaks the KIP-320 logic.

check_leader_epoch in redpanda treats negative epoch values as "not set" and, I believe, franz-go behaves the same.

As result, KIP-320 fencing is not being applied and the client ends up with OFFSET_OUT_OF_RANGE error.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x

Release Notes

  • none

@vbotbuildovich
Copy link
Collaborator

new failures in https://buildkite.com/redpanda/redpanda/builds/46616#018e668c-9526-49f5-97e6-de141da9ca44:

"rptest.tests.partition_move_interruption_test.PartitionMoveInterruption.test_cancelling_partition_move.replication_factor=3.unclean_abort=True.recovery=restart_recovery.compacted=False"

@nvartolomei
Copy link
Contributor Author

/dt
skip-redpanda-build
dt-repeat=10
tests/rptest/tests/partition_move_interruption_test.py::PartitionMoveInterruption.test_cancelling_partition_move

@nvartolomei
Copy link
Contributor Author

failing test with verifiable consumer; without write caching... https://ci-artifacts.dev.vectorized.cloud/redpanda/46616/018e668c-9526-49f5-97e6-de141da9ca44/vbuild/ducktape/results/final/report.html

[2024-03-22 14:47:42,280] INFO [Consumer clientId=consumer-test_group-1, groupId=test_group] Fetch position FetchPosition{offset=6230, offsetEpoch=Optional[2], currentLeader=LeaderAndEpoch{leader=Optional[docker-rp-2:9092 (id: 4 rack: null)], epoch=absent}} is out of range for partition topic-peirabobwf-11, resetting offset (org.apache.kafka.clients.consumer.internals.Fetcher)
[2024-03-22 14:47:42,805] INFO [Consumer clientId=consumer-test_group-1, groupId=test_group] Resetting offset for partition topic-peirabobwf-11 to position FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[docker-rp-2:9092 (id: 4 rack: null)], epoch=absent}}. (org.apache.kafka.clients.consumer.internals.SubscriptionState)

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. When you request reviewers, I think we'll want to get a few groups bharath/michal and probably someone from enterprise team.

@savex
Copy link
Contributor

savex commented Mar 25, 2024

/ci-repeat 10
skip-redpanda-build
dt-repeat=2
tests/rptest/tests/partition_move_interruption_test.py::PartitionMoveInterruption.test_cancelling_partition_move

nvartolomei added a commit to nvartolomei/redpanda that referenced this pull request Mar 25, 2024
VerifiableConsumer is KIP-320 compliant now. At least to the degree the
java kafka client is.

This will facilitates testing of the redpanda implementation of KIP-320
in group recovery (redpanda-data#17260)
and data loss handling when write caching is enabled.
@nvartolomei nvartolomei force-pushed the nv/kafka-group-recover-leader-epoch branch from 8e82687 to 0ed5c6b Compare March 25, 2024 17:22
VerifiableConsumer is KIP-320 compliant now. At least to the degree the
java kafka client is.

This will facilitates testing of the redpanda implementation of KIP-320
in group recovery (redpanda-data#17260)
and data loss handling when write caching is enabled.
This tests consumer group commits. The test also shows a bug in which
leader_offset is reset after cluster restart. This is fixed in a
subsequent commit.
This was discovered while testing write caching feature. After
leadership change or node restart we would reply with default field
value `-2147483648` which breaks the KIP-320 logic.

`check_leader_epoch` in redpanda treats negative epoch values as "not
set" and, I believe, franz-go behaves the same.

As result, KIP-320 fencing is not being applied and the client ends up
with `OFFSET_OUT_OF_RANGE` error.
Copy link
Contributor

@graphcareful graphcareful left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice !

@nvartolomei nvartolomei merged commit 42bafee into redpanda-data:dev Mar 26, 2024
21 checks passed
@vbotbuildovich
Copy link
Collaborator

/backport v23.3.x

@vbotbuildovich
Copy link
Collaborator

/backport v23.2.x

vbotbuildovich pushed a commit to vbotbuildovich/redpanda that referenced this pull request Mar 26, 2024
VerifiableConsumer is KIP-320 compliant now. At least to the degree the
java kafka client is.

This will facilitates testing of the redpanda implementation of KIP-320
in group recovery (redpanda-data#17260)
and data loss handling when write caching is enabled.

(cherry picked from commit 9316309)
vbotbuildovich pushed a commit to vbotbuildovich/redpanda that referenced this pull request Mar 26, 2024
VerifiableConsumer is KIP-320 compliant now. At least to the degree the
java kafka client is.

This will facilitates testing of the redpanda implementation of KIP-320
in group recovery (redpanda-data#17260)
and data loss handling when write caching is enabled.

(cherry picked from commit 9316309)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants