Skip to content

KAFKA-20549: Implemented PRODUCE RPC handler for DLQ state manager. [5/N]#22368

Merged
AndrewJSchofield merged 11 commits into
apache:trunkfrom
smjn:KAFKA-20549-6
May 27, 2026
Merged

KAFKA-20549: Implemented PRODUCE RPC handler for DLQ state manager. [5/N]#22368
AndrewJSchofield merged 11 commits into
apache:trunkfrom
smjn:KAFKA-20549-6

Conversation

@smjn
Copy link
Copy Markdown
Collaborator

@smjn smjn commented May 25, 2026

  • Produce RPC impl.
  • Batching and coalescing added.
  • Unit tests for ShareGroupDLQStateManager.

Reviewers: Andrew Schofield aschofield@confluent.io

@github-actions github-actions Bot added the triage PRs from the community label May 25, 2026
@github-actions github-actions Bot added core Kafka Broker KIP-932 Queues for Kafka labels May 25, 2026
@smjn smjn requested a review from AndrewJSchofield May 26, 2026 07:22
Copy link
Copy Markdown
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Some initial comments.

Object isEnabled = props.get(TopicConfig.ERRORS_DEADLETTERQUEUE_GROUP_ENABLE_CONFIG);
if (isEnabled instanceof Boolean) {
return (boolean) isEnabled;
if (isEnabled instanceof String) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised that this is necessary. This config has a boolean type already, so I wonder whether that's being lost somehow on its way to the metadata cache. For other similar configs, such as TopicConfig.REMOTE_LOG_STORAGE_ENABLE_CONFIG, the code just does Config.getBoolean.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will encapsulate in LogConfig


@Test
public void testIsDlqEnabledOnTopicReturnsTrue() {
public void testIsDlqEnabledOnTopicReturnsFalseValue() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: why FalseValue instead of False?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This method could call the overload beneath will fully specified arguments.

}

private void addRequestToNodeMap(Node node, ProduceRequestHandler handler) {
if (!handler.isBatchable()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could do with some explanation I feel.

}

if (tpData.numPartitions().isEmpty()) {
throw new ConfigException(String.format("DLQ topic partition count not be found for share group %s with DLQ topic %s.", param.groupId(), dlqTopic.get()));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Missing "could".


private Header[] headers(long offset) {
List<Header> headers = new ArrayList<>();
headers.add(new RecordHeader("__dlq.errors.topic", recordTopic().getBytes(StandardCharsets.UTF_8)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would define some static strings for the header keys.

ProduceResponse produceResponse = ((ProduceResponse) response.responseBody());
ProduceResponseData.TopicProduceResponseCollection produceResponseCollection = produceResponse.data().responses();
if (produceResponseCollection.isEmpty()) {
LOG.error("Received empty produce response for {} to dlq topic node {}.", this, dlqPartitionLeaderNode());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's be consistent with use of . in the log messages. I would tend not to include in log messages, but I would definitely go for consistency in this source file.

@smjn
Copy link
Copy Markdown
Collaborator Author

smjn commented May 26, 2026

@AndrewJSchofield Thanks for the review, inc comments.

@smjn smjn requested a review from AndrewJSchofield May 26, 2026 20:15
@github-actions github-actions Bot removed the triage PRs from the community label May 27, 2026
Copy link
Copy Markdown
Member

@AndrewJSchofield AndrewJSchofield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good initial implementation of DLQ produce.

@AndrewJSchofield AndrewJSchofield merged commit d1e8279 into apache:trunk May 27, 2026
22 checks passed
AndrewJSchofield pushed a commit that referenced this pull request May 28, 2026
* Add ShareGroupDLQManager instance creation code in BrokerServer and
pass along the instance to SharePartitionManager to be handed over to
SharePartition.

NOTE: Merge after #22368 Reviewers:
Apoorv Mittal <apoorvmittal10@gmail.com>

Reviewers: Apoorv Mittal <apoorvmittal10@gmail.com>, Andrew Schofield
 <aschofield@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci-approved core Kafka Broker KIP-932 Queues for Kafka

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants