KAFKA-20505: Moving future completion out of locks for Share Partition by apoorvmittal10 · Pull Request #22126 · apache/kafka

apoorvmittal10 · 2026-04-23T10:10:41Z

As explianed in KAFKA-20505, there can be a deadlock when future is
completed for the request where next set of actions tries to attain lock
on purgatory (checkAndComplete/trigger waiting requests). As the lock
might not always be released hence a deadlock can happen. The PR moves
such futures out of the lock.

I have also reviewed other future completions and doesn't seems we need
other changes.

I have tested using franz-go Kafka test and can't reproduce the issues
in 160 continuous runs. Earlier the issue was reproducible between 20-50
consecutive runs.

=== Run 160 ===
=== RUN   TestShareGroupETL
=== PAUSE TestShareGroupETL
=== CONT  TestShareGroupETL
[09:59:38.788 1][INFO] producing to a new topic for the first time,
fetching metadata to learn its partitions; topic:
f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e
...
...
[09:59:43.94 3][INFO] immediate metadata update triggered; why: forced
load because we are producing to a topic for the first time
[09:59:43.947 3][INFO] done waiting for metadata for new topic; topic:
6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9
    share_test.go:507: level 1 phase 2: adding consumers after 122923
consumed
[09:59:44.225 3][INFO] flushing
[09:59:44.225 4][INFO] immediate metadata update triggered; why:
querying metadata for consumer initialization
...
...
[09:59:44.226 5][INFO] beginning to manage the share group lifecycle;
group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35
[09:59:44.226 10][INFO] beginning to manage the share group lifecycle;
group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35
[09:59:44.227 3][INFO] leaving share group; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
member_id: LP4lpqzQjAm-QxQdCRkSXA==
[09:59:44.227 7][INFO] assigning share partitions; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
assignments: map[]
[09:59:44.227 4][INFO] assigning share partitions; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
assignments: map[]
...
...
[09:59:49.232 7][INFO] assigning share partitions; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
assignments:
map[f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e:[1
2]]
[09:59:49.232 6][INFO] assigning share partitions; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
assignments:
map[f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e:[1]]
...
...
[09:59:49.466 18][INFO] immediate metadata update triggered; why:
querying metadata for consumer initialization
[09:59:49.466 13][INFO] immediate metadata update triggered; why:
querying metadata for consumer initialization
[09:59:49.466 15][INFO] immediate metadata update triggered; why:
querying metadata for consumer initialization
[09:59:49.467 16][INFO] beginning to manage the share group lifecycle;
group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6
[09:59:49.467 14][INFO] beginning to manage the share group lifecycle;
group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6
[09:59:49.467 11][INFO] beginning to manage the share group lifecycle;
group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6
...
...
[09:59:49.467 15][INFO] beginning to manage the share group lifecycle;
group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6
[09:59:49.468 13][INFO] assigning share partitions; group:
0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35,
assignments: map[]
[09:59:49.469 17][INFO] assigning share partitions; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
assignments: map[]
...
...
[09:59:54.472 18][INFO] assigning share partitions; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
assignments:
map[5da63960cb4fe97d887d575a73dfbddc51a2eb8071d119b3a5ba5a2b0d87bc7e:[1]]
[09:59:54.485 14][INFO] assigning share partitions; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
assignments:
map[6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9:[0]]
...
...
[09:59:54.494 12][INFO] metadata update triggered; why: reload trigger
due to produce topic still not known
[09:59:54.495 12][INFO] producer id initialization success; id: 3524,
epoch: 0
[09:59:54.5 13][INFO] producing to a new topic for the first time,
fetching metadata to learn its partitions; topic:
6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9
[09:59:54.5 13][INFO] immediate metadata update triggered; why: forced
load because we are producing to a topic for the first time
...
...
[09:59:54.525 11][INFO] leaving share group; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
member_id: _4Jk9faoDdUlGMlSR9zKmg==
    share_test.go:605: level 2 rebalance 1: killing l2-c1 after 169339
consumed
[09:59:55.101 14][INFO] flushing
[09:59:55.101 19][INFO] immediate metadata update triggered; why:
querying metadata for consumer initialization
[09:59:55.102 19][INFO] beginning to manage the share group lifecycle;
group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6
[09:59:55.103 14][INFO] leaving share group; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
member_id: wJibgxG934tiAuqCPloF_w==
[09:59:55.107 19][INFO] assigning share partitions; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
assignments: map[]
    share_test.go:619: level 2 rebalance 2: killing l2-c3 after 375726
consumed
[09:59:55.401 18][INFO] flushing
...
...
[10:00:00.915 20][INFO] leaving share group; group:
d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6,
member_id: HjzG4-5QwncfYfr8pSmEUQ==
    share_test.go:377: level 1: 499900 unique keys, 500624 total
accepts, 500624 produced, 724 duplicates, 35614 redelivered, max dc 3,
consumed 532987
    share_test.go:377: level 2: 499900 unique keys, 501513 total
accepts, 501513 produced, 1613 duplicates, 20272 redelivered, max dc 2,
consumed 518049
    share_test.go:704: level 1: 100 purely rejected, 35614 redelivered
    share_test.go:60: deleting topic
f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e
    share_test.go:61: deleting topic
f7e388a2de7ef0814328f9186e8c4b73b1f2437490e1b98730af9fb17ee74175
    share_test.go:62: deleting topic
5da63960cb4fe97d887d575a73dfbddc51a2eb8071d119b3a5ba5a2b0d87bc7e
    share_test.go:63: deleting topic
6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9
    share_test.go:64: deleting topic
7e74eb054cbb02e0de5da8a8018115dc01094496222039f323841770b11b8a12
    share_test.go:65: deleting topic
4b8c44d4071cd22272ae9ac694342faa3404bd10b479fe88874bdef4a8a4276d
--- PASS: TestShareGroupETL (22.73s)
PASS
ok      github.com/twmb/franz-go/pkg/kgo        22.926s

Reviewers: Andrew Schofield aschofield@confluent.io, Chia-Ping Tsai
chia7712@gmail.com, PoAn Yang payang@apache.org

AndrewJSchofield

Thanks for the PR. Looks good to me.

apoorvmittal10 · 2026-04-23T10:57:22Z

@FrankYang0529 As you are the release manager for 4.2.1, I have to open a cherry-pick for 4.2.1 release as this issue is a blocker. Once merged in trunk will open PR for 4.2, also will merge in 4.3.

FrankYang0529 · 2026-04-23T10:59:10Z

@FrankYang0529 As you are the release manager for 4.2.1, I have to open a cherry-pick for 4.2.1 release as this issue is a blocker. Once merged in trunk will open PR for 4.2, also will merge in 4.3.

Thank you. I will prepare another RC after this is backport to 4.2 branch.

chia7712

LGTM. I've left a suggestion that could be handled in a follow-up.

chia7712 · 2026-04-23T11:10:12Z

-        lock.writeLock().lock();
-        try {
-            if (throwable != null) {
+        if (throwable != null) {


Personally, I prefer to return a chained CompletableFuture rather than manually completing an explicitly created one, as calling complete() under a lock is highly error-prone. For example, rollbackOrProcessStateUpdates could be significantly streamlined with this chaining approach.

CompletableFuture<Void> rollbackOrProcessStateUpdates( Throwable throwable, List<PersisterBatch> persisterBatches ) { if (throwable != null) { lock.writeLock().lock(); try { ... }); } finally { lock.writeLock().unlock(); } return CompletableFuture.failedFuture(throwable); } if (persisterBatches.isEmpty()) { return CompletableFuture.completedFuture(null); } ... return writeShareGroupState(persisterBatches.stream().map(PersisterBatch::stateBatch).toList()) .whenComplete((result, exception) -> {

Another benefit is that it naturally propagates any exceptions thrown inside the whenComplete block.

@chia7712 Valid suggestion. I wanted to keep the changes minimal for the blocker. Shall I do the chaining only in trunk so we have limited scope of testing in 4.2.1 and 4.3.0?

I have created following ticket and assigned to myself. It will be benfecial and can be well tested in trunk prior 4.4.0 release as it's code improvement: https://issues.apache.org/jira/browse/KAFKA-20521. Please let me know if you think it's fine to have separately.

Ahhh, you mentioned already that this can be a follow-up then I shall merge this commit and cherry-pick. Then shall open follow-up.

apache#22126) As explianed in KAFKA-20505, there can be a deadlock when future is completed for the request where next set of actions tries to attain lock on purgatory (checkAndComplete/trigger waiting requests). As the lock might not always be released hence a deadlock can happen. The PR moves such futures out of the lock. I have also reviewed other future completions and doesn't seems we need other changes. I have tested using franz-go Kafka test and can't reproduce the issues in 160 continuous runs. Earlier the issue was reproducible between 20-50 consecutive runs. ``` === Run 160 === === RUN TestShareGroupETL === PAUSE TestShareGroupETL === CONT TestShareGroupETL [09:59:38.788 1][INFO] producing to a new topic for the first time, fetching metadata to learn its partitions; topic: f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e ... ... [09:59:43.94 3][INFO] immediate metadata update triggered; why: forced load because we are producing to a topic for the first time [09:59:43.947 3][INFO] done waiting for metadata for new topic; topic: 6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9 share_test.go:507: level 1 phase 2: adding consumers after 122923 consumed [09:59:44.225 3][INFO] flushing [09:59:44.225 4][INFO] immediate metadata update triggered; why: querying metadata for consumer initialization ... ... [09:59:44.226 5][INFO] beginning to manage the share group lifecycle; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35 [09:59:44.226 10][INFO] beginning to manage the share group lifecycle; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35 [09:59:44.227 3][INFO] leaving share group; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, member_id: LP4lpqzQjAm-QxQdCRkSXA== [09:59:44.227 7][INFO] assigning share partitions; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, assignments: map[] [09:59:44.227 4][INFO] assigning share partitions; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, assignments: map[] ... ... [09:59:49.232 7][INFO] assigning share partitions; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, assignments: map[f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e:[1 2]] [09:59:49.232 6][INFO] assigning share partitions; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, assignments: map[f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e:[1]] ... ... [09:59:49.466 18][INFO] immediate metadata update triggered; why: querying metadata for consumer initialization [09:59:49.466 13][INFO] immediate metadata update triggered; why: querying metadata for consumer initialization [09:59:49.466 15][INFO] immediate metadata update triggered; why: querying metadata for consumer initialization [09:59:49.467 16][INFO] beginning to manage the share group lifecycle; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6 [09:59:49.467 14][INFO] beginning to manage the share group lifecycle; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6 [09:59:49.467 11][INFO] beginning to manage the share group lifecycle; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6 ... ... [09:59:49.467 15][INFO] beginning to manage the share group lifecycle; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6 [09:59:49.468 13][INFO] assigning share partitions; group: 0045957d65fdd11a7bd0dfbc9293c3fd935da4bb359df6fa56aca8dcc119ff35, assignments: map[] [09:59:49.469 17][INFO] assigning share partitions; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, assignments: map[] ... ... [09:59:54.472 18][INFO] assigning share partitions; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, assignments: map[5da63960cb4fe97d887d575a73dfbddc51a2eb8071d119b3a5ba5a2b0d87bc7e:[1]] [09:59:54.485 14][INFO] assigning share partitions; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, assignments: map[6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9:[0]] ... ... [09:59:54.494 12][INFO] metadata update triggered; why: reload trigger due to produce topic still not known [09:59:54.495 12][INFO] producer id initialization success; id: 3524, epoch: 0 [09:59:54.5 13][INFO] producing to a new topic for the first time, fetching metadata to learn its partitions; topic: 6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9 [09:59:54.5 13][INFO] immediate metadata update triggered; why: forced load because we are producing to a topic for the first time ... ... [09:59:54.525 11][INFO] leaving share group; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, member_id: _4Jk9faoDdUlGMlSR9zKmg== share_test.go:605: level 2 rebalance 1: killing l2-c1 after 169339 consumed [09:59:55.101 14][INFO] flushing [09:59:55.101 19][INFO] immediate metadata update triggered; why: querying metadata for consumer initialization [09:59:55.102 19][INFO] beginning to manage the share group lifecycle; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6 [09:59:55.103 14][INFO] leaving share group; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, member_id: wJibgxG934tiAuqCPloF_w== [09:59:55.107 19][INFO] assigning share partitions; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, assignments: map[] share_test.go:619: level 2 rebalance 2: killing l2-c3 after 375726 consumed [09:59:55.401 18][INFO] flushing ... ... [10:00:00.915 20][INFO] leaving share group; group: d90b64695fc163edc7c1052c412cd0ca4d4dd1696badab2b5a9b5a1e81e000c6, member_id: HjzG4-5QwncfYfr8pSmEUQ== share_test.go:377: level 1: 499900 unique keys, 500624 total accepts, 500624 produced, 724 duplicates, 35614 redelivered, max dc 3, consumed 532987 share_test.go:377: level 2: 499900 unique keys, 501513 total accepts, 501513 produced, 1613 duplicates, 20272 redelivered, max dc 2, consumed 518049 share_test.go:704: level 1: 100 purely rejected, 35614 redelivered share_test.go:60: deleting topic f2c35a17978f41d684a36ab8297dd873138cb3c982f095a7d0e74d7a8589411e share_test.go:61: deleting topic f7e388a2de7ef0814328f9186e8c4b73b1f2437490e1b98730af9fb17ee74175 share_test.go:62: deleting topic 5da63960cb4fe97d887d575a73dfbddc51a2eb8071d119b3a5ba5a2b0d87bc7e share_test.go:63: deleting topic 6ce53f956e91c6b106843ff7aa728451f290cbd9064cefc8dc73ee2bcadef7b9 share_test.go:64: deleting topic 7e74eb054cbb02e0de5da8a8018115dc01094496222039f323841770b11b8a12 share_test.go:65: deleting topic 4b8c44d4071cd22272ae9ac694342faa3404bd10b479fe88874bdef4a8a4276d --- PASS: TestShareGroupETL (22.73s) PASS ok github.com/twmb/franz-go/pkg/kgo 22.926s ``` Reviewers: Andrew Schofield <aschofield@confluent.io>, Chia-Ping Tsai <chia7712@gmail.com>, PoAn Yang <payang@apache.org>

KAFKA-20505: Moving future completion out of locks for Share Partition

2ac92a1

github-actions Bot added the triage PRs from the community label Apr 23, 2026

apoorvmittal10 requested a review from AndrewJSchofield April 23, 2026 10:10

github-actions Bot added core Kafka Broker KIP-932 Queues for Kafka small Small PRs labels Apr 23, 2026

apoorvmittal10 added ci-approved and removed small Small PRs triage PRs from the community labels Apr 23, 2026

AndrewJSchofield approved these changes Apr 23, 2026

View reviewed changes

chia7712 approved these changes Apr 23, 2026

View reviewed changes

FrankYang0529 approved these changes Apr 23, 2026

View reviewed changes

apoorvmittal10 merged commit e1cfea3 into apache:trunk Apr 23, 2026
33 of 34 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-20505: Moving future completion out of locks for Share Partition#22126

KAFKA-20505: Moving future completion out of locks for Share Partition#22126
apoorvmittal10 merged 1 commit intoapache:trunkfrom
apoorvmittal10:KAFKA-20505

apoorvmittal10 commented Apr 23, 2026 •

edited by github-actions Bot

Loading

Uh oh!

AndrewJSchofield left a comment

Uh oh!

apoorvmittal10 commented Apr 23, 2026

Uh oh!

FrankYang0529 commented Apr 23, 2026

Uh oh!

chia7712 left a comment

Uh oh!

chia7712 Apr 23, 2026

Uh oh!

apoorvmittal10 Apr 23, 2026

Uh oh!

apoorvmittal10 Apr 23, 2026

Uh oh!

apoorvmittal10 Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

apoorvmittal10 commented Apr 23, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AndrewJSchofield left a comment

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 commented Apr 23, 2026

Uh oh!

FrankYang0529 commented Apr 23, 2026

Uh oh!

chia7712 left a comment

Choose a reason for hiding this comment

Uh oh!

chia7712 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

apoorvmittal10 Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

apoorvmittal10 commented Apr 23, 2026 •

edited by github-actions Bot

Loading