Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-15449: Verify transactional offset commits (KIP-890 part 1) #14370

Merged
merged 9 commits into from
Oct 2, 2023

Conversation

jolshan
Copy link
Contributor

@jolshan jolshan commented Sep 11, 2023

Previous commits left out TxnOffsetCommits which go through the group coordinator (not directly from the producer).

I've wired up the methods to include the transactional id and state partition to do the verification.

I've also updated UnifiedLog to verify on client and coordinator requests that are transactional.
I've not updated any sequence check logic since the sequence is always 0 on group coordinator initiated writes.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@jolshan jolshan requested a review from dajac September 11, 2023 23:06
Copy link
Contributor

@artemlivshits artemlivshits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

core/src/main/scala/kafka/log/UnifiedLog.scala Outdated Show resolved Hide resolved
Copy link
Contributor

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan Thanks for the PR. I left a few comments for consideration.

Comment on lines 455 to 457
case Errors.INVALID_PRODUCER_ID_MAPPING
| Errors.INVALID_TXN_STATE =>
Errors.UNKNOWN_MEMBER_ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the rational behind using UNKNOWN_MEMBER_ID here? We should also add a comment about it because this is not obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I resolved the PR comment, but I had a discussion with myself here: #14370 (comment)

Similar to the produce request, we wanted to have a non-fatal error here. (I suppose there is an argument for the invalid PID mapping being fatal)
What do you think? There are only 2 abortable errors returned by this request.

                } else if (error == Errors.UNKNOWN_MEMBER_ID
                        || error == Errors.ILLEGAL_GENERATION) {
                    abortableError(new CommitFailedException("Transaction offset Commit failed " +
                        "due to consumer group metadata mismatch: " + error.exception().getMessage()));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For posterity, we decided to not go with the abortable error and stick with the fatal one since it makes it clearer what happened. We also can't guarantee clients treat unknown member as non-fatal.

Copy link
Contributor

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a small nit. I don't need to re-review it.

@jolshan jolshan merged commit b6c7855 into apache:trunk Oct 2, 2023
1 check failed
k-wall pushed a commit to k-wall/kafka that referenced this pull request Nov 21, 2023
…ache#14370)

Previous commits left out TxnOffsetCommits which go through the group coordinator (not directly from the producer).

I've wired up the methods to include the transactional id and state partition to do the verification.

I've also updated UnifiedLog to verify on client and coordinator requests that are transactional.
I've not updated any sequence check logic since the sequence is always 0 on group coordinator initiated writes.

Added returned errors to Response files. Both InvalidPidMapping and InvalidTxnState will be returned and be fatal for the transactional OffsetCommit requests.

Reviewers:  David Jacot <david.jacot@gmail.com>,  Artem Livshits <alivshits@confluent.io>
AnatolyPopov pushed a commit to aiven/kafka that referenced this pull request Feb 16, 2024
…ache#14370)

Previous commits left out TxnOffsetCommits which go through the group coordinator (not directly from the producer).

I've wired up the methods to include the transactional id and state partition to do the verification.

I've also updated UnifiedLog to verify on client and coordinator requests that are transactional.
I've not updated any sequence check logic since the sequence is always 0 on group coordinator initiated writes.

Added returned errors to Response files. Both InvalidPidMapping and InvalidTxnState will be returned and be fatal for the transactional OffsetCommit requests.

Reviewers:  David Jacot <david.jacot@gmail.com>,  Artem Livshits <alivshits@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants