-
Notifications
You must be signed in to change notification settings - Fork 13.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-17011: SupportedFeatures.MinVersion incorrectly blocks v0 (3.8) #16420
Conversation
Ran the kraft upgrade tests and they passed with this change 👍 |
I ran some group coordinator tests as well, and those are also passing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jolshan thanks for this quick fix. two trivial comments are left. PTAL
Features<SupportedVersionRange> backwardsCompatibleFeatures = Features.supportedFeatures(latestSupportedFeatures.features().entrySet() | ||
.stream().filter(entry -> { | ||
SupportedVersionRange supportedVersionRange = entry.getValue(); | ||
return supportedVersionRange.min() != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure whether I fail to catch your description.
so for now we will change 0 to 1 in the response in order to be backwards compatible.
the code looks like it gets rid of "0" instead of changing to from 0 to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For another, does this PR mean Admin#describeFeatures
can't see the feature "group.version" from the broker running in 3.8.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey sorry, the comment was from the old implementation, the new way is to omit as you mentioned.
I believe describeFeatures will not be able to show the version for this release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's useful to add a comment on why we are filtering version 0 features.
BTW, there is a description in KIP-584
not sure why #15671 changed the start version from 1 to 0 https://github.com/apache/kafka/pull/15671/files#r1576879728 |
@chia7712 It was incorrect to set at 1 because we can not assume 0 level is supported. However, 0 level -- the feature being disabled is valid. |
@jolshan I feel |
Thanks @chia7712 I will take a look. |
Cleaned up the test failures. The new ones look unrelated. 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@jolshan thanks for explanation. Also, I have saw the discussion on the mail channel (https://lists.apache.org/thread/7wx3j788ztf1vl7xm9ryt8cwl3tk9lw3). Both are good resources to learn the root cause of this issue :) |
@jolshan sorry that I have another question: it seems the hotfix for 3.8 is to avoid propagating the |
Hmmm. I believe the handling is ok with 0. The issue is not with 0 itself. (We can handle controllers that don't have the same range of supported versions correctly.) The issue was with SupportedVersionRange which only seems to be used by NodeApiVersions. @cmccabe correct me if I'm wrong. |
@chia7712 the problem is specific to ApiVersionResponse. The deserializer for that RPC response doesn't allow 0 values for min supported version or max supported version. Broker registration doesn't have this problem. |
I will merge in the next hour or so unless there are further concerns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jolshan : Thanks for the PR. Just a minor comment. It will also be useful to include the exception caused by this issue in the jira or PR description.
Features<SupportedVersionRange> backwardsCompatibleFeatures = Features.supportedFeatures(latestSupportedFeatures.features().entrySet() | ||
.stream().filter(entry -> { | ||
SupportedVersionRange supportedVersionRange = entry.getValue(); | ||
return supportedVersionRange.min() != 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's useful to add a comment on why we are filtering version 0 features.
Thanks @junrao! Done |
@jolshan @jsancio thanks for your response. My concern was the "min version" of broker registration is used to create
That is used in migration, so not sure whether it will be a problem. For example, the quorum controller is running with defective |
@chia7712 The new code of SupportedVersionRange does support 0 version. The only issue is when a new broker sends a request to an older one. Is it correct that in order to encounter this issue, we would need to have an migration ongoing at the same time as a version rolling upgrade? Is this typical for a kraft migration? If there are tests that can confirm migration is safe, I'm also happy to run and double check. |
It is not possible to do an More importantly, |
That may be not a common case, but the following error can be produced in migrating zk to kraft.
steps:
|
Producing the error needs to enable |
I see that the supported range is 0-1 in that case. Is it really the case you need to set that flag? |
just to clarify: this is not a real case I have encountered. I just be over-engineering today 😄
|
unstable feature versions should never be enabled outside JUnit tests (I have a KIP which formalized this). Exactly to avoid this kind of confusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry that I should not use
|
@chia7712 @jolshan : Yes, after some offline discussion I now understand what you were trying to say. I guess the bad case is:
Just as with |
Reverting the group.version feature flag seems to be the best option and we can live with going back to the static config for the preview in 3.8. I wonder if we should also revert it in trunk in order to not have it in 3.9. Then, we can bring it back for 4.0 when we GA the feature. |
I think that it would be nice to have a public config to run the broker in experimental mode in order to test non-production features in development clusters. Otherwise, I suppose that we can achieve this by manually upgrading the MV to an unreleased version with |
Discussed offline with a few folks and reverting group.version seems to be the way to go. I will prepare the PR as soon as possible. cc @jlprat |
Thanks @dajac ! |
I will close this PR. |
For 3.8 we can't implement a new ApiVersions bump, so for now we will omit the feature in the response in order to be backwards compatible.
Without this change, when interacting with an old version of kafka, we see the following error: