-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for KRaft to KRaft upgrades #9393
Add support for KRaft to KRaft upgrades #9393
Conversation
/azp run kraft-regression |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run upgrade |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run feature-gates-regression |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Jakub Scholz <www@scholzj.com>
0b42bb2
to
a52a83b
Compare
/azp run upgrade |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run kraft-regression |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run regression |
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, only a couple of small suggestions
...erator/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftMetadataManager.java
Show resolved
Hide resolved
...c/main/java/io/strimzi/operator/cluster/operator/assembly/ZooKeeperVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
private Future<List<Pod>> getPods() { | ||
Labels selectorLabels = Labels.forStrimziKind(Kafka.RESOURCE_KIND) | ||
.withStrimziCluster(reconciliation.name()) | ||
.withStrimziName(KafkaResources.kafkaStatefulSetName(reconciliation.name())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess at some point we should refactor this method to be KafkaResources.kafkaStrimziPodSetName
since statefulsets will eventually be removed from Strimzi entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a fair point. I might open some issue for it. But it is a bit more complicated than that:
- It is still used for some migration and upgrade purposes as StatefulSet name
- It is actually not a StrimziPodSet name as with node pools it is named after the node pools
- It is a label used to mark all the Kafka pods regardless of the pool they belong to.
So it is a bit of a chaos :-(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
////////// | ||
|
||
@Test | ||
public void testMetadataVersionAtNoChange() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could potentially be converted to a parametrized test to make it easier to parse the different permutations that we are testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. I guess possibly it could. But I'm not sure if it would really help with debugging and readability. To be honest, I do not like parametrized tests too much.
...va/io/strimzi/operator/cluster/operator/assembly/KafkaUpgradeDowngradeWithKRaftMockTest.java
Outdated
Show resolved
Hide resolved
...va/io/strimzi/operator/cluster/operator/assembly/KafkaUpgradeDowngradeWithKRaftMockTest.java
Outdated
Show resolved
Hide resolved
...va/io/strimzi/operator/cluster/operator/assembly/KafkaUpgradeDowngradeWithKRaftMockTest.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jakub Scholz <www@scholzj.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just two really minor things.
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Jakub Scholz <www@scholzj.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this feature. I left some comments.
I also manually tried to do a safe downgrade that I know is possible (3.5-IV1 -> 3.5-IV0). It worked fine, but logs sometimes use the featureLevel (i.e. 10), and sometimes use release-subVersion (i.e. 3.5-IV1), which can be confusing for users. I think we should stick with the release-subVersion.
2023-11-28 16:34:58 INFO AbstractOperator:511 - Reconciliation #46(watch) Kafka(test/my-cluster): Kafka my-cluster in namespace test was MODIFIED
2023-11-28 16:34:58 INFO AbstractOperator:265 - Reconciliation #46(watch) Kafka(test/my-cluster): Kafka my-cluster will be checked for creation or modification
2023-11-28 16:34:59 INFO KRaftMetadataManager:168 - Reconciliation #46(watch) Kafka(test/my-cluster): Updating metadata version from 10 to 9
2023-11-28 16:34:59 INFO KRaftMetadataManager:118 - Reconciliation #46(watch) Kafka(test/my-cluster): Successfully updated metadata version to 3.5-IV0
2023-11-28 16:34:59 INFO CrdOperator:122 - Reconciliation #46(watch) Kafka(test/my-cluster): Status of Kafka my-cluster in namespace test has been updated
2023-11-28 16:34:59 INFO AbstractOperator:511 - Reconciliation #47(watch) Kafka(test/my-cluster): Kafka my-cluster in namespace test was MODIFIED
2023-11-28 16:34:59 INFO AbstractOperator:265 - Reconciliation #47(watch) Kafka(test/my-cluster): Kafka my-cluster will be checked for creation or modification
2023-11-28 16:34:59 INFO AbstractOperator:537 - Reconciliation #46(watch) Kafka(test/my-cluster): reconciled
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
...r/src/main/java/io/strimzi/operator/cluster/operator/assembly/KRaftVersionChangeCreator.java
Outdated
Show resolved
Hide resolved
@@ -10,6 +10,7 @@ | |||
If needed, `UnidirectionalTopicOperator` can be disabled in the feature gates configuration in the Cluster Operator. | |||
* Improved Kafka Connect metrics and dashboard example files | |||
* Allow specifying and managing KRaft metadata version | |||
* Add support for KRaft to KRaft upgrades (Apache Kafka upgrades for the KRaft based clusters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add the min supported Strimzi version in this note (the first release using kafkaVersion pod annotation)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with min supported Strimzi version
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We handle the case where pods are there but without kafka-version annotation, which results in KafkaUpgradeException. So I think adding a note to avoid that experience would be good (for users that actually read release notes and documentation).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm still not sure what is it you are asking for.
Signed-off-by: Jakub Scholz <www@scholzj.com>
@fvaleri You are right, this could be confusing. I opened a separate PR to fix it as it is not really related to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks.
Type of change
Description
This PR delivers the second part of the KRaft-to-KRaft upgrades according to the [SP#61|https://github.com/strimzi/proposals/blob/main/061-kraft-upgrades-and-downgrades.md]:
KafkaVersionChange
classVersionChangeCreator
toZooKeeperVersionChangeCreator
. It keeps being used for ZooKeeper based clusters without any major changes. That minimizes possible impact on the existing Zoo-based clusters.KRaftVersionChangeCreator
that helps to create theKafkaVersionChange
instance for KRaft clusters following the proposal and a code similar to what we use for ZooKeeperVersionChangeCreator
that is shared betweenKRaftVersionChangeCreator
andZooKeeperVersionChangeCreator
to make the handling of them inKafkaAssemblyOperator
easier.ResourceUtils
for our unit tests in order to makeit possible to re-use the default Kafka Admin client mock and add additional mocking to it.Documentation will be updated in a separate PR.
There is a space to improve the
KafkaVersionChange
class and how it is used inKafkaReconciler
. But I decided to not do it in this PR to make it easier to review. And might get back to refactor it later in a follow-up PR.This PR also contains unit tests and was tested manually as well. The current testing possibilities are limited:
KafkaRoller
class. Controller-only nodes were tested manually by deleting the pods.But these limitations should not prevent reviews / and merging this.
Checklist