Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(kraft): adds a procedure for zookeeper to kraft migration #9633

Merged
merged 16 commits into from
Mar 2, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
[role="_abstract"]
If you are using a ZooKeeper-based Kafka cluster, an upgrade requires an update to the Kafka version and the inter-broker protocol version.

If you want to switch a Kafka cluster from using a ZooKeeper for metadata management to operating in KRaft mode, the steps must be performed separately from the upgrade.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
For information on migrating to a KRaft-based cluster, see xref:proc-deploy-migrate-kraft-str[].

include::../../modules/upgrading/ref-upgrade-kafka-versions.adoc[leveloffset=+1]
include::../../modules/upgrading/con-upgrade-older-clients.adoc[leveloffset=+1]

Expand Down
2 changes: 2 additions & 0 deletions documentation/deploying/deploying.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ include::modules/deploying/proc-deploy-cluster-operator-helm-chart.adoc[leveloff
include::modules/operators/ref-operator-cluster-feature-gates.adoc[leveloffset=+1]
//feature gate release lifecycle
include::modules/operators/ref-operator-cluster-feature-gate-releases.adoc[leveloffset=+2]
//migrating to KRaft
include::modules/deploying/proc-deploy-migrate-kraft.adoc[leveloffset=+1]
//configuration of components
include::assemblies/configuring/assembly-config.adoc[leveloffset=+1]
//creating topics
Expand Down
256 changes: 256 additions & 0 deletions documentation/modules/deploying/proc-deploy-migrate-kraft.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,256 @@
// Module included in the following assemblies:
//
// deploying/deploying.adoc

[id='proc-deploy-migrate-kraft-{context}']
= Migrating to KRaft mode

[role="_abstract"]
If you are using ZooKeeper for metadata management in your Kafka cluster, you can migrate to using Kafka in KRaft mode.
KRaft mode replaces ZooKeeper for distributed coordination, offering enhanced reliability, scalability, and throughput.

During the migration, you install a quorum of controller nodes as a node pool, which replaces ZooKeeper for management of your cluster.
You enable KRaft migration in the cluster configuration by applying the `strimzi.io/kraft: migration` annotation.
After the migration is complete, you switch the brokers to using KRaft and the controllers out of migration mode using the `strimzi.io/kraft: enabled` annotation.

Before starting the migration, verify that your environment can support Kafka in KRaft mode, as there are a number of xref:ref-operator-use-kraft-feature-gate-str[limitations].
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
Note also, the following:

* Migration is only supported on dedicated controller nodes, not on nodes with dual roles as brokers and controllers.
* Throughout the migration process, ZooKeeper and Controller nodes operate in parallel for a period, requiring sufficient compute resources in the cluster.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

.Prerequisites
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

* You must be using Strimzi 0.40 or newer with Kafka 3.7.0 or newer. If you using an earlier version of Strimzi or Apache Kafka, upgrade before migrating to KRaft mode.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
* Verify that the ZooKeeper-based deployment is operating without the following, as they are not supported in KRaft mode:
** The Topic Operator running in bidirectional mode. It should either be in unidirectional mode or disabled.
** JBOD storage. While the `jbod` storage type can be used, the JBOD array must contain only one disk.
* The Cluster Operator that manages the Kafka cluster is running.
* The Kafka cluster deployment uses Kafka node pools.
+
If your ZooKeeper-based cluster is already using node pools, it is ready to migrate.
If not, you can xref:proc-migrating-clusters-node-pools-str[migrate the cluster to use node pools].
To migrate, brokers must be contained in a `KafkaNodePool` resource configuration that is assigned a `broker` role and has the name `kafka`.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
Support for node pools is enabled in the `Kafka` resource configuration using the `strimzi.io/node-pools: enabled` annotation.

In this procedure, the Kafka cluster name is `my-cluster`, which is located in `my-project`.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
The name of the controller node pool is `controller`.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

.Procedure

. For the Kafka cluster, create a node pool with a `controller` role.
+
Add a quorum of controller nodes to the node pool.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
+
.Example configuration for a controller node pool
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaNodePoolApiVersion}
kind: KafkaNodePool
metadata:
name: controller
labels:
strimzi.io/cluster: my-cluster
spec:
replicas: 3
roles:
- controller
storage:
type: jbod
volumes:
- id: 0
type: persistent-claim
size: 20Gi
deleteClaim: false
resources:
requests:
memory: 64Gi
cpu: "8"
limits:
memory: 64Gi
cpu: "12"
----
+
NOTE: For the migration, you cannot use a node pool of nodes that share the broker and controller roles.

. Apply the new `KafkaNodePool` resource to create the controllers.
+
Errors related to using controllers in a ZooKeeper-based environment are expected.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

. Enable KRaft migration in the `Kafka` resource by setting the `strimzi.io/kraft` annotation to `migration`:
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft: migration
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
----
+
.Enabling KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft: migration
# ...
----
+
Applying the annotation to the `Kafka` resource configuration starts the migration.

. Check the controllers have started and the brokers have rolled:
+
[source,shell]
----
kubectl get pods -n my-project
----
+
.Output shows nodes in broker and controller node pools
[source,shell]
----
NAME READY STATUS RESTARTS
my-cluster-kafka-0 1/1 Running 0
my-cluster-kafka-1 1/1 Running 0
my-cluster-kafka-2 1/1 Running 0
my-cluster-controller-3 1/1 Running 0
my-cluster-controller-4 1/1 Running 0
my-cluster-controller-5 1/1 Running 0
# ...
----
+
Here, the brokers have the name `my-cluster-kafka` as they were contained in a node pool named `kafka` when migrating the cluster to use node pools.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

. Check the status of the migration:
+
[source,shell]
----
kubectl get kafka my-cluster -n my-project -w
----
+
.Updates to the metadata state
[source,shell]
----
NAME ... METADATA STATE
my-cluster ... Zookeeper
my-cluster ... KRaftMigration
my-cluster ... KRaftDualWriting
my-cluster ... KRaftPostMigration
----
+
`METADATA STATE` shows the mechanism used to manage Kafka metadata and coordinate operations.
At the start of the migration this is `ZooKeeper`.
+
--
* `ZooKeeper` is the initial state when metadata is only stored in ZooKeeper.
* `KRaftMigration` is the state when the migration is in progress and brokers are rolled to register with the controllers.
The migration can take some time at this point depending on the number of topics and partitions in the cluster.
* `KRaftDualWriting` is the state when the Kafka cluster is working as a KRaft cluster,
but controllers are managing metadata in Kafka and ZooKeeper.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
Brokers are rolled a second time to remove the ZooKeeper configuration and migration annotation from the `Kafka` resource.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
* `KRaftPostMigration` is the state when KRaft mode is enabled for brokers and there is no ZooKeeper involvement.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
Controllers are still connected to ZooKeeper.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
You can xref:proc-deploy-migrate-kraft-rollback-{context}[roll back from this point].
--
+
The migration status is also represented in the `status.kafkaMetadataState` property of the `Kafka` resource.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

. Enable KRaft in the `Kafka` resource configuration by setting the `strimzi.io/kraft` annotation to `enabled`:
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft: enabled
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
----
+
.Enabling KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft: enabled
# ...
----
+
WARNING: Rollback cannot be performed after enabling KRaft.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved

. Check the status of the move to full KRaft mode:
+
[source,shell]
----
kubectl get kafka my-cluster -n my-project -w
----
+
.Updates to the metadata state
[source,shell]
----
NAME ... METADATA STATE
my-cluster ... Zookeeper
my-cluster ... KRaftMigration
my-cluster ... KRaftDualWriting
my-cluster ... KRaftPostMigration
my-cluster ... KRaft
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
----
+
--
* `KRaft` is the final state (after the controllers have rolled) when the KRaft migration is finalized.
--
+
All ZooKeeper-related resources have been automatically deleted.

[id='proc-deploy-migrate-kraft-rollback-{context}']
.Performing a rollback on the migration

Before the migration is finalized by enabling KRaft in the `Kafka` resource, and the state has moved to the `KRaft` state, you can perform a rollback operation as follows:

. Apply the `strimzi.io/kraft: rollback` annotation to the `Kafka` resource to roll back the brokers.
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft: rollback
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
----
+
.Rolling back KRaft migration
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft: rollback
# ...
----
+
The brokers are rolled back so that they can be connected to ZooKeeper again and the state returns to `KRaftDualWriting`.

. Delete the controllers node pool:
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
+
[source,shell]
----
kubectl delete KafkaNodePool controller -n my-project
----

. Apply the `strimzi.io/kraft: disabled` annotation to the `Kafka` resource to return the metadata state to `ZooKeeper`.
+
[source,shell]
----
kubectl annotate kafka my-cluster strimzi.io/kraft: disabled
PaulRMellor marked this conversation as resolved.
Show resolved Hide resolved
----
+
.Switching back to using ZooKeeper
[source,yaml,subs="+attributes"]
----
apiVersion: {KafkaApiVersion}
kind: Kafka
metadata:
name: my-cluster
namespace: my-project
annotations:
strimzi.io/kraft: disabled
# ...
----
22 changes: 12 additions & 10 deletions documentation/modules/managing/con-custom-resources-status.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -98,11 +98,12 @@ status:
- lastTransitionTime: '2023-01-20T17:56:29.396588Z'
status: 'True'
type: Ready # <3>
kafkaVersion: {DefaultKafkaVersion} # <4>
kafkaNodePools: # <5>
kafkaMetadataState: KRaft # <4>
kafkaVersion: {DefaultKafkaVersion} # <5>
kafkaNodePools: # <6>
- name: broker
- name: controller
listeners: # <6>
listeners: # <7>
- addresses:
- host: my-cluster-kafka-bootstrap.prm-project.svc
port: 9092
Expand Down Expand Up @@ -140,16 +141,17 @@ status:

-----END CERTIFICATE-----
name: external4
observedGeneration: 3 # <7>
operatorLastSuccessfulVersion: {ProductVersion} # <8>
observedGeneration: 3 # <8>
operatorLastSuccessfulVersion: {ProductVersion} # <9>
----
<1> The Kafka cluster ID.
<2> Status `conditions` describe the current state of the Kafka cluster.
<3> The `Ready` condition indicates that the Cluster Operator considers the Kafka cluster able to handle traffic.
<4> The version of Kafka being used by the Kafka cluster.
<5> The node pools belonging to the Kafka cluster.
<6> The `listeners` describe Kafka bootstrap addresses by type.
<7> The `observedGeneration` value indicates the last reconciliation of the `Kafka` custom resource by the Cluster Operator.
<8> The version of the operator that successfully completed the last reconciliation.
<4> Kafka metadata state that shows the mechanism used (KRaft or ZooKeeper) to manage Kafka metadata and coordinate operations.
<5> The version of Kafka being used by the Kafka cluster.
<6> The node pools belonging to the Kafka cluster.
<7> The `listeners` describe Kafka bootstrap addresses by type.
<8> The `observedGeneration` value indicates the last reconciliation of the `Kafka` custom resource by the Cluster Operator.
<9> The version of the operator that successfully completed the last reconciliation.

NOTE: The Kafka bootstrap addresses listed in the status do not signify that those endpoints or the Kafka cluster is in a `Ready` state.
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ Stable feature gates have reached a beta level of maturity, and are generally en
Stable feature gates are production-ready, but they can still be disabled.

[id='ref-operator-use-kraft-feature-gate-{context}']
== UseKRaft feature gate
=== UseKRaft feature gate

The `UseKRaft` feature gate has a default state of _enabled_.

Expand Down