Conversation
WalkthroughAdds a new end-to-end documentation guide for Kafka scale switching in the ACP Log Storage plugin (targets ACP 4.0.x/4.1.x). Describes moduleinfo edits to adjust Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Admin
participant K8s as "K8s Cluster"
participant ACP as "ACP Log Storage"
participant Kafka as "Kafka Cluster"
participant ZK as "ZooKeeper"
rect rgb(248,250,252)
note over Admin,ACP: Scale switching via moduleinfo edits
Admin->>K8s: Locate moduleinfo resource
Admin->>K8s: Edit moduleinfo.kafka.k8sNodes (add/remove node names)
K8s-->>ACP: Apply resource changes
ACP->>Kafka: Reconcile / rolling update
ACP->>ZK: Coordinate state as needed
alt Components healthy
Kafka-->>ACP: Report healthy
ACP-->>Admin: Update complete
else Unhealthy
Admin->>K8s: Optionally restart Kafka/ZK pods
Kafka-->>Admin: Health restored
end
end
sequenceDiagram
autonumber
participant Admin
participant Pod as "Kafka Pod"
participant Kafka as "Brokers"
participant ZK as "ZooKeeper"
rect rgb(250,255,250)
note over Admin,Pod: Optional partition reassignment (3 -> 3+n)
Admin->>Pod: Exec into container
Admin->>Pod: Read `kafka_server_jaas.conf` (SASL creds)
Admin->>Pod: Create TLS `client.cfg`
Admin->>Pod: List topics → produce `topic-generate.json`
Admin->>Pod: Retrieve broker IDs from `/kafka*/meta.properties`
Admin->>Pod: `kafka-reassign-partitions.sh --generate` (old/new JSON)
Admin->>Pod: `kafka-reassign-partitions.sh --execute`
loop Monitor reassignment
Admin->>Pod: `kafka-reassign-partitions.sh --verify`
Pod-->>Admin: Progress / completion
end
note over Admin,Pod: Optional pod restarts if collection disrupted
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (2)
docs/en/solutions/Kafka_Scale_Switching.md (2)
86-91: Strengthen readiness checks beyond pod phase.Pod Running != service ready. Add rollout and broker health checks.
Apply this diff:
```shell kubectl get pods -n cpaas-system | grep kafka cpaas-kafka-0 1/1 Running 0 6m54s cpaas-kafka-1 1/1 Running 0 7m49s cpaas-kafka-2 1/1 Running 0 7m49s +kubectl rollout status statefulset/cpaas-kafka -n cpaas-system +# Optional: check for under-replicated partitions +/opt/kafka/bin/kafka-topics.sh --bootstrap-server cpaas-kafka:9092 --command-config ./client.cfg --describe --under-replicated-partitions--- `193-193`: **Remove stray non-breaking space before closing array bracket.** Prevents copy/paste JSON errors. Apply this diff: ```diff - ], + ],
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
docs/en/solutions/Kafka_Scale_Switching.md(1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
docs/en/solutions/Kafka_Scale_Switching.md
129-129: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
135-135: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
147-147: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
148-148: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
150-150: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
159-159: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
🔇 Additional comments (3)
docs/en/solutions/Kafka_Scale_Switching.md (3)
8-22: Overall doc is useful and clearly structured.Good background, scope, and stepwise guidance.
158-160: Fix SASL mechanism/JAAS class mismatch
sasl.mechanism=PLAIN is paired with ScramLoginModule—either switch to SCRAM-SHA-512 when using ScramLoginModule or use org.apache.kafka.common.security.plain.PlainLoginModule for PLAIN. Confirm which mechanism ACP Kafka 4.0.x/4.1.x is configured to use.
71-75: Clarify k8sNodes identifiers: IPs vs. node names
The example lists IP addresses but the comment refers to “node names.” Confirm whether the operator accepts IPs or requires Kubernetes node names, then update the example (e.g. use<node-name-a>, etc.) accordingly.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (5)
docs/en/solutions/Kafka_Scale_Switching.md (5)
93-105: Do rolling restarts; don’t delete all ZK/Kafka pods at once.Mass deletion risks longer downtime and leader churn. Use rollout with status gates.
-Note: If Kafka pods are Running but the Kafka cluster is not functioning, manually restart both Kafka and ZooKeeper instances once after scaling completes. +Note: If Kafka pods are Running but the Kafka cluster is not functioning, perform rolling restarts. @@ -```shell -kubectl delete pods -n cpaas-system -l 'service_name in (cpaas-zookeeper, cpaas-kafka)' -... -``` +```shell +# Restart ZooKeeper first +kubectl rollout restart statefulset/cpaas-zookeeper -n cpaas-system +kubectl rollout status statefulset/cpaas-zookeeper -n cpaas-system +# Then Kafka +kubectl rollout restart statefulset/cpaas-kafka -n cpaas-system +kubectl rollout status statefulset/cpaas-kafka -n cpaas-system +```
121-137: Remove hardcoded credentials; use placeholders and pull from Secrets.Real-looking usernames/passwords in docs are a leakage risk and will trip scanners. Show placeholders and instruct sourcing from a Kubernetes Secret.
-(2.1)Retrieve the Kafka username and password: +(2.1)Retrieve Kafka credentials (prefer from Kubernetes Secret, not from live files). +> IMPORTANT: Do not paste live credentials in docs. Use placeholders and load from a Secret. @@ -bash-5.1$ cat /opt/kafka/conf/kafka_server_jaas.conf +## Example: show structure only (values are placeholders) +bash-5.1$ cat /opt/kafka/conf/kafka_server_jaas.conf @@ -KafkaServer { - org.apache.kafka.common.security.plain.PlainLoginModule required - username="dBodNvjT" - password="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38" - user_dBodNvjT="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38"; -}; -Client { - org.apache.zookeeper.server.auth.DigestLoginModule required - username="dBodNvjT" - password="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38"; -}; +KafkaServer { + org.apache.kafka.common.security.<ScramLoginModule|plain.PlainLoginModule> required + username="<KAFKA_USER>" + password="<KAFKA_PASSWORD>"; +}; +Client { + org.apache.zookeeper.server.auth.DigestLoginModule required + username="<ZK_USER>" + password="<ZK_PASSWORD>"; +};
142-160: Fix TLS hostname verification, remove broker-only property, and align SASL mechanism with JAAS.Client config disables hostname verification, sets a broker-side key, and PLAIN vs SCRAM are mismatched.
-bash-5.1$ cat <<EOF> client.cfg +bash-5.1$ cat <<EOF> client.cfg security.protocol=sasl_ssl - -ssl.endpoint.identification.algorithm= +ssl.endpoint.identification.algorithm=https ssl.keystore.location=/opt/kafka/config/certs/keystore.jks -ssl.keystore.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU -ssl.key.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU +ssl.keystore.password=<KEYSTORE_PASSWORD> +ssl.key.password=<KEY_PASSWORD> ssl.truststore.location=/opt/kafka/config/certs/truststore.jks -ssl.truststore.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU -ssl.client.auth=required +ssl.truststore.password=<TRUSTSTORE_PASSWORD> ssl.enabled.protocols=TLSv1.2 ssl.cipher.suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 ssl.keystore.type=JKS ssl.truststore.type=JKS ssl.secure.random.implementation=SHA1PRNG - -sasl.mechanism=PLAIN -sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="VhwgyZj7" password="STTTzrX4YmkTeCgc5zLycZGXjMafpouU"; + +# Choose one mechanism and keep JAAS consistent: +# For SCRAM-SHA-512: +sasl.mechanism=SCRAM-SHA-512 +sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="<KAFKA_USER>" password="<KAFKA_PASSWORD>"; +# Or for PLAIN: +# sasl.mechanism=PLAIN +# sasl.jaas.config=org.apache.kafka.common.security.plain.PlainLoginModule required username="<KAFKA_USER>" password="<KAFKA_PASSWORD>"; EOF
201-211: Normalize admonition and add throttling guidance.Use a standard admonition and advise throttling to limit IO/traffic impact.
-!!!NOTE: The --broker-list parameter needs to be replaced with values extracted from each instance's /kafka0/meta.properties file +!!! note +The --broker-list parameter must be replaced with broker IDs from each instance's data dir (e.g., /kafka0/meta.properties). +Tip: Use throttling during reassignment to reduce impact on production traffic.
215-217: Throttle reassignment traffic.Add --throttle to avoid saturating disks/network.
-bash-5.1$ /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server cpaas-kafka:9092 --reassignment-json-file new-assign.json --execute --command-config ./client.cfg +bash-5.1$ /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server cpaas-kafka:9092 --reassignment-json-file new-assign.json --execute --throttle 10485760 --command-config ./client.cfgIf your Kafka version lacks --throttle, set leader.replication.throttled.rate and follower.replication.throttled.rate as dynamic broker configs.
🧹 Nitpick comments (2)
docs/en/solutions/Kafka_Scale_Switching.md (2)
86-91: Use label selectors instead of grep for pod checks.More robust and scriptable.
-kubectl get pods -n cpaas-system | grep kafka +kubectl get pods -n cpaas-system -l service_name=cpaas-kafka
107-111: Also review internal topics and ISR settings when adding brokers.When scaling, verify replication factor and min.insync.replicas for internal topics (e.g., __consumer_offsets, __transaction_state) align with your new broker count. Plan reassignment if needed and adjust topic/broker configs accordingly.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
docs/en/solutions/Kafka_Scale_Switching.md(1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
docs/en/solutions/Kafka_Scale_Switching.md
129-129: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
135-135: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
147-147: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
148-148: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
150-150: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
159-159: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
| exporterPort: 9308 | ||
| install: true | ||
| k8sNodes: # Add the scaled node names to the specified array. | ||
| - 192.168.179.86 |
There was a problem hiding this comment.
Do not use real IP addresses.
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (6)
docs/en/solutions/Kafka_Scale_Switching.md (6)
29-33: Patch only the needed field; don’t hand-edit full objects with dynamic metadata.Editing the entire ModuleInfo (with resourceVersion/uid/timestamps) is fragile. Provide a focused patch for spec.config.components.kafka.k8sNodes.
-kubectl edit moduleinfo <moduleinfo_name> +kubectl patch moduleinfo <moduleinfo_name> --type merge -p \ +$'{"spec":{"config":{"components":{"kafka":{"k8sNodes":["<node-name-1>","<node-name-2>","<node-name-3>"]}}}}}'Also replace the long YAML sample with the one-line patch to avoid copying stale metadata.
Also applies to: 35-82
201-211: Fix admonition formatting and add throttling guidance.-!!!NOTE: The --broker-list parameter needs to be replaced with values extracted from each instance's /kafka0/meta.properties file +!!! note +The --broker-list parameter must use broker IDs from each instance’s data dir (e.g., /kafka0/meta.properties). +Tip: Use reassignment throttling to reduce IO/traffic impact during migration.
215-217: Throttle reassignment traffic; add --throttle.-/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server cpaas-kafka:9092 --reassignment-json-file new-assign.json --execute --command-config ./client.cfg +/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server cpaas-kafka:9092 --reassignment-json-file new-assign.json --execute --throttle 10485760 --command-config ./client.cfgIf your Kafka version lacks --throttle, set broker dynamic configs leader.replication.throttled.rate and follower.replication.throttled.rate.
93-105: Avoid deleting all ZooKeeper/Kafka pods; use rolling restarts.Mass deletion increases downtime and risk. Use rollout restart and wait for each StatefulSet.
-Note: If Kafka pods are Running but the Kafka cluster is not functioning, manually restart both Kafka and ZooKeeper instances once after scaling completes. +Note: If Kafka pods are Running but the Kafka cluster is not functioning, perform rolling restarts. -```shell -kubectl delete pods -n cpaas-system -l 'service_name in (cpaas-zookeeper, cpaas-kafka)' -... -``` +```shell +# ZK: rolling +kubectl rollout restart statefulset/cpaas-zookeeper -n cpaas-system +kubectl rollout status statefulset/cpaas-zookeeper -n cpaas-system +# Kafka: rolling +kubectl rollout restart statefulset/cpaas-kafka -n cpaas-system +kubectl rollout status statefulset/cpaas-kafka -n cpaas-system +```
124-137: Remove hardcoded credentials; add placeholders and a “no secrets” warning. Also fix TLS/JAAS/SASL inconsistencies.Real-looking usernames/passwords are present; this will trigger secret scanners. Replace with placeholders, enable hostname verification, remove broker-only client property, and align SASL mechanism with JAAS.
bash-5.1$ cat /opt/kafka/conf/kafka_server_jaas.conf - -KafkaServer { - org.apache.kafka.common.security.plain.PlainLoginModule required - username="dBodNvjT" - password="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38" - user_dBodNvjT="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38"; -}; -Client { - org.apache.zookeeper.server.auth.DigestLoginModule required - username="dBodNvjT" - password="oC6viKVmKtzYpOKH7f8DWOmC15wWxg38"; -}; +KafkaServer { + org.apache.kafka.common.security.scram.ScramLoginModule required + username="<KAFKA_USER>" + password="<KAFKA_PASSWORD>"; +}; +Client { + org.apache.zookeeper.server.auth.DigestLoginModule required + username="<ZK_USER>" + password="<ZK_PASSWORD>"; +}; @@ -bash-5.1$ cat <<EOF> client.cfg +> IMPORTANT: Do not paste real secrets; source from a Kubernetes Secret when possible. +> Example: kubectl get secret <kafka-client-secret> -n cpaas-system -o jsonpath='{.data.username}' | base64 -d +bash-5.1$ cat <<EOF> client.cfg security.protocol=sasl_ssl - -ssl.endpoint.identification.algorithm= +ssl.endpoint.identification.algorithm=https ssl.keystore.location=/opt/kafka/config/certs/keystore.jks -ssl.keystore.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU -ssl.key.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU +ssl.keystore.password=<KEYSTORE_PASSWORD> +ssl.key.password=<KEY_PASSWORD> ssl.truststore.location=/opt/kafka/config/certs/truststore.jks -ssl.truststore.password=STTTzrX4YmkTeCgc5zLycZGXjMafpouU -ssl.client.auth=required +ssl.truststore.password=<TRUSTSTORE_PASSWORD> ssl.enabled.protocols=TLSv1.2 ssl.cipher.suites=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 ssl.keystore.type=JKS ssl.truststore.type=JKS ssl.secure.random.implementation=SHA1PRNG -sasl.mechanism=PLAIN -sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="VhwgyZj7" password="STTTzrX4YmkTeCgc5zLycZGXjMafpouU"; +sasl.mechanism=SCRAM-SHA-512 +sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="<KAFKA_USER>" password="<KAFKA_PASSWORD>"; EOFOptional: show env-var driven heredoc to avoid multiple edits.
export KAFKA_USER=<...> KAFKA_PASSWORD=<...> KEYSTORE_PASSWORD=<...> KEY_PASSWORD=<...> TRUSTSTORE_PASSWORD=<...> cat <<'EOF' > client.cfg # use $KAFKA_USER etc. in the JAAS line EOFAlso applies to: 142-160
71-74: Do not use IPs here; use node names and sanitize examples.The comment says “add node names” but the sample shows real-looking IPs. Replace with placeholders and hostnames, per prior guidance.
- k8sNodes: # Add the scaled node names to the specified array. - - 1.1.1.1 - - 2.2.2.2 - - 3.3.3.3 + k8sNodes: # Add the scaled Kubernetes node names. + - <node-name-1> + - <node-name-2> + - <node-name-3>
🧹 Nitpick comments (5)
docs/en/solutions/Kafka_Scale_Switching.md (5)
25-27: Prefer label selector/JSONPath over grep pipeline.Use server-side filtering to avoid brittle grep chains and false matches.
-kubectl get moduleinfo | grep logcenter | grep <cluster_name> +kubectl get moduleinfo -l cpaas.io/module-name=logcenter \ + -o jsonpath='{range .items[?(@.metadata.labels.cpaas\.io/cluster-name=="<cluster_name>")]}{.metadata.name}{"\n"}{end}'
86-91: Update verification output to reflect scaled size.If you scaled beyond 3 brokers, show the new pods (e.g., cpaas-kafka-3, cpaas-kafka-4) and/or use rollout status for clarity.
-kubectl get pods -n cpaas-system | grep kafka +kubectl rollout status statefulset/cpaas-kafka -n cpaas-system +kubectl get pods -n cpaas-system -l statefulset.kubernetes.io/pod-name -o name | sort | grep 'cpaas-kafka'
169-197: Topic selection may be environment-specific; generate list programmatically.To avoid drift, derive the topic list with a pattern or allow an override.
Example:
/opt/kafka/bin/kafka-topics.sh --bootstrap-server cpaas-kafka:9092 --list --command-config ./client.cfg | \ grep -E '^(ALAUDA_|cpaas-devops-)' > topics.txtThen construct topic-generate.json from topics.txt in a loop.
199-205: Automate broker ID discovery instead of hardcoding.Derive broker IDs from meta.properties to avoid mistakes.
# From within a Kafka pod: BROKERS=$( for d in /kafka*/meta.properties; do grep -m1 '^broker.id=' "$d" | cut -d= -f2 done | paste -sd, - ) echo "$BROKERS"-/opt/kafka/bin/kafka-reassign-partitions.sh ... --broker-list "1100,1101,1102,1103,1104" --generate ... +/opt/kafka/bin/kafka-reassign-partitions.sh ... --broker-list "$BROKERS" --generate ...Also keep old-assign.json for rollback (already noted).
12-16: Clarify supported versions explicitly.State the exact ACP plugin versions this was tested on and link to the verification ticket for traceability.
Example: “Tested on ACP 4.0.6 and 4.1.2; verification: AIT-60471.”
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
docs/en/solutions/Kafka_Scale_Switching.md(1 hunks)
🧰 Additional context used
🪛 Gitleaks (8.27.2)
docs/en/solutions/Kafka_Scale_Switching.md
129-129: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
135-135: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
147-147: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
148-148: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
150-150: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
159-159: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.
(generic-api-key)
* Create Kafka_Scale_Switching.md * Update Kafka_Scale_Switching.md * Update Kafka_Scale_Switching.md * Update Kafka_Scale_Switching.md
Add a solution to switch Kafka scale;
Test Verification Link: https://jira.alauda.cn/browse/AIT-60471
Summary by CodeRabbit