Skip to content

Create ES_Scale_Switching.md#42

Merged
tyzhou111 merged 4 commits intoalauda:mainfrom
adminwcs:ES-250828
Sep 5, 2025
Merged

Create ES_Scale_Switching.md#42
tyzhou111 merged 4 commits intoalauda:mainfrom
adminwcs:ES-250828

Conversation

@adminwcs
Copy link
Copy Markdown
Contributor

@adminwcs adminwcs commented Aug 28, 2025

Add a solution to switch ES scale;
Test Verification Link: https://jira.alauda.cn/browse/AIT-60471

Summary by CodeRabbit

  • Documentation
    • Added a comprehensive guide for Elasticsearch scale switching in ACP (v4.0.x/4.1.x). Covers LocalVolume and StorageClass workflows, transition paths (single-node ↔ small-scale, small-scale ↔ large-scale), prerequisites and limitations (UI-based scaling not supported post-4.0), required manual edits, resource recommendations, YAML examples, step-by-step procedures, and verification steps via the Platform Management UI.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Aug 28, 2025

Walkthrough

Adds a new documentation page describing manual Elasticsearch scale switching in Alauda Container Platform (ACP) for versions 4.0.x and 4.1.x, covering LocalVolume and StorageClass, both single→small and small→large transitions, required ModuleInfo YAML edits, login to the global master node, and verification steps in the UI.

Changes

Cohort / File(s) Summary
Documentation: Elasticsearch Scale Switching Guide
docs/en/solutions/Elasticsearch_Scale_Switching.md
New guide documenting manual Elasticsearch scale switching in ACP (4.0.x, 4.1.x). Covers LocalVolume and StorageClass, two transition directions (single→small, small→large), required ModuleInfo YAML edits and example fields (k8sNodes, masterK8sNodes, masterReplicas, nodeReplicas, resources, type), notes that UI-based scale switching is unsupported post-4.0, login step to global master node, and verification via Platform Management > Marketplace > Clusters Plugins > ACP Log Collector.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

Suggested reviewers

  • tyzhou111

Poem

I hop through clusters, nose in the log,
Tweaking YAML under moonlit fog.
From single burrow to many-storey hive,
Replicas hum and nodes come alive.
Carrots of config — let the indices thrive. 🥕🐰

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

  • Add @coderabbitai ignore or @coderabbit ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

  • Visit our Status Page to check the current availability of CodeRabbit.
  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/en/solutions/ES_Scale_Switching.md (1)

318-321: Tighten wording and add CLI verification

Fix grammar, and include quick checks for ModuleInfo and ES cluster health.

-After modification, check whether the changes have taken effect in the **Platform Management** **> Marketplace > Clusters Plugins > ACP Log Collector** section.
+After the change, verify in Platform Management > Marketplace > Cluster Plugins > ACP Log Collector.
+Additionally, verify via CLI:
+1) kubectl get moduleinfo -A | grep -E 'logcenter.*<cluster-name>'
+2) curl -s http://<es_svc>:9200/_cluster/health?pretty
+   # expect "status":"green" and node counts matching the new scale
🧹 Nitpick comments (5)
docs/en/solutions/ES_Scale_Switching.md (5)

8-8: Use a human-readable title

Consider “Elasticsearch scale switching” (H1) instead of “ES_Scale_Switching” for clarity and consistency with other docs.

-# ES_Scale_Switching
+# Elasticsearch scale switching

16-16: Version string formatting

Add a space after the comma, and clarify if patch versions outside these ranges are unsupported.

-Applicable Versions: 4.0.x,4.1.x
+Applicable versions: 4.0.x, 4.1.x

228-242: Define “minfo” on first use; ensure consistent terminology

“minfo” is shorthand for ModuleInfo. Define once (“minfo (ModuleInfo)”) and use consistently (“plugin UI” vs “plugin interface”).

-        # Change the scale type from `single` to `normal` (**must be modified in minfo**)
+        # Change the scale type from `single` to `normal` (**must be modified in ModuleInfo, a.k.a. "minfo"**)

305-313: Requests/limits for large-scale

Same recommendation as LocalVolume: consider matching requests to limits for predictable performance.

-          requests:
-            cpu: "1"
-            memory: 2Gi
+          requests:
+            cpu: "8"
+            memory: 16Gi

12-13: Add safety guardrails (backups, maintenance window, shard allocation)

Before scale switches, advise: take snapshots, schedule a maintenance window, temporarily disable shard allocation during node topology changes, and re-enable afterward.

 ## Background
@@
-Manual operations are required to modify the scale if needed.
+Manual operations are required to modify the scale if needed.
+
+Important precautions:
+- Take a recent Elasticsearch snapshot/backup.
+- Perform during a maintenance window; expect data rebalancing.
+- Optionally set: PUT _cluster/settings {"transient":{"cluster.routing.allocation.enable":"none"}} before disruptive changes; re-enable with "all" after nodes stabilize.

Also applies to: 20-22

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 13d4166 and 59f853b.

📒 Files selected for processing (1)
  • docs/en/solutions/ES_Scale_Switching.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/solutions/ES_Scale_Switching.md

[grammar] ~320-~320: There might be a mistake here.
Context: ...s Plugins > ACP Log Collector** section.

(QB_NEW_EN)

🔇 Additional comments (5)
docs/en/solutions/ES_Scale_Switching.md (5)

65-93: Small-scale example likely lacks HA; confirm nodeReplicas

You list three k8sNodes but set nodeReplicas: 1. If “small” is intended to be HA, set nodeReplicas ≥ 3 or explicitly state that 1 is acceptable and non-HA.

-        nodeReplicas: 1
+        nodeReplicas: 3   # if HA small-scale is required

169-242: StorageClass example: verify use of hostPath

For StorageClass-backed deployments, hostpath is typically unused. Confirm whether this field is ignored or must be removed to avoid confusion.

-        hostpath: /cpaas/data/elasticsearch
+        # hostpath is not used with StorageClass (verify); leave unset if not applicable
+        # hostpath: /cpaas/data/elasticsearch

292-304: Good: SC small→large shows consistent masters and data replicas

This section correctly sets masterReplicas: 3 and nodeReplicas: 3 for HA. Mirrors expectations for “big” scale.


80-82: Clarify storage size units

masterStorageSize: 5 and nodeStorageSize: 200 lack units. State whether these are Gi or another unit to avoid misconfiguration.

-        masterStorageSize: 5
+        masterStorageSize: 5   # Gi (confirm)
@@
-        nodeStorageSize: 200
+        nodeStorageSize: 200   # Gi (confirm)

Also applies to: 153-156, 230-231, 304-305


20-20: Clarify ambiguous control-plane terminology
Replace “global master node” with the exact product term (e.g. “Global Management cluster control-plane node”); if ModuleInfo is namespaced, include its namespace in subsequent commands.

Comment on lines +26 to +29
```shell
kubectl get moduleinfo -A | grep logcenter | grep <cluster-name> # Check the logcenter of the target cluster
kubectl edit moduleinfo <moduleinfo_name> # Edit the moduleinfo YAML and modify the following sections
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Prefer patch over full-object edit; avoid accidental metadata edits

kubectl edit exposes immutable metadata (uid, resourceVersion, etc.) and increases risk. Recommend strategic-merge or JSON patch targeting only the ES fields.

-kubectl edit moduleinfo <moduleinfo_name>                                            # Edit the moduleinfo YAML and modify the following sections
+# Safer alternative: patch only desired fields
+kubectl -n <namespace> patch moduleinfo <moduleinfo_name> --type merge -p '{
+  "spec": { "config": { "components": { "elasticsearch": { "type": "normal" }}}}
+}'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
```shell
kubectl get moduleinfo -A | grep logcenter | grep <cluster-name> # Check the logcenter of the target cluster
kubectl edit moduleinfo <moduleinfo_name> # Edit the moduleinfo YAML and modify the following sections
```
🤖 Prompt for AI Agents
In docs/en/solutions/ES_Scale_Switching.md around lines 26 to 29, the doc
recommends using "kubectl edit" which can unintentionally modify immutable
metadata; update the guidance to prefer using "kubectl patch" with a
strategic-merge or JSON patch that targets only the Elasticsearch-related fields
(namespace/name and the specific spec fields to change) and show a safe example
command pattern and notes to validate the patch with kubectl get/apply --dry-run
or kubectl rollout status; ensure the text warns against full-object edits and
instructs to patch only the ES-specific keys to avoid accidental metadata
changes.

Comment thread docs/en/solutions/ES_Scale_Switching.md Outdated
Comment on lines +35 to +53
annotations:
cpaas.io/display-name: logcenter
cpaas.io/module-name: '{"en": "Alauda Container Platform Log Storage for Elasticsearch",
"zh": "Alauda Container Platform Log Storage for Elasticsearch"}'
creationTimestamp: "2025-08-27T07:00:36Z"
finalizers:
- moduleinfo
generation: 3
labels:
cpaas.io/cluster-name: global
cpaas.io/module-name: logcenter
cpaas.io/module-type: plugin
cpaas.io/product: Platform-Center
create-by: cluster-transformer
manage-delete-by: cluster-transformer
manage-update-by: cluster-transformer
name: global-e671599464a5b1717732c5ba36079795
resourceVersion: "4202333"
uid: 8c7161a3-0573-4375-9b3c-0badb40864d9
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Do not copy/edit metadata from examples

creationTimestamp, uid, resourceVersion, and name are example values. Editing these breaks updates. Add a warning and replace with placeholders.

-  creationTimestamp: "2025-08-27T07:00:36Z"
+  # Do not modify auto-managed fields below; shown for context only
+  creationTimestamp: "<auto>"
 ...
-  resourceVersion: "4202333"
-  uid: 8c7161a3-0573-4375-9b3c-0badb40864d9
+  resourceVersion: "<auto>"
+  uid: "<auto>"

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In docs/en/solutions/ES_Scale_Switching.md around lines 35 to 53, the example
Kubernetes metadata fields (creationTimestamp, uid, resourceVersion, name) are
real/example values that must not be copied or edited; update the document to
add a clear WARNING near the top of the example stating "Do not copy/edit
example metadata" and replace those fields with neutral placeholders (e.g.
"<creationTimestamp>", "<uid>", "<resourceVersion>", "<name>") or remove them
entirely from the snippet so readers don’t paste concrete values into manifests;
ensure the warning explains that these fields are example-only and will break
updates if used.

Comment on lines +82 to +90
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
tcpPort: 9300
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Align requests with limits for Elasticsearch stability

requests at 200m/256Mi can cause OOM and throttling. For critical stateful services, prefer Guaranteed QoS (requests == limits) or raise requests closer to limits.

-          requests:
-            cpu: 200m
-            memory: 256Mi
+          requests:
+            cpu: "2"
+            memory: 4Gi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
tcpPort: 9300
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: "2"
memory: 4Gi
tcpPort: 9300
🤖 Prompt for AI Agents
In docs/en/solutions/ES_Scale_Switching.md around lines 82 to 90, the pod
resource requests are far lower than limits (cpu 200m/memory 256Mi vs limits
2/4Gi) which risks OOMs and throttling; update the requests to match the limits
for Elasticsearch data nodes to achieve Guaranteed QoS (i.e., set requests cpu
to 2 and memory to 4Gi) or at minimum raise requests substantially closer to
limits and document the recommendation that stateful ES nodes should use
requests == limits for stability.

Comment on lines +136 to +167
# Update the spec.config.components.elasticsearch.k8sNodes field to add ES nodes and the spec.config.components.elasticsearch.masterK8sNodes field to add master nodes (optional in minfo; can be updated later via the plugin UI after scaling)
k8sNodes:
- 1.1.1.1
- 2.2.2.2
- 3.3.3.3
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 0
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "1"
memory: 2Gi
tcpPort: 9300
# Change the scale type from `normal` to `big` (**must be modified in minfo**)
type: big
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Inconsistent “large-scale” config: masterReplicas is 0 while masters are listed

Large scale with dedicated masters requires masterReplicas ≥ 3. Also consider increasing data node replicas for HA. Fix below.

         masterK8sNodes: 
           - 4.4.4.4
           - 5.5.5.5
           - 6.6.6.6 
-        masterReplicas: 0
+        masterReplicas: 3
 ...
-        nodeReplicas: 1
+        nodeReplicas: 3
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Update the spec.config.components.elasticsearch.k8sNodes field to add ES nodes and the spec.config.components.elasticsearch.masterK8sNodes field to add master nodes (optional in minfo; can be updated later via the plugin UI after scaling)
k8sNodes:
- 1.1.1.1
- 2.2.2.2
- 3.3.3.3
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 0
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "1"
memory: 2Gi
tcpPort: 9300
# Change the scale type from `normal` to `big` (**must be modified in minfo**)
type: big
```
# Update the spec.config.components.elasticsearch.k8sNodes field to add ES nodes and the spec.config.components.elasticsearch.masterK8sNodes field to add master nodes (optional in minfo; can be updated later via the plugin UI after scaling)
k8sNodes:
- 1.1.1.1
- 2.2.2.2
- 3.3.3.3
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 3
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 3
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "1"
memory: 2Gi
tcpPort: 9300
# Change the scale type from `normal` to `big` (**must be modified in minfo**)
type: big
🤖 Prompt for AI Agents
In docs/en/solutions/ES_Scale_Switching.md around lines 136 to 167, the
"large-scale" example is inconsistent: dedicated masters are listed but
masterReplicas is set to 0; update masterReplicas to at least 3 (preferably 3)
to match the presence of masterK8sNodes, ensure the number of masterK8sNodes
aligns with masterReplicas, and set nodeReplicas to a higher value (e.g., 3) for
data-node HA; keep masterResources and resources as-is unless increasing
capacity is required.

Comment on lines +156 to +164
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "1"
memory: 2Gi
tcpPort: 9300
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Match requests to “large-scale” limits

To avoid preemption and GC pressure, align requests with the documented large defaults.

-          requests:
-            cpu: "1"
-            memory: 2Gi
+          requests:
+            cpu: "8"
+            memory: 16Gi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "1"
memory: 2Gi
tcpPort: 9300
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
resources:
limits:
cpu: "8"
memory: 16Gi
requests:
cpu: "8"
memory: 16Gi
tcpPort: 9300
🤖 Prompt for AI Agents
In docs/en/solutions/ES_Scale_Switching.md around lines 156 to 164, the requests
for ES data nodes are set too low relative to the "large-scale" limits; update
the resources.requests to match the documented large-scale limits (set cpu to
"8" and memory to 16Gi) so requests align with limits to avoid preemption and GC
pressure, and note that this change can be applied in minfo or updated later via
the plugin UI.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/en/solutions/Elasticsearch_Scale_Switching.md (1)

318-321: Fix navigation grammar and add concrete verification commands.

Tighten the UI path and include a kubectl check.

-After modification, check whether the changes have taken effect in the **Platform Management** **> Marketplace > Clusters Plugins > ACP Log Collector** section.
+After modification, verify in **Platform Management > Marketplace > Cluster Plugins > ACP Log Collector**.
+
+You can also confirm on the CR:
+```shell
+kubectl get moduleinfo <moduleinfo_name> -o jsonpath='{.spec.config.components.elasticsearch.type}{"\n"}'
+kubectl get moduleinfo <moduleinfo_name> -o jsonpath='{.status.phase}{"\n"}'
+```
🧹 Nitpick comments (5)
docs/en/solutions/Elasticsearch_Scale_Switching.md (5)

12-12: Clarify version scope precisely.

Say "Starting with 4.0.0" or ">= 4.0.0" instead of "After version 4.0" to avoid ambiguity.


251-316: Add scheduling safeguards (masters/data separation, PDBs, topology spread).

  • Enforce anti-affinity between masters and between data nodes.
  • Add PodDisruptionBudgets for masters and data.
  • Use topologySpreadConstraints/zone awareness to avoid skew.

If desired, I can provide example snippets keyed to your operator’s values.


31-93: Trim YAML to only the fields users must edit to avoid accidental metadata changes.

Long, full-object examples increase risk of editing immutable fields (uid, resourceVersion). Show only the spec subtree:

spec:
  config:
    components:
      elasticsearch:
        type: normal
        k8sNodes:
        - 192.0.2.10
        - 192.0.2.11
        - 192.0.2.12
        masterK8sNodes: []
        masterReplicas: 0
        nodeReplicas: 3
        resources:
          limits:
            cpu: "2"
            memory: 4Gi

157-163: Normalize resource quantity formatting.

Use consistent quoting and CPU units across examples (e.g., quote all cpu integers, keep millicores as strings), to reduce user errors.

Also applies to: 307-313, 84-89, 233-239


314-315: Standardize terminology: "ModuleInfo" vs "minfo/moduleinfo".

Use "ModuleInfo" consistently and define the abbreviation once if needed.

Also applies to: 228-232, 91-92, 165-166

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 59f853b and 1e8d5ef.

📒 Files selected for processing (1)
  • docs/en/solutions/Elasticsearch_Scale_Switching.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/solutions/Elasticsearch_Scale_Switching.md

[grammar] ~320-~320: There might be a mistake here.
Context: ...s Plugins > ACP Log Collector** section.

(QB_NEW_EN)

🔇 Additional comments (1)
docs/en/solutions/Elasticsearch_Scale_Switching.md (1)

178-242: StorageClass example: remove hostPath and align wording. Verify against schema.

For StorageClass-backed deployments, hostpath is typically irrelevant and may confuse users. Also standardize "ModuleInfo" wording.

         install: true
-        hostpath: /cpaas/data/elasticsearch
         httpPort: 9200
@@
-        # Change the scale type from `single` to `normal` (**must be modified in moduleinfo**)
+        # Change the scale type from `single` to `normal` (**must be modified in ModuleInfo**)

Please confirm with the CRD/schema whether hostpath is ignored for StorageClass; if so, keep it out of StorageClass examples.

@alauda alauda deleted a comment from coderabbitai Bot Aug 29, 2025
@alauda alauda deleted a comment from coderabbitai Bot Aug 29, 2025
@alauda alauda deleted a comment from coderabbitai Bot Aug 29, 2025

## Background

After version 4.0, the product prohibits switching between different scales and only supports node expansion within the same scale. Manual operations are required to modify the scale if needed.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Briefly introduce the different scales of Elasticsearch in the ACP product to provide readers with background information.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
docs/en/solutions/Elasticsearch_Scale_Switching.md (1)

12-12: Add concrete definitions for each ES "scale" (map to type values and node counts).

Readers need a quick mapping (single → type: single, 1 node; small → type: normal, ≥3 data; large → type: big, ≥3 data + 3 masters). This also addresses prior feedback to briefly introduce the scales.

Apply after Line 12:

+Quick reference:
+- single-node: `type: single` (1 data node; no dedicated masters)
+- small-scale: `type: normal` (≥3 data nodes; no dedicated masters)
+- large-scale: `type: big` (≥3 data nodes + 3 dedicated master nodes)
🧹 Nitpick comments (7)
docs/en/solutions/Elasticsearch_Scale_Switching.md (7)

228-242: Clarify size units and validation for storage fields.

masterStorageSize and nodeStorageSize lack units. If the operator expects quantities (e.g., Gi), document it to avoid misconfiguration.

Apply a brief note below the fields:

         nodeStorageSize: 200
+        # NOTE: Use the units required by the operator (e.g., 200Gi). Verify with your platform version.

If integers are indeed required, state that explicitly instead.


169-242: StorageClass example: consider documenting storageClassName if applicable.

If the operator supports/needs a StorageClass selector, add storageClassName to the example and a note to choose an appropriate class.

Apply in the YAML block if supported:

         install: true
+        # Optional: set the StorageClass for ES PVCs
+        storageClassName: <your-storage-class>

318-321: Polish the verification step and add concrete checks.

Minor grammar/label fix and add quick API checks for scale and master count.

Apply to Line 320 and append validation:

-After modification, check whether the changes have taken effect in the **Platform Management** **> Marketplace > Clusters Plugins > ACP Log Collector** section.
+After modification, verify in Platform Management > Marketplace > Cluster Plugins > ACP Log Collector.
+
+Additional checks:
+```shell
+# Expect green and the correct number of nodes
+kubectl -n <es-namespace> port-forward svc/<es-svc> 9200:9200 &
+curl -s localhost:9200/_cluster/health | jq '.status,.number_of_nodes,.number_of_data_nodes'
+curl -s localhost:9200/_cat/master
+```

16-16: Scope the version statement.

Explicitly state “tested on 4.0.x/4.1.x” to avoid implying support beyond that.

Apply change:

-Applicable Versions: 4.0.x, 4.1.x
+Applicable versions: tested on 4.0.x and 4.1.x.

98-100: Post-edit reconcile note.

After kubectl edit moduleinfo, mention watching the reconcile until ready.

Apply beneath the command block:

+kubectl get moduleinfo <moduleinfo_name> -w                                          # Watch until status/conditions report Ready/Running

Also applies to: 247-249


24-29: Parameterize placeholders consistently.

Use a consistent placeholder style and call it out once (e.g., , ).

Apply:

-kubectl edit moduleinfo <moduleinfo_name>
+kubectl edit moduleinfo <moduleinfo-name>

And add a short note after the command block:

+Note: Replace <cluster-name>, <moduleinfo-name>, <es-namespace>, <es-svc>, and IP addresses with your actual values.

Also applies to: 171-176


24-93: Optional: add a rollback note.

Briefly document how to revert if the reconcile fails (restore previous ModuleInfo or from snapshot).

Add at the end of each section:

+Rollback:
+- Revert the edited fields in ModuleInfo to the previous values and wait for reconcile, or
+- Restore from the pre-change snapshot/PV snapshot.

Also applies to: 169-316

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1e8d5ef and a1507ec.

📒 Files selected for processing (1)
  • docs/en/solutions/Elasticsearch_Scale_Switching.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/solutions/Elasticsearch_Scale_Switching.md

[grammar] ~320-~320: There might be a mistake here.
Context: ...s Plugins > ACP Log Collector** section.

(QB_NEW_EN)

@alauda alauda deleted a comment from coderabbitai Bot Sep 5, 2025
@alauda alauda deleted a comment from coderabbitai Bot Sep 5, 2025
@alauda alauda deleted a comment from coderabbitai Bot Sep 5, 2025
@alauda alauda deleted a comment from coderabbitai Bot Sep 5, 2025
@alauda alauda deleted a comment from coderabbitai Bot Sep 5, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
docs/en/solutions/Elasticsearch_Scale_Switching.md (1)

10-19: Background added — good coverage of scale tiers

The overview of single/small/large tiers addresses earlier feedback and sets context well.

🧹 Nitpick comments (4)
docs/en/solutions/Elasticsearch_Scale_Switching.md (4)

28-29: State the required two-step path explicitly

Spell out that single → large must be performed as two sequential operations (single → small, then small → large) to avoid confusion.

-Log in to the **global master node**. Direct switching from a single-node to a large-scale configuration is technically unsupported.
+Log in to the **global master node**.
+Note: Direct switching from a single-node to a large-scale configuration is not supported. Perform the change in two steps:
+1) single → small
+2) small → large

73-73: Use consistent term “ModuleInfo” (avoid “minfo/moduleinfo” variants in prose)

Consistency improves clarity and searchability in docs.

-        # Update ... (optional in minfo; can be updated later via the plugin UI after scaling)
+        # Update ... (optional in ModuleInfo; can be updated later via the plugin UI after scaling)
@@
-        # Adjust ... (recommended ...; optional in minfo; can be updated later via the plugin UI)
+        # Adjust ... (recommended ...; optional in ModuleInfo; can be updated later via the plugin UI)
@@
-        # Change the scale type from `single` to `normal` (**must be modified in minfo**)
+        # Change the scale type from `single` to `normal` (**must be modified in ModuleInfo**)
@@
-        # Update ... (optional in minfo; can be updated later via the plugin UI after scaling)
+        # Update ... (optional in ModuleInfo; can be updated later via the plugin UI after scaling)
@@
-        # Adjust ... (recommended ...; optional in minfo; can be updated later via the plugin UI)
+        # Adjust ... (recommended ...; optional in ModuleInfo; can be updated later via the plugin UI)
@@
-        # Change the scale type from `normal` to `big` (**must be modified in minfo**)
+        # Change the scale type from `normal` to `big` (**must be modified in ModuleInfo**)
@@
-        # Change the scale type from `single` to `normal` (**must be modified in moduleinfo**)
+        # Change the scale type from `single` to `normal` (**must be modified in ModuleInfo**)
@@
-        # Change the scale type from `normal` to `big` (**must be modified in minfo**)
+        # Change the scale type from `normal` to `big` (**must be modified in ModuleInfo**)

Also applies to: 90-91, 99-101, 144-145, 165-166, 173-174, 249-250, 322-323


35-37: Add a quick backup step before editing ModuleInfo

Safer workflow; helps rollback if needed.

 kubectl get moduleinfo -A | grep logcenter | grep <cluster-name>                     # Check the logcenter of the target cluster
+kubectl get moduleinfo <moduleinfo_name> -o yaml > backup-<moduleinfo_name>.yaml     # Backup current spec (cluster-scoped)
 kubectl edit moduleinfo <moduleinfo_name>                                            # Edit the moduleinfo YAML and modify the following sections

Also applies to: 106-108, 182-184, 255-257


326-329: Tighten verification section; add concrete health checks and fix grammar

Also adjust “Clusters Plugins” → “Cluster Plugins” (singular).

-After modification, check whether the changes have taken effect in the **Platform Management** **> Marketplace > Clusters Plugins > ACP Log Collector** section.
+After modification:
+- In the UI, confirm changes under **Platform Management > Marketplace > Cluster Plugins > ACP Log Collector**.
+- Check pods are Ready and match the new replica counts.
+- Verify ES health is green:
+  kubectl -n <ns> exec -it <es-pod> -- curl -s http://localhost:9200/_cluster/health?pretty
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a1507ec and e07c6fc.

📒 Files selected for processing (1)
  • docs/en/solutions/Elasticsearch_Scale_Switching.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/en/solutions/Elasticsearch_Scale_Switching.md

[grammar] ~328-~328: There might be a mistake here.
Context: ...s Plugins > ACP Log Collector** section.

(QB_NEW_EN)

Comment on lines +26 to +30
## Procedure

Log in to the **global master node**. Direct switching from a single-node to a large-scale configuration is technically unsupported.

## Modification Steps for LocalVolume Storage Type
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Call out operational impact and prerequisites

Readers need to plan maintenance windows and validate capacity before switching scales.

 ## Procedure
@@
-Log in to the **global master node**.
+Log in to the **global master node**.
+
+Warning:
+- Expect temporary ingestion disruption and ES shard reallocation during scale changes.
+- Ensure target nodes meet CPU/memory/disk requirements for the desired tier.
+- Verify sufficient PV capacity and network connectivity between data and master nodes.
+- Perform the operation during a maintenance window and monitor cluster health.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## Procedure
Log in to the **global master node**. Direct switching from a single-node to a large-scale configuration is technically unsupported.
## Modification Steps for LocalVolume Storage Type
## Procedure
Log in to the **global master node**.
Warning:
- Expect temporary ingestion disruption and ES shard reallocation during scale changes.
- Ensure target nodes meet CPU/memory/disk requirements for the desired tier.
- Verify sufficient PV capacity and network connectivity between data and master nodes.
- Perform the operation during a maintenance window and monitor cluster health.
Direct switching from a single-node to a large-scale configuration is technically unsupported.
## Modification Steps for LocalVolume Storage Type
🤖 Prompt for AI Agents
In docs/en/solutions/Elasticsearch_Scale_Switching.md around lines 26 to 30, the
procedure section lacks a clear callout about operational impact and
prerequisites; update this section to add a concise warning and checklist that
instructs operators to schedule a maintenance window, communicate
downtime/impact, validate cluster and node capacity (CPU, memory, disk, IOPS),
ensure backups/snapshots are current, confirm storage and networking capacity
for the target scale, and verify rollback steps before attempting any scale
switch.

manage-update-by: cluster-transformer
name: global-e671599464a5b1717732c5ba36079795
resourceVersion: "4202333"
uid: <Standed UUID>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix typo “Standed UUID”

Use “Standard UUID” or simply “”.

-  uid: <Standed UUID>
+  uid: <UUID>

Also applies to: 132-132, 214-214, 287-287

🤖 Prompt for AI Agents
In docs/en/solutions/Elasticsearch_Scale_Switching.md around lines 61, 132, 214,
and 287, the placeholder string "Standed UUID" is a typo; replace it with
"Standard UUID" or simply "<UUID>" consistently at those locations. Update each
occurrence to use the corrected text, ensure casing and punctuation match
surrounding content, and run a quick search to confirm no other instances
remain.

Comment on lines +88 to +90
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Small-scale example shows nodeReplicas: 1 (conflicts with “min 3 data nodes”)

For small-scale, data nodes must be ≥3. The sample should reflect that to prevent under-provisioning.

-        nodeReplicas: 1
+        nodeReplicas: 3
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
nodeReplicas: 3
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the small-scale default of 2c4G; optional in minfo; can be updated later via the plugin UI)
🤖 Prompt for AI Agents
In docs/en/solutions/Elasticsearch_Scale_Switching.md around lines 88 to 90, the
small-scale example shows nodeReplicas: 1 which conflicts with the stated
minimum of 3 data nodes; update the example to use nodeReplicas: 3 (and keep
nodeStorageSize: 200) and adjust the inline comment to state that small-scale
requires at least 3 data nodes (e.g., "data nodes must be ≥3; set nodeReplicas:
3 for small-scale") so the sample matches the documented minimum and avoids
under-provisioning.

Comment on lines +149 to +164
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 0
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Large-scale example sets masterReplicas: 0 and nodeReplicas: 1

Large-scale requires ≥3 dedicated masters and ≥3 data nodes. The example should match the requirement.

-        masterReplicas: 0
+        masterReplicas: 3
@@
-        nodeReplicas: 1
+        nodeReplicas: 3
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 0
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 1
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
masterK8sNodes:
- 4.4.4.4
- 5.5.5.5
- 6.6.6.6
masterReplicas: 3
masterResources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 200m
memory: 256Mi
masterStorageSize: 5
nodeReplicas: 3
nodeStorageSize: 200
# Adjust the spec.config.components.elasticsearch.resources.limits field to modify resource limits for ES data nodes (recommended to set at least the large-scale default of 8c16G; optional in minfo; can be updated later via the plugin UI)
🤖 Prompt for AI Agents
In docs/en/solutions/Elasticsearch_Scale_Switching.md around lines 149 to 164,
the large-scale YAML example incorrectly sets masterReplicas: 0 and
nodeReplicas: 1 while the text requires ≥3 dedicated masters and ≥3 data nodes;
update the example to match the requirement by setting masterReplicas: 3 (and
ensure masterK8sNodes lists at least three host entries) and nodeReplicas: 3 (or
higher as appropriate), and adjust any related comments or resource sizes to
reflect large-scale defaults.

Comment on lines +220 to +224
address: ""
basicAuthSecretName: ""
hostpath: /cpaas/data/elasticsearch
httpPort: 9200
install: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

StorageClass examples should not use hostPath; include storageClassName instead

Using hostPath with StorageClass is misleading and can cause misconfiguration. Replace with storageClassName.

-        hostpath: /cpaas/data/elasticsearch
+        storageClassName: <your-storage-class>

Also applies to: 295-297

🤖 Prompt for AI Agents
In docs/en/solutions/Elasticsearch_Scale_Switching.md around lines 220 to 224
(and also apply the same change at lines 295-297), the example uses hostpath:
/cpaas/data/elasticsearch which is misleading for StorageClass-based volumes;
remove the hostpath entry and replace it with a storageClassName:
<your-storage-class> line (and optionally a persistentVolumeClaim or size entry
if required by the deployment), ensure the example uses the correct
storageClassName value and remove any hostPath references so the manifest
reflects proper dynamic provisioning.

@tyzhou111 tyzhou111 merged commit 40add40 into alauda:main Sep 5, 2025
2 checks passed
changluyi pushed a commit to changluyi/knowledge that referenced this pull request Apr 23, 2026
* Create ES_Scale_Switching.md

* Update and rename ES_Scale_Switching.md to Elasticsearch_Scale_Switching.md

* Update Elasticsearch_Scale_Switching.md

* Update Elasticsearch_Scale_Switching.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants