Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rabbitmq-cluster-operator] split brain after installation of 3 node cluster #26344

Closed
simakji opened this issue May 22, 2024 · 3 comments
Closed
Assignees
Labels
rabbitmq-cluster-operator solved stale 15 days without activity tech-issues The user has a technical issue about an application

Comments

@simakji
Copy link

simakji commented May 22, 2024

Name and Version

bitnami/rabbitmq-cluster-operator 4.2.10

What architecture are you using?

amd64

What steps will reproduce the bug?

I try to install 3 node RMQ cluster via rabbitmq-cluster-operator 4.2.10 to Kubernetes cluster, Azure cloud:

  1. Install operator: helm upgrade -i rabbitmq-operator rabbitmq-cluster-operator -n "rmq" --version "4.2.10" --repo https://charts.bitnami.com/bitnami --wait
  2. Install cluster: kubectl apply -f k8s/definition.yaml
apiVersion: rabbitmq.com/v1beta1
kind: RabbitmqCluster
metadata:
  name: rabbitmqcluster-sample
  namespace: rmq
spec:
  replicas: 3
  persistence:
    storageClassName: azurefile
    storage: 20Gi

What do you see instead?

I expect RMQ cluster to be formed, but I got 2 clusters, one with node 0, a second with nodes 1 and 2. I repeated the installation a few times, but the result was still the same. When I turned on debug logs, the cluster was formed correctly. When I tried chart version 3.20.1 the cluster was formed correctly as well (without debug logs on).

Additional information

rmq-0 cluster status:

$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbitmqcluster-sample-server-0.rabbitmqcluster-sample-nodes.rmq ...
Basics

Cluster name: rabbitmqcluster-sample
Total CPU cores available cluster-wide: 4

Disk Nodes

rabbit@rabbitmqcluster-sample-server-0.rabbitmqcluster-sample-nodes.rmq

Running Nodes

rabbit@rabbitmqcluster-sample-server-0.rabbitmqcluster-sample-nodes.rmq

rmq-1 cluster status:

$ rabbitmqctl cluster_status
Cluster status of node rabbit@rabbitmqcluster-sample-server-1.rabbitmqcluster-sample-nodes.rmq ...
Basics

Cluster name: rabbitmqcluster-sample
Total CPU cores available cluster-wide: 8

Disk Nodes

rabbit@rabbitmqcluster-sample-server-1.rabbitmqcluster-sample-nodes.rmq
rabbit@rabbitmqcluster-sample-server-2.rabbitmqcluster-sample-nodes.rmq

Running Nodes

rabbit@rabbitmqcluster-sample-server-1.rabbitmqcluster-sample-nodes.rmq
rabbit@rabbitmqcluster-sample-server-2.rabbitmqcluster-sample-nodes.rmq
@simakji simakji added the tech-issues The user has a technical issue about an application label May 22, 2024
@github-actions github-actions bot added the triage Triage is needed label May 22, 2024
@github-actions github-actions bot removed the triage Triage is needed label May 23, 2024
@github-actions github-actions bot assigned migruiz4 and unassigned carrodher May 23, 2024
@migruiz4
Copy link
Member

Hi @simakji,

I'm sorry but I haven't been able to reproduce this issue. In my case, several attempts to freshly install rabbitmq with 3 nodes, the cluster was not split:

Disk Nodes

rabbit@rabbitmqcluster-sample-server-0.rabbitmqcluster-sample-nodes.default
rabbit@rabbitmqcluster-sample-server-1.rabbitmqcluster-sample-nodes.default
rabbit@rabbitmqcluster-sample-server-2.rabbitmqcluster-sample-nodes.default

Could you please provide more details about your case? Maybe there were some residual PVCs from previous deployments? Did you find anything in the container logs?

Copy link

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

@github-actions github-actions bot added the stale 15 days without activity label Jun 15, 2024
Copy link

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

@bitnami-bot bitnami-bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rabbitmq-cluster-operator solved stale 15 days without activity tech-issues The user has a technical issue about an application
Projects
None yet
Development

No branches or pull requests

4 participants