Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: Allow an even number of mons #8636

Merged
merged 2 commits into from
Sep 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 6 additions & 1 deletion Documentation/ceph-cluster-crd.md
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,12 @@ A specific will contain a specific release of Ceph as well as security fixes fro

### Mon Settings

* `count`: Set the number of mons to be started. The number must be odd and between `1` and `9`. If not specified the default is set to `3`.
* `count`: Set the number of mons to be started. The number must be between `1` and `9`. The recommended value is most commonly `3`.
For highest availability, an odd number of mons should be specified.
For higher durability in case of mon loss, an even number can be specified although availability may be lower.
To maintain quorum a majority of mons must be up. For example, if there are three mons, two must be up.
If there are four mons, three must be up. If there are two mons, both must be up.
If quorum is lost, see the [disaster recovery guide](ceph-disaster-recovery.md#restoring-mon-quorum) to restore quorum from a single mon.
* `allowMultiplePerNode`: Whether to allow the placement of multiple mons on a single node. Default is `false` for production. Should only be set to `true` in test environments.
* `volumeClaimTemplate`: A `PersistentVolumeSpec` used by Rook to create PVCs
for monitor storage. This field is optional, and when not provided, HostPath
Expand Down
4 changes: 2 additions & 2 deletions Documentation/ceph-mon-health.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,9 @@ quorum and perform operations in the cluster. If the majority of mons are not ru
Most commonly a cluster will have three mons. This would mean that one mon could go down and allow the cluster to remain healthy.
You would still have 2/3 mons running to give you consensus in the cluster for any operation.

You will always want an odd number of mons. Fifty percent of mons will not be sufficient to maintain quorum. If you had two mons and one
For highest availability, an odd number of mons is required. Fifty percent of mons will not be sufficient to maintain quorum. If you had two mons and one
of them went down, you would have 1/2 of quorum. Since that is not a super-majority, the cluster would have to wait until the second mon is up again.
Therefore, Rook prohibits an even number of mons.
Rook allows an even number of mons for higher durability. See the [disaster recovery guide](ceph-disaster-recovery.md#restoring-mon-quorum) if quorum is lost and to recover mon quorum from a single mon.

The number of mons to create in a cluster depends on your tolerance for losing a node. If you have 1 mon zero nodes can be lost
to maintain quorum. With 3 mons one node can be lost, and with 5 mons two nodes can be lost. Because the Rook operator will automatically
Expand Down
7 changes: 4 additions & 3 deletions cluster/charts/rook-ceph-cluster/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ cephClusterSpec:
waitTimeoutForHealthyOSDInMinutes: 10

mon:
# Set the number of mons to be started. Must be an odd number, and is generally recommended to be 3.
# Set the number of mons to be started. Generally recommended to be 3.
# For highest availability, an odd number of mons should be specified.
count: 3
# The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
# Mons should only be allowed on the same node for test environments where data loss is acceptable.
Expand Down Expand Up @@ -328,14 +329,14 @@ cephBlockPools:
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# mapOptions: lock_on_read,queue_depth=1024

# (optional) unmapOptions is a comma-separated list of unmap options.
# For krbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd/#kernel-rbd-krbd-options
# For nbd options refer
# https://docs.ceph.com/docs/master/man/8/rbd-nbd/#options
# unmapOptions: force

# RBD image format. Defaults to "2".
imageFormat: "2"
# RBD image features. Available for imageFormat: "2". CSI RBD currently supports only `layering` feature.
Expand Down
1 change: 1 addition & 0 deletions cluster/charts/rook-ceph/templates/resources.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -812,6 +812,7 @@ spec:
type: boolean
count:
description: Count is the number of Ceph monitors
maximum: 9
minimum: 0
type: integer
stretchCluster:
Expand Down
3 changes: 2 additions & 1 deletion cluster/examples/kubernetes/ceph/cluster-on-pvc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ metadata:
spec:
dataDirHostPath: /var/lib/rook
mon:
# Set the number of mons to be started. Must be an odd number, and is generally recommended to be 3.
# Set the number of mons to be started. Generally recommended to be 3.
# For highest availability, an odd number of mons should be specified.
count: 3
# The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
# Mons should only be allowed on the same node for test environments where data loss is acceptable.
Expand Down
3 changes: 2 additions & 1 deletion cluster/examples/kubernetes/ceph/cluster.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,8 @@ spec:
# The default wait timeout is 10 minutes.
waitTimeoutForHealthyOSDInMinutes: 10
mon:
# Set the number of mons to be started. Must be an odd number, and is generally recommended to be 3.
# Set the number of mons to be started. Generally recommended to be 3.
# For highest availability, an odd number of mons should be specified.
count: 3
# The mons should be on unique nodes. For production, at least 3 nodes are recommended for this reason.
# Mons should only be allowed on the same node for test environments where data loss is acceptable.
Expand Down
1 change: 1 addition & 0 deletions cluster/examples/kubernetes/ceph/crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -812,6 +812,7 @@ spec:
type: boolean
count:
description: Count is the number of Ceph monitors
maximum: 9
minimum: 0
type: integer
stretchCluster:
Expand Down
4 changes: 0 additions & 4 deletions pkg/apis/ceph.rook.io/v1/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -55,10 +55,6 @@ func (c *CephCluster) ValidateDelete() error {
}

func validateUpdatedCephCluster(updatedCephCluster *CephCluster, found *CephCluster) error {
if updatedCephCluster.Spec.Mon.Count > 0 && updatedCephCluster.Spec.Mon.Count%2 == 0 {
return errors.Errorf("mon count %d cannot be even, must be odd to support a healthy quorum", updatedCephCluster.Spec.Mon.Count)
}

if updatedCephCluster.Spec.DataDirHostPath != found.Spec.DataDirHostPath {
return errors.Errorf("invalid update: DataDirHostPath change from %q to %q is not allowed", found.Spec.DataDirHostPath, updatedCephCluster.Spec.DataDirHostPath)
}
Expand Down
2 changes: 1 addition & 1 deletion pkg/apis/ceph.rook.io/v1/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ func Test_validateUpdatedCephCluster(t *testing.T) {
}{
{"everything is ok", args{&CephCluster{}, &CephCluster{}}, false},
{"good mon count", args{&CephCluster{Spec: ClusterSpec{Mon: MonSpec{Count: 1}}}, &CephCluster{}}, false},
{"even mon count", args{&CephCluster{Spec: ClusterSpec{Mon: MonSpec{Count: 2}}}, &CephCluster{}}, true},
{"even mon count", args{&CephCluster{Spec: ClusterSpec{Mon: MonSpec{Count: 2}}}, &CephCluster{}}, false},
{"good mon count", args{&CephCluster{Spec: ClusterSpec{Mon: MonSpec{Count: 3}}}, &CephCluster{}}, false},
{"changed DataDirHostPath", args{&CephCluster{Spec: ClusterSpec{DataDirHostPath: "foo"}}, &CephCluster{Spec: ClusterSpec{DataDirHostPath: "bar"}}}, true},
{"changed HostNetwork", args{&CephCluster{Spec: ClusterSpec{Network: NetworkSpec{HostNetwork: false}}}, &CephCluster{Spec: ClusterSpec{Network: NetworkSpec{HostNetwork: true}}}}, true},
Expand Down
1 change: 1 addition & 0 deletions pkg/apis/ceph.rook.io/v1/types.go
Original file line number Diff line number Diff line change
Expand Up @@ -467,6 +467,7 @@ const (
type MonSpec struct {
// Count is the number of Ceph monitors
// +kubebuilder:validation:Minimum=0
// +kubebuilder:validation:Maximum=9
// +optional
Count int `json:"count,omitempty"`
// AllowMultiplePerNode determines if we can run multiple monitors on the same node (not recommended)
Expand Down
3 changes: 0 additions & 3 deletions pkg/operator/ceph/cluster/cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -351,9 +351,6 @@ func preClusterStartValidation(cluster *cluster) error {
logger.Warningf("mon count should be at least 1, will use default value of %d", mon.DefaultMonCount)
cluster.Spec.Mon.Count = mon.DefaultMonCount
}
if cluster.Spec.Mon.Count%2 == 0 {
return errors.Errorf("mon count %d cannot be even, must be odd to support a healthy quorum", cluster.Spec.Mon.Count)
}
if !cluster.Spec.Mon.AllowMultiplePerNode {
// Check that there are enough nodes to have a chance of starting the requested number of mons
nodes, err := cluster.context.Clientset.CoreV1().Nodes().List(ctx, metav1.ListOptions{})
Expand Down
2 changes: 1 addition & 1 deletion pkg/operator/ceph/cluster/cluster_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ func TestPreClusterStartValidation(t *testing.T) {
wantErr bool
}{
{"no settings", args{&cluster{Spec: &cephv1.ClusterSpec{}, context: &clusterd.Context{Clientset: testop.New(t, 3)}}}, false},
{"even mons", args{&cluster{context: &clusterd.Context{Clientset: testop.New(t, 3)}, Spec: &cephv1.ClusterSpec{Mon: cephv1.MonSpec{Count: 2}}}}, true},
{"even mons", args{&cluster{context: &clusterd.Context{Clientset: testop.New(t, 3)}, Spec: &cephv1.ClusterSpec{Mon: cephv1.MonSpec{Count: 2}}}}, false},
{"missing stretch zones", args{&cluster{context: &clusterd.Context{Clientset: testop.New(t, 3)}, Spec: &cephv1.ClusterSpec{Mon: cephv1.MonSpec{StretchCluster: &cephv1.StretchClusterSpec{Zones: []cephv1.StretchClusterZoneSpec{
{Name: "a"},
}}}}}}, true},
Expand Down
4 changes: 0 additions & 4 deletions pkg/operator/ceph/cluster/mon/health.go
Original file line number Diff line number Diff line change
Expand Up @@ -597,10 +597,6 @@ func (c *Cluster) addOrRemoveExternalMonitor(status cephclient.MonStatusResponse
logger.Debugf("ClusterInfo is now Empty, refilling it from status.MonMap.Mons")

monCount := len(status.MonMap.Mons)
if monCount%2 == 0 {
logger.Warningf("external cluster mon count is even (%d), should be uneven, continuing.", monCount)
}

if monCount == 1 {
logger.Warning("external cluster mon count is 1, consider adding new monitors.")
}
Expand Down