Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: ability to abort orchestration #6693

Merged
merged 4 commits into from
Nov 25, 2020
Merged

Conversation

leseb
Copy link
Member

@leseb leseb commented Nov 23, 2020

Description of your changes:

We can now prioritize orchestrations on certain event. Today only two
events will cancel on-going orchestrations (if any):

  • request for cluster deletion
  • request for cluster upgrade

If one of the two are caught by the watcher we will cancel the on-going
orchestration.
For that we implemented a simple approach based on check points, where
we will check for a cancellation request in certain part of the
orchestration. Mainly before each mons/mgr/osds orchestration loops.

This solution is not perfect, but we are waiting for the
controller-runtime to release its 0.7 version which will embed context
support. With that we will be able to cancel reconciles more precisely
and rapidly.

Operator log example:

2020-11-25 12:59:12.986264 I | ceph-cluster-controller: done reconciling ceph cluster in namespace "rook-ceph"











2020-11-25 13:07:33.776947 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff=  v1.ClusterSpec{
        CephVersion: v1.CephVersionSpec{
                Image:            "ceph/ceph:v15.2.5",
-               AllowUnsupported: true,
+               AllowUnsupported: false,
        },
        DriveGroups: nil,
        Storage:     {UseAllNodes: true, Selection: {UseAllDevices: &true}},
        ... // 20 identical fields
  }
2020-11-25 13:07:33.777039 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph"
2020-11-25 13:07:33.785088 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:33.788626 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.5...
2020-11-25 13:07:35.280789 I | ceph-cluster-controller: detected ceph image version: "15.2.5-0 octopus"
2020-11-25 13:07:35.280806 I | ceph-cluster-controller: validating ceph version from provided image
2020-11-25 13:07:35.285888 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:35.287828 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:35.288082 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:35.621625 I | ceph-cluster-controller: cluster "rook-ceph": version "15.2.5-0 octopus" detected for image "ceph/ceph:v15.2.5"
2020-11-25 13:07:35.642688 I | op-mon: start running mons
2020-11-25 13:07:35.646323 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:35.654070 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2]
2020-11-25 13:07:35.868253 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:35.868573 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:37.074353 I | op-mon: targeting the mon count 3
2020-11-25 13:07:38.153435 I | op-mon: checking for basic quorum with existing mons
2020-11-25 13:07:38.178029 I | op-mon: mon "a" endpoint is [v2:10.107.242.49:3300,v1:10.107.242.49:6789]
2020-11-25 13:07:38.670191 I | op-mon: mon "b" endpoint is [v2:10.109.71.30:3300,v1:10.109.71.30:6789]
2020-11-25 13:07:39.477820 I | op-mon: mon "c" endpoint is [v2:10.98.93.224:3300,v1:10.98.93.224:6789]
2020-11-25 13:07:39.874094 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2]
2020-11-25 13:07:40.467999 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:40.469733 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:41.071710 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:41.078903 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:41.125233 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2020-11-25 13:07:41.327778 I | op-k8sutil: updating deployment "rook-ceph-mon-a" after verifying it is safe to stop
2020-11-25 13:07:41.327895 I | op-mon: checking if we can stop the deployment rook-ceph-mon-a
2020-11-25 13:07:44.045644 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-a"
2020-11-25 13:07:44.045706 I | op-mon: checking if we can continue the deployment rook-ceph-mon-a
2020-11-25 13:07:44.045740 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:44.109159 I | op-mon: mons running: [a b c]
2020-11-25 13:07:44.474596 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:44.478565 I | op-mon: deployment for mon rook-ceph-mon-b already exists. updating if needed
2020-11-25 13:07:44.493374 I | op-k8sutil: updating deployment "rook-ceph-mon-b" after verifying it is safe to stop
2020-11-25 13:07:44.493403 I | op-mon: checking if we can stop the deployment rook-ceph-mon-b
2020-11-25 13:07:47.135524 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-b"
2020-11-25 13:07:47.135542 I | op-mon: checking if we can continue the deployment rook-ceph-mon-b
2020-11-25 13:07:47.135551 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:47.148820 I | op-mon: mons running: [a b c]
2020-11-25 13:07:47.445946 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:47.448991 I | op-mon: deployment for mon rook-ceph-mon-c already exists. updating if needed
2020-11-25 13:07:47.462041 I | op-k8sutil: updating deployment "rook-ceph-mon-c" after verifying it is safe to stop
2020-11-25 13:07:47.462060 I | op-mon: checking if we can stop the deployment rook-ceph-mon-c
2020-11-25 13:07:48.853118 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff=  v1.ClusterSpec{
        CephVersion: v1.CephVersionSpec{
-               Image:            "ceph/ceph:v15.2.5",
+               Image:            "ceph/ceph:v15.2.6",
                AllowUnsupported: false,
        },
        DriveGroups: nil,
        Storage:     {UseAllNodes: true, Selection: {UseAllDevices: &true}},
        ... // 20 identical fields
  }
2020-11-25 13:07:48.853140 I | ceph-cluster-controller: upgrade requested, cancelling any ongoing orchestration
2020-11-25 13:07:50.119584 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-c"
2020-11-25 13:07:50.119606 I | op-mon: checking if we can continue the deployment rook-ceph-mon-c
2020-11-25 13:07:50.119619 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:50.130860 I | op-mon: mons running: [a b c]
2020-11-25 13:07:50.431341 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:50.431361 I | op-mon: mons created: 3
2020-11-25 13:07:50.734156 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:50.745763 I | op-mon: mons running: [a b c]
2020-11-25 13:07:51.045108 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:51.054497 E | ceph-cluster-controller: failed to reconcile. failed to reconcile cluster "rook-ceph": failed to configure local ceph cluster: failed to create cluster: CANCELLING CURRENT ORCHESTATION
2020-11-25 13:07:52.055208 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph"
2020-11-25 13:07:52.070690 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:52.088979 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.6...
2020-11-25 13:07:53.904811 I | ceph-cluster-controller: detected ceph image version: "15.2.6-0 octopus"
2020-11-25 13:07:53.904862 I | ceph-cluster-controller: validating ceph version from provided image

Closes: #6587

Which issue is resolved by this Pull Request:
Resolves #6587

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@leseb leseb requested a review from BlaineEXE November 23, 2020 18:02
@leseb
Copy link
Member Author

leseb commented Nov 23, 2020

More testing to do on my end, but good for an initial review.

@mergify mergify bot added the ceph main ceph tag label Nov 23, 2020
@leseb leseb mentioned this pull request Nov 24, 2020
10 tasks
Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit message. Also removing one used variable. Did you mean "unused" variable?

Edit: I was hoping this would be attached to the commit I was looking at but I guess not. This is the first commit message.

Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With adding a dedicated predicate for the CephCluster object, the thing I have concerns about is what happens if any of the other cases (CephObjectStore, CephBlockPool, CephFilesystem, CephNFS, CephRBDMirror, etc.) have CRDs created before the CephCluster is created. Assuming I don't find anything wrong, I will likely not request changes and trust that you'll test this before merging.

Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits.

@@ -66,4 +67,7 @@ type Context struct {

// The local devices detected on the node
Devices []*sys.LocalDisk

// OrchestrationCanceller manages the orchestration and its possible cancellation
OrchestrationCanceller *abool.AtomicBool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a nit, the methods on this are Set, Unset, and IsSet, and I don't read "(un)set canceller" as having the right semantic meaning. I think renaming this to "RequestCancelOrchestration" or "OrchestrationCancellationRequest" or something similar makes the method use more grammatically specific to what's going on in the code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will go with RequestCancelOrchestration thanks for the suggestion.


// If spec Images are different we cancel any on-going orchestration
if objNew.Spec.CephVersion.Image != objOld.Spec.CephVersion.Image {
logger.Info("upgrade requested, cancelling any on-going orchestration")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "on-going" -> "ongoing"

Since we moved to the controller-runtime, events are processed one by
one and so are reconciles. This means we won't have multiple
orchestrations happening at the same time. Thus removing this code.

Also removing one unused variable.

Signed-off-by: Sébastien Han <seb@redhat.com>
Since the predicate for the CephCluster object will soon move into its
own predidacte we need to export isDoNotReconcile so that it can be
consummed by the "cluster" package.

Signed-off-by: Sébastien Han <seb@redhat.com>
Since we want to pass a context to it, let's extract the logic into its
own predicate.

Signed-off-by: Sébastien Han <seb@redhat.com>
@leseb
Copy link
Member Author

leseb commented Nov 24, 2020

With adding a dedicated predicate for the CephCluster object, the thing I have concerns about is what happens if any of the other cases (CephObjectStore, CephBlockPool, CephFilesystem, CephNFS, CephRBDMirror, etc.) have CRDs created before the CephCluster is created. Assuming I don't find anything wrong, I will likely not request changes and trust that you'll test this before merging.

Conceptually, it makes more sense to have a dedicated predicate for each controller. I had previously homogenized the code to end up with a single one but because this cancellation approach is only for the CephCluster I didn't want to update all the other controller's definitions for something we might never use.
Extracting the predicate does not change the behavior of the other controllers, they will behave the same as they were. If a child is injected before a CephCluster one, the child controller will respond normally and wait for the CephCluster to appear so it can reconcile.

Does that clarify?
Thanks!

@BlaineEXE
Copy link
Member

With adding a dedicated predicate for the CephCluster object, the thing I have concerns about is what happens if any of the other cases (CephObjectStore, CephBlockPool, CephFilesystem, CephNFS, CephRBDMirror, etc.) have CRDs created before the CephCluster is created. Assuming I don't find anything wrong, I will likely not request changes and trust that you'll test this before merging.

Conceptually, it makes more sense to have a dedicated predicate for each controller. I had previously homogenized the code to end up with a single one but because this cancellation approach is only for the CephCluster I didn't want to update all the other controller's definitions for something we might never use.
Extracting the predicate does not change the behavior of the other controllers, they will behave the same as they were. If a child is injected before a CephCluster one, the child controller will respond normally and wait for the CephCluster to appear so it can reconcile.

Does that clarify?
Thanks!

Yes! That seems obvious now that you have said it. I still haven't had coffee yet today. 😬 ☕

@leseb leseb force-pushed the abort-reconcile-simple branch 5 times, most recently from f26218d to 7a76ed9 Compare November 25, 2020 14:23
We can now prioritize orchestrations on certain event. Today only two
events will cancel on-going orchestrations (if any):

* request for cluster deletion
* request for cluster upgrade

If one of the two are caught by the watcher we will cancel the on-going
orchestration.
For that we implemented a simple approach based on check points, where
we will check for a cancellation request in certain part of the
orchestration. Mainly before each mons/mgr/osds orchestration loops.

This solution is not perfect, but we are waiting for the
controller-runtime to release its 0.7 version which will embed context
support. With that we will be able to cancel reconciles more precisely
and rapidly.

Operator log example:

```
2020-11-24 13:54:59.499719 I | op-mon: parsing mon endpoints: a=10.109.126.120:6789
2020-11-24 13:54:59.499719 I | op-mon: parsing mon endpoints: a=10.109.126.120:6789
2020-11-25 12:59:12.986264 I | ceph-cluster-controller: done reconciling ceph cluster in namespace "rook-ceph"
2020-11-25 13:07:33.776947 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff=  v1.ClusterSpec{
        CephVersion: v1.CephVersionSpec{
                Image:            "ceph/ceph:v15.2.5",
-               AllowUnsupported: true,
+               AllowUnsupported: false,
        },
        DriveGroups: nil,
        Storage:     {UseAllNodes: true, Selection: {UseAllDevices: &true}},
        ... // 20 identical fields
  }
2020-11-25 13:07:33.777039 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph"
2020-11-25 13:07:33.785088 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:33.788626 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.5...
2020-11-25 13:07:35.280789 I | ceph-cluster-controller: detected ceph image version: "15.2.5-0 octopus"
2020-11-25 13:07:35.280806 I | ceph-cluster-controller: validating ceph version from provided image
2020-11-25 13:07:35.285888 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:35.287828 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:35.288082 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:35.621625 I | ceph-cluster-controller: cluster "rook-ceph": version "15.2.5-0 octopus" detected for image "ceph/ceph:v15.2.5"
2020-11-25 13:07:35.642688 I | op-mon: start running mons
2020-11-25 13:07:35.646323 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:35.654070 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2]
2020-11-25 13:07:35.868253 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:35.868573 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:37.074353 I | op-mon: targeting the mon count 3
2020-11-25 13:07:38.153435 I | op-mon: checking for basic quorum with existing mons
2020-11-25 13:07:38.178029 I | op-mon: mon "a" endpoint is [v2:10.107.242.49:3300,v1:10.107.242.49:6789]
2020-11-25 13:07:38.670191 I | op-mon: mon "b" endpoint is [v2:10.109.71.30:3300,v1:10.109.71.30:6789]
2020-11-25 13:07:39.477820 I | op-mon: mon "c" endpoint is [v2:10.98.93.224:3300,v1:10.98.93.224:6789]
2020-11-25 13:07:39.874094 I | op-mon: saved mon endpoints to config map map[csi-cluster-config-json:[{"clusterID":"rook-ceph","monitors":["10.107.242.49:6789","10.109.71.30:6789","10.98.93.224:6789"]}] data:a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789 mapping:{"node":{"a":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"b":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"},"c":{"Name":"minikube","Hostname":"minikube","Address":"192.168.39.3"}}} maxMonId:2]
2020-11-25 13:07:40.467999 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:40.469733 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:41.071710 I | cephclient: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-11-25 13:07:41.078903 I | cephclient: generated admin config in /var/lib/rook/rook-ceph
2020-11-25 13:07:41.125233 I | op-mon: deployment for mon rook-ceph-mon-a already exists. updating if needed
2020-11-25 13:07:41.327778 I | op-k8sutil: updating deployment "rook-ceph-mon-a" after verifying it is safe to stop
2020-11-25 13:07:41.327895 I | op-mon: checking if we can stop the deployment rook-ceph-mon-a
2020-11-25 13:07:44.045644 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-a"
2020-11-25 13:07:44.045706 I | op-mon: checking if we can continue the deployment rook-ceph-mon-a
2020-11-25 13:07:44.045740 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:44.109159 I | op-mon: mons running: [a b c]
2020-11-25 13:07:44.474596 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:44.478565 I | op-mon: deployment for mon rook-ceph-mon-b already exists. updating if needed
2020-11-25 13:07:44.493374 I | op-k8sutil: updating deployment "rook-ceph-mon-b" after verifying it is safe to stop
2020-11-25 13:07:44.493403 I | op-mon: checking if we can stop the deployment rook-ceph-mon-b
2020-11-25 13:07:47.135524 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-b"
2020-11-25 13:07:47.135542 I | op-mon: checking if we can continue the deployment rook-ceph-mon-b
2020-11-25 13:07:47.135551 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:47.148820 I | op-mon: mons running: [a b c]
2020-11-25 13:07:47.445946 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:47.448991 I | op-mon: deployment for mon rook-ceph-mon-c already exists. updating if needed
2020-11-25 13:07:47.462041 I | op-k8sutil: updating deployment "rook-ceph-mon-c" after verifying it is safe to stop
2020-11-25 13:07:47.462060 I | op-mon: checking if we can stop the deployment rook-ceph-mon-c
2020-11-25 13:07:48.853118 I | ceph-cluster-controller: CR has changed for "rook-ceph". diff=  v1.ClusterSpec{
        CephVersion: v1.CephVersionSpec{
-               Image:            "ceph/ceph:v15.2.5",
+               Image:            "ceph/ceph:v15.2.6",
                AllowUnsupported: false,
        },
        DriveGroups: nil,
        Storage:     {UseAllNodes: true, Selection: {UseAllDevices: &true}},
        ... // 20 identical fields
  }
2020-11-25 13:07:48.853140 I | ceph-cluster-controller: upgrade requested, cancelling any ongoing orchestration
2020-11-25 13:07:50.119584 I | op-k8sutil: finished waiting for updated deployment "rook-ceph-mon-c"
2020-11-25 13:07:50.119606 I | op-mon: checking if we can continue the deployment rook-ceph-mon-c
2020-11-25 13:07:50.119619 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:50.130860 I | op-mon: mons running: [a b c]
2020-11-25 13:07:50.431341 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:50.431361 I | op-mon: mons created: 3
2020-11-25 13:07:50.734156 I | op-mon: waiting for mon quorum with [a b c]
2020-11-25 13:07:50.745763 I | op-mon: mons running: [a b c]
2020-11-25 13:07:51.045108 I | op-mon: Monitors in quorum: [a b c]
2020-11-25 13:07:51.054497 E | ceph-cluster-controller: failed to reconcile. failed to reconcile cluster "rook-ceph": failed to configure local ceph cluster: failed to create cluster: CANCELLING CURRENT ORCHESTATION
2020-11-25 13:07:52.055208 I | ceph-cluster-controller: reconciling ceph cluster in namespace "rook-ceph"
2020-11-25 13:07:52.070690 I | op-mon: parsing mon endpoints: a=10.107.242.49:6789,b=10.109.71.30:6789,c=10.98.93.224:6789
2020-11-25 13:07:52.088979 I | ceph-cluster-controller: detecting the ceph image version for image ceph/ceph:v15.2.6...
2020-11-25 13:07:53.904811 I | ceph-cluster-controller: detected ceph image version: "15.2.6-0 octopus"
2020-11-25 13:07:53.904862 I | ceph-cluster-controller: validating ceph version from provided image
```

Closes: rook#6587
Signed-off-by: Sébastien Han <seb@redhat.com>
@leseb leseb merged commit 78f22fc into rook:master Nov 25, 2020
@leseb leseb deleted the abort-reconcile-simple branch November 25, 2020 15:23
mergify bot added a commit that referenced this pull request Nov 25, 2020
ceph: ability to abort orchestration (bp #6693)
mergify bot added a commit that referenced this pull request Nov 25, 2020
ceph: ability to abort orchestration (bp #6693)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Abort a CephCluster reconcile if the CR is updated while a reconcile is in progress
2 participants