Replace OSD to use new backend store #12507

sp98 · 2023-07-13T12:53:39Z

Description of your changes:
This PR is to enable migration of OSDs to a new backend store. Currently the OSDs are using bluestore. In future if a new OSD store is introduced, then existing rook OSDs will need to migrate to use this new store.

Design PR: #12381

Pending:

Test with encryption
Test with KMS

Note: This PR does not cover following scenarios:

Migrating existing OSDs on PVC with metadata devices.
Existing node-based OSDs (multiple OSDs are created at once using ceph volume batch which adds to additional complications).

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
Reviewed the developer guide on Submitting a Pull Request
Pending release notes updated with breaking and/or notable changes for the next minor release.
Documentation has been updated, if necessary.
Unit tests have been added, if necessary.
Integration tests have been added, if necessary.

mergify · 2023-07-13T12:54:14Z

This pull request has merge conflicts that must be resolved before it can be merged. @sp98 please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

pkg/operator/ceph/cluster/osd/replace.go

travisn · 2023-08-16T00:07:51Z

cmd/rook/ceph/osd.go

+	// destroy the OSD using the OSD ID
+	var replaceOSD *osd.OSDReplaceInfo
+	if replaceOSDID != -1 {
+		osdInfo, err := osddaemon.GetOSDInfoById(context, &clusterInfo, replaceOSDID)


Can this be moved inside the destroyOSD() method? To keep everything about the destroy in a single place.

travisn · 2023-08-16T00:11:29Z

pkg/daemon/ceph/osd/volume.go

@@ -860,6 +880,7 @@ func GetCephVolumeLVMOSDs(context *clusterd.Context, clusterInfo *client.Cluster
 			lvPath = lv
 		}

+		// TODO: Don't read osd store type from env variable


Ok, then we can update the comment here with the tracker, thanks

pkg/operator/ceph/cluster/osd/health.go

travisn · 2023-08-16T00:15:21Z

pkg/operator/ceph/cluster/osd/health.go

+
+	osdStore, err := m.getOSDStoreStatus()
+	if err != nil {
+		logger.Errorf("failed to get osd store type count. %v", osdStore)


If this failed, will osdStore still be valid?

Suggested change

logger.Errorf("failed to get osd store type count. %v", osdStore)

logger.Errorf("failed to get osd store status. %v", err)

updated the log message.

It fails only when there is a failure to list the OSD deployment. I think then we would just return and don't update OSD store status in the spec, keeping the status prior to the failure intact.

travisn · 2023-08-16T00:21:27Z

pkg/operator/ceph/cluster/osd/health.go

+
+func (m *OSDHealthMonitor) getOSDStoreStatus() (*cephv1.OSDStatus, error) {
+	label := fmt.Sprintf("%s=%s", k8sutil.AppAttr, AppName)
+	osdDeployments, err := k8sutil.GetDeployments(m.clusterInfo.Context, m.context.Clientset, m.clusterInfo.Namespace, label)


The OSD deployments will rarely change, so it seems expensive to query these deployments again every minute for the OSD status. I wonder if we can skip this query most of the time. For example, perhaps only query if the desired storeType is different from any existing OSD store types.

Or perhaps we just query this and update the status at the end of a normal OSD reconcile. During the OSD status update just seems too frequent.

Or perhaps we just query this and update the status at the end of a normal OSD reconcile. During the OSD status update just seems too frequent.

With this approach, we will updating osdStatus at two different places (once for deviceClass list and again for OSD store). That might be more confusing to debug later on. IMO, updating the entire OSD status at one place might be a better approach.

The OSD deployments will rarely change, so it seems expensive to query these deployments again every minute for the OSD status. I wonder if we can skip this query most of the time. For example, perhaps only query if the desired storeType is different from any existing OSD store types.

I prefer this approach. We just need to pass the desired store type spec to the OSD health checker. If desired store type is bluestore-rdr but status.storage.osd.storeType['bluestore] != 0` (implying that migration is not complete), only then we make go ahead and get the OSD list.

Agreed, that should work perfectly to reduce the queries, and keep the osd status update in a single place.

The approach I suggested will not work.

I'll take this up in a follow PR next week.

travisn · 2023-08-16T00:31:24Z

pkg/operator/ceph/cluster/osd/osd.go

+		if pgClean {
+			return true, nil
+		}
+		logger.Warningf("pgs are not healthy. PG status: %q", pgHealthMsg)


Suggested change

logger.Warningf("pgs are not healthy. PG status: %q", pgHealthMsg)

logger.Infof("waiting for PGs to be healthy after replacing an OSD, status: %q", pgHealthMsg)

Can we also just print this every minute or two instead of every 10s?

I'll take this up in a follow PR next week.

travisn · 2023-08-16T00:34:18Z

pkg/operator/ceph/cluster/osd/replace.go

+}
+
+func (o *OSDReplaceConfig) string() (string, error) {
+	configInBytes, err := json.Marshal(o)


pkg/operator/ceph/cluster/osd/replace.go

mergify · 2023-08-16T17:17:33Z

This pull request has merge conflicts that must be resolved before it can be merged. @sp98 please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

If osd store is updated in the ceph cluster, then delete OSDs one by one, cleanup disks and provision a new OSD on the same disk Signed-off-by: sp98 <sapillai@redhat.com>

travisn

Going ahead and merging with these follow-up items understood:

Additional changes needed for encrypted OSDs to be cleaned up during replacement
OSD status update to be more efficient so it doesn't query OSDs at every interval
Reduce logging frequency while waiting for PGs

Also, a couple follow-up items as feedback from my testing:
4. Let's not wipe the entire disk, just the first part. There's no need to wipe the entire disk since we're just going to backfill to the disk again. Let's just do a quick wipe (perhaps first ~1MB) to get it clean enough to create a new OSD.
5. In the OSD prepare job, add a clear log statement that we're replacing the OSD right before we wipe the disk. When looking at the log, it just wasn't obvious why the disk was being wiped.

sp98 · 2023-08-21T04:07:26Z

OSD status update to be more efficient so it doesn't query OSDs at every interval

@travisn I propose we move the entire updateCephStorageStatus from OSDHealthChecker to the main reconcile that creates/updates the OSD. We can add this method at the end of the OSD reconcile.

travisn · 2023-08-21T14:50:47Z

OSD status update to be more efficient so it doesn't query OSDs at every interval

@travisn I propose we move the entire updateCephStorageStatus from OSDHealthChecker to the main reconcile that creates/updates the OSD. We can add this method at the end of the OSD reconcile.

That method only updates the device classes and device types, right? Since those will only change infrequently and only after a reconcile, makes sense to update that status at the end of a reconcile.

Replace OSD to use new backend store (backport #12507)

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

This follow up the #12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com> (cherry picked from commit 11b8d10)

sp98 force-pushed the osd-migration branch 12 times, most recently from 94b404d to 2b5255d Compare July 17, 2023 10:22

sp98 marked this pull request as ready for review July 17, 2023 10:22

sp98 changed the title ~~Osd migration~~ Replace OSD to use new backend store Jul 17, 2023

sp98 changed the title ~~Replace OSD to use new backend store~~ [WIP] Replace OSD to use new backend store Jul 17, 2023

sp98 force-pushed the osd-migration branch 2 times, most recently from 980dc8a to 5ce3b9a Compare July 17, 2023 12:18

sp98 changed the title ~~[WIP] Replace OSD to use new backend store~~ [WIP] [Not Ready for Review] Replace OSD to use new backend store Jul 17, 2023

sp98 force-pushed the osd-migration branch 3 times, most recently from 6d45eb4 to 00ca167 Compare July 18, 2023 08:24

sp98 commented Jul 18, 2023

View reviewed changes

pkg/operator/ceph/cluster/osd/replace.go Outdated Show resolved Hide resolved

sp98 changed the title ~~[WIP] [Not Ready for Review] Replace OSD to use new backend store~~ Replace OSD to use new backend store Jul 18, 2023

sp98 force-pushed the osd-migration branch 5 times, most recently from f314fc0 to 57e9922 Compare July 18, 2023 14:43

sp98 requested a review from travisn July 18, 2023 15:39

sp98 force-pushed the osd-migration branch from c6837ab to e2770fd Compare August 11, 2023 11:13

sp98 requested a review from travisn August 11, 2023 11:55

sp98 force-pushed the osd-migration branch 4 times, most recently from 1fa42fa to ef2a467 Compare August 13, 2023 06:29

travisn requested changes Aug 16, 2023

View reviewed changes

sp98 force-pushed the osd-migration branch from ef2a467 to 64a354c Compare August 16, 2023 08:33

sp98 force-pushed the osd-migration branch from 64a354c to 22234b0 Compare August 17, 2023 08:12

osd: replace osd to use new backend store

886bb35

If osd store is updated in the ceph cluster, then delete OSDs one by one, cleanup disks and provision a new OSD on the same disk Signed-off-by: sp98 <sapillai@redhat.com>

sp98 force-pushed the osd-migration branch from 22234b0 to 886bb35 Compare August 17, 2023 16:11

sp98 requested a review from travisn August 17, 2023 16:15

travisn approved these changes Aug 17, 2023

View reviewed changes

travisn merged commit 0e3117b into rook:master Aug 17, 2023
46 of 49 checks passed

travisn added the backport-release-1.12 label Aug 17, 2023

mergify bot mentioned this pull request Aug 17, 2023

Replace OSD to use new backend store (backport #12507) #12752

Merged

travisn added a commit that referenced this pull request Aug 21, 2023

Merge pull request #12752 from rook/mergify/bp/release-1.12/pr-12507

77d398c

Replace OSD to use new backend store (backport #12507)

sp98 added a commit to sp98/rook that referenced this pull request Aug 22, 2023

osd: replace existing OSDs to use new store

d91e0a1

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 mentioned this pull request Aug 22, 2023

osd: replace existing OSDs to use new store #12770

Merged

9 tasks

sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023

osd: replace existing OSDs to use new store

5eb298e

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023

osd: replace existing OSDs to use new store

6048cf0

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023

osd: replace existing OSDs to use new store

f8d44da

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023

osd: replace existing OSDs to use new store

7aa25d3

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023

osd: replace existing OSDs to use new store

7a8f726

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Aug 27, 2023

osd: replace existing OSDs to use new store

497fc70

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

sp98 added a commit to sp98/rook that referenced this pull request Sep 1, 2023

osd: replace existing OSDs to use new store

11b8d10

This follow up the rook#12507 - fixes replacing of encrypted OSDs. - Updates OSD status at the end of reconcile Signed-off-by: sp98 <sapillai@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace OSD to use new backend store #12507

Replace OSD to use new backend store #12507

sp98 commented Jul 13, 2023 •

edited

mergify bot commented Jul 13, 2023

travisn Aug 16, 2023

travisn Aug 16, 2023

travisn Aug 16, 2023

sp98 Aug 16, 2023 •

edited

travisn Aug 16, 2023

travisn Aug 16, 2023

sp98 Aug 16, 2023 •

edited

sp98 Aug 16, 2023

travisn Aug 16, 2023

sp98 Aug 17, 2023

sp98 Aug 17, 2023

travisn Aug 16, 2023

travisn Aug 16, 2023

sp98 Aug 17, 2023

travisn Aug 16, 2023

mergify bot commented Aug 16, 2023

travisn left a comment

sp98 commented Aug 21, 2023

travisn commented Aug 21, 2023

	logger.Errorf("failed to get osd store type count. %v", osdStore)
	logger.Errorf("failed to get osd store status. %v", err)

	logger.Warningf("pgs are not healthy. PG status: %q", pgHealthMsg)
	logger.Infof("waiting for PGs to be healthy after replacing an OSD, status: %q", pgHealthMsg)

Replace OSD to use new backend store #12507

Replace OSD to use new backend store #12507

Conversation

sp98 commented Jul 13, 2023 • edited

mergify bot commented Jul 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sp98 Aug 16, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sp98 Aug 16, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mergify bot commented Aug 16, 2023

travisn left a comment

Choose a reason for hiding this comment

sp98 commented Aug 21, 2023

travisn commented Aug 21, 2023

sp98 commented Jul 13, 2023 •

edited

sp98 Aug 16, 2023 •

edited

sp98 Aug 16, 2023 •

edited