Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace OSD to use new backend store #12507

Merged
merged 1 commit into from
Aug 17, 2023
Merged

Conversation

sp98
Copy link
Contributor

@sp98 sp98 commented Jul 13, 2023

Description of your changes:
This PR is to enable migration of OSDs to a new backend store. Currently the OSDs are using bluestore. In future if a new OSD store is introduced, then existing rook OSDs will need to migrate to use this new store.

Design PR: #12381

Pending:

  • Test with encryption
  • Test with KMS

Note: This PR does not cover following scenarios:

  • Migrating existing OSDs on PVC with metadata devices.
  • Existing node-based OSDs (multiple OSDs are created at once using ceph volume batch which adds to additional complications).

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@mergify
Copy link

mergify bot commented Jul 13, 2023

This pull request has merge conflicts that must be resolved before it can be merged. @sp98 please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

@sp98 sp98 force-pushed the osd-migration branch 12 times, most recently from 94b404d to 2b5255d Compare July 17, 2023 10:22
@sp98 sp98 marked this pull request as ready for review July 17, 2023 10:22
@sp98 sp98 changed the title Osd migration Replace OSD to use new backend store Jul 17, 2023
@sp98 sp98 changed the title Replace OSD to use new backend store [WIP] Replace OSD to use new backend store Jul 17, 2023
@sp98 sp98 force-pushed the osd-migration branch 2 times, most recently from 980dc8a to 5ce3b9a Compare July 17, 2023 12:18
@sp98 sp98 changed the title [WIP] Replace OSD to use new backend store [WIP] [Not Ready for Review] Replace OSD to use new backend store Jul 17, 2023
@sp98 sp98 force-pushed the osd-migration branch 3 times, most recently from 6d45eb4 to 00ca167 Compare July 18, 2023 08:24
@sp98 sp98 changed the title [WIP] [Not Ready for Review] Replace OSD to use new backend store Replace OSD to use new backend store Jul 18, 2023
@sp98 sp98 force-pushed the osd-migration branch 5 times, most recently from f314fc0 to 57e9922 Compare July 18, 2023 14:43
@sp98 sp98 requested a review from travisn July 18, 2023 15:39
@sp98 sp98 requested a review from travisn August 11, 2023 11:55
@sp98 sp98 force-pushed the osd-migration branch 4 times, most recently from 1fa42fa to ef2a467 Compare August 13, 2023 06:29
// destroy the OSD using the OSD ID
var replaceOSD *osd.OSDReplaceInfo
if replaceOSDID != -1 {
osdInfo, err := osddaemon.GetOSDInfoById(context, &clusterInfo, replaceOSDID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be moved inside the destroyOSD() method? To keep everything about the destroy in a single place.

@@ -860,6 +880,7 @@ func GetCephVolumeLVMOSDs(context *clusterd.Context, clusterInfo *client.Cluster
lvPath = lv
}

// TODO: Don't read osd store type from env variable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then we can update the comment here with the tracker, thanks

pkg/operator/ceph/cluster/osd/health.go Outdated Show resolved Hide resolved

osdStore, err := m.getOSDStoreStatus()
if err != nil {
logger.Errorf("failed to get osd store type count. %v", osdStore)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this failed, will osdStore still be valid?

Suggested change
logger.Errorf("failed to get osd store type count. %v", osdStore)
logger.Errorf("failed to get osd store status. %v", err)

Copy link
Contributor Author

@sp98 sp98 Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the log message.

It fails only when there is a failure to list the OSD deployment. I think then we would just return and don't update OSD store status in the spec, keeping the status prior to the failure intact.


func (m *OSDHealthMonitor) getOSDStoreStatus() (*cephv1.OSDStatus, error) {
label := fmt.Sprintf("%s=%s", k8sutil.AppAttr, AppName)
osdDeployments, err := k8sutil.GetDeployments(m.clusterInfo.Context, m.context.Clientset, m.clusterInfo.Namespace, label)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OSD deployments will rarely change, so it seems expensive to query these deployments again every minute for the OSD status. I wonder if we can skip this query most of the time. For example, perhaps only query if the desired storeType is different from any existing OSD store types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps we just query this and update the status at the end of a normal OSD reconcile. During the OSD status update just seems too frequent.

Copy link
Contributor Author

@sp98 sp98 Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps we just query this and update the status at the end of a normal OSD reconcile. During the OSD status update just seems too frequent.

With this approach, we will updating osdStatus at two different places (once for deviceClass list and again for OSD store). That might be more confusing to debug later on. IMO, updating the entire OSD status at one place might be a better approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OSD deployments will rarely change, so it seems expensive to query these deployments again every minute for the OSD status. I wonder if we can skip this query most of the time. For example, perhaps only query if the desired storeType is different from any existing OSD store types.

I prefer this approach. We just need to pass the desired store type spec to the OSD health checker. If desired store type is bluestore-rdr but status.storage.osd.storeType['bluestore] != 0` (implying that migration is not complete), only then we make go ahead and get the OSD list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that should work perfectly to reduce the queries, and keep the osd status update in a single place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach I suggested will not work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this up in a follow PR next week.

if pgClean {
return true, nil
}
logger.Warningf("pgs are not healthy. PG status: %q", pgHealthMsg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.Warningf("pgs are not healthy. PG status: %q", pgHealthMsg)
logger.Infof("waiting for PGs to be healthy after replacing an OSD, status: %q", pgHealthMsg)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also just print this every minute or two instead of every 10s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take this up in a follow PR next week.

}

func (o *OSDReplaceConfig) string() (string, error) {
configInBytes, err := json.Marshal(o)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

pkg/operator/ceph/cluster/osd/replace.go Show resolved Hide resolved
@mergify
Copy link

mergify bot commented Aug 16, 2023

This pull request has merge conflicts that must be resolved before it can be merged. @sp98 please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

If osd store is updated in the ceph cluster, then
delete OSDs one by one, cleanup disks and provision a new OSD on
the same disk

Signed-off-by: sp98 <sapillai@redhat.com>
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going ahead and merging with these follow-up items understood:

  1. Additional changes needed for encrypted OSDs to be cleaned up during replacement
  2. OSD status update to be more efficient so it doesn't query OSDs at every interval
  3. Reduce logging frequency while waiting for PGs

Also, a couple follow-up items as feedback from my testing:
4. Let's not wipe the entire disk, just the first part. There's no need to wipe the entire disk since we're just going to backfill to the disk again. Let's just do a quick wipe (perhaps first ~1MB) to get it clean enough to create a new OSD.
5. In the OSD prepare job, add a clear log statement that we're replacing the OSD right before we wipe the disk. When looking at the log, it just wasn't obvious why the disk was being wiped.

@travisn travisn merged commit 0e3117b into rook:master Aug 17, 2023
46 of 49 checks passed
@sp98
Copy link
Contributor Author

sp98 commented Aug 21, 2023

  1. OSD status update to be more efficient so it doesn't query OSDs at every interval

@travisn I propose we move the entire updateCephStorageStatus from OSDHealthChecker to the main reconcile that creates/updates the OSD. We can add this method at the end of the OSD reconcile.

@travisn
Copy link
Member

travisn commented Aug 21, 2023

  1. OSD status update to be more efficient so it doesn't query OSDs at every interval

@travisn I propose we move the entire updateCephStorageStatus from OSDHealthChecker to the main reconcile that creates/updates the OSD. We can add this method at the end of the OSD reconcile.

That method only updates the device classes and device types, right? Since those will only change infrequently and only after a reconcile, makes sense to update that status at the end of a reconcile.

travisn added a commit that referenced this pull request Aug 21, 2023
Replace OSD to use new backend store (backport #12507)
sp98 added a commit to sp98/rook that referenced this pull request Aug 22, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 23, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Aug 27, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
sp98 added a commit to sp98/rook that referenced this pull request Sep 1, 2023
This follow up the rook#12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
mergify bot pushed a commit that referenced this pull request Sep 5, 2023
This follow up the #12507
- fixes replacing of encrypted OSDs.
- Updates OSD status at the end of reconcile

Signed-off-by: sp98 <sapillai@redhat.com>
(cherry picked from commit 11b8d10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants