Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] osd: reweight osd while resizing #14435

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

parth-gr
Copy link
Member

@parth-gr parth-gr commented Jul 8, 2024

osd got resized by cryptsetup bluestore cmd
but it should also be reweight to balance the pgs properly

closes: #14430

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@parth-gr parth-gr changed the title osd: reweight osd while resizing [wip] osd: reweight osd while resizing Jul 8, 2024
@travisn
Copy link
Member

travisn commented Jul 8, 2024

I'm thinking that instead of modifying the init container, we should check this during the osd reconcile. For example:

  1. The OSD reconcile starts
  2. Query for the crush weights of all OSDs
  3. As each OSD is reconciled, if the weight does not equal the size of the PVC (in TiB), then execute the command:
ceph osd crush set osd.<id> <expected_weight> <crush_location_properties>

This way, we just check it during reconcile, and no need to modify the init containers to check it every time OSDs start.

This will follow a similar pattern to #14056, which should also check the OSD device class during reconcile. The pattern for updating these OSD properties will be very similar.

@parth-gr
Copy link
Member Author

@travisn would this be the right place

err = setOSDProperties(c, osdProps, osd)
, for checking the ceph osd df and osd PVC size

@parth-gr
Copy link
Member Author

But the actuall update happen here

failures := updateMultipleDeploymentsAndWaitFunc(c.cluster.clusterInfo.Context, c.cluster.context.Clientset, updatedDeployments, listFunc)

So we should do after this?

@parth-gr
Copy link
Member Author

osdResize := `
DM_name=%s
cryptsetup --verbose status "$DM_name"
ceph osd crush reweight osd.0 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to only be osd.0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was prototyping, we are going a different direction now.

@BlaineEXE BlaineEXE marked this pull request as draft July 10, 2024 15:25
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or it should be here https://github.com/rook/rook/blob/master/pkg/operator/ceph/cluster/osd/osd.go#L274

As discussed, let's do it here since the osd reconcile is all done at this point and we can check all the OSDs for the needed weights and device classes.

osdResize := `
DM_name=%s
cryptsetup --verbose status "$DM_name"
ceph osd crush reweight osd.0 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was prototyping, we are going a different direction now.

@parth-gr parth-gr marked this pull request as ready for review August 1, 2024 15:11
osd got resized by cryptsetup bluestore cmd
but it should also be reweight to balance the pgs properly

closes: rook#14430

Signed-off-by: parth-gr <partharora1010@gmail.com>
@@ -2833,6 +2833,9 @@ type StorageScopeSpec struct {
// +nullable
// +optional
StorageClassDeviceSets []StorageClassDeviceSet `json:"storageClassDeviceSets,omitempty"`
// resize the osd crush weight once size of the osd pvc is increased
// +optional
ResizeOsdCrushWeight bool `json:"resizeOsdCrushWeight,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about moving this down next to the AllowDeviceClassUpdate setting? Also, suggest this name for consistency.

Suggested change
ResizeOsdCrushWeight bool `json:"resizeOsdCrushWeight,omitempty"`
AllowOsdWeightUpdate bool `json:"resizeOsdCrushWeight,omitempty"`

Copy link
Member Author

@parth-gr parth-gr Aug 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about AllowOsdCrushWeightUpdate?

}
desiredOSD, ok := desiredOSDs[actualOSD.ID]
if !ok {
return nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to continue the loop

Suggested change
return nil
continue

return nil
}
if actualOSD.KB != actualOSD.CrushWeight {
args := []string{"osd", "crush", "reweight", fmt.Sprintf("osd.%d", actualOSD.ID), fmt.Sprintf("%q", actualOSD.KB.String())}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed, let's check on the units

@@ -2833,6 +2833,9 @@ type StorageScopeSpec struct {
// +nullable
// +optional
StorageClassDeviceSets []StorageClassDeviceSet `json:"storageClassDeviceSets,omitempty"`
// resize the osd crush weight once size of the osd pvc is increased
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc comments get rendered to rook.io now, so let's try to keep things capitalized and following good doc comment hygiene.

And since this setting affects cluster rebalancing/stability, let's make sure we note that for users. I think this change might be helpful and still concise enough for users.

Suggested change
// resize the osd crush weight once size of the osd pvc is increased
// ResizeOsdCrushWeight allows Rook to resize the OSD CRUSH weight when the OSD PVC is increased. This allows cluster data to be rebalanced to make most effective use of new OSD space; however, data rebalancing can cause temporary cluster slowdown.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this make more sense

args := []string{"osd", "crush", "reweight", fmt.Sprintf("osd.%d", actualOSD.ID), fmt.Sprintf("%q", actualOSD.KB.String())}
buf, err := cephclient.NewCephCommand(c.context, c.clusterInfo, args).Run()
if err != nil {
return errors.Wrap(err, "failed to reweight. "+string(buf))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

errors.Wrap uses a colon to separate error messages. Let's also capture the OSD ID in the error message to help debug.

Suggested change
return errors.Wrap(err, "failed to reweight. "+string(buf))
return errors.Wrapf(err, "failed to reweight osd %d: %s", actualOSD.ID, string(buf))

Comment on lines +335 to +339
logger.Error(err)
}
if err := c.updateDeviceClassIfChanged(actualOSD.ID, desiredOSD.DeviceClass, actualOSD.DeviceClass); err != nil {
// Log the error and allow other updates to continue
logger.Error(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because these errors are only logged, let's also be sure to include the namespace of the Ceph cluster that corresponds to the error by appending that. Otherwise, we might wind up with scenarios where we can't tell which cluster is giving the error.

logger.Error(fmt.Sprintf("failed to XYZ on cluster in namespace %q: %s", namespace, err))

if !c.spec.Storage.ResizeOsdCrushWeight {
return nil
}
if actualOSD.KB != actualOSD.CrushWeight {
Copy link
Member

@BlaineEXE BlaineEXE Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't actualOSD.KB be translated into a TiB value? Or is it TB? We should make sure we know which should be used.

For that matter, we should check whether the KB value is truly KB or whether it's KiB.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OSD resize should update the osd weight
3 participants