Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: add deviceType label to an OSD #11159

Merged
merged 1 commit into from
Nov 3, 2022
Merged

Conversation

parth-gr
Copy link
Member

Closes:#11059
Signed-off-by: parth-gr paarora@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide).
  • Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@parth-gr parth-gr marked this pull request as draft October 17, 2022 14:57
@parth-gr parth-gr force-pushed the rotational-disk branch 2 times, most recently from 64d1094 to 919423c Compare October 18, 2022 12:52
@parth-gr parth-gr marked this pull request as ready for review October 18, 2022 12:52
@parth-gr parth-gr marked this pull request as draft October 18, 2022 13:38
@parth-gr parth-gr force-pushed the rotational-disk branch 5 times, most recently from 03356cd to edc0b5c Compare October 18, 2022 15:28
@parth-gr
Copy link
Member Author

Testing results:

metadata:
  creationTimestamp: "2022-10-18T15:20:05Z"
  generateName: rook-ceph-osd-0-6c8df8569c-
  labels:
    app: rook-ceph-osd
    app.kubernetes.io/component: cephclusters.ceph.rook.io
    app.kubernetes.io/created-by: rook-ceph-operator
    app.kubernetes.io/instance: "0"
    app.kubernetes.io/managed-by: rook-ceph-operator
    app.kubernetes.io/name: ceph-osd
    app.kubernetes.io/part-of: my-cluster
    ceph-osd-id: "0"
    ceph-version: 17.2.5-0
    ceph_daemon_id: "0"
    ceph_daemon_type: osd
    device-type: rotational
    ```

@parth-gr parth-gr marked this pull request as ready for review October 18, 2022 15:30
@parth-gr parth-gr requested a review from travisn October 18, 2022 15:30
@@ -52,6 +52,7 @@ func (c *Cluster) getOSDLabels(osd OSDInfo, failureDomainValue string, portable
labels[OsdIdLabelKey] = stringID
labels[FailureDomainKey] = failureDomainValue
labels[portableKey] = strconv.FormatBool(portable)
labels["device-type"] = osd.DeviceType
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
labels["device-type"] = osd.DeviceType
labels["device-type"] = osd.DeviceType

can we not use hard-coded and declare this somewhere similar to other labels above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay

@@ -67,6 +67,10 @@ jobs:
# print existing client auth
kubectl -n rook-ceph exec $toolbox -- ceph auth ls

- name: test OSD label
run: |
kubectl get pod -l ceph-osd-id=0 -nrook-ceph -o yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we should either remove this, or add a check for the label. I think unit tests will be sufficient though, not sure we need the CI test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for testing is the OSD has the label or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you'll revert this before merging?

@@ -229,6 +229,14 @@ func Provision(context *clusterd.Context, agent *OsdAgent, crushLocation, topolo
return errors.Wrap(err, "failed to configure devices")
}

for i := range deviceOSDs {
if deviceOSDs[i].DeviceClass == "hdd" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few thoughts/questions about the device class:

  • Is this DeviceClass always set for an OSD? I assume it's returned by ceph-volume and it would be set.
  • The user can override the device class with a setting in the cluster CR, so it might still be rotational even though the device class is not hdd
  • If we're using the device class property, we should just name the OSD property as deviceClass instead of calling it rotational, since we don't really know if it's rotational.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@umangachapagain Would the device class property be sufficient for your scenario? As long as the user doesn't override the device class and it's a rotational device, we would expect it to be set to hdd. And if that is sufficient, we already have an env var for ROOK_OSD_DEVICE_CLASS on the OSDs. Perhaps that would already work for your scenario? If not, we could add this same device class as a label. Or if the device class is not sufficient, we need to detect the rotational property differently.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If users can override it, I'd want to avoid using it. How common is it for users to override it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It totally depends on the user, if they want to override it or not, so defer from user to user on the basis of their env.
https://github.com/rook/rook/blame/master/Documentation/CRDs/Cluster/ceph-cluster-crd.md#L391

Here is the place in the Cluster CR where user can specify deviceClass

# Individual nodes and their config can be specified as well, but 'useAllNodes' above must be set to false. Then, only the named

So we need a DeviceType label on the OSDs, which would be only set if the DeviceType is governed by the prepare jobs?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the documentation

By default, if a device's class has not already been set, OSDs will automatically set a device's class to either `hdd`, `ssd`, or `nvme`  based on the hardware properties exposed by the Linux kernel.

I think this info as label would be good enough even if device's class is already set by user.

Copy link
Member Author

@parth-gr parth-gr Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay

@@ -96,6 +96,7 @@ type OSDInfo struct {
UUID string `json:"uuid"`
DevicePartUUID string `json:"device-part-uuid"`
DeviceClass string `json:"device-class"`
DeviceType string `json:"device-type"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest to change this to DeviceClass, but I see that property already exists. So maybe we just need to add the label with that deviceClass property that already exists.

Suggested change
DeviceType string `json:"device-type"`

Copy link
Member Author

@parth-gr parth-gr Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, right but it should be device type

@@ -32,6 +32,7 @@ const (
osdWalSizeEnvVarName = "ROOK_OSD_WAL_SIZE"
osdsPerDeviceEnvVarName = "ROOK_OSDS_PER_DEVICE"
osdDeviceClassEnvVarName = "ROOK_OSD_DEVICE_CLASS"
osdDeviceTypeEnvVarName = "ROOK_OSD_DEVICE_TYPE"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of an env var, let's add a label. But I see that we already have an env var for the device class so I wonder if that will be sufficient and we don't need a separate label.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess adding a label would be fine which directly says it's rotational or not fetched from the device class info,
@umangachapagain thoughts??

@@ -67,6 +67,10 @@ jobs:
# print existing client auth
kubectl -n rook-ceph exec $toolbox -- ceph auth ls

- name: test OSD label
run: |
kubectl get pod -l ceph-osd-id=0 -nrook-ceph -o yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you'll revert this before merging?

@@ -52,6 +52,11 @@ func (c *Cluster) getOSDLabels(osd OSDInfo, failureDomainValue string, portable
labels[OsdIdLabelKey] = stringID
labels[FailureDomainKey] = failureDomainValue
labels[portableKey] = strconv.FormatBool(portable)
if osd.DeviceClass == "hdd" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're basing this on the device class, I really think we should just directly add the device-class label. Then whoever looks at the label can interpret it themselves as rotational or not.

Suggested change
if osd.DeviceClass == "hdd" {
labels["device-class"] = osd.DeviceClass

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Closes: rook#11059

Signed-off-by: parth-gr <paarora@redhat.com>
@parth-gr
Copy link
Member Author

@umangachapagain please review it once.
Does it solve the actual issue?

@parth-gr
Copy link
Member Author

@umangachapagain @travisn ^^

@sp98
Copy link
Contributor

sp98 commented Oct 31, 2022

Testing results:

metadata:
  creationTimestamp: "2022-10-18T15:20:05Z"
  generateName: rook-ceph-osd-0-6c8df8569c-
  labels:
    app: rook-ceph-osd
    app.kubernetes.io/component: cephclusters.ceph.rook.io
    app.kubernetes.io/created-by: rook-ceph-operator
    app.kubernetes.io/instance: "0"
    app.kubernetes.io/managed-by: rook-ceph-operator
    app.kubernetes.io/name: ceph-osd
    app.kubernetes.io/part-of: my-cluster
    ceph-osd-id: "0"
    ceph-version: 17.2.5-0
    ceph_daemon_id: "0"
    ceph_daemon_type: osd
    device-type: rotational
    ```

is the result still valid? Looks like PR was updated to use deviceClass

@parth-gr
Copy link
Member Author

Testing results:

metadata:
  creationTimestamp: "2022-10-18T15:20:05Z"
  generateName: rook-ceph-osd-0-6c8df8569c-
  labels:
    app: rook-ceph-osd
    app.kubernetes.io/component: cephclusters.ceph.rook.io
    app.kubernetes.io/created-by: rook-ceph-operator
    app.kubernetes.io/instance: "0"
    app.kubernetes.io/managed-by: rook-ceph-operator
    app.kubernetes.io/name: ceph-osd
    app.kubernetes.io/part-of: my-cluster
    ceph-osd-id: "0"
    ceph-version: 17.2.5-0
    ceph_daemon_id: "0"
    ceph_daemon_type: osd
    device-type: rotational
    ```

is the result still valid? Looks like PR was updated to use deviceClass

yaa these were the old testing result

@travisn travisn merged commit d39258e into rook:master Nov 3, 2022
travisn added a commit that referenced this pull request Nov 3, 2022
osd: add deviceType label to an OSD (backport #11159)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants