Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: support OSD on PVC backed by LV #4219

Merged
merged 1 commit into from Dec 5, 2019

Conversation

@satoru-takeuchi
Copy link
Contributor

satoru-takeuchi commented Oct 30, 2019

Description of your changes:

"OSD on PVC" doesn't work for PV backed by LV. Fixing this problem
by the following changes.

  • Rook accepts LVM disk type.
  • If a LV-backed device is passed, Rook/Ceph invokes
    "ceph-volume lvm prepare" with "--data vg/lv"
    instead of "--data /path/to/device".
  • If a LV-backed device is passed, Rook/Ceph suppresses
    activation/deactivation of VG that owns this LV.

Which issue is resolved by this Pull Request:
Resolves #4185

Checklist:

  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.
  • Comments have been added or updated based on the standards set in CONTRIBUTING.md
  • Add the flag for skipping the CI if this PR does not require a build. See here for more details.
func IsLV(devicePath string, executor exec.Executor) (bool, error) {
devProps, err := GetDevicePropertiesFromPath(devicePath, executor)
if err != nil {
return false, fmt.Errorf("failed to get device properties for %s: %+v", devicePath, err)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

use %q instead of %s

}
diskType, ok := devProps["TYPE"]
if !ok {
return false, fmt.Errorf("TYPE property is not found for %s", devicePath)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

ditto

}
out, err := context.Executor.ExecuteCommandWithOutput(true, "", "dmsetup", "splitname", devInfo, "--noheadings")
if err != nil {
return "", fmt.Errorf("failed dmsetup splitname %s. err: %v", devInfo, err)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

use fmt.Errorf("failed dmsetup splitname %q. %+v", devInfo, err)

devInfo, err := context.Executor.ExecuteCommandWithOutput(true, "",
"dmsetup", "info", "-c", "--noheadings", "-o", "name", devicePath)
if err != nil {
return "", fmt.Errorf("failed dmsetup info. output: %s, err: %v", devInfo, err)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

use: fmt.Errorf("failed dmsetup info. output: %q. %+v", devInfo, err)

}
split := strings.Split(out, ":")
if len(split) < 2 {
return "", fmt.Errorf("dmsetup splitname returned unexpected result for %s. output: %s", devInfo, out)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

use: fmt.Errorf("dmsetup splitname returned unexpected result for %q. output: %q", devInfo, out)

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 6b48979 to 312151c Oct 30, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Oct 30, 2019

@leseb Thank you for your comment. Fixed.

Copy link
Member

leseb left a comment

Please rebase. Also add some documentation and a release note.

// pass 'vg/lv' to ceph-volume
deviceArg, err = getLVFromDevicePath(context, device.Config.Name)
if err != nil {
return "", fmt.Errorf("failed to get device path from lv. %+v", err)

This comment has been minimized.

Copy link
@leseb

leseb Oct 30, 2019

Member

Print out which LV.

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 312151c to 8ed5e6e Nov 1, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 1, 2019

@leseb Thanks, fixed.

// pass 'vg/lv' to ceph-volume
deviceArg, err = getLVNameFromDevicePath(context, device.Config.Name)
if err != nil {
return "", fmt.Errorf("failed to get lv name from device path %s. %+v", device.Config.Name, err)

This comment has been minimized.

Copy link
@leseb

leseb Nov 4, 2019

Member

Use %q instead of %s.

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 8ed5e6e to ef8b3df Nov 5, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 5, 2019

@leseb Oops, sorry. Fixed.

@leseb
leseb approved these changes Nov 5, 2019
@leseb leseb added this to the 1.2 milestone Nov 5, 2019
@leseb leseb added this to In progress in v1.2 via automation Nov 5, 2019
@leseb leseb requested review from travisn and BlaineEXE Nov 5, 2019
@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch 5 times, most recently from 15cd73d to 176060b Nov 7, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 7, 2019

Resolved conflicts caused by recent commits to master.

v1.2 automation moved this from In progress to Review in progress Nov 8, 2019
Copy link
Member

leseb left a comment

Can you update the doc as well? I mention somewhere in https://rook.io/docs/rook/v1.1/ceph-cluster-crd.html#storage-selection-settings would be nice. Thanks.

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 176060b to e3b0ee3 Nov 12, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 12, 2019

Added the description of the supported block device for OSD on PVC.

@satoru-takeuchi satoru-takeuchi requested a review from leseb Nov 12, 2019
@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch 2 times, most recently from 619ed16 to 6d8f66c Nov 13, 2019

var err error
if len(devices.Entries) == 0 {
logger.Infof("no new devices to configure. returning devices already configured with ceph-volume.")
osds, err = getCephVolumeOSDs(context, a.cluster.Name, a.cluster.FSID, lv, false)
osds, err = getCephVolumeOSDs(context, a.cluster.Name, a.cluster.FSID, lv, false, false)

This comment has been minimized.

Copy link
@leseb

leseb Nov 13, 2019

Member
Suggested change
osds, err = getCephVolumeOSDs(context, a.cluster.Name, a.cluster.FSID, lv, false, false)
osds, err = getCephVolumeOSDs(context, a.cluster.Name, a.cluster.FSID, lv, false, lvBackedPV)

?

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Nov 28, 2019

Author Contributor

fixed

@@ -80,7 +80,7 @@ func PopulateDeviceInfo(d string, executor exec.Executor) (*sys.LocalDisk, error
}

diskType, ok := diskProps["TYPE"]
if !ok || (diskType != sys.SSDType && diskType != sys.CryptType && diskType != sys.DiskType && diskType != sys.PartType && diskType != sys.LinearType) {
if !ok || (diskType != sys.SSDType && diskType != sys.CryptType && diskType != sys.DiskType && diskType != sys.PartType && diskType != sys.LinearType && diskType != sys.LVMType) {

This comment has been minimized.

Copy link
@travisn

travisn Nov 13, 2019

Member

If the LVM disk is allowed here, it will also impact the non-PVC scenario. Is there any reason not to support this for the non-PVC scenario? This feature has been requested by others as also seen in #2047. Could you update this PR to support for the non-PVC scenario as well? Thanks!

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Nov 13, 2019

Author Contributor

Not at all. I didn't handle non-PVC scenario since I considered these are the different features.

I'll update this PR. Please wait for a while.

This comment has been minimized.

Copy link
@travisn

travisn Nov 13, 2019

Member

thanks for looking into it!

This comment has been minimized.

Copy link
@travisn

travisn Nov 26, 2019

Member

I'm ok if we need to address the non-PV scenario with a separate PR, but if it's not much difference it would be great to support at the same time. Were you able to investigate yet? If not, let's just focus on finishing this for feature complete for v1.2 next week. Thanks!

@@ -236,7 +236,7 @@ The following are the settings for Storage Class Device Sets which can be config
* `portable`: If `true`, the OSDs will be allowed to move between nodes during failover. This requires a storage class that supports portability (e.g. `aws-ebs`, but not the local storage provisioner). If `false`, the OSDs will be assigned to a node permanently. Rook will configure Ceph's CRUSH map to support the portability.
* `volumeClaimTemplates`: A list of PVC templates to use for provisioning the underlying storage devices.
* `resources.requests.storage`: The desired capacity for the underlying storage devices.
* `storageClassName`: The StorageClass to provision PVCs from. Default would be to use the cluster-default StorageClass.
* `storageClassName`: The StorageClass to provision PVCs from. Default would be to use the cluster-default StorageClass. This StorageClass should provide raw block device, partition, encrypted device, or logical volume.

This comment has been minimized.

Copy link
@travisn

travisn Nov 26, 2019

Member

How about if you also add an example of how to use an LV in this scenario? Or it just depends on the storage class having those options? How does the OSD provisioning code know which type it needs to provision?

This comment has been minimized.

Copy link
@travisn

travisn Nov 26, 2019

Member

This PR looks like it just enables LVs. This comment also indicates partitions and encrypted devices. How were they enabled?

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Nov 28, 2019

Author Contributor

Or it just depends on the storage class having those options?

Yes, it does.

How does the OSD provisioning code know which type it needs to provision?

It's not for OSD provisioning code, but for users who specify the storage class. I added the list of device types to let users know which type of PV (block device) is accepted by this feature.

This PR looks like it just enables LVs. This comment also indicates partitions and encrypted devices. How were they enabled?

It doesn't mean what I added to support (LV), but means the supported device types described in the following line.

if !ok || (diskType != sys.SSDType && diskType != sys.CryptType && diskType != sys.DiskType && diskType != sys.PartType && diskType != sys.LinearType && diskType != sys.LVMType) {

If this line is not necessary, I'll remove it or use a more proper expression.

This comment has been minimized.

Copy link
@travisn

travisn Dec 3, 2019

Member

This PR only adds support for LVs. Should the statement be the following?

This StorageClass should provide a raw block device or logical volume. Other types are not supported.

We're not adding support for partitions or encrypted devices, right?

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Dec 4, 2019

Author Contributor

You're right, will fix.

@@ -16,6 +16,7 @@
- The on-host log directory for OSDs was set incorrectly to `<dataDirHostPath>/<namespace>/log`;
fix this to be `<dataDirHostPath>/log/<namespace>`, the same as other daemons.
- Use the mon configuration database for directory-based OSDs, and do not generate a config
- Support PersistentVolume backed by LVM Logical Volume for "OSD on PVC".

This comment has been minimized.

Copy link
@travisn

travisn Nov 26, 2019

Member

Are you also adding support for partitions and encrypted devices? These are mentioned in your doc change above.

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Nov 28, 2019

Author Contributor

As I mentioned above, partitions and encrypted devices before my PR.

@@ -80,7 +80,7 @@ func PopulateDeviceInfo(d string, executor exec.Executor) (*sys.LocalDisk, error
}

diskType, ok := diskProps["TYPE"]
if !ok || (diskType != sys.SSDType && diskType != sys.CryptType && diskType != sys.DiskType && diskType != sys.PartType && diskType != sys.LinearType) {
if !ok || (diskType != sys.SSDType && diskType != sys.CryptType && diskType != sys.DiskType && diskType != sys.PartType && diskType != sys.LinearType && diskType != sys.LVMType) {

This comment has been minimized.

Copy link
@travisn

travisn Nov 26, 2019

Member

I'm ok if we need to address the non-PV scenario with a separate PR, but if it's not much difference it would be great to support at the same time. Were you able to investigate yet? If not, let's just focus on finishing this for feature complete for v1.2 next week. Thanks!

@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 27, 2019

@travisn

I'm ok if we need to address the non-PV scenario with a separate PR, but if it's not much
difference it would be great to support at the same time. Were you able to investigate yet? If > not, let's just focus on finishing this for feature complete for v1.2 next week. Thanks!

OK, I'll focus on reflecting your comments for LV for LV-backed PV and complete this feature in v1.2.
I'll update PR tomorrow. In addition, I'll send another PR that handles OSD on non-PV LV later.

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 6d8f66c to 7304e96 Nov 28, 2019
@satoru-takeuchi

This comment has been minimized.

Copy link
Contributor Author

satoru-takeuchi commented Nov 28, 2019

@travisn @leseb I updated the PR, PTAL.

@@ -50,16 +50,18 @@ const (
cephVolumeCmd = "ceph-volume"
cephVolumeMinDBSize = 1024 // 1GB
lvmFilter = `filter = [ "a|^/mnt/.*|", "r|.*|" ]`
lvmFilterPVBacked = `filter = [ "a|^/mnt/.*|", "a|^/dev/.*|", "r|.*|" ]`

This comment has been minimized.

Copy link
@leseb

leseb Nov 28, 2019

Member

In the PVC case, we must skip /dev/loop devices, so please change "a|^/dev/.*|" to accept everything accept loop devices. Thanks.

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Dec 4, 2019

Author Contributor

Thanks, changed as you said.

Copy link
Member

travisn left a comment

A merge conflict needs to be resolved before it is merged.

@@ -236,7 +236,7 @@ The following are the settings for Storage Class Device Sets which can be config
* `portable`: If `true`, the OSDs will be allowed to move between nodes during failover. This requires a storage class that supports portability (e.g. `aws-ebs`, but not the local storage provisioner). If `false`, the OSDs will be assigned to a node permanently. Rook will configure Ceph's CRUSH map to support the portability.
* `volumeClaimTemplates`: A list of PVC templates to use for provisioning the underlying storage devices.
* `resources.requests.storage`: The desired capacity for the underlying storage devices.
* `storageClassName`: The StorageClass to provision PVCs from. Default would be to use the cluster-default StorageClass.
* `storageClassName`: The StorageClass to provision PVCs from. Default would be to use the cluster-default StorageClass. This StorageClass should provide raw block device, partition, encrypted device, or logical volume.

This comment has been minimized.

Copy link
@travisn

travisn Dec 3, 2019

Member

This PR only adds support for LVs. Should the statement be the following?

This StorageClass should provide a raw block device or logical volume. Other types are not supported.

We're not adding support for partitions or encrypted devices, right?

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 7304e96 to 0acc196 Dec 4, 2019
@travisn

This comment has been minimized.

Copy link
Member

travisn commented Dec 4, 2019

@satoru-takeuchi There is one more merge conflict again. I think we'll be good to merge after that.

@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 0acc196 to 27ac2a7 Dec 5, 2019
@travisn
travisn approved these changes Dec 5, 2019
@leseb
leseb approved these changes Dec 5, 2019
@@ -84,7 +84,7 @@ func StartOSD(context *clusterd.Context, osdType, osdID, osdUUID, lvPath string,
logger.Errorf("failed to start osd or shutting down. %+v", err)
}

if pvcBackedOSD {
if pvcBackedOSD && !lvBackedPV {

This comment has been minimized.

Copy link
@leseb

leseb Dec 5, 2019

Member

Out of curiosity why don't you need this?

This comment has been minimized.

Copy link
@leseb

This comment has been minimized.

Copy link
@satoru-takeuchi

satoru-takeuchi Dec 6, 2019

Author Contributor

Oops, sorry. I overlooked this comment.

After running ceph-volume, OSD-on-PVC manages(activates/deactivates) VG since it is dedicated to OSD. However, it shouldn't be done in OSD-on-LV-backed-PVC case since VG is not only for OSD, but also for many PVCs. So CSI driver is in charge of managing VG. On the same reason, activating VG (the code block from line#60) also be skipped.

Does this answer make sense?

This comment has been minimized.

Copy link
@leseb

leseb Dec 6, 2019

Member

Makes sense! Thanks for the reply.

"OSD on PVC" doesn't work for PV backed by LV. Fixing this problem
by the following changes.

- Rook accepts LVM disk type.
- If a LV-backed device is passed, Rook/Ceph invokes
  "ceph-volume lvm prepare" with "--data vg/lv"
  instead of "--data /path/to/device".
- If a LV-backed device is passed, Rook/Ceph suppresses
  activation/deactivation of VG that owns this LV.

Fixes: #4185
Signed-off-by: dulltz <isrgnoe@gmail.com>
Co-authored-by: Satoru Takeuchi <satoru.takeuchi@gmail.com>
@satoru-takeuchi satoru-takeuchi force-pushed the cybozu-go:support-lv-backed-pv branch from 27ac2a7 to dfe45ac Dec 5, 2019
Copy link
Member

BlaineEXE left a comment

I think the repeat failures suggest that the integration tests are likely not cleaning up the environment properly when they are done. I don't think we should merge this unless tests are passing normally (i.e., with only occasional random failures).

Copy link
Member

BlaineEXE left a comment

I recently learned the NFS test failures are a known issue. Approving and running again to skip integration tests.

@travisn travisn merged commit 6c54378 into rook:master Dec 5, 2019
4 checks passed
4 checks passed
DCO DCO
Details
Summary 1 potential rule
Details
commitlint found 0 problems, 0 warnings
continuous-integration/jenkins/pr-head This commit looks good
Details
v1.2 automation moved this from Review in progress to Done Dec 5, 2019
leseb added a commit to leseb/rook that referenced this pull request Dec 9, 2019
Since rook#4219, lv devices are not
presented to ceph-volume when preparing the device.
So on this initial run, this won't fail because the dm haven't been
created yet but if an orchestration is re-triggered, the prepare pod
ensures idempotency with the ceph-volume batch command.
Unfortunately, c-v seems to have a bug where it doesn't read the dm to
detect whether or not they are ceph members already.
Ceph bug: https://tracker.ceph.com/issues/43209

Signed-off-by: Sébastien Han <seb@redhat.com>
leseb added a commit to leseb/rook that referenced this pull request Dec 9, 2019
Since rook#4219, lv devices are not
presented to ceph-volume when preparing the device.
So on this initial run, this won't fail because the dm hasn't been
created yet but if an orchestration is re-triggered, the prepare pod
ensures idempotency with the ceph-volume batch command.
Unfortunately, c-v seems to have a bug where it doesn't read the dm to
detect whether or not they are ceph members already.
Ceph bug: https://tracker.ceph.com/issues/43209

Signed-off-by: Sébastien Han <seb@redhat.com>
@leseb leseb mentioned this pull request Dec 9, 2019
0 of 9 tasks complete
leseb added a commit to leseb/rook that referenced this pull request Dec 9, 2019
Since rook#4219, lv devices are now
presented to ceph-volume when preparing the device.
So on this initial run, this won't fail because the dm hasn't been
created yet but if an orchestration is re-triggered, the prepare pod
ensures idempotency with the ceph-volume batch command.
Unfortunately, c-v seems to have a bug where it doesn't read the dm to
detect whether or not they are ceph members already.
Ceph bug: https://tracker.ceph.com/issues/43209

Signed-off-by: Sébastien Han <seb@redhat.com>
@satoru-takeuchi satoru-takeuchi mentioned this pull request Dec 10, 2019
0 of 4 tasks complete
rajatsingh25aug pushed a commit to rajatsingh25aug/rook that referenced this pull request Dec 17, 2019
Since rook#4219, lv devices are now
presented to ceph-volume when preparing the device.
So on this initial run, this won't fail because the dm hasn't been
created yet but if an orchestration is re-triggered, the prepare pod
ensures idempotency with the ceph-volume batch command.
Unfortunately, c-v seems to have a bug where it doesn't read the dm to
detect whether or not they are ceph members already.
Ceph bug: https://tracker.ceph.com/issues/43209

Signed-off-by: Sébastien Han <seb@redhat.com>
zoetrope added a commit to cybozu-go/rook that referenced this pull request Dec 26, 2019
Since rook#4219, lv devices are now
presented to ceph-volume when preparing the device.
So on this initial run, this won't fail because the dm hasn't been
created yet but if an orchestration is re-triggered, the prepare pod
ensures idempotency with the ceph-volume batch command.
Unfortunately, c-v seems to have a bug where it doesn't read the dm to
detect whether or not they are ceph members already.
Ceph bug: https://tracker.ceph.com/issues/43209

Signed-off-by: Sébastien Han <seb@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
v1.2
  
Done
5 participants
You can’t perform that action at this time.