docs: add dev guide explaining lvmo components [skip ci] #101

leelavg · 2022-02-02T08:06:38Z

Signed-off-by: Leela Venkaiah G lgangava@redhat.com

openshift-ci · 2022-02-02T08:06:40Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: leelavg
To complete the pull request process, please assign nbalacha after the PR has been reviewed.
You can assign the PR to them by writing /assign @nbalacha in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

travisn

Great to see the design writeup, thanks!

travisn · 2022-02-02T20:33:11Z

doc/dev-guide/reconciler.md

+- *csiDriver*: Reconciles topolvm CSI Driver
+- *topolvmController*: Reconciles topolvm controller plugin
+- *lvmVG*: Reconciles volume groups from LVMCluster CR
+- *openshiftSccs*: Manages SCCs when the operator is run in Openshift


The operator creates the SCCs? Security best practices is that the SCCs should be defined statically and not managed by an operator. For example, Rook defines its SCC in its upstream manifest here. However, downstream doesn't have a way to define the SCCs in OLM, so the workaround was to have the operator do it. But upstream really should define the SCC in a standalone manifest so the operator doesn't need high privileges to create SCCs.

i may be wrong here, does rook being deployed standalone in downstream, isn't that setup by meta operators which also takes care of SCCs?

if indeed rook upstream isn't setting SCCs from code, it does have to deploy necessary RBAC required to access SCC api irrespective of upstream/downstream, isn't it?

i may be wrong here, does rook being deployed standalone in downstream, isn't that setup by meta operators which also takes care of SCCs?

Downstream Rook is only deployed as part of ODF and the OCS operator creates the necessary SCC since Rook does not create its own.

if indeed rook upstream isn't setting SCCs from code, it does have to deploy necessary RBAC required to access SCC api irrespective of upstream/downstream, isn't it?

The Rook operator does not ever create SCCs or RBAC. It's all defined in its helm chart (for helm users), or the common.yaml for other upstream users. So the admin has full control to grant the operator only exactly the permissions needed.

How should this be done for downstream outside of the operator? Bundles do not allow us to include SCCs so we decided to do this in the operator in order to make deployment easier.

Downstream does require the operator to do it. I'm just suggesting that for upstream there should be an option for the operator not to do it, so the SCC can be created separately and avoid giving the operator the extra privileges to create SCCs.

We can look into that.

doc/dev-guide/lvmo-units.md

travisn · 2022-02-02T20:36:16Z

doc/dev-guide/lvmo-units.md

+
+- *topolvmStorageClass* resource units creates and manages all the storage
+  classes corresponding to the deviceClasses in the LVMCluster
+- Storage Class name is generated with a prefix "topolvm-" added to name of the


Can a storage class name be specified without the topolvm- prefix being added? Admins generally like to control the names of their storage classes.

if admin doesn't want to use a pre-created storage class they are free to create a new storage class and not use pre-created SC at all

we aren't setting any SC as default but just creating SC's corresponding to device classes

What about adding a property on the device class something like storageClassName that allows them to override the generated storage class name? It's just a suggestion for a future PR.

The intention here is to provide a 1-click install so there is a working storage setup the admin can use from day 1, similar to how OCS does this. They can create their own sc later if they choose.

Upstream users appreciate customizing from the first install. That's the beauty of CRs is that it allows for the customization from the start. Admins don't generally want to have extra/unused/uncustomizable storage classes. 1-click install is great for the common case, but requiring the 1-click install doesn't work well for the advanced users.

A good point and can be taken up in later improvements. Out first priority was to have a working operator and one of the requirements was a one-click installation.

Yes certainly, common cases first and this is just a suggestion for the future.

travisn · 2022-02-02T20:44:18Z

doc/dev-guide/lvmo-units.md

+  and creates the required volume groups on the individual nodes based on the
+  specified deviceSelector and nodeSelector.
+- The corresponding CRs forms the basis of `vgManager` unit to create volume
+  groups and create lvmd config file


How is the lvmd config file generated on each node? Does the vgManager have a hostPath mounted? Could you add more details about that?

doc did explain about hostPath in the initial version of this PR

maybe I misunderstood that comment and removed that info from vg-manager doc

@nbalacha do you want to see the mention of hostPath?

Yes, I think having the hostpath information here would be useful.

travisn · 2022-02-02T20:46:03Z

doc/dev-guide/reconciler.md

+- *vgManager*: Responsible for creation of Volume Groups
+- *topolvmStorageClass*: Manages storage class life cycle based on
+  devicesClasses in LVMCluster CR
+- The LVMO creates an LVMVolumeGroup CR for each deviceClass in the


This is the point I still don't understand... Why doesn't the admin create the LVMVolumeGroup CR directly? I don't see any description of what we're solving by embedding them in the cluster CR.

same as docs: add usage guide for lvmo [skip ci] #100 (comment)

it's mostly about having to update status on the CR by multiple controllers which'll result into parallel CR update requests

We want the operator to have the ability to generate the LVMVolumeGroup CRs based on the LVMCluster CR. This allows us better control and validation as discussed earlier.

it's mostly about having to update status on the CR by multiple controllers which'll result into parallel CR update requests

The problem of updating the status seems the same either way. I would agree that the operator should own updating the status on the LVMVolumeGroup CRs, but that doesn't mean the admin can't create those CRs.

We want the operator to have the ability to generate the LVMVolumeGroup CRs based on the LVMCluster CR. This allows us better control and validation as discussed earlier.

What control and validation is needed? These are the details I'm missing. It seems any meaningful validation must happen on nodes by the vgmanager, not in the operator. The operator doesn't have insight to storage actually available.

It could be something as simple as making sure there are not 2 entries trying to use up all the disks on all the nodes. It allows the operator to perform a first level validation before the vgmanager controller starts to create the VGs on the nodes. There could be other validations that we may perform in the future as the operator grows. IMO it is better to prevent issues than to try to fix things especially when dealing with local storage.

The vgmanager will process the LVMVolumeGroup CRs serially, right? This means that if there is a conflict, the first one processed will win and the later ones may not have any devices available to add to its VG. This doesn't seem like an error condition, it just seems like the reality that if the admin wants multiple volume groups they will need to define the volume groups wisely. Why should conflicts be an error condition rather than a first-one-wins?

There are multiple nodes and if multiple CRs are created at the same time, can we guarantee that all the nodes will process them in the same order?
If different nodes process the CRs in a different order, we cannot know which VGs can exist on which nodes. As the Storageclasses will have been created by then, potentially PVs can be created and then cleaning up becomes difficult. Hence I want to have a level of control on this.

The API server will notify controllers about the CRs in the order of creation, so I expect the ordering will be consistent.

As discussed on the call, I'm ok to keep this behavior for now until we give this more time to bake. In 4.10 it's anyway simplified with a single VG supported. I also don't see upgrades being affected negatively if we do decide later to remove the VGs from the LVMCluster CR. The LVMVolumeGroup CRs would have already been created by the operator, and the operator could just ignore the VGs in the LVMCluster after that.

I can see if that if we have compelling validation error-checking scenarios for clusters with multiple VGs we could keep this design. But fundamentally I still recommend not including the VGs in the LVMCluster CR for these reasons:

Bundling multiple resources into a single CR and forcing it to be a singleton is not a common K8s pattern I would expect on CRDs. A volume group is a logical unit, and therefore each VG makes sense as a single CR instance.

Validation must be done on the nodes anyway by the vgmanager where the devices can be fully validated. The operator just can't fully validate available devices.

Instead of assuming multiple VGs would cause conflicts as an error condition, allow the first VG to consume the device. The later VG will simply not find the available device. The later VG won't be in error, it just might not have devices available to fulfill storage requests.

As discussed on the call, I do not see any advantage to having the admin create the LVMVolumeGroup CRs. The actual interface exposed to the admin is the deviceClass - that is what we are internally mapping to lvm volume groups. We could decide to change this completely in the future.
The additional level of validation is something that makes things easier with setup especially when targeting hundreds of nodes. The node level validation done by the Vg manager is a different level of validation.

A volume group is a logical unit, and therefore each VG makes sense as a single CR instance.
Yes, but that is not something the user needs to know. As far as they are concerned they get dynamic local storage and the exposing the lower layers to them is not really required. The LVMCluster CR will be fine tuned in future to provide this experience.

Instead of assuming multiple VGs would cause conflicts as an error condition, allow the first VG to consume the device. The later VG will simply not find the available device. The later VG won't be in error, it just might not have devices available to fulfill storage requests.

If there are failures in reconciliation for one VG or the reconcile requests are, for some reason, sent in a different order (network issues etc), this can lead to different VGs on different nodes. Since cleaning up is not easy, it makes more sense to prevent such conflicts as early as possible.

As discussed, this is the approach we are following now. Any further decisions will be based on user experience and feedback and will be taken up later.

doc/dev-guide/topolvm-csi.md

Signed-off-by: Leela Venkaiah G <lgangava@redhat.com>

openshift-ci · 2022-04-13T09:55:13Z

@leelavg: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/images	`9f22f5d`	link	true	`/test images`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

leelavg mentioned this pull request Feb 2, 2022

docs: update readme, docs layout and add user, dev docs [skip ci] #87

Closed

travisn reviewed Feb 2, 2022

View reviewed changes

leelavg mentioned this pull request Feb 3, 2022

Resources documentation #102

Closed

nbalacha reviewed Feb 16, 2022

View reviewed changes

doc/dev-guide/topolvm-csi.md Outdated Show resolved Hide resolved

docs: add dev guide explaining lvmo components [skip ci]

9f22f5d

Signed-off-by: Leela Venkaiah G <lgangava@redhat.com>

nbalacha merged commit 7a67fe5 into openshift:main Apr 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add dev guide explaining lvmo components [skip ci] #101

docs: add dev guide explaining lvmo components [skip ci] #101

leelavg commented Feb 2, 2022

openshift-ci bot commented Feb 2, 2022

travisn left a comment

travisn Feb 2, 2022

leelavg Feb 3, 2022

travisn Feb 3, 2022

nbalacha Feb 4, 2022

travisn Feb 4, 2022

nbalacha Feb 16, 2022

travisn Feb 2, 2022

leelavg Feb 3, 2022

travisn Feb 3, 2022

nbalacha Feb 4, 2022

travisn Feb 4, 2022

nbalacha Feb 5, 2022

travisn Feb 7, 2022

travisn Feb 2, 2022

leelavg Feb 3, 2022

leelavg Feb 10, 2022

nbalacha Feb 16, 2022

travisn Feb 2, 2022

leelavg Feb 3, 2022

nbalacha Feb 4, 2022

travisn Feb 4, 2022

nbalacha Feb 5, 2022 •

edited

Loading

travisn Feb 7, 2022

nbalacha Feb 9, 2022

travisn Feb 9, 2022

nbalacha Feb 10, 2022

openshift-ci bot commented Apr 13, 2022

docs: add dev guide explaining lvmo components [skip ci] #101

docs: add dev guide explaining lvmo components [skip ci] #101

Conversation

leelavg commented Feb 2, 2022

openshift-ci bot commented Feb 2, 2022

travisn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nbalacha Feb 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openshift-ci bot commented Apr 13, 2022

nbalacha Feb 5, 2022 •

edited

Loading