Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Config" in the CRD name is redundant IMO. How about a name without "Config"? #8

Closed
Madhu-1 opened this issue May 29, 2024 · 12 comments

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented May 29, 2024

          I think I commented on this previously, but can't find it now or if there was a response... "Config" in the CRD name is redundant IMO. How about a name without "Config"? For example, CephCSIPeerMapping, CephCSIPoolMapping, CephCSIClusterMapping, or even just CephCSIMapping
kind: CephCSIPeerMapping

Originally posted by @travisn in #1 (comment)

@travisn
Copy link
Member

travisn commented May 29, 2024

@Madhu-1 @nb-ohad @BlaineEXE To follow up from the design doc that was merged, I'd like to continue discussing the naming and structure of the CRDs.

CephCSIOperatorConfig --> CephCSIDriverDefaults

99% of the settings in this appear to be for the driver defaults. Why not name it as such? The only setting besides the driver defaults appears to be the operator log level, which could split off elsewhere. Then it is obvious to the user that they add defaults to this CR, while they can override the defaults in CephCSIDriver.

CephCSICephCluster and CephCSIConfig --> CephCSICluster and CephCSIConnection

This CRD needs to be split into more than one CRD. One CRD should be for the connection info and another CRD should be settings for the drivers that manage that cephcluster. This way, Rook can generate the connection info and it won't interfere with the settings that the user wants to apply to the cluster. This is why I think we should include different consumer scenarios (e.g. Rook) in the CephCSI operator design doc, so we can recognize what may be generated vs what should be set by users.

The settings in this CRD also seem to have a very similar purpose to the settings in CephCSIConfig, since in both CRDs they are for a single CephCluster. What is the difference between them from the user's perspective? Maybe there is some implementation difference under the covers about which secrets or configmaps are generated, but the user doesn't care about that.

Seems like we need connection info in a separate CRD from the user settings for a cluster, so we might end up with CRDs such as:

  • CephCSIConnection: Mon endpoints, which could be generated in some scenarios (e.g. by Rook)
  • CephCSICluster: All user settings for a CephCluster that apply to the driver. This includes all the settings currently in the CephCSICephCluster (except the mon endpoints) plus the settings in the CephCSIConfig CRD

My initial suggestion was going to be CephCSICephConnection and CephCSICephCluster, but then "Ceph" is very redundant. Perhaps it is more precise if the latter "Ceph" is removed from the name, but at the same time perhaps those redundant names are better to make it more obvious that the settings correspond to a CephCluster. I could go either way.

CephCSIConfigMapping --> CephCSIClusterMapping

Based on the suggestion to refactor the previous CRDs, this CRD could be named CephCSIClusterMapping or CephCSICephClusterMapping

@BlaineEXE
Copy link

BlaineEXE commented May 29, 2024

This is again a place where I feel like well-described user stories will be critical for having productive conversation about this work. I think it will help us all to have different scenarios described, who actors are, and what they can do. With that information, we can better envision and vet whether the designs are meeting goals and workflows we have in mind, or whether there are gaps.

@Madhu-1 and @nb-ohad are the ones who have already worked on this design, and I get a sense that they know what the user stories are and who actors are, but for those of us outside, we need these things to be able to understand the design in the context of its intent. Otherwise, we are much in the dark, and I don't think we can provide quality feedback.

Scenario 1: Upstream Rook

In this scenario, the administrator deploys ceph-csi-operator via default manifests that are present in rook/rook. The manifests set up ceph-csi-operator in single-namespace mode. The admin is actor 1.

Rook is actor 2. Rook configures X, Y, and Z resources automatically for the admin.

If the admin wants to override configs, they create X new CR to apply overrides.

Scenario 2: Consumer mode

In this scenario, an admin deploys ceph-csi-operator via OLM. ceph-csi-operator is set up in X mode. The admin is actor 1. ceph-csi-operator is a client for an external Ceph cluster. The admin has created a client operator as well to help automate things. The client operator is actor 2.

The client operator gets connection details for the Ceph cluster via means that are outside ceph-csi-operator's scope. The operator configures X, Y, and Z ceph-csi-operator CRs to connect to the external controller.

If the admin wants to override configs, they take X and Y, or Z action.

Scenario 3: A user creates ceph-csi-operator and manages it themselves

This is probably the simplest case to describe. The user configures X, Y, and Z resources to connect to their Ceph cluster. The user does not have to consider overrides or defaults, because they are the only actor in the system.

Scenario 4: whatever other modes are necessary

etc. etc. etc.

I particularly want to see a scenario description in which multi-namespace is involved.

I think I can extrapolate multi-namespace to all-namespaces by assuming that the user has a fully separate k8s cluster just for storage, to keep things segregated for security.

@travisn
Copy link
Member

travisn commented Jun 7, 2024

Based on comments and discussion with @Madhu-1 in rook/rook#14260, here are some follow-ups on CRDs and naming to consider...

CephCSICephCluster CRD

  • Consider removing cephConfig from this CRD. It's currently only done upstream for a corner case where cephx auth needed to be disabled. If it's really such a corner case, maybe it shouldn't be in a CRD and continue to exist only as a configmap.
  • kernelMountOptions and fuseMountOptions should be moved into this CRD

CephCSIConfig CRD

  • Naming: The purpose of this CRD is to define a name that will be used as the clusterID in a storage class. It's not just a generic config option. How about a name of CephCSIClusterID or CephCSIStorageClassID?

CephCSIOperatorConfig CRD

  • Naming: CephCSIDriverDefaults still seems much more specific and descriptive. If other operator settings are included, it seems fine with that name as well. The common use case is to define the defaults in one place, so it is appropriate to name it for the common case. Users would likely rarely need to override a log level or other operator settings.

CephCSIConfigMapping

  • Naming: The purpose is to map pools from one cluster to another between mirrored clusters. Previously, it had been described as a mapping that is based on the CephCSIConfig CRD name, but I don't see that relationship now. The CephCSIConfig is for creating a clusterID in a storage class, but this CRD is for pool mappings, which is completely different. What about a name such as CephCSIMirroredPoolMapping?

@nb-ohad
Copy link
Collaborator

nb-ohad commented Jun 9, 2024

CephCSIConfig CRD

The purpose of this CRD is not to define a name, but to decouple the configuration needed to consume Ceph based storage from the storage class itself. This was done for a couple of reasons:

  1. the storage class params are immutable and cannot be updated
  2. Avoid the reputation of identical configuration on multiple storage classes

The linkage between a storage class and this configuration is indeed by name/id but this link does not define the essence of the information residing in this CRD. This is why we chose CephCSIConfig as the name and I don't think the proposed name is a better alternative

CephCSIOperatorConfig

Changing the name of this CRD to CephCSIDriverDefaults does not make sense, mainly because the driver defaults are just a single aspect of the information included in this CR. This CR is intended to offer global/singleton information for the use/configuration of the operator and its reconciliation process.
It is true that at this point, the driver default substruct is the majority of the content described by this CR. Still, it is most unlikely that this part will see rapid change when compared to other future operator configuration fields. As an example, we are already discussing additions outside of the default section to provide image sets to the operator to overwrite the default images for CSI components (To generalize and cover one of the use case of CSI orchestration that is available today in ocs-client-operator)

CephCSIConfigMapping

This CR enables you to create an "alias" for a config within the context of a single block pool. It is incorrect to assume a purpose based on multi-cluster environment for this CR. There are other usecase that we considered that are internal to a single cluster. Take for example the case of the need to rename a config for a specific storage class (we anticipate a need for this in the future, especially around cluster migrations.
With that in mind, I am still open to renaming as long as the new name is not specifically tied to "mirroring"

@travisn
Copy link
Member

travisn commented Jun 10, 2024

CephCSIConfig CRD

The purpose of this CRD is not to define a name, but to decouple the configuration needed to consume Ceph based storage from the storage class itself. This was done for a couple of reasons:

  1. the storage class params are immutable and cannot be updated
  2. Avoid the reputation of identical configuration on multiple storage classes

The linkage between a storage class and this configuration is indeed by name/id but this link does not define the essence of the information residing in this CRD. This is why we chose CephCSIConfig as the name and I don't think the proposed name is a better alternative

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class.

Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

CephCSIOperatorConfig

Changing the name of this CRD to CephCSIDriverDefaults does not make sense, mainly because the driver defaults are just a single aspect of the information included in this CR. This CR is intended to offer global/singleton information for the use/configuration of the operator and its reconciliation process. It is true that at this point, the driver default substruct is the majority of the content described by this CR. Still, it is most unlikely that this part will see rapid change when compared to other future operator configuration fields. As an example, we are already discussing additions outside of the default section to provide image sets to the operator to overwrite the default images for CSI components (To generalize and cover one of the use case of CSI orchestration that is available today in ocs-client-operator)

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

CephCSIConfigMapping

This CR enables you to create an "alias" for a config within the context of a single block pool. It is incorrect to assume a purpose based on multi-cluster environment for this CR. There are other usecase that we considered that are internal to a single cluster. Take for example the case of the need to rename a config for a specific storage class (we anticipate a need for this in the future, especially around cluster migrations. With that in mind, I am still open to renaming as long as the new name is not specifically tied to "mirroring"

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

@nb-ohad
Copy link
Collaborator

nb-ohad commented Jun 13, 2024

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class.

Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

I am not sure we should do that, The fact that the only config we have today is these two pieces of information does not mean it will stay that way for long. We should design our API based on different aspects of CSI and not completely based on the current information available. In This specific case, we identified the need for a CR that represents a central configuration that will be attached to a storage class to allow CSI to understand how that storage class is related to the Ceph cluster. This CR includes any fields that Ceph CSI needs to utilize Ceph correctly. In an Idel world that information should have been embedded in the storage class params but unfortunately, as of the limitations I described in a previous comment it was decided to move it into the CSI config map entry which we are formalizing with this CR

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

The concern of this CR is not defaults configuration, it is the operator configuration, similar to an operator config map (which is a given when you deal with OLM operators), that fact that we consider default driver configuration as part of the operator configuration is just one aspect of this CR. This CR mainly describes the way the operator should behave

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

Right now the only concern inside the CR is Pool Mapping, but if you look carefully, the mapping is not the top-level key. This was done as we identified that in the future there are going to be other aspects of mapping between CSI configurations. So pool mapping is just one aspect of this API and for that reason, I will argue that CephCSIPoolMapping is not a good name

@travisn
Copy link
Member

travisn commented Jun 13, 2024

@Madhu-1 had mentioned that the mount options would be moved to the CephCSICephCluster CRD, so it seems that all that is left in this CRD is to associate the rados namespace or the subvolume group with some name, which can then be added to the storage class.
Agreed the name still doesn't make a lot of sense. IMO the naming would be more clear if this were split into two CRDs, then they could be named CephCSIRadosNamespace and CephCSISubvolumeGroup.

I am not sure we should do that, The fact that the only config we have today is these two pieces of information does not mean it will stay that way for long. We should design our API based on different aspects of CSI and not completely based on the current information available. In This specific case, we identified the need for a CR that represents a central configuration that will be attached to a storage class to allow CSI to understand how that storage class is related to the Ceph cluster. This CR includes any fields that Ceph CSI needs to utilize Ceph correctly. In an Ideal world that information should have been embedded in the storage class params but unfortunately, as of the limitations I described in a previous comment it was decided to move it into the CSI config map entry which we are formalizing with this CR

What about the CRD name CephResource? Its purposes is to associate some Ceph resource with CSI, so it seems flexible and descriptive enough.

What about CephCSIDefaultConfig? The new settings you described also seem to fit in this name.

The concern of this CR is not defaults configuration, it is the operator configuration, similar to an operator config map (which is a given when you deal with OLM operators), that fact that we consider default driver configuration as part of the operator configuration is just one aspect of this CR. This CR mainly describes the way the operator should behave

Operator is such a generic name, I disagree with using that name in any case. What other alternatives do you suggest? Any idea I suggest is rejected. What I am against is a name so general it is meaningless.

What about CephCSIPoolMapping? This is generic enough that the pools are being mapped for some purpose, whether for mirroring or within the same cluster.

Right now the only concern inside the CR is Pool Mapping, but if you look carefully, the mapping is not the top-level key. This was done as we identified that in the future there are going to be other aspects of mapping between CSI configurations. So pool mapping is just one aspect of this API and for that reason, I will argue that CephCSIPoolMapping is not a good name

If this is intended to be based on the naming of the CephCSIConfig CRD (which above is suggested to name it CephResource, what about CephResourceMapping?

@nb-ohad
Copy link
Collaborator

nb-ohad commented Jun 16, 2024

What about the CRD name CephResource? Its purpose is to associate some Ceph resources with CSI, so it seems flexible and descriptive enough.

CephResource implies the content is specific to a single Ceph resource which is not the case here. What about CephSettings or CephConfigurtion where Configuration is a none, as one in many configurations.

Operator is such a generic name, I disagree with using that name in any case. What other alternatives do you suggest? Any idea I suggest is rejected. What I am against is a name so general it is meaningless.

I agree that operator is a very generic name. What I think we should do is follow the convention met by the operator framework where they automatically generate an OpeartorConfig config map (which is of course untyped data). This will bring us back to the OperatorConfig name for the CRD (with a fully qualified name of operatorconfigs.csi.ceph.io

@travisn
Copy link
Member

travisn commented Jul 2, 2024

What about the CRD name CephResource? Its purpose is to associate some Ceph resources with CSI, so it seems flexible and descriptive enough.

CephResource implies the content is specific to a single Ceph resource which is not the case here. What about CephSettings or CephConfigurtion where Configuration is a none, as one in many configurations.

Are you saying CephResource shouldn't be used because multiple resources could be configured with a single CR? The difference between singular and plural seems unimportant. CRs can be queried with either their singular or plural names, interchangeably. So CephResource and CephResources would be the same CRD.

Operator is such a generic name, I disagree with using that name in any case. What other alternatives do you suggest? Any idea I suggest is rejected. What I am against is a name so general it is meaningless.

I agree that operator is a very generic name. What I think we should do is follow the convention met by the operator framework where they automatically generate an OpeartorConfig config map (which is of course untyped data). This will bring us back to the OperatorConfig name for the CRD (with a fully qualified name of operatorconfigs.csi.ceph.io

Ok, I could go with operatorconfigs.csi.ceph.io. What still seems awkward to me is that defaults can be specified in this CRD that can be overridden by a different CRD. I wonder if there is a different convention for the defaults that would be more natural. For example, perhaps create an instance of CephCSIDriver with a special name such as ceph-csi-defaults. This way a single CRD owns the schema for the driver settings.

@travisn
Copy link
Member

travisn commented Jul 17, 2024

Given all the separate naming discussions in this issue, discussion in #10, and slack discussions, here is my summary of the naming proposals and open questions. The listed names are based on the current design doc names:

  • CephCSIOperatorConfig:
    • Rename to operatorconfigs.csi.ceph.io
    • [Travis] It would be more intuitive to me to have “defaults” in the name (e.g. operatorDefaults.csi.ceph.io), but I can live with operatorConfig
  • CephCSIDriver:
    • Rename to drivers.csi.ceph.io
  • CephCSICephCluster
    • Rename to: cephconnections.csi.ceph.io
  • CephCSIConfig renaming:
    • [Travis] cephresource.csi.ceph.io: Seems like a good fit for configuring different resources including subvolume groups, rados namespaces, and potentially other types of resources in the future.
    • [Ohad] cephsettings.csi.ceph.io or cephconfiguration.csi.ceph.io
  • CephCSIConfigMapping renaming:
    • Append mapping to whatever name we decide on for the previous CRD (e.g. cephresourcemappings.csi.ceph.io)

@nb-ohad
Copy link
Collaborator

nb-ohad commented Jul 22, 2024

Summarizing agreements on names from our recent offline discussion:

  • drivers.csi.ceph.com - Unchanged
  • operatorconfigs.csi.ceph.com - unchanged
  • cephclusters.csi.ceph.com - renamed to cephconnections.csi.ceph.com
  • configs.csi.ceph.com - renamed to clientprofiles.csi.ceph.com
  • configmappings.csi.ceph.com - renamed to clientprofilemappings.csi.ceph.com

In addition, we agreed these name changes will be applied to the repo only after the merging of the following PRs:

@Madhu-1 @travisn @BlaineEXE
If the above is missing anything, please reply

@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Aug 1, 2024

This is done in #62

@Madhu-1 Madhu-1 closed this as completed Aug 1, 2024
iamniting pushed a commit to iamniting/ceph-csi-operator that referenced this issue Aug 8, 2024
Syncing latest changes from upstream main for ceph-csi-operator
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants