Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: added support for multus for csi #5740

Merged
merged 1 commit into from Aug 20, 2020
Merged

Conversation

rohan47
Copy link
Member

@rohan47 rohan47 commented Jun 30, 2020

Description of your changes:
CSI pods now utilize multus networking and connect to public
network specified in the CephCluster CR.

Signed-off-by: rohan47 rohgupta@redhat.com
Which issue is resolved by this Pull Request:
Resolves #5356

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

[test ceph]

@@ -378,6 +379,10 @@ func startDrivers(namespace string, clientset kubernetes.Interface, ver *version
// apply resource request and limit to rbdplugin containers
applyResourcesToContainers(clientset, rbdPluginResource, &rbdPlugin.Spec.Template.Spec)
k8sutil.SetOwnerRef(&rbdPlugin.ObjectMeta, ownerRef)
err = applyCephClusterNetworkConfig(&rbdPlugin.Spec.Template.ObjectMeta, rookclientset)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotations are applied to the csi daemonset pods but the k8s.v1.cni.cncf.io/networks-status annotation is missing so couldn't verify if the pods are using multus

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the annotation getting overwritten?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.
Just that there is no k8s.v1.cni.cncf.io/networks-status annotation in the pod and I am not able to get in the containers to check if the multus interfaces are added

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotations are applied normally

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well how can the annotation be applied normally if not visible later on? Something is off.

Copy link
Member Author

@rohan47 rohan47 Jul 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotation is applied

annotations:
    k8s.v1.cni.cncf.io/networks: public
    openshift.io/scc: rook-ceph-csi

There should be a k8s.v1.cni.cncf.io/networks-status
something like the following:

Annotations:  k8s.v1.cni.cncf.io/networks: public
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.128.2.31"
                    ],
                    "dns": {},
                    "default-route": [
                        "10.128.2.1"
                    ]
                },{
                    "name": "public",
                    "interface": "net1",
                    "ips": [
                        "192.168.1.209"
                    ],
                    "mac": "4e:5d:a8:b7:36:2b",
                    "dns": {}
                }]
              openshift.io/scc: rook-ceph-csi

@@ -378,6 +379,10 @@ func startDrivers(namespace string, clientset kubernetes.Interface, ver *version
// apply resource request and limit to rbdplugin containers
applyResourcesToContainers(clientset, rbdPluginResource, &rbdPlugin.Spec.Template.Spec)
k8sutil.SetOwnerRef(&rbdPlugin.ObjectMeta, ownerRef)
err = applyCephClusterNetworkConfig(&rbdPlugin.Spec.Template.ObjectMeta, rookclientset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the annotation getting overwritten?

pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
pkg/operator/ceph/csi/spec.go Show resolved Hide resolved
pkg/operator/k8sutil/network.go Outdated Show resolved Hide resolved
@Madhu-1
Copy link
Member

Madhu-1 commented Jul 2, 2020

I have 2 questions related to multus support.

  1. As the multus is per ceph cluster and a CSI driver is deployed once per rook operator, a rook operator can manage multiple ceph clusters. what happens if different ceph cluster uses different multus network?
  2. Ceph CSI daemonset pods run with host network as we are applying the multus to daemonset pods, how it's going to behave, is daemonset pods going to use multus instead of host network?

cc @travisn

@rohan47
Copy link
Member Author

rohan47 commented Jul 2, 2020

I have 2 questions related to multus support.

1. As the multus is per ceph cluster and a CSI driver is deployed once per rook operator, a rook operator can manage multiple ceph clusters. what happens if different ceph cluster uses different multus network?

After thinking about this it looks like the annotatins will get replaced.
Will fix this.

2. Ceph CSI daemonset pods run with host network as we are applying the multus to daemonset pods, how it's going to behave, is daemonset pods going to use multus instead of host network?

This is the part where I was stuck. Trying to use multus instead of host network and will update once I have the results.

cc @travisn

@mergify
Copy link

mergify bot commented Jul 7, 2020

This pull request has merge conflicts that must be resolved before it can be merged. @rohan47 please rebase it. https://rook.io/docs/rook/master/development-flow.html#updating-your-fork

@rohan47
Copy link
Member Author

rohan47 commented Jul 8, 2020

@Madhu-1 csi daemonset doesn't work on multus network. So here are my questions:

  1. What was the reason for using host network for the daemonsets?
  2. What will be the drawback if we use a host network for daemonset pods and multus for the rest of the csi pods?

Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

@rohan47 rohan47 added the do-not-merge DO NOT MERGE :) label Jul 10, 2020
@travisn
Copy link
Member

travisn commented Jul 10, 2020

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

@rohan47 rohan47 requested a review from leseb July 20, 2020 09:48
@rohan47 rohan47 force-pushed the csionmultus branch 2 times, most recently from 4923bdf to af1d258 Compare July 20, 2020 14:40
pkg/operator/k8sutil/network.go Outdated Show resolved Hide resolved
pkg/operator/k8sutil/network.go Outdated Show resolved Hide resolved
@rohan47
Copy link
Member Author

rohan47 commented Jul 21, 2020

This doc has the connection information of mons and osds and the logs of the failed OSD

@mergify
Copy link

mergify bot commented Jul 24, 2020

This pull request has merge conflicts that must be resolved before it can be merged. @rohan47 please rebase it. https://rook.io/docs/rook/master/development-flow.html#updating-your-fork

@rohan47 rohan47 force-pushed the csionmultus branch 2 times, most recently from 9c395a0 to 226abe7 Compare August 5, 2020 11:31
@rohan47
Copy link
Member Author

rohan47 commented Aug 5, 2020

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

@leseb
Copy link
Member

leseb commented Aug 5, 2020

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

Yes and let's link the ceph-csi issue too.

@rohan47
Copy link
Member Author

rohan47 commented Aug 5, 2020

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

Yes and let's link the ceph-csi issue too.

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

Yes and let's link the ceph-csi issue too.

Okay

@mergify mergify bot added the ceph main ceph tag label Aug 13, 2020
@leseb
Copy link
Member

leseb commented Aug 20, 2020

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

Yes and let's link the ceph-csi issue too.

With this PR we can remove the "experimental" flag from the doc https://rook.io/docs/rook/v1.3/ceph-cluster-crd.html and add a mention in the release note too.

Before we remove the experimental flag, how about we get 3rd party validation as suggested in the separate email thread today.

We are keeping the experimental flag right?

Yes and let's link the ceph-csi issue too.

Okay

Any updates @rohan47? We should move forward with that.

@rohan47 rohan47 requested a review from leseb August 20, 2020 09:54
pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
pkg/operator/ceph/csi/spec.go Outdated Show resolved Hide resolved
CSI pods now utilize multus networking and connect to public
network specified in the CephCluster CR.

Closes: rook#5356
Signed-off-by: rohan47 <rohgupta@redhat.com>
}
}

return true, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even if the ceph cluster is not using multus it's returning true?
don't we need to return false if the ceph cluster network is not multus?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, we should only return if multus got configured, I'm sending a PR to fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -498,6 +522,23 @@ func deleteCSIDriverResources(
return succeeded
}

func applyCephClusterNetworkConfig(objectMeta *metav1.ObjectMeta, rookclientset rookclient.Interface) (bool, error) {
cephClusters, err := rookclientset.CephV1().CephClusters(objectMeta.Namespace).List(metav1.ListOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: won't this fail if the ceph cluster and CSI is in a different namespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
The namespace we have here is actually the Rook Ceph operators namespace, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes its rook ceph operator namespace

mergify bot added a commit that referenced this pull request Aug 27, 2020
ceph: added support for multus for csi (bp #5740)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multus and CSI
4 participants