Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI Manila driver #536

Merged
merged 2 commits into from Jun 3, 2019

Conversation

@gman0
Copy link
Contributor

commented Mar 17, 2019

What this PR does / why we need it:
Adds CSI Manila driver, currently supports only CephFS and NFS. I've only tested with CephFS so far. Tested with CephFS and NFS.

Which issue this PR fixes: fixes #394

Design considerations / a brief description of what does what

"proxy" architecture

This CSI driver acts as a provisioner of Manila shares, all node-related operations (attachments, mounts) are carried out by another CSI driver, dedicated for that particular filesystem. This is achieved by forwarding CSI RPCs to the other CSI driver.

For example, creating and mounting a CephFS Manila share would go like this:

CO requests a volume provisioning with CreateVolume:

                        +--------------------+
                        |       MASTER       |
+------+  CreateVolume  |  +--------------+  |
|  CO  +------------------>+  csi-manila  |  |
+------+                |  +--------------+  |
                        +--------------------+

1. Authenticate with Manila v2 client
2. Create a CephFS share
3. Create a cephx access rule

And when CO publishes the volume onto a node with NodePublishVolume:

                             +--------------------------------------+
                             |                 NODE                 |
+------+  NodePublishVolume  |  +--------------+                    |
|  CO  +----------------------->+  csi-manila  |                    |
+------+                     |  +------+-------+                    |
                             |         |                            |
                             |         | FORWARD NodePublishVolume  |
                             |         V                            |
                             |  +------+-------+                    |
                             |  |  csi-cephfs  |                    |
                             |  +--------------+                    |
                             +--------------------------------------+

1. Authenticate with Manila v2 client
2. Retrieve the CephFS share
3. Retrieve the cephx access rule
4. Connect to csi-cephfs socket
5. Call csi-cephfs's NodePublishVolume, return its response

So the intention here is to deal with only Manila side of things, and leave all filesystem specifics to another CSI driver that's specialized for that particular fs.

If connection with the proxy'd CSI driver cannot be established, csi-manila will return an error and will let the CO retry.

Downsides:

From the documentation

A single instance of the driver may serve only a single Manila share protocol. To support multiple share protocols, multiple deployments of the driver need to be made. In order to avoid deployment collisions, each instance of the driver should be named differently, e.g. csi-manila-cephfs, csi-manila-nfs.

The initial idea was to encompass all Manila share protocols within a single instance of csi-manila. Due to limitations of CSI, this cannot be achieved and there will have to be multiple instances of csi-manila running in order to support multiple share protocols, one for each. There are couple of RPCs in Node Service that don't bring enough context to decide which proxy'd driver that particular RPC should be forwarded to (e.g. NodeGetCapabilities would be ambiguous and we can't know which proxy'd driver should answer) => therefore with current design, there's only a single proxy'd driver per csi-manila instance. Such "multiplexer" design is out-of-scope for CSI anyway IMHO.

Concurrency

CreateVolume is so far the most time-expensive operation in the driver because it waits till the share is in Available state, and for CephFS shares this waiting is even more pronounced because it waits till the Ceph key for the share is available. This means that the CO may retry the CreateVolume RPC for the same volume many times before it's actually done creating.

In Kubernetes, this means there will be multiple CreateVolume RPCs for the same volume in parallel - which, in our case, doesn't help the situation at all and would only create more unnecessary load on Manila itself. To guard against situations where there's a long running operation, some form of locking needs to be employed.

Using a simple mutex per volume ID would result in this:

  R req1             req2             req3
T -----------------+----------------+----------------+
01| LOCK           |                |                |
02| op1            |                |                |
03| op2            | LOCK...wait    |                |
04| op3            | wait...        |                |
05| UNLOCK         | wait...        | LOCK...wait    |
06| RETURN success + LOCK           | wait...        |
07|                  op1            | wait...        |
08|                  op2            | wait...        |
09|                  op3            | wait...        |
21|                  UNLOCK         | wait...        |
22|                  RETURN success + LOCK           |
23|                                   op1            |
24|                                   op2            |
25|                                   op3            |
26|                                   UNLOCK         |
27|                                   RETURN SUCCESS +

(I really hope these diagrams make at least a bit of sense) On the Y axis, there's T for time, and on the X axis there's R for a request. The first request req1 locks the mutex and launches a long running operation op (taking up 3 time units). The call times out, so the CO decides to retry with req2 but needs to wait till req1 finishes so that req2 may acquire the lock. This pattern may repeat until the call doesn't time-out. There are two problems here: (a) it may take a long time till CO is satisfied with the answer, (b) the long running operation op is being run in all of those retries.

pkg/csi/manila/responsebroker from this PR tries to tackle the problem (b), and hopes to mitigate (a):

  R req1                                             req2                          req3
T -------------------------------------------------+-----------------------------+-----------------------------+
01| LOCK(acq-or-create)                            |                             |                             |
02| op1                                            |                             |                             |
03| op2                                            | LOCK(ack-or-create)...wait  |                             |
04| op3                                            | wait...                     |                             |
05| UNLOCK, write success, wait till all Rs finish | wait...                     | LOCK(ack-or-create)...wait  |
06| wait...                                        | LOCK, read response:success | wait...                     |
07| wait...                                        | UNLOCK                      | wait...                     |
08| wait...                                        | RETURN SUCCESS              + LOCK, read response:success |
09| RETURN SUCCESS                                 +                               UNLOCK                      |
10|                                                                                RETURN SUCCESS              +

The same scenario as above: req1 locks the mutex and runs the long running operation op and times-out. req2 will wait till it can acquire the lock, but instead of running op again, it reads the response from req1 and if it's successful, it immediately exits with success. The responsebroker basically caches the result of an operation and propagates it to multiple readers.

This mechanism could be also used to implement a form of journaling: if an error occurs in one call, the future retries would pick up the operation in the exact spot where the previous call had failed - again, saving some load from the backend. Maybe in another PR.

To be done in this PR

  • If an operator wishes to use a custom certificate for authentication, they may use the os-certAuthority parameter in Secrets to inject the contents of a PEM cert file. This is the way manila-provisioner does it. The problem here is the size limit: the CSI spec allows up to 128 bytes in string fields and a certificate can be obviously larger than that,
  • List* and GetCapacity CSI RPCs need OpenStack credentials, but these RPCs don't carry any Secrets fields. CSI Cinder driver solves this by supplying the driver with a file containing the credentials. It doesn't rely on Secrets supplied by the CO.

Release note:

added CSI Manila driver

/cc @tsmetana @dims @adisky

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 17, 2019

Hi @gman0. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build failed.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build failed.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build succeeded.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build succeeded.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build succeeded.

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build failed.

  • cloud-provider-openstack-acceptance-test-standalone-cinder : RETRY_LIMIT in 31m 13s
@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented Mar 17, 2019

Build succeeded.

@adisky

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

/ok-to-test

@tsmetana

This comment has been minimized.

Copy link
Contributor

commented Mar 18, 2019

Hi. Overall I'd say the design looks OK. I would prefer to have some more tests to accompany the code. Otherwise it looks like a good start.

@gouthampacha
Copy link

left a comment

Hi, thank you for working on this. Still going through the code and understanding the implementation.. Please take a look at the comment wrt default access for NFS...

`type` | _yes_ | Manila [share type](https://wiki.openstack.org/wiki/Manila/Concepts#share_type)
`shareNetworkID` | _no_ | Manila [share network ID](https://wiki.openstack.org/wiki/Manila/Concepts#share_network)
`cephfs-mounter` | _no_ | Relevant for CephFS Manila shares. Specifies which mounting method to use with the CSI CephFS driver. Available options are `kernel` and `fuse`, defaults to `fuse`. See [docs](https://github.com/ceph/ceph-csi/blob/csi-v1.0/docs/deploy-cephfs.md#configuration) for further information.
`nfs-shareClient` | _no_ | Relevant for NFS Manila shares. Specifies what address has access to the NFS share. Defaults to `0.0.0.0`, i.e. anyone.

This comment has been minimized.

Copy link
@gouthampacha

gouthampacha Mar 18, 2019

All NFS drivers in manila understand the CIDR notation for NFS access to ANY client as 0.0.0.0/0.

Show resolved Hide resolved pkg/csi/manila/options/shareoptions.go Outdated
@tombarron
Copy link
Contributor

left a comment

Just noting that I have been working through the proposed change and overall it seems a valuable contribution and very high quality work. Will continue reviewing but getting CSI support for manila available and exercising this plugin more generally will be a constructive move for the community IMO.

@tsmetana
Copy link
Contributor

left a comment

This is actually quite a large piece of code to review and apart from the comments that have been made I think it's a great starting point for Manila CSI driver.

@tombarron tombarron referenced this pull request Apr 3, 2019

Closed

add csi-manila #394

@gman0

This comment has been minimized.

Copy link
Contributor Author

commented May 20, 2019

@adisky ok no probs, thanks for the heads up!

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-acceptance-test-e2e-conformance

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.12

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.13

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.14

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-test-flexvolume-cinder

@mrhillsman

This comment has been minimized.

Copy link
Member

commented May 23, 2019

/test cloud-provider-openstack-acceptance-test-flexvolume-cinder

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 23, 2019

Build failed.

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 23, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 23, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 23, 2019

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 23, 2019

@gman0

This comment has been minimized.

Copy link
Contributor Author

commented May 26, 2019

/test cloud-provider-openstack-acceptance-test-e2e-conformance

@theopenlab-ci

This comment has been minimized.

Copy link

commented May 26, 2019

Build succeeded.

@gman0

This comment has been minimized.

Copy link
Contributor Author

commented May 27, 2019

Tests are finally passing! @dims @hogepodge @adisky PTAL

@adisky

This comment has been minimized.

Copy link
Contributor

commented May 27, 2019

@gman0 Thanks!!!
/lgtm

@adisky

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

/priority important-longterm

@tsmetana

This comment has been minimized.

Copy link
Contributor

commented May 31, 2019

@dims or @rootfs Could you please take a look and approve the merge eventually? Thanks.

@dims

This comment has been minimized.

Copy link
Member

commented May 31, 2019

/approve

@dims

dims approved these changes May 31, 2019

@gman0

This comment has been minimized.

Copy link
Contributor Author

commented May 31, 2019

@adisky @dims Thanks guys! It seems like k8s-ci-robot wasn't triggered though..?

@adisky

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

@dims it has not taken "/approved" label, you need to do it again

@dims dims added the approved label Jun 3, 2019

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jun 3, 2019

[APPROVALNOTIFIER] This PR is APPROVED

Approval requirements bypassed by manually added approval.

This pull-request has been approved by:

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 9900325 into kubernetes:master Jun 3, 2019

14 of 16 checks passed

pull-cloud-provider-openstack-check Job triggered.
Details
pull-cloud-provider-openstack-test Job triggered.
Details
cla/linuxfoundation gman0 authorized
Details
openlab/cloud-provider-openstack-acceptance-test-csi-cinder cloud-provider-openstack-acceptance-test-csi-cinder status: success
Details
openlab/cloud-provider-openstack-acceptance-test-e2e-conformance cloud-provider-openstack-acceptance-test-e2e-conformance status: success
Details
openlab/cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.12 cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.12 status: success
Details
openlab/cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.13 cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.13 status: success
Details
openlab/cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.14 cloud-provider-openstack-acceptance-test-e2e-conformance-stable-branch-v1.14 status: success
Details
openlab/cloud-provider-openstack-acceptance-test-flexvolume-cinder cloud-provider-openstack-acceptance-test-flexvolume-cinder status: success
Details
openlab/cloud-provider-openstack-acceptance-test-k8s-cinder cloud-provider-openstack-acceptance-test-k8s-cinder status: success
Details
openlab/cloud-provider-openstack-acceptance-test-keystone-authentication-authorization cloud-provider-openstack-acceptance-test-keystone-authentication-authorization status: success
Details
openlab/cloud-provider-openstack-acceptance-test-lb-octavia cloud-provider-openstack-acceptance-test-lb-octavia status: success
Details
openlab/cloud-provider-openstack-acceptance-test-standalone-cinder cloud-provider-openstack-acceptance-test-standalone-cinder status: success
Details
openlab/cloud-provider-openstack-format cloud-provider-openstack-format status: success
Details
openlab/cloud-provider-openstack-unittest cloud-provider-openstack-unittest status: success
Details
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.