Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cinder-csi-plugin] Cinder CSI plugin should not provision ROX or RWX volumes #1367

Closed
Fedosin opened this issue Jan 12, 2021 · 29 comments
Closed
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@Fedosin
Copy link
Contributor

Fedosin commented Jan 12, 2021

/kind bug

What happened:
When creating a PVC with ROX or RWX access mode, the volume provisioned successfully, but when creating pods assigned to different nodes, it got attach failure.

What you expected to happen:
We should show out the unsupport message as cinder in-tree plugin does:
Warning ProvisioningFailed 7s (x3 over 14s) persistentvolume-controller Failed to provision volume with StorageClass "standard": invalid AccessModes [ReadWriteMany]: only AccessModes [ReadWriteOnce] are supported

How to reproduce it:

  1. Install OSP cluster and cinder csi driver is installed.

  2. Create PVC with ROX/RWX access mode, and create pod to consume it.

  3. Check the pod and pvc statuses

  4. Create another pod which assigned to another node

  5. Volume provisioned successfully
    $ oc get pvc -n wduan
    NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
    mypvc02 Bound pvc-ae09e478-2b8e-4350-a83a-b789ff991d7d 1Gi RWX standard-csi 17m

$ oc get pvc -n wduan-01
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
mypvc03 Bound pvc-9404f88c-d437-493b-ba5e-3a2685cdc1c9 1Gi ROX standard-csi 14m

  1. Attach failed when creating another pod assigned to another node
    Warning FailedAttachVolume 12s attachdetach-controller AttachVolume.Attach failed for volume "pvc-ae09e478-2b8e-4350-a83a-b789ff991d7d" : rpc error: code = Internal desc = [ControllerPublishVolume] Attach Volume failed with error failed to attach ccc11e8b-cfff-40ac-92da-d7baed902b12 volume to fa6d2e8c-eb8f-4a57-a54a-5063f2b8bec7 compute: Bad request with: [POST https://rhos-d.infra.prod.upshift.rdu2.redhat.com:13774/v2.1/servers/fa6d2e8c-eb8f-4a57-a54a-5063f2b8bec7/os-volume_attachments], error message: {"badRequest": {"message": "Invalid input received: Invalid volume: Volume ccc11e8b-cfff-40ac-92da-d7baed902b12 status must be available or downloading (HTTP 400) (Request-ID: req-75c01a31-c877-4582-930e-35c5d8b64eaf)", "code": 400}}
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jan 12, 2021
@ramineni
Copy link
Contributor

ramineni commented Jan 13, 2021

@Fedosin which version of plugin you are using? I believe this error shouldn't come in latest code

@jichenjc
Copy link
Contributor

@Fedosin I saw another node and another pod, to confirm, it's not caused by multiple nodes or pods
just ROX is not supported by CSI cinder so we want to reject that create action during creation stage because ROX is not supposed to be supported?

@Fedosin
Copy link
Contributor Author

Fedosin commented Jan 15, 2021

This is our code https://github.com/openshift/cloud-provider-openstack
We are slightly behind the master (the rebase happened on Nov, 26), but I don't see what could fix the issue after that.

@jichenjc
Copy link
Contributor

@ramineni I think @Fedosin means he is using openshift master and encounter this issue
@Fedosin please help to confirm , if that's true, then this is something we need fix

@ramineni
Copy link
Contributor

ramineni commented Jan 18, 2021

@Fedosin Did you specify volume type is storage class as multiattach , and volume created in cinder have the property multiattach=True?
https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/features.md#multi-attach-volumes

@mandre
Copy link
Contributor

mandre commented Jan 18, 2021

Hi @ramineni, the issue occurs when asking for ROX or RWX access mode but the backend doesn't support multiattach. In this case, cinder CSI should not provision the volume but return an error because it won't be able to honor the required access mode.

See the following example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: mydeploy03
spec:
  replicas: 1
  selector:
    matchLabels:
      app: hello-cinder
  template:
    metadata:
      labels:
        app: hello-cinder
    spec:
      containers:
      - name: hello-openshift
        image: docker.io/aosqe/storage@sha256:a05b96d373be86f46e76817487027a7f5b8b5f87c0ac18a246b018df11529b40
        ports:
        - containerPort: 80
        volumeMounts:
        - name: local
          mountPath: /mnt/local
      volumes:
      - name: local
        persistentVolumeClaim:
          claimName: mydep-pvc03

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mydep-pvc03
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 1Gi
  storageClassName: standard-csi

The PV shows RWX access mode:

psi ❯ oc get pv -A                                                                                                                                                        
NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM                 STORAGECLASS   REASON   AGE                          
pvc-00cc5aec-a183-45fd-b222-9dcc219d26cb   1Gi        RWX            Delete           Bound    default/mydep-pvc03   standard-csi            37s                          

However, the volume doesn't support multi-attach:

psi ❯ openstack volume show dfab6c07-55ee-4804-a1f5-fdb61d647686
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field                        | Value                                                                                                                                                                                                                                                                                                                               |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| attachments                  | [{'id': 'dfab6c07-55ee-4804-a1f5-fdb61d647686', 'attachment_id': 'b1836bf2-8df4-4447-8889-9de3012603b5', 'volume_id': 'dfab6c07-55ee-4804-a1f5-fdb61d647686', 'server_id': 'a04d19e9-19d7-43aa-bc06-7ccf372534c5', 'host_name': 'compute-ci-d-036.localdomain', 'device': '/dev/vdb', 'attached_at': '2021-01-18T08:17:44.000000'}] |
| availability_zone            | nova                                                                                                                                                                                                                                                                                                                                |
| bootable                     | false                                                                                                                                                                                                                                                                                                                               |
| consistencygroup_id          | None                                                                                                                                                                                                                                                                                                                                |
| created_at                   | 2021-01-18T08:17:41.000000                                                                                                                                                                                                                                                                                                          |
| description                  | Created by OpenStack Cinder CSI driver                                                                                                                                                                                                                                                                                              |
| encrypted                    | False                                                                                                                                                                                                                                                                                                                               |
| id                           | dfab6c07-55ee-4804-a1f5-fdb61d647686                                                                                                                                                                                                                                                                                                |
| multiattach                  | False                                                                                                                                                                                                                                                                                                                               |
| name                         | pvc-00cc5aec-a183-45fd-b222-9dcc219d26cb                                                                                                                                                                                                                                                                                            |
| os-vol-tenant-attr:tenant_id | c73b7097d07c46f78eb4b4dcfbac5ca8                                                                                                                                                                                                                                                                                                    |
| properties                   | cinder.csi.openstack.org/cluster='kubernetes'                                                                                                                                                                                                                                                                                       |
| replication_status           | None                                                                                                                                                                                                                                                                                                                                |
| size                         | 1                                                                                                                                                                                                                                                                                                                                   |
| snapshot_id                  | None                                                                                                                                                                                                                                                                                                                                |
| source_volid                 | None                                                                                                                                                                                                                                                                                                                                |
| status                       | in-use                                                                                                                                                                                                                                                                                                                              |
| type                         | tripleo                                                                                                                                                                                                                                                                                                                             |
| updated_at                   | 2021-01-18T08:17:45.000000                                                                                                                                                                                                                                                                                                          |
| user_id                      | c58a5aa7bf2df7c49420a43898f9b1df39bff9ee0b7dc240a2aa975c910750a5                                                                                                                                                                                                                                                                    |
+------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Do you think it's enough to chesk the volume capabilities in CreateVolume() like @Fedosin did in #1368 ?

@ramineni
Copy link
Contributor

@mandre Thanks for the explanation. But PR #1368 blocks creation , irrespective of backend supports or not. If backend supports and user would like create a volume with volume type of multiattach , he should be able to use it.

@mandre
Copy link
Contributor

mandre commented Jan 18, 2021

There seem to be multiple problems here:

Is there a way for the cinder driver to discover what the backend really supports?

Just to make sure I understand correctly: #1368 fixes the second issue, but is not acceptable in its current form because, due to the first issue, this would prevent the Multi-Attach feature ?

@ramineni
Copy link
Contributor

ramineni commented Jan 19, 2021

Just to make sure I understand correctly: #1368 fixes the second issue, but is not acceptable in its current form because, due to the first issue, this would prevent the Multi-Attach feature ?

@mandre right.

@mandre
Copy link
Contributor

mandre commented Jan 19, 2021

Is there a way to discover what capabilities the backend supports?

@ramineni
Copy link
Contributor

ramineni commented Jan 20, 2021

@mandre , right now, we only check volume.multiattach flag , to check volume is capable of multiattach or not. It requires admin to configure correct volume type while creating volume. You could also explore any cinder API which expose that

@mdbooth
Copy link
Contributor

mdbooth commented Apr 12, 2021

'multiattach' is an extra_spec on the volume type. We should be able to check that.

@mdbooth
Copy link
Contributor

mdbooth commented Apr 12, 2021

Incidentally, although the published api docs don't say so there's at least 1 code comment suggesting that volume.multiattach is deprecated:

https://opendev.org/openstack/cinder/src/commit/355681cd53fbb2f5fd2c4cecb55a83c5e0c27c2c/cinder/volume/flows/api/create_volume.py#L434-L436

I don't know how likely it is to go away in practise, but it seems prudent to use the extra_spec on the type rather than the field on the volume.

@tombarron
Copy link
Contributor

tombarron commented Apr 12, 2021

multi-attach is not by itself sufficient for safe RWX when using block storage like cinder. You also need to make sure to set "volumeMode = 'Block' " in your PVC. The volumeMode option which was introduced as a beta feature in k8s 1.13 [1] Not doing this setting, using an earlier k8s version than 1.13, or setting "Filesystem" gets you traditional behavior where k8s expects to either find a filesystem on the block volume or it will create an xfs or ext4 filesystem before making it available to applications.

These are node local file systems and it is not safe to mount them on multiple nodes in the cluster, even read only (read only mounts still write file system metadata).

So Cinder and other block based PVs with RWX is really for applications like some databases that work with raw block devices, read and write to block offsets rather than to posix filesystem paths, and that do their own coordination/write arbitration when there are multiple writers rather than relying on a filesystem for this function. Another use is for instance with kubevirt, where raw block volumes with ISO images are used for boot and it is is useful to be able to multi-attach them to facilitate live migration.

When you do multiple attaches of a Cinder block volume in OpenStack neither Cinder nor Nova builds a file system on the block volume so the traditional k8s behavior of imposing a node local filesystem is not an issue. The owner of Nova guest VMs with multiple attaches to the same volume can make this mistake, but OpenStack itself will not put them in this predicament.

[1] https://kubernetes.io/blog/2019/03/07/raw-block-volume-support-to-beta/

@mdbooth
Copy link
Contributor

mdbooth commented Apr 13, 2021

For completeness, I wrote up what I believe should be the logic here: #1368 (comment)

@tombarron
Copy link
Contributor

I commented in #1368 that to my knowledge Cinder provides no way for a regular user (member with project scoped keystone role) to discover the multiattach extra spec in volume types. IMO that is a deficiency that Cinder should address (PTG topic? capabilities discovery has come up before in Cinder but to my knowledge was never resolved). (FWIW, in Manila we distinguish between traditional, private extra-specs which have back end details that are not the business of regular users and public extra-specs, which can be used by regular users for capability discovery)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 13, 2021
@tombarron
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 14, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 12, 2021
@gouthampacha
Copy link
Contributor

/remove-lifecycle-stale

@ramineni
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2021
@tombarron
Copy link
Contributor

https://review.opendev.org/c/openstack/cinder/+/806260 implements user visible extra specs for Cinder, including whether a volume type has multi-attach support.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2022
@ramineni
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 20, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 20, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants