Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow modifying IOPS and Throughput after volume creation #1338

Closed
tareksha opened this issue Aug 10, 2022 · 10 comments
Closed

Allow modifying IOPS and Throughput after volume creation #1338

tareksha opened this issue Aug 10, 2022 · 10 comments
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@tareksha
Copy link

tareksha commented Aug 10, 2022

Is your feature request related to a problem? Please describe.
Allow modifying IOPS and Throughput of individual gp3 volumes after they are created.

Describe the solution you'd like in detail
I would like to specify different IOPS and Throughput for gp3 volumes provisioned via PersistentVolumeClaims. More importantly i would like to have the ability to modify them after the volume is created. Currently the only modification that is possible is expansion which means we miss a very valuable feature of gp3 volumes.

Something like PVC/PV-level annotations should be fine.

Describe alternatives you've considered
There is no kubernetes-native way to do this right now.

Additional context
In our clusters we need to switch to gp3 volumes and update their specs periodically according to real world usage.
Right now we are stuck gp2 volumes and increase IOPS and increasing volume sizes. These is no way increase baseline throughput byond the maximum 250 MB/sec in gp2.

@rdpsin
Copy link
Contributor

rdpsin commented Aug 10, 2022

Unfortunately, this is not possible.

  1. PVC/PV annotations are not passed to the CSI driver. There have been issues opened (see Adding PVC's annotation to CreateVolumeRequest.Parameters  kubernetes-csi/external-provisioner#86), but no progress on it.

  2. Even if the CSI driver had access to PVC/PV annotations, there's no way to trigger a 'ModifyVolume' call on PVC/PV annotations update.

What we would need is a generic 'ModifyVolume' call (see container-storage-interface/spec#491, for example), but it is currently not possible.

For migrating gp2 to gp3 volumes, please take a look at this: https://aws.amazon.com/blogs/containers/migrating-amazon-eks-clusters-from-gp2-to-gp3-ebs-volumes/

@DerekTBrown
Copy link

@rdpsin I am wondering if this feature could be implemented using a custom resource instead:

  1. The CSI controller would implement a custom resource called EBSVolumeParameters which would contain the mutable AWS-specific parameters of EBS volumes (IOPS, Throughput, Storage Type).

  2. Each EBS StorageClass would point to a EBSVolumeParameters object, which would define the parameters for that StorageClass.

  3. As an initial implementation, the EBSVolumeParameters resource could be modified, but will only apply to newly created volumes. This would enable customers to migrate IOPS/Throughput/Storage Classes by (1) manually modifying existing EBS volumes, and then (2) modifying the EBSVolumeParameters resource as applied to future volumes.

  4. In a longer term implementation, the CSI controller could handle EBSVolumeParameters modifications and change the underlying EBS volumes accordingly.

I feel this is a fairly important problem for CSI; at the moment IOPS/Throughtput/Storage Class changes requires taking downtime, which isn't something our services can tolerate.

@rdpsin
Copy link
Contributor

rdpsin commented Oct 3, 2022

Yes, we've thought about using CRDs this way. I think there a couple of challenges:

  1. This would be EBS specific; will require creating new sidecars.
  2. More egregiously, the sidecar will not have any context about which PVC object you want to modify. For example, both PVC A and PVC B would use the same StorageClass referring to EBSVolumeParameters C. Now suppose you want to update the volume properties. You modify C, and the controller/sidecar watching the CRD gets called into action. It doesn't know which PVC it needs to update. It can probably use the K8s API to figure out which PVCs are related to it, but it still doesn't know whether to update A or B or both or some other PVC that also uses the particular CRD.

@DerekTBrown
Copy link

This would be EBS specific; will require creating new sidecars.

I agree this is less-than-ideal.

However, a bespoke implementation might be a good way to demonstrate a broader need to the storage SIG.

It can probably use the K8s API to figure out which PVCs are related to it, but it still doesn't know whether to update A or B or both or some other PVC that also uses the particular CRD.

I am not sure I understand this limitation- it seems like the controller/sidecar could perform a "join" from EBSVolumeParameters -> StorageClass -> PVC to figure out what it needs to update?

@luanguimaraesla
Copy link

What about having two different CRDs? EBSVolumeParametersClass which would be Immutable and linked to the StorageClass, working like a template with default parameters to IOPS/Throughput. When a user creates a new PVC/PV, the CSI driver could use that to deploy another CR EBSVolumeParameters linked to that PV.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 25, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 24, 2023
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 26, 2023
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tareksha
Copy link
Author

hi @DerekTBrown, @rayandas, is the possible via the new VolumeAttributesClass API in k8s 1.29?
https://kubernetes.io/blog/2023/12/15/kubernetes-1-29-volume-attributes-class

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants