Skip to content

Support striped LVM for higher GP3 throughput #2440

Open
@jirislav

Description

@jirislav

Is your feature request related to a problem? Please describe.

The GP3 volume type is cost-effective, but relatively slow. If we manage to provision multiple GP3 volumes, attach them to a single node, bootstrap them with striped LVM, we should gain increased throughput and IOPS at the same GB-month price.

Describe the solution you'd like in detail

Let's say I would create this Storage Class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: multi-ebs-gp3-xfs
provisioner: ebs.csi.aws.com
allowVolumeExpansion: true  # I suggest this to be unsupported in the initial LVM-supported version
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
mountOptions:
  - noatime
  - nodiratime
parameters:
  type: gp3
  csi.storage.k8s.io/fstype: xfs
  encrypted: "true"
  throughput: "1000" # per volume
  iops: "16000" # per volume
  # New parameter:
  lvm_striped_volumes: "10"

Then, when a PVC requests say 1000GB, the CSI driver would provision 10 GP3 volumes (100GB each), bringing theoretical IOPS to 160000 and throughput to 10000 MiB/s.

Under the hood, the driver would attach the volumes to the node, create a striped LVM on top of them, format with XFS and attach to the POD. If the POD is later scheduled on another node, it would ensure the same 10 GP3 volumes are mounted in the same order to another node and the process is repeated.

Once we have a Persistent Volume created in k8s, I would expect the volume IDs to as well as their mount order to be stored within the annotations of that PV so that it can be mounted to any new node as needed.

Describe alternatives you've considered

There exists a CSI driver with LVM support here, but it only supports locally attached disks. While it should be possible to provision EC2 nodes and mount EBS volumes to them using an init script, there is no option to migrate these EBS volumes to another node.

Additional context

Other LVM mount options such as mirror or linear are not needed IMO. The EBS is already replicated, so mirrored doesn't add any value. Linear volume simply appends one volume to the other, which doesn't bring improved I/O performance compared to a striped volume.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions