Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No LustreFS support in newer variants (k8s-1.28, ecs-2) #3459

Closed
yeazelm opened this issue Sep 14, 2023 · 11 comments
Closed

No LustreFS support in newer variants (k8s-1.28, ecs-2) #3459

yeazelm opened this issue Sep 14, 2023 · 11 comments
Labels
area/core Issues core to the OS (variant independent) type/bug Something isn't working

Comments

@yeazelm
Copy link
Contributor

yeazelm commented Sep 14, 2023

The 6.1 kernel currently doesn't support LustreFS but the 5.10 and 5.15 kernels do. There seems to be some work ongoing upstream to bring this support back into the 6.1 kernel. We should add this support once it is ready for the 6.1 kernels.

@yeazelm yeazelm added type/bug Something isn't working status/needs-triage Pending triage or re-evaluation area/core Issues core to the OS (variant independent) and removed status/needs-triage Pending triage or re-evaluation labels Sep 14, 2023
@bcressey
Copy link
Contributor

This affects all the new variants using 6.1:

  • aws-ecs-2 / aws-ecs-2-nvidia
  • aws-k8s-1.28 / aws-k8s-1.28-nvidia
  • vmware-k8s-1.28
  • metal-k8s-1.28

Other variants are not affected.

@bcressey bcressey changed the title Add LustreFS support back for 6.1 kernels No LustreFS support in newer variants (k8s-1.28, ecs-2) Sep 14, 2023
@bcressey bcressey pinned this issue Sep 14, 2023
@autarchprinceps
Copy link

What options does one have if one needs continued FSx support? Not upgrade to EKS 1.28? Because you can't downgrade, and you won't know this is an issue, until you upgrade. Just ran into this issue in one of our clusters

@stmcginnis
Copy link
Contributor

That's a good question @autarchprinceps. Assuming you are not relying on any Kubernetes 1.28 functionality, you could deploy new worker nodes using a Bottlerocket 1.27 AMI. That is a supported Kubernetes configuration that would allow you to still use Lustre, even after the EKS cluster has been upgraded to 1.28.

@kamirendawkins
Copy link

Another workaround that we have chosen, to avoid version skew, is to leverage Karpenter to force pods that require FSX onto AL2(Kernel 5.10) until this is resolved and we can move back to BottleRocket for those services.

@bcressey
Copy link
Contributor

This should be fixed by the kernel 6.1 update in #3853, but needs to be verified. (cc: @larvacea)

@larvacea
Copy link
Member

A (very) quick test pre-release showed that 6.1 had the LustreFS configuration and patch from upstream, and that the pieces appeared to be able to speak to a Lustre file system. If you feel confident in upstream testing, this may be sufficient. If not, let me know and I can add Lustre testing to our loop.

@bcressey
Copy link
Contributor

I'd like to see positive proof that the Lustre CSI driver works on Bottlerocket with the 6.1 kernel before we resolve this and add a release note about it.

@natehudson
Copy link

We are running Kubernetes 1.29 now and had to switch back to AmazonLinux2 for our FSx (Lustre) EKS workloads. We prefer BottleRocketOS so I tested out the latest Bottlerocket OS 1.19.3 (aws-k8s-1.29) release and I'm still seeing problems with the CSI driver and Lustre. Our containers are failing to launch due to mounting errors:

Warning  FailedMount  106s (x9 over 3m54s)  kubelet            MountVolume.SetUp failed for volume "pv-shared-fsx-east-sandbox" : rpc error: code = Internal desc = Could not mount "fs-our-id-here.fsx.us-east-1.amazonaws.com@tcp:/our-mount-point" at "/var/lib/kubelet/pods/1d65cbb9-e60e-4433-a221-d1615639bd61/volumes/kubernetes.io~csi/pv-shared-fsx-east-sandbox/mount": mount failed: exit status 5
Mounting command: mount
Mounting arguments: -t lustre -o flock fs-our-id-here.fsx.us-east-1.amazonaws.com@tcp:/our-mount-point /var/lib/kubelet/pods/1d65cbb9-e60e-4433-a221-d1615639bd61/volumes/kubernetes.io~csi/pv-shared-fsx-east-sandbox/mount
Output: mount.lustre: mount fs-our-id-here.fsx.us-east-1.amazonaws.com@tcp:/our-mount-point at /var/lib/kubelet/pods/1d65cbb9-e60e-4433-a221-d1615639bd61/volumes/kubernetes.io~csi/pv-shared-fsx-east-sandbox/mount failed: Input/output error
Is the MGS running?

Searching around, the Is the MGS running? message seems to suggest it is a security group problem with port 988 but these BottleRocket nodes are in the same subnets and security groups as the working AmazonLinux2 nodes running the same fsx-csi-driver. They are all trying to share the same shared PVC and PV backed by the same FSx filesystem.

Any ideas or further debugging I can help with?

@larvacea
Copy link
Member

larvacea commented Apr 26, 2024

I tested the 1.20 CSI drivers on version 1.19.4 of our k8s-1.28 variant. I can (probably should) go back and try this with 1.29, but the bottlerocket AMIs for 1.28 and 1.29 use the same kernel. I know "it works for me" is not helpful; it obviously does not work for @natehudson.
I have seen the same error, though, so I can share what I learned about FSx Lustre, networks, and security groups, in the hope that it helps here.

  • I found https://docs.aws.amazon.com/eks/latest/userguide/fsx-csi.html helpful. I used the configuration there almost unchanged for my testing.
  • The FSx documentation does not tell you this, but the VPC setup visible (and editable) in the console, including security groups and route tables, applies to the control node only. Changing this setup after you create the FSx file system will not affect the data nodes, and you will get exactly this "Is the MGS running?" message. It is absolutely imperative that you get the subnet + security group setup prepared before provisioning a file system if the file system will be separate from the EKS cluster VPC.
  • Given all of that fun, and assuming that this does not cause security or other problems for you, I recommend provisioning your storage in the EKS cluster security group, using one of the EKS cluster's internal subnets. If your FSx filesystem is inside the cluster, you do not need to open ports in the SG and you do not need to add routes to communicate between the EKS cluster and its filesystem.

For my testing, I followed the getting-started example in the EKS FSx documentation. I can share the two yaml files I used, in the hope that this will help with your debugging:

The storage_class.yaml file:

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: fsx-sc
provisioner: fsx.csi.aws.com
parameters:
  subnetId: subnet-0123456780abcdef
  securityGroupIds: sg-0123456780abcdef
  deploymentType: PERSISTENT_1
  automaticBackupRetentionDays: "1"
  dailyAutomaticBackupStartTime: "00:00"
  copyTagsToBackups: "true"
  perUnitStorageThroughput: "200"
  dataCompressionType: "NONE"
  weeklyMaintenanceStartTime: "7:09:00"
  fileSystemTypeVersion: "2.12"
mountOptions:
  - flock

Where the subnetId is any one of the private subnets (172.something) for your EKS cluster, and the SecurityGroupIds value is the cluster security group. The claim.yaml file contains:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: fsx-claim
spec:
  accessModes:
    - ReadWriteMany
  storageClassName: fsx-sc
  resources:
    requests:
      storage: 1200Gi

You should be able to edit parameters to suit your application, of course. I would certainly suggest selecting deploymentType based on your own requirements. Deployment type imposes constraints on the value of storage in the persistent volume claim (and again, you know best what you need).

Using this configuration (and following the step-by-step process in the documentation), I can verify that the sample container was able to mount and write to the Lustre file system. I also tested a non-CSI application running on a cluster without the CSI driver, connecting to a separately-provisioned Lustre filesystem. It was able to mount and write to the file system. For this non-CSI test, I did provision the FSx filesystem separately, and I did add the required firewall rules to both the FSx SG and the cluster SG (and the first time I tried, I did not do this correctly, and I got the error message you report, much to my dismay and confusion).

@larvacea
Copy link
Member

This is available in all variants in releases starting with v1.19.3. As I mentioned earlier, it works for either native Lustre mounts or the CSI driver for FSx Lustre.

@natehudson
Copy link

Thanks for all your help @larvacea !

I tested yesterday again with the latest BottleRocketOS and fsx-csi-driver release and was still getting the Is the MGS running? errors.

In my research, I stumbled across this compatibility matrix page https://docs.aws.amazon.com/fsx/latest/LustreGuide/lustre-client-matrix.html when I realized that our older FSx mounts are Lustre version 2.10 and are incompatible with Linux Kernel 6.1. This explains why the same setup is working on our AmazonLinux2 EKS nodes as they are still using Linux Kernel 5.x.

I tested a new storageclass with fileSystemTypeVersion: "2.12" and was able to create and mount the volume no problem on the Bottlerocket OS 1.20.0 (aws-k8s-1.29) EKS nodes. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Issues core to the OS (variant independent) type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants