# EKS CSI FSX Lustre Setup

Amazon FSx for Lustre is a high-performance file system optimized for deep learning workloads. FSx provides POSIX-compliant file system access to S3 for multiple readers and writers simultaneously.
  
The Amazon FSx for Lustre Container Storage Interface (CSI) driver provides a CSI interface that allows Amazon EKS clusters to manage the lifecycle of Amazon FSx for Lustre file systems.  

https://docs.aws.amazon.com/eks/latest/userguide/fsx-csi.html

In [None]:
import boto3
import json
from botocore.exceptions import ClientError

iam = boto3.client('iam')
sts = boto3.client('sts')
cfn = boto3.client('cloudformation')
eks = boto3.client('eks')

region = boto3.Session().region_name
cluster_name = 'demo'

# 1. Install the FSx CSI Driver for Kubernetes

## Create IAM Policy

Create an IAM policy and service account that allows the driver to make calls to AWS APIs on your behalf.

In [None]:
!pygmentize fsx/fsx-csi-driver.json

In [None]:
# !aws iam create-policy \
#     --policy-name Amazon_FSx_Lustre_CSI_Driver \
#     --policy-document file://fsx/fsx-csi-driver.json

In [None]:
with open('fsx/fsx-csi-driver.json') as json_file:
    data = json.load(json_file)
    policy = json.dumps(data)

try:
    response = iam.create_policy(
        PolicyName='Amazon_FSx_Lustre_CSI_Driver',
        PolicyDocument=policy
    )
    print("[OK] Policy created.")

except ClientError as e:
    if e.response['Error']['Code'] == 'EntityAlreadyExists':
        print("[OK] Policy already exists.")
    else:
        print("Error: %s" % e)

In [None]:
account_id = sts.get_caller_identity()['Account']
csi_policy_arn = 'arn:aws:iam::{}:policy/Amazon_FSx_Lustre_CSI_Driver'.format(account_id)
print(csi_policy_arn)

## Create Kubernetes IAM Service Account

Create a Kubernetes service account for the driver and attach the policy to the service account. Replacing the ARN of the policy with the ARN returned in the previous step.

In [None]:
!eksctl create iamserviceaccount \
     --region $region \
     --name fsx-csi-controller-sa \
     --namespace kube-system \
     --cluster $cluster_name \
     --attach-policy-arn $policy_arn \
     --approve

In [None]:
cf_stack_name = 'eksctl-{}-addon-iamserviceaccount-kube-system-fsx-csi-controller-sa'.format(cluster_name)
print(cf_stack_name)

In [None]:
response = cfn.list_stack_resources(
    StackName=cf_stack_name
)
print(response)

In [None]:
iam_role_name = response['StackResourceSummaries'][0]['PhysicalResourceId']
print(iam_role_name)

In [None]:
iam_role_arn = iam.get_role(RoleName=iam_role_name)['Role']['Arn']
print(iam_role_arn)

# Deploy CSI Driver

In [None]:
!kubectl apply -k "github.com/kubernetes-sigs/aws-fsx-csi-driver/deploy/kubernetes/overlays/stable/?ref=master"


Patch the driver deployment to add the service account that you just created, replacing the ARN with the correct role ARN.

In [None]:
!kubectl annotate serviceaccount -n kube-system fsx-csi-controller-sa \
 eks.amazonaws.com/role-arn=$iam_role_arn --overwrite=true

# Check S3 Bucket For FSX

In [None]:
bucket = 's3://fsx-antje'

In [None]:
#!aws s3 mb $bucket

In [None]:
!aws s3 ls $bucket

In [None]:
!aws s3 ls $bucket --recursive

# Download Storage Class Manifest

In [None]:
!curl -o storageclass.yaml https://raw.githubusercontent.com/kubernetes-sigs/aws-fsx-csi-driver/master/examples/kubernetes/dynamic_provisioning_s3/specs/storageclass.yaml
    

## Get VPC ID and Subnet ID

In [None]:
%%bash

source ~/.bash_profile

#### Get VPC ID
export VPC_ID=$(aws ec2 describe-vpcs --filters "Name=tag:Name,Values=eksctl-${AWS_CLUSTER_NAME}-cluster/VPC" --query "Vpcs[0].VpcId" --output text)
echo "export VPC_ID=${VPC_ID}" | tee -a ~/.bash_profile

#### Get Subnet ID
export SUBNET_ID=$(aws ec2 describe-subnets --filters "Name=vpc-id,Values=${VPC_ID}" --query "Subnets[0].SubnetId" --output text)
echo "export SUBNET_ID=${SUBNET_ID}" | tee -a ~/.bash_profile

## Create Security Group

In [None]:
%%bash

source ~/.bash_profile

export SEC_GROUP_ID=$(aws ec2 create-security-group --group-name eks-fsx-security-group --vpc-id ${VPC_ID} --description "FSx for Lustre Security Group" --query "GroupId" --output text)
echo "export SEC_GROUP_ID=${SEC_GROUP_ID}" | tee -a ~/.bash_profile

## Add an ingress rule that opens up port 988 from the 192.168.0.0/16 CIDR range

In [None]:
%%bash

source ~/.bash_profile

aws ec2 authorize-security-group-ingress --group-id ${SEC_GROUP_ID} --protocol tcp --port 988 --cidr 192.168.0.0/16

## Attach Security Group to Nodes

## Update the environment variables in the `storageclass.yaml` file

In [None]:
!pygmentize fsx/storageclass.yaml

In [None]:
# %%bash

# source ~/.bash_profile

# # Populate SUBNET_ID, SECURITY_GROUP_ID, S3_BUCKET

# cd

# sed "s@SUBNET_ID@$SUBNET_ID@" fsx/fsx-s3-sc.yaml.template > fsx/fsx-s3-sc.yaml

# sed -i.bak -e "s@SECURITY_GROUP_ID@$SECURITY_GROUP_ID@" fsx/fsx-s3-sc.yaml 

# sed -i.bak -e "s@S3_BUCKET@$S3_BUCKET@" fsx/fsx-s3-sc.yaml

# Create FSX Storage Class

In [None]:
!kubectl delete -f fsx/storageclass.yaml

In [None]:
!kubectl create -f fsx/storageclass.yaml

In [None]:
!kubectl get sc

# Create Claim

In [None]:
!curl -o claim.yaml https://raw.githubusercontent.com/kubernetes-sigs/aws-fsx-csi-driver/master/examples/kubernetes/dynamic_provisioning_s3/specs/claim.yaml

In [None]:
!pygmentize fsx/claim.yaml

In [None]:
!kubectl delete -f fsx/claim.yaml

In [None]:
!kubectl apply -f fsx/claim.yaml

In [None]:
!kubectl get pvc fsx-claim

In [None]:
!kubectl describe pvc fsx-claim

## _Wait for status == Bound_

## Update FSX to `autoImportPolicy: NEW_CHANGED`

In [None]:
fsx = boto3.client('fsx')

In [None]:
response = fsx.describe_file_systems()
fsx_id = response['FileSystems'][0]['FileSystemId']
print(fsx_id)

In [None]:
response = fsx.update_file_system(
    FileSystemId=fsx_id,
    LustreConfiguration={
        'AutoImportPolicy': 'NEW_CHANGED'
    }
)
print(response)

## kubectl version

In [None]:
!kubectl version