Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(aws-fsx): L2 construct for FSx for Lustre #6653

Merged
merged 10 commits into from
May 4, 2020

Conversation

horsmand
Copy link
Contributor

@horsmand horsmand commented Mar 10, 2020

Description

I wrote the L2 construct for FSx for Lustre. This can be found in lustre-file-system.ts. I didn't implement FSx for Windows, but tried to split out anything that would be shareable between the two implementations into file-system.ts.

I verified this through 100% unit test coverage, along with some integration tests. I've included one integration test in the code called integ.lustre-file-system.ts that does basic setup. I also wrote a more in-depth integ test that I haven't included here but used as example code in the Mounting section of the README. I did the same for the import instructions as well, there's no included integ test but I did write one to verify it and include parts of it in the README.

Commit Message

feat(aws-fsx): L2 construct for FSx for Lustre (#6653)

Created the L2 construct for FSx for Lustre with the ability to create or import a file system and give access to it through Connections.

End Commit Message


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 52fbf17
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 930e5e4
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 361f9f2
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: b627f3f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

};

const fileSystem = new FsxLustreFileSystem(stack, 'FsxLustreFileSystem', {
lustreConfiguration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in terms of example I would prefer inlining the config here rather than put them in separate variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


const fileSystem = new FsxLustreFileSystem(stack, 'FsxLustreFileSystem', {
lustreConfiguration,
storageCapacity,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A unitless number scares me. Can we make this storageCapacityGiB, storageCapacityGB, storageCapacityTiB, or whatever the case may be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've inlined this along with the lustreConfiguration so there is no variable and line is storageCapacity: 1200. Would you still rather have a variable named storageCapacityGiB?

Did you mean rename the attribute itself to storageCapacityGiB? I could do that but it would be inconsistent with how CloudFormation names it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I meant rename the property. I'm okay taking the hit being not exactly the same. The prefix is the same (so people typing will autocomplete to the prop they expect) and I feel the additional clarity is worth it.

Not all AWS services are consistent with one another, but we can at least be consistent within the CDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

packages/@aws-cdk/aws-fsx/README.md Show resolved Hide resolved
const vpc = new Vpc(stack, 'VPC');

// Create a security group to share across the file system and ec2 instance and open the FSx ports.
const securityGroup = new SecurityGroup(stack, 'FsxLustreSecurityGroup', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You seem to be going against your own advice of the previous section here. Quote:

FSx has a fixed default port, so you don't need to specify the port.

Also, generally you wouldn't share an SG across instance and filesystem. As soon as you attach your instance to a load balancer, you would implicitly open port 8000 (or whatever) on the filesystem as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is rather convoluted due to the way the EC2 instance needs to look the file system mount name up for itself.

The problem I was trying to avoid here was having the EC2 instance come up before the file system is up and the mount name is retrievable from the AWS CLI. To do that, I added the dependency on the file system from the instance. If I let both resources create their own security groups and tried to add this dependency, it would be circular, as the file system already attempts to add the instance's security group to its Connections.

Removing this dependency would simplify the example a fair bit, but there's always the chance the instance runs its UserData before the file system is initialized and the mount fails.

Copy link
Contributor

@rix0rrr rix0rrr Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. The security groups are accidentally included in the dependency set when you do addDependency huh?

Let's replace it with:

// Depend on the filesystem resource, not any of the security groups and security group rules
instance.addDependency(fileSystem.node.defaultChild);

Which I agree is not great, but that's more of a symptom of the dependency system. We can address that in due time, but that's no reason to make this example make people do things we actually don't want them to do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked.

inst.node.addDependency(fs);

const mountPath = '/mnt/fsx';
const dnsName = `${fs.fileSystemId}.fsx.${stack.region}.amazonaws.com`;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could have been a property on the FsxFileSystem class, would come in very useful. Don't forget to use Aws.URL_SUFFIX for the amazonaws.com part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added this property.

packages/@aws-cdk/aws-fsx/lib/lustre-file-system.ts Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-fsx/lib/lustre-file-system.ts Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-fsx/lib/lustre-file-system.ts Outdated Show resolved Hide resolved
packages/@aws-cdk/aws-fsx/lib/lustre-file-system.ts Outdated Show resolved Hide resolved
@mergify mergify bot dismissed rix0rrr’s stale review March 17, 2020 20:28

Pull request has been modified.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: ab44cff
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@rix0rrr
Copy link
Contributor

rix0rrr commented Mar 18, 2020

I think you might have missed some comments of mine as they were hidden by GitHub by default.


const fileSystem = new FsxLustreFileSystem(stack, 'FsxLustreFileSystem', {
lustreConfiguration: { deploymentType: FsxLustreDeploymentType.SCRATCH_2 },
storageCapacity: 1200,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not quite familiar with the use case. Just to clarify, did we choose 1200 because this is the smallest offering?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there was no other reason for using 1200.

Comment on lines 96 to 97
securityGroup.addIngressRule(Peer.anyIpv4(), Port.tcp(988), 'Default FSx port ingress');
securityGroup.addEgressRule(Peer.anyIpv4(), Port.tcp(988), 'Default FSx port egress');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not sufficient. Please add port 1021-1023 to it as well. See updated guidelines in https://docs.aws.amazon.com/fsx/latest/LustreGuide/limit-access-security-groups.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

inst.userData.addCommands(
'set -eux',
'yum update -y',
'amazon-linux-extras install -y lustre2.10',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lustre version will change over time. Maybe worth thinking how you would want to version control you cdk to meet with the lustre version you need.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is example code in a README, not managed on behalf of the user.

Looks like the current iteration of this is punting the version management to the user.

Is there a way the latest version could be automatically retrieved, which we can recommend or make available to users instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried looking into it a little more but couldn't find any documentation about why 2.10 is being used rather than a more recent version. There's also no way to tie it to the latest version, unless amazon-linux-extras ins't used. This will at least always pick up the most recent release of 2.10.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got the following information about why AL2 is distributing Lustre 2.10:

FSx for Lustre currently serves only Lustre 2.10 and we only recommend Lustre 2.10 clients for our customers. The Lustre 2.10 kernel modules are baked directly into the AL2 kernel, which is why we currently only vend Lustre 2.10 in the amazon-linux-extras repo (only the Lustre 2.10 userspace tools are included in the Lustre2.10 package in the repo).
I think FSx for Lustre may be unique here. Using a Lustre 2.12 client, for example, may give you trouble when using FSx for Lustre backed by a 2.10 server.

readonly perUnitStorageThroughput?: number;

/**
* The preferred time to perform weekly maintenance, in the UTC time zone, 24 hour clock. For example: '2:20:30'.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first digit is the day of the week, say 2:20:30 means Tuesday at 20:30

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 9578da1
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: be8c0f0
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 925c59d
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: a0189b6
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 1dccc8a
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 619f340
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: b0fcbb0
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

Copy link
Contributor

@rix0rrr rix0rrr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions about the code in the example (I think it's doing a couple things it doesn't need to, or at least SHOULDN'T need to).

Other than that, as good as ready to ship!

(P.S: You don't have to keep merging from master by hand, we have a bot that will do that automatically when we approve)

subnetType: SubnetType.PUBLIC,
}
});
inst.connections.securityGroups.forEach((securityGroup => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what does this do? This seems very wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to figure out a way to get all the ports allowed for a set of them that wasn't consecutive. The default port on Connections only accepts a single port or a range. FSx requires 988, 1021, 1022, 1023. To get around this I've changed it to open ports 988-1023. This isn't ideal, but I don't see a better solution.


// Add a dependency on the file system to the instance so that we can be sure the file system is instantiated before querying
// for its mount name.
if (fs.node.defaultChild) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary though? CloudFormation should be taking care of this automatically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't automatically create the dependency, but it does seem to create the EC2 instance after the file system is created. I'll remove this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use of a { Ref } or { Fn::GetAtt } will automatically create a dependency. You only need to use DependsOn if the dependency is implicit, because creating the resource has some side effect (such as Policys usually do).

/**
* Class for scheduling a weekly manitenance time.
*/
export class LustreMaintenanceTime {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@mergify mergify bot dismissed rix0rrr’s stale review April 22, 2020 14:58

Pull request has been modified.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: e6a2fb7
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: b45e364
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 0a906ee
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: c4588b2
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@horsmand
Copy link
Contributor Author

FSx has exposed the Lustre file system's mount name through CloudFormation, so I was able to add this as field to the LustreFileSystem Construct and then update my mounting example in the README to use this instead of having to update the AWS CLI version on the EC2 instance and then parse the mount name out of a CLI call.

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: d524edc
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented May 4, 2020

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject6AEA49D1-qxepHUsryhcu
  • Commit ID: 38b4719
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented May 4, 2020

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit 7363912 into aws:master May 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants