-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Managed Node Group launch template disk size fix #1229
Managed Node Group launch template disk size fix #1229
Conversation
… support setting disk size when a MNG uses a custom launch template
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for the PR!
Could you please add a test for this? Let me know if you need help with that; https://github.com/pulumi/pulumi-eks/blob/master/examples/examples_nodejs_test.go#L415 is a good example of how to create an additional test.
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
Hey @flostadler, I had a crack at making a test (I haven't run it yet), does this look like what you'd be looking for? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your contribution! Mostly some small nits.
Once you've added the new tests to the test matrix I'll start an integration test run :)
With(integration.ProgramTestOptions{ | ||
Dir: path.Join(getCwd(t), "tests", "managed-ng-disk-size"), | ||
ExtraRuntimeValidation: func(t *testing.T, info integration.RuntimeValidationStackInfo) { | ||
utils.RunEKSSmokeTest(t, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was the issue before that the nodes where booted with the wrong disk size or that the creation of the managed node group failed?
If it's the former we'd need to add additional verification to ensure the disk size is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They booted with the wrong disk size, so yeah, would be good to check. Is there another test I can look at that does similar verification, so I can use that as a basis?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could create a similar func like this one here that's checking if the node is ready:
pulumi-eks/examples/utils/utils.go
Lines 371 to 387 in 4e11388
func IsNodeReady(t *testing.T, clientset *kubernetes.Clientset, node *corev1.Node) (bool, *corev1.Node) { | |
// Attempt to retrieve Node. | |
o, err := clientset.CoreV1().Nodes().Get(context.TODO(), node.Name, metav1.GetOptions{}) | |
if err != nil { | |
return false, nil | |
} | |
// Check the returned Node's conditions for readiness. | |
for _, condition := range o.Status.Conditions { | |
if condition.Type == corev1.NodeReady { | |
t.Logf("Checking if Node %q is Ready | Condition.Status: %q | Condition: %v\n", node.Name, condition.Status, condition) | |
return condition.Status == corev1.ConditionTrue, o | |
} | |
} | |
return false, o | |
} |
It could do something like:
var desiredSizeGB int64 = 50
nodeEphemeralStorage := o.Status.Capacity[corev1.ResourceEphemeralStorage]
return nodeEphemeralStorage.CmpInt64(desiredSizeGB * 1_000_000_000) >= 0
Alternatively we could check the EC2 instances directly using AWS APIs, but I feel like the kube only approach is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed a change to use the utils.ValidateNodes
function. It's my first time using Go and I don't have a way to actually run the code so I'm sure it's missing something but hopefully it's a start?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll kick off the tests :)
@@ -1836,9 +1839,19 @@ Content-Type: text/x-shellscript; charset="us-ascii" | |||
? { httpTokens: "required", httpPutResponseHopLimit: 2, httpEndpoint: "enabled" } | |||
: undefined; | |||
|
|||
const blockDeviceMappings = [ | |||
{ | |||
deviceName: "/dev/xvda", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mental note for myself: This works only for AL2 and AL2023 based AMIs, which is fine right now because we only support AL2. Will need to be taken into account when working on #1195
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
/run-acceptance-tests |
Please view the PR build - https://github.com/pulumi/pulumi-eks/actions/runs/10366894620 |
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
@JustASquid I had to merge in some changes from the master branch because they were required for the build to work. I'll re-run the tests now |
/run-acceptance-tests |
Please view the PR build - https://github.com/pulumi/pulumi-eks/actions/runs/10368370295 |
@JustASquid this seems to have broken managed node groups:
https://github.com/pulumi/pulumi-eks/actions/runs/10368370295/job/28702007770#step:21:519 I can have a look later today to see what exactly is going wrong here |
@JustASquid After debugging this, it seems that we need an additional check whether disksize is defined here: https://github.com/pulumi/pulumi-eks/pull/1229/files#diff-fa11b07a99c4eb8e85f987e265e143389230bafb7636b3b6bab10cf6158a120eR1847 Do you want to add this change or should I add it on top of this PR? |
…-size-fix' into feature/mng-launch-template-disk-size-fix
@flostadler I just pushed a fix. |
PR is now waiting for a maintainer to run the acceptance tests. This PR will only perform build and linting. |
/run-acceptance-tests |
Please view the PR build - https://github.com/pulumi/pulumi-eks/actions/runs/10472336069 |
@JustASquid The existing tests are happy now, but I found that newly added tests aren't being executed for community PRs. I ran the Thanks again for this contribution |
This fixes the handling of disk size changes when a custom launch template is created for the MNG. It is based on the Community PR #1229 by @JustASquid, I just added some test fixes on top. Fixes #1228 --------- Co-authored-by: Daniel West <daniel.west.279@gmail.com>
This fixes the handling of disk size changes when a custom launch template is created for the MNG. It is based on the Community PR #1229 by @JustASquid, I just added some test fixes on top. Fixes #1228 --------- Co-authored-by: Daniel West <daniel.west.279@gmail.com>
Proposed changes
Correctly handle disk size changes when a launch template is created by ManagedNodeGroup
Related issues (optional)
Resolves #1228