Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EBS Root Volume Termination #7865

Merged
merged 6 commits into from
Dec 6, 2019
Merged

Conversation

tioxy
Copy link
Contributor

@tioxy tioxy commented Nov 1, 2019

What type of PR is this?
/kind feature

What this PR does / why we need it:
Enables to change EBS DeleteOnTermination by specifiying separated flags for Root Volume and Additional Volumes inside InstanceGroupSpec.

Which issue(s) this PR fixes:
Fixes #6373

Special notes for your reviewer:

Further discussion about UX changed field names. Previous source code is here

We found better to split deleteOnTermination into different fields, as it enabled a deeper customization of InstanceGroups (originally was suggested only a single field in the structure).

Now we can finally close the 1.15 Milestone 😄
Also, a huge thanks to @joshbranham's work

Does this PR introduce a user-facing change?:

Add capacity to retain EBS volumes after instance termination from InstanceGroups

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

---
kind: InstanceGroup
spec:
  rootVolumeDeleteOnTermination: false

---
kind: InstanceGroup
spec:
  volumes:
  - device: /dev/xvdd
    deleteOnTermination: false

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 1, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @tioxy. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 1, 2019
@tioxy
Copy link
Contributor Author

tioxy commented Nov 1, 2019

/assign @geojaz

@rifelpet
Copy link
Member

rifelpet commented Nov 1, 2019

/ok-to-test

Can you add a test for this? Perhaps just adding the field in this manifest and updating the terraform and cloudformation files in the same directory to match.

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 1, 2019
@joshbranham
Copy link
Contributor

@tioxy have you tried building an instance group with and without this setting? I had some difficulty with an existing group and the kops update cluster diff showing the change to false, but maybe you handle that more elegantly than I had!

@tioxy
Copy link
Contributor Author

tioxy commented Nov 1, 2019

Hi @joshbranham 😀

One of the changes that probably affected your attempt was around awstasks/launchconfiguration.go with the method Find(). It will probably output mislead information if you do not change the actual Launch Configuration struct field.

    actual.RootVolumeTermination = b.Ebs.DeleteOnTermination

I made a step-by-step below running kops updates against different instance group volume settings as you suggested 👍

I cleaned the ouput as much as I could to make it more clear

These are the manifests that I used to create a cluster with different volume settings:

 kind: InstanceGroup
 metadata:
   name: master-us-east-1a
 spec:
   role: Master
   rootVolumeRetainOnTermination: true     # retain root
   volumes:
   - device: /dev/xvdd
     encrypted: true
     size: 20
     type: gp2
 kind: InstanceGroup
 metadata:
   name: nodes
 spec:
   role: Node
   volumes:
   - device: /dev/xvdd
     encrypted: true
     retainOnTermination: true     # retain xvdd
     size: 20
     type: gp2

Then I ran kops update cluster (1st output):

  LaunchConfiguration/master-us-east-1a.masters.kops.tioxy.com
  	BlockDeviceMappings 	[{"DeviceName":"/dev/xvdd","EbsDeleteOnTermination":true,"EbsEncrypted":true,"EbsVolumeIops":null,"EbsVolumeSize":20,"EbsVolumeType":"gp2","VirtualName":null}]
  	RootVolumeSize      	64
        // retain root
  	RootVolumeTermination	false
  	RootVolumeType      	gp2	

  LaunchConfiguration/nodes.kops.tioxy.com
        // retain xvdd
  	BlockDeviceMappings 	[{"DeviceName":"/dev/xvdd","EbsDeleteOnTermination":false,"EbsEncrypted":true,"EbsVolumeIops":null,"EbsVolumeSize":20,"EbsVolumeType":"gp2","VirtualName":null}]
  	RootVolumeSize      	128
  	RootVolumeTermination	true
  	RootVolumeType      	gp2      	

I edited the nodes InstanceGroup to add rootVolumeRetainOnTermination: true:

 kind: InstanceGroup
 metadata:
   name: nodes
 spec:
   role: Node
   rootVolumeRetainOnTermination: true     # retaining root now
   volumes:
   - device: /dev/xvdd
     encrypted: true
     retainOnTermination: true
     size: 20
     type: gp2

And this the result after kops update cluster (2nd output):

Will modify resources:
  LaunchConfiguration/nodes.kops.tioxy.com
        // retaining root now
  	RootVolumeTermination	 true -> false

Must specify --yes to apply changes

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 1, 2019
@tioxy
Copy link
Contributor Author

tioxy commented Nov 1, 2019

@rifelpet Even after kops delete cluster, the retained volumes will be available in the AWS account (which I think is the intended behavior).

Should we be worried about these dangling volumes during e2e tests (costs and unused resources)?

@tioxy
Copy link
Contributor Author

tioxy commented Nov 3, 2019

Fixed tests 👍
Needed to add new fields to in-legacy-v1alpha2.yaml as well

@tioxy
Copy link
Contributor Author

tioxy commented Nov 4, 2019

/test pull-kops-e2e-kubernetes-aws

@tioxy
Copy link
Contributor Author

tioxy commented Nov 17, 2019

/assign @rifelpet

@geojaz
Copy link
Member

geojaz commented Nov 25, 2019

@tioxy I think this looks good, and I'm trying to give it a test over here for you, but I noticed that I'm getting a failure on verify-goimports (which make ci runs). I'll take a look at the rest of it, but could you take a look at that one pls?

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 26, 2019
@tioxy tioxy force-pushed the ebs_termination branch 2 times, most recently from db532b8 to a7085a0 Compare November 26, 2019 00:37
@joshbranham
Copy link
Contributor

I would agree that using the following nomenclature makes the most sense: deleteOnTermination, with rootVolume in front etc as you have described, thanks for continuing to work on this as we deliberate :)

Support RootVolumeDeleteOnTermination and DeleteOnTermination fields
to enable a deeper customization
Change BDM behavior when finding Launch Configuration and generating
Cloudformation/Terraform
Implement separated logics for root volume and additional volumes
Copy link
Contributor

@joshbranham joshbranham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me, thanks @tioxy !

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 4, 2019
@rifelpet
Copy link
Member

rifelpet commented Dec 4, 2019

I agree it looks good! Can we mention the default value of true in the API field comments so that it shows up in the API documentation?

And I noticed this will only apply to LaunchConfigurations and not LaunchTemplates:

bm[aws.StringValue(img.RootDeviceName)] = &BlockDeviceMapping{
EbsDeleteOnTermination: aws.Bool(true),
EbsVolumeSize: t.RootVolumeSize,
EbsVolumeType: t.RootVolumeType,
EbsVolumeIops: t.RootVolumeIops,
}

I think we can address that in a followup PR but it might be worth mentioning in the API field comments that it currently only applies to LaunchConfigurations.

Add rootVolumeDeleteOnTermination and deleteOnTermination to test if
volumes are being retained properly in direct, terraform and
cloudformation
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 4, 2019
@tioxy
Copy link
Contributor Author

tioxy commented Dec 4, 2019

@rifelpet Pushed the default/launch configuration api docs with new codegen.
Added launch configuration docs in a separate line to be easily removed when Launch Template is implemented.

Thank you all for such great reviews 👍

@rifelpet
Copy link
Member

rifelpet commented Dec 4, 2019

One last concern came to mind: if volumes are not deleted by the ASG, does a kops delete cluster delete them? We should determine whether it does or not and clearly document that as well. I could see either behavior being desired, so it's important to make sure users are aware of whichever one it is.

Copy link
Member

@geojaz geojaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tioxy this looks great. thanks taking our feedback and running with it. also, you probably have some of the best github change comments I've seen! 🙌

I had some suggestions on the wording that i think make this more digestable for we humans. I didn't note every single change because it's basically all the same :)

k8s/crds/kops.k8s.io_instancegroups.yaml Outdated Show resolved Hide resolved
k8s/crds/kops.k8s.io_instancegroups.yaml Outdated Show resolved Hide resolved
pkg/apis/kops/instancegroup.go Outdated Show resolved Hide resolved
State default value for both deleteOnTermination and
rootVolumeDeleteOnTermination equals to true
Codegen for deleteOnTermination and rootVolumeDeleteOnTermination (crds
and apis)
@tioxy
Copy link
Contributor Author

tioxy commented Dec 6, 2019

@geojaz Updated API field docs with your suggestion, way more human readable and straightforward 😅

@rifelpet Volumes are retained after cluster deletion. Just added this to API field docs as well.

I will start to work in the Launch Template implementation in a few days 😉

Copy link
Member

@geojaz geojaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

nice work @tioxy !

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 6, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: geojaz, tioxy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 6, 2019
@k8s-ci-robot k8s-ci-robot merged commit b60fbff into kubernetes:master Dec 6, 2019
@tioxy tioxy deleted the ebs_termination branch December 6, 2019 13:01
@tioxy tioxy restored the ebs_termination branch December 6, 2019 13:01
k8s-ci-robot added a commit that referenced this pull request Jan 28, 2020
…origin-release-1.17

Automated cherry pick of #7865: feat(api): DeleteOnTermination fields for volumes
k8s-ci-robot added a commit that referenced this pull request Jan 29, 2020
…origin-release-1.16

Automated cherry pick of #7865: feat(api): DeleteOnTermination fields for volumes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants