Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: uninstall command #725

Merged
merged 13 commits into from
Sep 23, 2021
Merged

cli: uninstall command #725

merged 13 commits into from
Sep 23, 2021

Conversation

ndhanushkodi
Copy link
Contributor

@ndhanushkodi ndhanushkodi commented Sep 15, 2021

Based off of a demo branch @sadjamz worked on here: https://github.com/hashicorp/consul-k8s-cli/tree/uninstall

Changes proposed in this PR:

  • consul-k8s uninstall uninstalls the Consul Helm installation with options to wipe all data

  • If you don't provide the release name/namespace, it will find the release whose metadata.name matches "consul", and then prompt to uninstall that release and any pvcs/secrets based on that.

  • You can provide the release name/namespace in a case where you have already helm uninstall'ed the consul helm installation, but still have PVCs, Secrets, etc to clean up.

  • Note that this PR is difficult to unit test-- acceptance tests are coming as the next task after install/uninstall are complete. The Helm steps are difficult to mock out for reasons mentioned in the install PR:

How I've tested this PR:
Manual testing of the cases mentioned below, a few unit tests.

How I expect reviewers to test this PR:

  1. Code review

  2. Test it out:
    cd cli
    go build -o ./bin/consul-k8s
    ./bin/consul-k8s uninstall
    If you can test out the command with the following flags and let me know if you have behaviour suggestions.
    a) no flags
    b) -name and -namespace set
    c) -auto-approve
    d) -wipe-all-data

Checklist:

  • Tests added
  • CHANGELOG entry added

    HashiCorp engineers only, community PRs should not add a changelog entry.
    Entries should use present tense (e.g. Add support for...)

@ndhanushkodi ndhanushkodi requested review from a team, ishustava, sadjamz and kschoche and removed request for a team September 15, 2021 04:18
cli/README.md Outdated Show resolved Hide resolved
cli/README.md Outdated Show resolved Hide resolved
cli/README.md Outdated Show resolved Hide resolved
cli/README.md Outdated
Comment on lines 100 to 121
-skip-wipe-data
Skip deleting all PVCs, Secrets, and Service Accounts associated with
Consul Helm installation without prompting for approval to delete. The
default is false.

-wipe-data
Delete all PVCs, Secrets, and Service Accounts associated with Consul
Helm installation without prompting for approval to delete. Only use
this when persisted data from previous installations is no longer
necessary. The default is false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If skip-wipe-data is defaulted false (meaning wipe the data) and wipe-data is defaulted false (meaning dont wipe the data) what is the actual default behaviour? I'm assuming that we meant that skip-wipe-data is defaulted to true?

c.once.Do(c.init)

defer func() {
if err := c.Close(); err != nil {
Copy link
Contributor

@kschoche kschoche Sep 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just out of curiosity what does c.Close() here do? would c.UI.Output() still be usable if c.Close() failed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that's a good question. I'm curious about this as well. It looks like c.Close attempts to close the UI if the UI implements io.Closer interface, so I'm curious if that means that we won't be able to output to it anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched this to using the logger instead. In our case, the UI doesn't implement the closer interface, so Close() would be a no-op. But we could potentially choose to implement it in the future if we say "upgrade" our UI.

}
}
if len(secretNames) > 0 {
c.UI.Output("Consul secrets deleted.", terminal.WithSuccessStyle())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a super nit comment, but if there are 3 secrets and 2 got deleted and the third exits the for loop because you cant delete it for some reason then neither of these Outputs get processed properly. I suppose it is okay since it "should work" in happy path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the correct thing to do is output the Error via c.UI.Output() instead of return the error, and then continue to process the other secrets?

I guess this logic applies to the other similar functions, if you choose.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 to Kyle's question. I left a similar question in the deletePVCs function too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its ok because in an error case, it will exit the function at the point of the error without printing either of these outputs here, and just print the first error it got to. Then when a user re-runs uninstall after fixing their issue, it should succeed and delete the secrets properly.

Copy link
Contributor

@kschoche kschoche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great, I left some mostly minor comments but assuming you get time to look at them I think it's good to go!
I did not get a chance to test run the CLI unfortunately.. will get to that soon :)

cli/README.md Outdated Show resolved Hide resolved
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is impressive work!!! I'm so excited about this ❤️ This is a lot of work, and I really appreciate how easy this was to read.

I've only reviewed the command (and not tests) and had some comments that I wanted to clarify before proceeding with the rest of the review. I've also left some minor edits.

I've also ran some manual tests and discussed the issues I ran into with Nitya already but wanted to document them here for posterity:

  1. I think we need to change the helm list functionality to list any pending/uninstalled releases so that if your installation has failed, the uninstall will still find the release.
  2. We need make sure that the kubernetes client is initialized with the correct namespace. With the 0.33.0 release currently, uninstall fails because tls-init-cleanup job doesn't have Namespace set in its template. We've fixed it now in the latest helm release, but we should still make sure that the client is initialized correctly.
  3. It would be nice to default release name and namespace to consul for a case when I forgot to wipe-data in a previous uninstall. That way rerunning uninstall would still just work.

There is also another issue that we haven't discussed that I think would be nice to fix but doesn't have to be part of this PR. In my testing I kept running into errors saying that tls-init-cleanup job or service account etc already exist. It'd be nice if the uninstall command cleaned those up before calling helm delete.

cli/cmd/uninstall/uninstall.go Outdated Show resolved Hide resolved
cli/cmd/uninstall/uninstall.go Outdated Show resolved Hide resolved
cli/cmd/common/utils.go Outdated Show resolved Hide resolved
cli/cmd/install/install.go Show resolved Hide resolved
c.once.Do(c.init)

defer func() {
if err := c.Close(); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that's a good question. I'm curious about this as well. It looks like c.Close attempts to close the UI if the UI implements io.Closer interface, so I'm curious if that means that we won't be able to output to it anymore.

cli/cmd/uninstall/uninstall.go Show resolved Hide resolved
}
}
if len(secretNames) > 0 {
c.UI.Output("Consul secrets deleted.", terminal.WithSuccessStyle())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah +1 to Kyle's question. I left a similar question in the deletePVCs function too.

}

// deleteServiceAccounts deletes service accounts that have foundReleaseName in their name.
func (c *Command) deleteServiceAccounts(foundReleaseName, foundReleaseNamespace string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why do we delete only service accounts and not roles, rolebindings, clusterroles, and clusterrolebindings?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@ndhanushkodi ndhanushkodi Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in testing I am noticing the consul-tls-init role, rolebinding, and service account are sticking around if they are not manually deleted when installing with -preset=secure. I went ahead and added that logic to delete roles/rolebindings/service accounts. I plan to add logic in the future for jobs, clusterroles, and clusterrolebindings if we see cases those aren't being deleted!

cli/cmd/uninstall/uninstall.go Show resolved Hide resolved
cli/cmd/uninstall/uninstall.go Outdated Show resolved Hide resolved
@ndhanushkodi ndhanushkodi added area/acls Related to ACLs and removed area/acls Related to ACLs labels Sep 21, 2021
@ndhanushkodi ndhanushkodi added the area/acls Related to ACLs label Sep 21, 2021
@ndhanushkodi ndhanushkodi force-pushed the cli-uninstall branch 3 times, most recently from a5a0c22 to eb31fec Compare September 22, 2021 07:10
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ❤️ Thank you so much for considering the feedback so thoughtfully 🙏

I've added some minor comments, but they are not blocking.

cli/cmd/uninstall/uninstall.go Outdated Show resolved Hide resolved
cli/cmd/uninstall/uninstall_test.go Outdated Show resolved Hide resolved
cli/cmd/uninstall/uninstall_test.go Show resolved Hide resolved
cli/cmd/uninstall/uninstall.go Outdated Show resolved Hide resolved
rolebindings, err := c.kubernetes.RbacV1().RoleBindings("default").List(context.TODO(), metav1.ListOptions{})
require.NoError(t, err)
require.Len(t, rolebindings.Items, 0)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can also add tests for the findExistingInstallation function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is kind of tricky for similar reasons to

// Note that this function is tricky to test because mocking out the action.Configuration struct requires a
.

I think I can open a separate PR this week to refactor the helm logic out of uninstall.findExistingInstallation and install.checkForPreviousInstallations, and unit test the part that's unrelated to helm logic. The reason to pull it out into a separate PR is since I'll be focusing on testing within the acceptance tests PR we can add this additional testing there after a quick refactor.

@ndhanushkodi ndhanushkodi merged commit 9a974c1 into main Sep 23, 2021
@ndhanushkodi ndhanushkodi deleted the cli-uninstall branch September 23, 2021 17:55
lawliet89 pushed a commit to lawliet89/consul-k8s that referenced this pull request Oct 6, 2021
The port-forward sometimes fails randomly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/acls Related to ACLs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants