Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot re-run helm uninstallation if the first one failed and cannot fetch logs of failed uninstallation pod #4711

Closed
PhanLe1010 opened this issue Oct 13, 2022 · 10 comments
Assignees
Labels
area/install-uninstall-upgrade Install, Uninstall or Upgrade related kind/bug require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos
Milestone

Comments

@PhanLe1010
Copy link
Contributor

Describe the bug

Cannot re-run helm uninstallation if the first one failed and cannot fetch logs of failed uninstallation pod

To Reproduce

Steps to reproduce the behavior:

  1. Clone the chart repo and remove the --force flag here as a way to artificially fail the uninstalltion job
  2. Install the Helm chart
  3. Create some volumes and attach them
  4. Do helm uninstall
  5. The uninstallation job will fail because there are attached volumes and --force flag is not set
  6. Now detach the volumes
  7. Re-run helm uninstalltion again. This command always fail since the previous uninstallation job already exist
  8. Also we cannot check the log of the uninstallation pod because it is already cleanup so we are kind of lost not knowing what was the failed reason.

Expected behavior

Users can re-run helm uninstallation and can see logs of failed uninstallation pod

Environment

  • Longhorn version: <= 1.3.2
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl): Helm/Rancher UI
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Oct 13, 2022

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at: [BUG] Cannot re-run helm uninstallation if the first one failed and cannot fetch logs of failed uninstallation pod #4711 (comment)

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at: Non for getting the logs. But we can delete the failed job and retry helm uninstall

  • Does the PR include the explanation for the fix or the feature?

  • Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at: Fix bug: cannot re-run helm uninstall if the first one failed and cannot fetch logs of failed uninstalltion job #4712
    The PR for the chart change is at:

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at Fix bug: cannot re-run helm uninstall if the first one failed and cannot fetch logs of failed uninstalltion job #4712

  • Which areas/issues this PR might have potential impacts on?
    Area Uninstallation/Upgrade
    Issues

  • If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at

  • If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at
    The issue of automation test case implementation is at (please create by the template)

  • If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@PhanLe1010
Copy link
Contributor Author

Test steps:

  1. Clone the chart repo and remove the --force flag here as a way to artificially fail the uninstallation job
  2. Install the Helm chart
  3. Create some volumes and attach them
  4. Do helm uninstall
  5. The uninstallation job will fail because there are attached volumes and --force flag is not set
  6. Verify that we can check the log of the failed uninstallation pod inside longhorn-system namespace
  7. Now detach the volumes
  8. Re-run helm uninstall again. Very that the uninstallation succeeded

@roger-ryao
Copy link

Verified on master-head 20221027

  • longhorn master-head (777bc5b)
  • longhorn-manager master-head (76468fa)

The test steps

Test steps:

  1. Clone the chart repo git clone https://github.com/longhorn/longhorn.git
  2. remove the --force flag here as a way to artificially fail the uninstallation job
    ( longhorn/chart/templates/uninstall-job.yaml )
  3. Install the Helm chart
cd longhorn/chart
helm install longhorn . --namespace longhorn-system --create-namespace
  1. Create one volume and attach to node-1
  2. Do helm uninstall helm uninstall longhorn -n longhorn-system
  3. The uninstallation job will fail because there are attached volumes and --force flag is not set
    Screenshot_20221027_105232
  4. Verify that we can check the log of the failed uninstallation pod inside longhorn-system namespace
    kubectl -n longhorn-system logs longhorn-uninstall-z7hf5
    Screenshot_20221027_105538
  5. Now detach the volumes
  6. set deleting-confirmation-flag to true
    kubectl -n longhorn-system edit settings.longhorn.io deleting-confirmation-flag
  7. Re-run helm uninstall again. Very that the uninstallation succeeded
    helm uninstall longhorn -n longhorn-system

Result Passed

@roger-ryao
Copy link

roger-ryao commented Oct 27, 2022

Hi @PhanLe1010 :
After I verified this issue, we need to confirm one thing in our design.

The test steps
1. Clone the chart repo git clone https://github.com/longhorn/longhorn.git
2. Install the Helm chart

cd longhorn/chart
helm install longhorn . --namespace longhorn-system --create-namespace
  1. Do helm uninstall helm uninstall longhorn -n longhorn-system
  2. The uninstallation job will fail
  3. set deleting-confirmation-flag to true
    kubectl -n longhorn-system edit settings.longhorn.io deleting-confirmation-flag
  4. Re-run helm uninstall again. Very that the uninstallation succeeded
    helm uninstall longhorn -n longhorn-system

Result

  1. I didn't create any volume on the Longhorn, the uninstallation job also failed
  2. Before uninstalling Longhorn, we must set deleting-confirmation-flag to true

Is it the expected behavior on the master-head?

cc @longhorn/qa

@PhanLe1010
Copy link
Contributor Author

I didn't create any volume on the Longhorn, the uninstallation job also failed
Before uninstalling Longhorn, we must set deleting-confirmation-flag to true

Yes, it is expected that Before uninstalling Longhorn, we must set deleting-confirmation-flag to true Thank you for re-verifying

@yangchiu
Copy link
Member

@PhanLe1010 Should https://github.com/longhorn/longhorn/blob/master/uninstall/uninstall.yaml be modified like #4712 too? Otherwise if using kubectl to uninstall longhorn, it would be unable to fetch logs of failed uninstallation pod too?

@PhanLe1010
Copy link
Contributor Author

Great catch. I will update it

@PhanLe1010
Copy link
Contributor Author

Pushed PR #4810

@innobead innobead reopened this Nov 2, 2022
@innobead innobead added area/install-uninstall-upgrade Install, Uninstall or Upgrade related require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos labels Nov 2, 2022
@innobead
Copy link
Member

innobead commented Nov 2, 2022

@yangchiu Please help verify #4711 (comment), then close if it works correctly. Thanks.

@yangchiu
Copy link
Member

yangchiu commented Nov 2, 2022

Verified passed on master-head (longhorn cc043c4) by the following test steps:

(1) install longhorn master-head
(2) try to uninstall longhorn by kubectl create -f https://raw.githubusercontent.com/longhorn/longhorn/master/uninstall/uninstall.yaml
(3) view logs by kubectl logs job/longhorn-uninstall -n longhorn-system, the expected error message can be seen, indicates the Deleting Confirmation Flag flag has to be set to true.
(4) set Deleting Confirmation Flag flag to true, and delete/create uninstallation job again, longhorn can be uninstalled without problems.

@yangchiu yangchiu closed this as completed Nov 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/install-uninstall-upgrade Install, Uninstall or Upgrade related kind/bug require/chart Require updating (chart) manifests in longhorn, longhorn-manager, charts repos
Projects
None yet
Development

No branches or pull requests

5 participants