Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] can't upgrade engine if a volume was created in Longhorn v1.0 and the volume.spec.dataLocality is "" #4412

Closed
derekbit opened this issue Aug 12, 2022 · 2 comments
Assignees
Labels
area/webook Kubernetes validation and mutating webhooks backport/1.3.2 kind/bug reproduce/always 100% reproducible severity/3 Function working but has a major issue w/ workaround
Milestone

Comments

@derekbit
Copy link
Member

derekbit commented Aug 12, 2022

Describe the bug

can't upgrade engine if a volume was created in Longhorn v1.0 and its volume.spec.dataLocality ""
image

v1.0 does not have the dataLocality parameter, and the value is not handled in the upgrade path (v1.0->v1.1 and v1.1->v1.2) and always keeps an empty sting, so users might hit this issue while upgrading to v1.3.x if the volume was created in v1.0.

To Reproduce

Steps to reproduce the behavior:

  1. Deploy Longhorn v1.0
  2. Create a volume
  3. Upgrade Longhorn to v1.2.x and then upgrade the volume's engine
  4. Upgrade Longhorn to v1.3.x. Then, you might hit
Error: UPGRADE FAILED: cannot patch "engineimages.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "engineimages.longhorn.io" is invalid: spec.conversion.strategy: Invalid value: "Webhook": must be None if spec.preserveUnknownFields is true && cannot patch "nodes.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "nodes.longhorn.io" is invalid: spec.conversion.strategy: Invalid value: "Webhook": must be None if spec.preserveUnknownFields is true && cannot patch "volumes.longhorn.io" with kind CustomResourceDefinition: CustomResourceDefinition.apiextensions.k8s.io "volumes.longhorn.io" is invalid: spec.conversion.strategy: Invalid value: "Webhook": must be None if spec.preserveUnknownFields is true

Workaround: #4340 (comment)

  1. After upgrade to v1.3.x, upgrade the volume's engine. The issue will be hit while upgrading the engine.

Expected behavior

Can upgrade engine successfully

Log or Support bundle

If applicable, add the Longhorn managers' log or support bundle when the issue happens.
You can generate a Support Bundle using the link at the footer of the Longhorn UI.

Environment

  • Longhorn version:
  • Installation method (e.g. Rancher Catalog App/Helm/Kubectl):
  • Kubernetes distro (e.g. RKE/K3s/EKS/OpenShift) and version:
    • Number of management node in the cluster:
    • Number of worker node in the cluster:
  • Node config
    • OS type and version:
    • CPU per node:
    • Memory per node:
    • Disk type(e.g. SSD/NVMe):
    • Network bandwidth between the nodes:
  • Underlying Infrastructure (e.g. on AWS/GCE, EKS/GKE, VMWare/KVM, Baremetal):
  • Number of Longhorn volumes in the cluster:

Additional context

#4387

Workaround

Set volume.spec.dataLocality from "" to disabled manually

@derekbit derekbit self-assigned this Aug 12, 2022
@longhorn-io-github-bot
Copy link

longhorn-io-github-bot commented Aug 12, 2022

Pre Ready-For-Testing Checklist

  • Where is the reproduce steps/test steps documented?
    The reproduce steps/test steps are at: [BUG] can't upgrade engine if a volume was created in Longhorn v1.0 and the volume.spec.dataLocality is "" #4412 (comment)

  • Is there a workaround for the issue? If so, where is it documented?
    The workaround is at: [BUG] can't upgrade engine if a volume was created in Longhorn v1.0 and the volume.spec.dataLocality is "" #4412 (comment)

  • Does the PR include the explanation for the fix or the feature? [BUG] can't upgrade engine if a volume was created in Longhorn v1.0 and the volume.spec.dataLocality is "" #4412 (comment)

  • [ ] Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
    The PR for the YAML change is at:
    The PR for the chart change is at:

  • Have the backend code been merged (Manager, Engine, Instance Manager, BackupStore etc) (including backport-needed/*)?
    The PR is at Convert volume.spec.dataLocality from empty string to disabled longhorn-manager#1475

  • Which areas/issues this PR might have potential impacts on?
    Area: volume, conversion webhook, engine upgrade
    Issues

  • [ ] If labeled: require/LEP Has the Longhorn Enhancement Proposal PR submitted?
    The LEP PR is at

  • [ ] If labeled: area/ui Has the UI issue filed or ready to be merged (including backport-needed/*)?
    The UI issue/PR is at

  • ~~[ ] If labeled: require/doc Has the necessary document PR submitted or merged (including backport-needed/*)?
    The documentation issue/PR is at

  • [ ] If labeled: require/automation-e2e Has the end-to-end test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue (including backport-needed/*)
    The automation skeleton PR is at
    The automation test case PR is at
    The issue of automation test case implementation is at (please create by the template)

  • [ ] If labeled: require/automation-engine Has the engine integration test been merged (including backport-needed/*)?
    The engine automation PR is at

  • [ ] If labeled: require/manual-test-plan Has the manual test plan been documented?
    The updated manual test plan is at

  • [ ] If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
    The compatibility issue is filed at

@chriscchien
Copy link
Contributor

Verified on longhorn manager master-head d7d200
Result pass

Test steps

  1. Can reproduced on v1.3.1
  2. Volume engine image can upgrade to master-head when volume was created in v1.0.0 and longhorn was upgraded from v1.0.0 -> v1.2.x(upgrade engine) -> master-head

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/webook Kubernetes validation and mutating webhooks backport/1.3.2 kind/bug reproduce/always 100% reproducible severity/3 Function working but has a major issue w/ workaround
Projects
None yet
Development

No branches or pull requests

4 participants