Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] The harvester_config of terraform-provider-rancher2 in 1.25.0 and 3.0.0 are incompatible #3997

Closed
futuretea opened this issue May 30, 2023 · 3 comments
Assignees
Labels
area/terraform-provider-rancher2 Terraform Provider for Rancher v2 and Harvester node driver kind/bug Issues that are defects reported by users or that we know have reached a real release not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release reproduce/always Reproducible 100% of the time severity/1 Function broken (a critical incident with very high impact)
Milestone

Comments

@futuretea
Copy link
Contributor

Describe the bug

The harvester_config of terraform-provider-rancher2 in 1.25.0 and 3.0.0 are incompatible

To Reproduce

  1. Create a guest cluster using terraform rancher2 provider 1.25.0
  2. Upgrade rancher2 provider from 1.25.0 to 3.0.0
  3. terraform plan
❯ terraform plan
╷
│ Error: Missing required argument
│
│   on main.tf line 37, in resource "rancher2_machine_config_v2" "foo-harvester-v2":
│   37:   harvester_config {
│
│ The argument "disk_info" is required, but no definition was found.
╵
╷
│ Error: Missing required argument
│
│   on main.tf line 37, in resource "rancher2_machine_config_v2" "foo-harvester-v2":
│   37:   harvester_config {
│
│ The argument "network_info" is required, but no definition was found.

If the user wants to apply, they have to modify to the new format, modify to the new format and then apply, the Harvester guest cluster will be re-provisioned by Rancher due to machine_config fields change.

Actual Result

The user needs to change the field format of harvester_config

Expected Result

The new fields disk_info and network_info are not force required.
User can still use old fields
terraform apply should not cause cluster to be re-provisioned

Screenshots

Additional context

@futuretea futuretea added kind/bug Issues that are defects reported by users or that we know have reached a real release reproduce/needed Reminder to add a reproduce label and to remove this one severity/needed Reminder to add a severity label and to remove this one area/terraform-provider-rancher2 Terraform Provider for Rancher v2 and Harvester node driver reproduce/always Reproducible 100% of the time severity/1 Function broken (a critical incident with very high impact) and removed reproduce/needed Reminder to add a reproduce label and to remove this one severity/needed Reminder to add a severity label and to remove this one labels May 30, 2023
@futuretea futuretea added this to the v1.2.0 milestone May 30, 2023
@futuretea futuretea added the not-require/test-plan Skip to create a e2e automation test issue label May 30, 2023
@futuretea futuretea self-assigned this May 30, 2023
@harvesterhci-io-github-bot
Copy link

harvesterhci-io-github-bot commented May 30, 2023

Pre Ready-For-Testing Checklist

* [ ] If labeled: require/HEP Has the Harvester Enhancement Proposal PR submitted?
The HEP PR is at:

* [ ] Is there a workaround for the issue? If so, where is it documented?
The workaround is at:

* [ ] Have the backend code been merged (harvester, harvester-installer, etc) (including backport-needed/*)?
The PR is at:

* [ ] Does the PR include deployment change (YAML/Chart)? If so, where are the PRs for both YAML file and Chart?
The PR for the YAML change is at:
The PR for the chart change is at:

* [ ] If labeled: area/ui Has the UI issue filed or ready to be merged?
The UI issue/PR is at:

* [ ] If labeled: require/doc, require/knowledge-base Has the necessary document PR submitted or merged?
The documentation/KB PR is at:

* [ ] If NOT labeled: not-require/test-plan Has the e2e test plan been merged? Have QAs agreed on the automation test case? If only test case skeleton w/o implementation, have you created an implementation issue?
- The automation skeleton PR is at:
- The automation test case PR is at:

* [ ] If the fix introduces the code for backward compatibility Has a separate issue been filed with the label release/obsolete-compatibility?
The compatibility issue is filed at:

@rebeccazzzz rebeccazzzz added the priority/0 Must be fixed in this release label May 31, 2023
@irishgordo
Copy link

Perhaps some of the elements from the Test Plan from 1009 could help in setting this up?

@lanfon72
Copy link
Member

Verified this bug has been fixed.

Test Information

  • Environment: qemu/KVM 2 nodes
  • Harvester Version: v1.2-a5e74302-head
  • ui-source Option: Auto
  • Rancher Version:
    • v2.7-5334be362e291576c8f32b4432762399729832e6-head (sha256:b5af65652fbdc20e130947bb1759d5859c6f80af979393b9bd07199c16eac43d)
    • v2.6.13-rc1-linux-amd64 unable to import harvester
    • v2.6.13-head (sha256:4253a0f9945918f61df3e6fa6a01098c5fd66e6c9dff8bbc7ea3f0f29b800d48) 1

Verify Steps

  1. Install Harvester with any nodes

  2. Create a image for VM creation

  3. Create a VM network

  4. Login to Rancher, import Harvester into Rancher named harv

  5. Use terraform to pervision RKE2 cluster2

    • Save the code snippet as rke2.tf
    • Update relevant fields which having <> in the code snippets
    • terraform apply will fail in the first time3, it is expected, we will need to apply twice
  6. RKE2 cluster foo-harvester should be provisioned successfully

  7. Update terraform-provider-rancher2 to use version >v3.0.0

    • When newer version released, we can simply modify rke2.tf to update version
    terraform {
      required_providers {
        rancher2 = {
          source = "rancher/rancher2"
          version = "> 3.0.0"
        }
      }
    }
    • Or build from the PR fixing the bug4, then update the provider version:
    terraform {
      required_providers {
        rancher2 = {
          source = "terraform.local/local/rancher2"
          version = "0.0.0-dev"
        }
      }
    }
  8. Execute terraform init -upgrade to use new provider, we might encounter the error as the snapshot, apply it again will resolve it.
    image

  9. Execute terraform plan, there should have several warning about deprecated fields: image_size, image_name, disk_bus, network_name, network_model

  10. Execute terraform apply -auto-approve, RKE2 cluster should not be update

  11. Update rke2.tf to use new format and apply it terraform apply -auto-approve

    resource "rancher2_machine_config_v2" "foo-harvester-v2" {
      generate_name = "foo-harvester-v2"
      harvester_config {
        vm_namespace = "default"
        cpu_count = "2"
        memory_size = "4"
    #    disk_size = "40"
    #    disk_bus = "virtio"
    #    network_model = "virtio"
    #    image_name = "<IMAGE_NAMESPACE_UID>"
    #    network_name = "<VMNETWORK_NAMESPACE_UID>"
        disk_info = <<EOF
    {"disks":[{"imageName":"<IMAGE_NAMESPACE_UID>","size":40,"bootOrder":1}]}
    EOF
        network_info = <<EOF
    {"interfaces": [{"networkName": "<VMNETWORK_NAMESPACE_UID>"}]}
    EOF
      # ... truncated
      }
    }
  12. RKE2 cluster should be rebuilt successfully (Old VMs will be replaced with new initialed ones)

  13. Update rke2.tf to use both new/old fields then do terraform apply

  14. Error message should displayed to explain fields conflict with each other

Footnotes

  1. v2.6-head is NOT the latest version for v2.6
    image
    image

  2. Code snippet for initial RKE2 cluster

    terraform {
      required_providers {
        rancher2 = {
          source = "rancher/rancher2"
          version = "1.25.0"
        }
      }
    }
    
    
    provider "rancher2" {
      api_url    = "<API_URL>"
      access_key = "<TOKEN>"
      secret_key = "<SECRET>"
      insecure = true
    }
    
    
    data "rancher2_cluster_v2" "foo-harvester" {
      name = "harv"
    }
    
    # Create a new Cloud Credential for an imported Harvester cluster
    resource "rancher2_cloud_credential" "foo-harvester" {
      name = "foo-harvester"
      harvester_credential_config {
        cluster_id = data.rancher2_cluster_v2.foo-harvester.cluster_v1_id
        cluster_type = "imported"
        kubeconfig_content = data.rancher2_cluster_v2.foo-harvester.kube_config
      }
    }
    
    # Create a new rancher2 machine config v2 using harvester node_driver
    resource "rancher2_machine_config_v2" "foo-harvester-v2" {
      generate_name = "foo-harvester-v2"
      harvester_config {
        vm_namespace = "default"
        cpu_count = "2"
        memory_size = "4"
        disk_size = "40"
        disk_bus = "virtio"
        network_model = "virtio"
        image_name = "<IMAGE_NAMESPACE_UID>"
        network_name = "<VMNETWORK_NAMESPACE_UID>"
        ssh_user = "<SSH_USER>"
        user_data = "I2Nsb3VkLWNvbmZpZwpwYWNrYWdlX3VwZGF0ZTogdHJ1ZQpwYWNrYWdlczoKICAtIHFlbXUtZ3Vlc3QtYWdlbnQKICAtIGlwdGFibGVzCnJ1bmNtZDoKICAtIC0gc3lzdGVtY3RsCiAgICAtIGVuYWJsZQogICAgLSAnLS1ub3cnCiAgICAtIHFlbXUtZ3Vlc3QtYWdlbnQuc2VydmljZQo="
      }
    }
    
    resource "rancher2_cluster_v2" "foo-harvester-v2" {
      name = "foo-harvester-v2"
      kubernetes_version = "v1.24.8+rke2r1"
      rke_config {
        machine_pools {
          name = "pool1"
          cloud_credential_secret_name = rancher2_cloud_credential.foo-harvester.id
          control_plane_role = true
          etcd_role = true
          worker_role = true
          quantity = 1
          machine_config {
            kind = rancher2_machine_config_v2.foo-harvester-v2.kind
            name = rancher2_machine_config_v2.foo-harvester-v2.name
          }
        }
        machine_selector_config {
          config = {
            cloud-provider-name = ""
          }
        }
        machine_global_config = <<EOF
    cni: "calico"
    disable-kube-proxy: false
    etcd-expose-metrics: false
    EOF
        upgrade_strategy {
          control_plane_concurrency = "10%"
          worker_concurrency = "10%"
        }
        etcd {
          snapshot_schedule_cron = "0 */5 * * *"
          snapshot_retention = 5
        }
        chart_values = ""
      }
    }
    
  3. terraform apply might need to apply twice
    image

  4. Code snippet to build version 0.0.0-dev from Support old version HarvesterConfig rancher/terraform-provider-rancher2#1132

    • prerequisite: golang version > 1.13, we can simply use gvm to install new version
    git clone https://github.com/rancher/terraform-provider-rancher2
    cd terraform-provider-rancher2
    make
    

    Then move the custom provider to terraform local search path

    PROVIDER="rancher2"
    VERSION="0.0.0-dev"
    OS_PLATFORM=$(uname -sp | tr '[:upper:] ' '[:lower:]_' | sed 's/x86_64/amd64/' | sed 's/i386/amd64/' | sed 's/arm/arm64/')
    PROVIDERS_DIR=$HOME/.terraform.d/plugins/terraform.local/local/${PROVIDER}
    PROVIDER_DIR=${PROVIDERS_DIR}/${VERSION}/${OS_PLATFORM}
    mkdir -p ${PROVIDER_DIR}
    cp bin/terraform-provider-${PROVIDER} ${PROVIDER_DIR}/terraform-provider-${PROVIDER}_v${VERSION}
    

    To be awared, your platform might encouter the issue, we can simply update the folder name:

    cd $PROVIDER_DIR/..
    # then use `uname -a` to find the correct platform name, and rename the `linux_unknown` folder
    

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/terraform-provider-rancher2 Terraform Provider for Rancher v2 and Harvester node driver kind/bug Issues that are defects reported by users or that we know have reached a real release not-require/test-plan Skip to create a e2e automation test issue priority/0 Must be fixed in this release reproduce/always Reproducible 100% of the time severity/1 Function broken (a critical incident with very high impact)
Projects
None yet
Development

No branches or pull requests

5 participants