Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform crash during VM guest provisioning using Vsphere provider #846

Closed
ghost opened this issue Sep 18, 2019 · 18 comments
Closed

Terraform crash during VM guest provisioning using Vsphere provider #846

ghost opened this issue Sep 18, 2019 · 18 comments
Labels
crash Impact: Crash

Comments

@ghost
Copy link

ghost commented Sep 18, 2019

This issue was originally opened by @Shyamashree2005 as hashicorp/terraform#22836. It was migrated here as a result of the provider split. The original body of the issue is below.


Terraform Version

0.12.4

Terraform Configuration Files

provider "vsphere" {
  user                 = "${var.vsphere_user}"
  password             = "${var.vsphere_password}"
  vsphere_server       = "${var.VcenterName}"
  allow_unverified_ssl = true
}

data "vsphere_datacenter" "DcName" {
  name = "${var.DcName}"
}

data "vsphere_datastore" "Datastore" {
  name          = "${var.DataStore}"
  datacenter_id = "${data.vsphere_datacenter.DcName.id}"
}

data "vsphere_compute_cluster" "cluster" {
  name          = "${var.ClusterName}"
  datacenter_id = "${data.vsphere_datacenter.DcName.id}"
}

data "vsphere_virtual_machine" "template" {
  name          = "${var.TemplateName}"
  datacenter_id = "${data.vsphere_datacenter.DcName.id}"
}

data "vsphere_network" "net" {
  name          = "${var.NetworkName}"
  datacenter_id = "${data.vsphere_datacenter.DcName.id}"
}


resource "vsphere_virtual_machine" "vm" {
  name                 = "${var.VmName}"
  resource_pool_id     = "${data.vsphere_compute_cluster.cluster.resource_pool_id}"
  datastore_id     = "${data.vsphere_datastore.Datastore.id}"
  #datastore_cluster_id = "${data.vsphere_datastore_cluster.Datastore.id}"

  num_cpus = "${var.NoOfCpu}"
  memory   = "${var.MemInMb}"
  guest_id = "${data.vsphere_virtual_machine.template.guest_id}"

  network_interface {
    network_id   = "${data.vsphere_network.net.id}"
    adapter_type = "${data.vsphere_virtual_machine.template.network_interface_types[0]}"
  }

  disk {
    label            = "disk0"
    size             = "${var.Size}"
    thin_provisioned = "${data.vsphere_virtual_machine.template.disks.0.thin_provisioned}"
    eagerly_scrub    = "${data.vsphere_virtual_machine.template.disks.0.eagerly_scrub}"
  }


  clone {
    template_uuid = "${data.vsphere_virtual_machine.template.id}"

    customize {
      linux_options {
        host_name = "${var.VmName}"
        domain    = "${var.DomainName}"
      }

      timeout = 0
      network_interface {
        ipv4_address = "${var.Ip}"
        ipv4_netmask = "${var.Netmask}"
      }

      ipv4_gateway    = "${var.Gateway}"
      dns_server_list = "${var.DNS_List}"
    }
  }
}


Debug Output

<!--
Uploaded
-->

Crash Output

<!--
Uploaded
-->

Expected Behavior

<!--
The VM guest should have been provisioned up and running.
-->

Actual Behavior

<!--
The VM guest was provisioned and was in shutdown state. The guest was not assigned with IP address configured.
-->

Steps to Reproduce

<!--
Please list the full steps required to reproduce the issue, for example:
1. `terraform init`
2. `terraform plan`
3. `terraform apply`

-->

Additional Context

<!--
None
-->

References

<!--
None

crash.log
debug.log

-->

@Shyamashree2005
Copy link

Hi,

It might not be an entirely Vsphere provider issue. I have attached the debug and crash log with along with main.tf. Requesting you to please look into the crash log and suggest the next step forward.

@koikonom
Copy link
Contributor

Hi @Shyamashree2005! I tried reproducing the issue and I wasn't able to get the provider to crash. I have some questions that could point me to the right direction:

  • Would it be possible to share the output of "terraform plan" when you try to create the VM?
  • Can you share the values of the variables you use?

@koikonom koikonom added the waiting-response Status: Waiting on a Response label Sep 23, 2019
@Shyamashree2005
Copy link

Hi @koikonom ,

Good Morning. Thanks for the response.

The "debug.log" is already available as an attachment where you will find the output of terraform plan. I would upload the variable file separately.

@ghost ghost removed the waiting-response Status: Waiting on a Response label Sep 24, 2019
@Shyamashree2005
Copy link

Hi,

Variable files attached.
terraform.tfvars.txt
variables.tf.txt

@koikonom
Copy link
Contributor

Hi @Shyamashree2005, unfortunately I am still not able to reproduce the issue.

From the stacktrace it seems like one of the responses we get when making a RetrieveProperties vShpere API call is throwing govmomi off, but it's not possible to see more than that.

The only alternative would be to run terraform apply with govmomi request logging enabled (by setting the env var VSPHERE_CLIENT_DEBUG to true) and send us the resulting files that get generated in ~/.govmomi. The problem with this approach is that the files will contain sensitive information and scrubbing them is a long and sometimes difficult process.

@koikonom
Copy link
Contributor

I discussed this issue with a colleague and he gave me a very good idea.

The same code that triggers the issue is also present in govmomi's examples (https://github.com/vmware/govmomi/blob/master/examples/networks/main.go#L43).

Can you try building and running the networks test?

@koikonom
Copy link
Contributor

Here is what I did:

First I cloned govmomi, then I set the following env vars:

export GOVMOMI_URL="https://hostname:port/sdk"                                                                                           
export GOVMOMI_USERNAME="vsphere_username"         
export GOVMOMI_PASSWORD="vsphere_password"
#My lab has self signed certs
export GOVMOMI_INSECURE=true

and then I went into the examples/networks directory and ran the test like this:

❯ go run main.go 
VM Network: Network:network-7
DVS0-DVUplinks-9: DistributedVirtualPortgroup:dvportgroup-11
DC0_DVPG0: DistributedVirtualPortgroup:dvportgroup-13

Please give it a go and let me know what you see.

@koikonom koikonom added the waiting-response Status: Waiting on a Response label Sep 25, 2019
@Shyamashree2005
Copy link

Shyamashree2005 commented Sep 28, 2019

Hi Koikonom,

Thanks, that you spend time to investigate the issue.

Do I need to install "go" separately for this to work. Also when I download the GOVMOMI is there anything else I need to setup. I am asking this because without the terraform.tfvars file how would it identify the network used on my deployment ?

Also, I am bit worried as I need to run on this on production Vsphere setup. If any chance of having inadvertent issue ?

As part of the deployment, what I can see is that the VM is getting cloned but it is in poweroff state without any IP configured which suggests it might be an issue with customization.

Thanks.

@ghost ghost removed the waiting-response Status: Waiting on a Response label Sep 28, 2019
@Shyamashree2005
Copy link

Hi Koikonom,

If you please response, we can then progress with the next step.

Thanks.

@koikonom
Copy link
Contributor

koikonom commented Oct 7, 2019

Hi @Shyamashree2005, Yes you would need to install go and build govmomi before you're able to run the tests.

You would have to set environment variables for all the inputs required, like I did in the example above. govmomi cannot read terraform variables.

@koikonom koikonom added the waiting-response Status: Waiting on a Response label Oct 7, 2019
@Shyamashree2005
Copy link

Hi koikonim,

I am facing some issues with Go and govmomi. I will update you soon. Thanks again for the help.

@koikonom
Copy link
Contributor

koikonom commented Nov 5, 2019

Hi @Shyamashree2005, we have just merged PR #840 that I believe will address this issue. It is planned for the next provider release, so keep an eye for the new version.

@sofixa
Copy link

sofixa commented Dec 10, 2019

@koikonom when will the new release come out? I tried compiling master and i no longer encounter the crashes from #914 (and i suppose it would be the same for #852, #822 , #917 ,etc. ).

@ghost ghost removed the waiting-response Status: Waiting on a Response label Dec 10, 2019
@aareet
Copy link
Contributor

aareet commented Dec 10, 2019

Hi @sofixa I apologize for the delay, we're fixing some issues with our test pipeline and will work to get the release out soon after those are fixed.

@sofixa
Copy link

sofixa commented Dec 10, 2019

Great, thanks @aareet .

@sofixa
Copy link

sofixa commented Dec 11, 2019

@aareet do you have an estimation when that might be?

@sofixa
Copy link

sofixa commented Jan 3, 2020

1.14 is out and it's working great.

@aareet
Copy link
Contributor

aareet commented Jan 6, 2020

Glad to hear it @sofixa 👍Thank you for the feedback.

@aareet aareet closed this as completed Jan 6, 2020
@ghost ghost locked and limited conversation to collaborators Apr 18, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
crash Impact: Crash
Projects
None yet
Development

No branches or pull requests

4 participants