Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Packer 1.8.0: QEMU fails to remove CD-ROM installation image during reboot (worked fine with 1.7.10) #66

Closed
hc-github-team-packer opened this issue Mar 9, 2022 · 13 comments · Fixed by #67

Comments

@hc-github-team-packer
Copy link

This issue was originally opened by @dreibh in hashicorp/packer#11651 and has been migrated to this repository. The original issue description is below.


Overview of the Issue

Packer 1.7.10 with QEMU works fine, installing e.g. Ubuntu 18.04 or 20.04. However, after update to 1.8.0 (from HashiCorp's APT repository for Ubuntu 20.04, i.e. from https://apt.releases.hashicorp.com), Packer fails to remove the CD-ROM installation image. On reboot, the VM boots into the CD-ROM installation image again, instead of booting from the newly installed HDD image.

A downgrade to 1.7.10 solved the issue. So, it seems a bug has been introduced with 1.8.0.

Reproduction Steps

Trying to run a QEMU installation for Ubuntu 20.04 with Packer 1.8.0. It always fails.

Packer version

1.8.0

Simplified Packer Template

``{
"variables": {
"version": ""
},
"builders": [
{
"vm_name": "{{user vm_name}}",
"output_directory": "{{user `vm_name`}}",
"type": "qemu",
"accelerator": "kvm",
"headless": false,
"format": "qcow2",
"firmware": "/usr/share/qemu/OVMF.fd",
"disk_interface": "virtio",
"disk_cache": "unsafe",
"disk_discard": "unmap",
"net_device": "virtio-net",
"cpus": "{{user `vm_cores`}}",
"memory": "{{user `vm_memory`}}",
"disk_size": "{{user `vm_disk_root`}}",
"qemuargs": [
[
"-vga",
"qxl"
]
],
"http_directory": "http",
"iso_url": "https://releases.ubuntu.com/20.04/ubuntu-20.04.4-live-server-amd64.iso",
"iso_checksum": "sha256:28ccdb56450e643bad03bb7bcf7507ce3d8d90e8bf09e38f6bd9ac298a98eaad",
"ssh_username": "{{user `user_name`}}",
"ssh_password": "{{user `user_password`}}",
"ssh_pty": true,
"ssh_timeout": "60m",
"ssh_handshake_attempts": "1024",
"shutdown_command": "echo '{{user `user_password`}}' | sudo -S shutdown -P now",
"boot_wait": "1s",
"boot_command": [
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"",
"linux /casper/hwe-vmlinuz --- autoinstall ds="nocloud-net;seedfrom=http://{{ .HTTPIP }}:{{ .HTTPPort }}/"",
"initrd /casper/hwe-initrd",
"boot"
]
}
],
...
}
`

Operating system and Environment details

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal

@nywilken
Copy link
Member

nywilken commented Mar 9, 2022

Hello again @dreibh I'm following up on your original issue packer/#11651 which was moved here. There was a change in the 1.0.2 Qemu plugin release for loading frimware, which may be the issue here. Can you please provide the debug log associated with the Packer 1.8.0 build so that we can better understand what is being executed.

Does downgrading to version 1.0.1 of the Qemu builder plugin work with Packer 1.8.0?

In Packer 1.8.0 you can manually install version 1.0.1 of the Qemu builder, which is what Packer 1.7.10 is using by running

packer plugins install github.com/hashicorp/qemu v1.0.1.

The plugins command, only available in 1.8.0+, will install the previous release of the qemu builder and Packer will use that for all qemu builds over the version that is bundled into Packer.

If you are running the build in a CI environment, where running packer plugins install on all hosts is not possible. You can upgrade your JSON template to HCL2 with the packer hcl2_upgrade template.json command and then pin the version of the Qemu builder to v1.0.1 with the required_plugins block. Below is an example of your template generated by the hcl2_upgrade command with the required plugins block

packer {
  required_plugins {
    qemu = {
      version = "= 1.0.1"
      source  = "github.com/hashicorp/qemu"
    }
  }
}

variable "version" {
  type    = string
  default = ""
}

source "qemu" "autogenerated_1" {
  accelerator            = "kvm"
  boot_command           = ["<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<tab><wait>", "<esc><wait>", "linux /casper/hwe-vmlinuz --- autoinstall ds=\"nocloud-net;seedfrom=http://{{ .HTTPIP }}:{{ .HTTPPort }}/\"<enter><wait>", "initrd /casper/hwe-initrd<enter><wait>", "boot<enter>"]
  boot_wait              = "1s"
  cpus                   = "${var.vm_cores}"
  disk_cache             = "unsafe"
  disk_discard           = "unmap"
  disk_interface         = "virtio"
  disk_size              = "${var.vm_disk_root}"
  firmware               = "/usr/share/qemu/OVMF.fd"
  format                 = "qcow2"
  headless               = false
  http_directory         = "http"
  iso_checksum           = "sha256:28ccdb56450e643bad03bb7bcf7507ce3d8d90e8bf09e38f6bd9ac298a98eaad"
  iso_url                = "https://releases.ubuntu.com/20.04/ubuntu-20.04.4-live-server-amd64.iso"
  memory                 = "${var.vm_memory}"
  net_device             = "virtio-net"
  output_directory       = "${var.vm_name}"
  qemuargs               = [["-vga", "qxl"]]
  shutdown_command       = "echo '${var.user_password}' | sudo -S shutdown -P now"
  ssh_handshake_attempts = "1024"
  ssh_password           = "${var.user_password}"
  ssh_pty                = true
  ssh_timeout            = "60m"
  ssh_username           = "${var.user_name}"
  vm_name                = "${var.vm_name}"
}

build {
  sources = ["source.qemu.autogenerated_1"]

}

I look forward to hearing back if downgrading the qemu plugin temporary solves the issue for you.

@NHAS
Copy link

NHAS commented Mar 16, 2022

Hi there, I've just been stung quite hard by this bug.
Is there a way to sticky this issue on the main packer project as well? This seems like a rather critical issue for people running infrastructure builds like myself.

Im testing out the workaround you've stated and I'll get back to you if it works.

@NHAS
Copy link

NHAS commented Mar 16, 2022

Unfortunately simply running packer plugins install github.com/hashicorp/qemu v1.0.1 has not worked.

@carenas
Copy link
Contributor

carenas commented Mar 16, 2022

Unfortunately simply running packer plugins install github.com/hashicorp/qemu v1.0.1 has not worked.

You might need to ALSO add the "required_plugins" section in your HCL2 for the downgraded plugin to really be used, regardless of what was mentioned above (maybe another bug?)

@NHAS
Copy link

NHAS commented Mar 16, 2022

Indeed. I had to convert my json template to HCL2 (which broke it), then also pin it through those instructions above.

It now works

@nywilken nywilken pinned this issue Mar 17, 2022
carenas added a commit to carenas/packer-plugin-qemu that referenced this issue Mar 18, 2022
Since the merge of hashicorp#43 any script with a firmware setting, will
use a `-drive if=pflash` instead of `-bios` option when calling
QEMU, which has been shown to be problematic.

Implement a new option `pflash` instead, that could be set to
true for when that would be preferred (ex: when non-volatile
variables are needed), and use the stable interface by default.

Fixes: hashicorp#66
carenas added a commit to carenas/packer-plugin-qemu that referenced this issue Apr 8, 2022
Since the merge of hashicorp#43 any script with a firmware setting, will
use a `-drive if=pflash` instead of `-bios` option when calling
QEMU, which has been shown to be problematic.

Implement a new option `pflash` instead, that could be set to
true for when that would be preferred (ex: when non-volatile
variables are needed), and use the stable interface by default.

Fixes: hashicorp#66
carenas added a commit to carenas/packer-plugin-qemu that referenced this issue Apr 8, 2022
Since the merge of hashicorp#43 any script with a firmware setting, will
use a `-drive if=pflash` instead of `-bios` option when calling
QEMU, which has been shown to be problematic.

Implement a new option `pflash` instead, that could be set to
true for when that would be preferred (ex: when non-volatile
variables are needed), and use the stable interface by default.

Fixes: hashicorp#66
carenas added a commit to carenas/packer-plugin-qemu that referenced this issue Apr 8, 2022
Since the merge of hashicorp#43 any script with a firmware setting, will
use an `-drive if=pflash` option instead of `-bios` when calling
QEMU, which has been shown to be problematic.

Implement a new option `pflash` instead, that could be set to
true for when that would be preferred (ex: when using a custom
firmware with variables), but use the old interface by default.

Fixes: hashicorp#66
nywilken pushed a commit that referenced this issue Apr 12, 2022
…g firmware (#67)

Since the merge of #43 any script with a firmware setting, will
use an `-drive if=pflash` option instead of `-bios` when calling
QEMU, which has been shown to be problematic.

Implement a new option `pflash` instead, that could be set to
true for when that would be preferred (ex: when using a custom
firmware with variables), but use the old interface by default.

Fixes: #66
@dcd-arnold
Copy link

This is not solved. I am trying to setup Ubuntu Jammy 22.04 Server on Qemu 6.2.0 woth packer 1.8.0 and am running into this issue.

I tried the version 1.0.3 setting use_pflash=false which yields the following qemu cmd:

2022/04/28 17:58:19 packer-plugin-qemu_v1.0.3_x5.0_darwin_amd64 plugin: 
2022/04/28 17:58:19 Executing /usr/local/bin/qemu-system-x86_64: 
[]string{"-netdev", "user,id=user.0,hostfwd=tcp::4282-:22", 
"-vnc", "127.0.0.1:96", "-cpu", "host", 
"-machine", "type=q35,accel=hvf", 
"-bios", "uefi/OVMF_CODE.fd", 
"-drive", "file=output-bcimage/bikecenter,if=virtio,cache=writeback,discard=unmap,format=raw", 
"-drive", "file=/Users/user/.cache/packer/b9441068de828d36573e1274dfe77f69aebda15a.iso,media=cdrom", 
"-name", "bikecenter", 
"-m", "4096M", 
"-smp", "cpus=4,sockets=4", 
"-device", "virtio-net,netdev=user.0", 
"-boot", "once=d"}

However, even with the bios setting the cdrom is not removed when the installer restarts.
Using version 1.0.1 (as suggested as a workaround) does not fix this either.
I tries to override the boot order like this:

  qemuargs = [
    # prefer hard disk (c) before cdrom (d)
    ["-boot", "order=cd"],
  ]

Also, to no avail. I am unable to boot from disk after subiquity is done installing and restarts the machine.

@carenas
Copy link
Contributor

carenas commented Apr 28, 2022

However, even with the bios setting the cdrom is not removed when the installer restarts.

Just a shot in the dark, but might it be a problem with your EFI what is preventing the cdrom to be ejected in this case?

I agree the symptoms are similar, but the fact that it also breaks even with the packer code that doesn't have this feature or the one this fixes would seem to indicate the problem is elsewhere.

AFAIK (and I am no qemu expert) the boot parameter only applies to legacy bios so wouldn't make a difference here. Also qemu's 6.2 (or maybe 6.1) had a debug EFI by mistake which was fixed in 7.0 at least, and I had a better experiences with that newer EFI as well, so maybe that might help you too.

@dcd-arnold
Copy link

Hi @carenas,

thank you for getting back to me. It was the EFI from Arch Linux. ollowing your idea I tried the OVMF from Ubuntu Jammy and Fedora 36. All with the same result.

You are right "-boot", "order=cd" is only good when using a bios (I was desperate).

Qemu 7.0 is released. However, it is held back by my package manager folks and I am little reluctant to compile Qemu myself because it is a lot of management to maintain a source installation.

I am working around this issue by shutting down after the installer is down (adding poweroff as a post install hook) and setting communicator = "none". This will end the installer successfully and I get a disk image without any interaction. I can pick that up for provisioning in another packer run.

@NHAS
Copy link

NHAS commented May 18, 2022

Yeah this is till broken, Im on Ubuntu LTS 21.10 with packer 1.8.0 and its still broken.

@carenas
Copy link
Contributor

carenas commented May 19, 2022

Im on Ubuntu LTS 21.10 with packer 1.8.0 and its still broken

I believe you, but while the symptoms might be the same, the root cause MUST be different, so IMHO it would be better if we create a new issue and put all information you'd gathered about it so far there instead of here.

@lbajolet-hashicorp lbajolet-hashicorp unpinned this issue Sep 19, 2022
@martux69
Copy link

Hi @carenas,

thank you for getting back to me. It was the EFI from Arch Linux. ollowing your idea I tried the OVMF from Ubuntu Jammy and Fedora 36. All with the same result.

You are right "-boot", "order=cd" is only good when using a bios (I was desperate).

Qemu 7.0 is released. However, it is held back by my package manager folks and I am little reluctant to compile Qemu myself because it is a lot of management to maintain a source installation.

I am working around this issue by shutting down after the installer is down (adding poweroff as a post install hook) and setting communicator = "none". This will end the installer successfully and I get a disk image without any interaction. I can pick that up for provisioning in another packer run.

Hi @dcd-arnold
I would like to solve this still ongoing problem in Ubuntu jammy as you did, but til now without success. In the autoinstall section of the cloud-init file I set the shutdown parameter to poweroff and in the packer hcl file I set shutdown_command to empty and communicator=none. After cloud-init finished the vm is powered off, but packer still failed and to image is generated. How do you set these parameters?

@dcd-arnold
Copy link

Hi @martux69,

I moved away from packer to blank QEMU (which worked for me) and then dropped that as well for now. I am sorry I can not help with this. If you are interested I can share my QEMU script with you.

Best,
dcd-arnold

@BirknerAlex
Copy link

BirknerAlex commented Mar 13, 2023

We had the same issue when using following config.

efi_firmware_code    = "/usr/share/OVMF/OVMF_CODE.fd"
efi_firmware_vars    = "/usr/share/OVMF/OVMF_VARS.fd"
efi_boot = true

I recognized it's a Ubuntu only issue, Windows and Fedora is doing fine. I tackled it down to a issue on Ubuntu itself:
https://bugs.launchpad.net/subiquity/+bug/1901397

So adding the workaround to the cloud-init config of the Ubuntu installer worked fine.

  late-commands:
    - |
      if [ -d /sys/firmware/efi ]; then
        apt-get install -y efibootmgr
        efibootmgr -o $(efibootmgr | perl -n -e '/Boot(.+)\* ubuntu/ && print $1')
      fi

Now the Ubuntu installer reboots into the Disk instead of the ISO file again. The Ubuntu installer itself fails to remove the CD after install, which is another bug in combination with UEFI in Ubuntu: https://bugs.launchpad.net/subiquity/+bug/1845543

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants