Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proxmox builder - Error getting SSH address 500 QEMU guest agent is not running #91

Closed
goffinf opened this issue Jul 7, 2020 · 11 comments

Comments

@goffinf
Copy link

goffinf commented Jul 7, 2020

hashicorp/packer#9115
This issue relates to much of the discussion in the one linked above, however, at present I have a very specific issue which I think is worth separating out.

Packer version:

goffinf@DESKTOP-LH5LG1V:~$ packer version
Packer v1.6.0

Builder: proxmox

Proxmox version:

pveversion --verbose
proxmox-ve: 6.2-1 (running kernel: 5.4.44-1-pve)
pve-manager: 6.2-6 (running version: 6.2-6/ee1d7754)
pve-kernel-5.4: 6.2-3
pve-kernel-helper: 6.2-3
pve-kernel-5.4.44-1-pve: 5.4.44-1
pve-kernel-5.4.34-1-pve: 5.4.34-2
...

I am trying to run a Packer build using the ubuntu-20.04-live-server-amd64.iso, but when the boot_command runs the build just hangs waiting to get an SSH connection to the launched instance. When the ssh_timeout is reached the build fails and the VM is destroyed.

The Packer log reports that the SSH address cannot be obtained because the QEMU guest agent isn't running. Clearly that is not installed on a fresh ISO. I also set the communicator to ssh, disabled the qemu_agent, and configured an alternate ssh_port, so I'm not sure why this is happening ?

...
  "builders": [
    {
      ...
      "communicator": "ssh",
...
      "qemu_agent": false,
      "ssh_handshake_attempts": "50",
      "ssh_username": "{{user `ssh_username`}}",
      "ssh_password": "{{user `ssh_password`}}",
      "ssh_port": 2222,
      "ssh_pty": true,
      "ssh_timeout": "{{user `ssh_timeout`}}",

packer log:
image

build times-out:
image

From the related thread:

The new Ubuntu Server installer starts an SSH server.
The credentials are installer:<random_pw>
Packer wrongfully tries to connect to this SSH, thinking the VM is ready for further provisioning steps - which it is NOT.

Thanks to @JulyIghor we found a workaround.
We simply change the port packer expects the ssh server to run at to something else AND during cloud-init late_commands we override the servers port accordingly. That way once the cloud-init finishes and reboots the VM the ssh server will run at the new port - now packer picks up on that and continues provisiong as we are used to.

As a last step durng provision, we remove the conf file, essentially resettign the ssh server port back to default 22.

I have tried using a different ssh_port (2222 in the example below), but that had no effect.

For completeness, the boot_command, does result in the newly launched VM entering the correct autoinstall process (rather than the standard install dialogue) as evidenced by the following screen shots taken during a Packer debug build:

image

image

I have copied the complete Packer build file and the user-data below.

Any guidance on how to ensure that the Packer build uses ssh rather than the qemu agent would be much appreciated (or anything else you think might be the culprit).

Kind Regards

Fraser.

host.json:

{
  "builders": [
    {
      "boot_command": [
        "<enter><enter><f6><esc><wait>",
        "autoinstall ds=nocloud-net;s=http://{{ .HTTPIP }}:{{ .HTTPPort }}/",
        "<enter>"
      ],
      "boot_wait": "{{user `boot_wait`}}",
      "communicator": "ssh",
      "disks": [
        {
          "disk_size": "{{user `home_volume_size`}}",
          "storage_pool": "local-lvm",
          "storage_pool_type": "lvm-thin",
          "type": "scsi",
          "format": "raw"
        }
      ],
      "http_directory": "{{user `http_directory`}}",
      "insecure_skip_tls_verify": true,
      "iso_checksum": "{{user `iso_checksum_type`}}:{{user `iso_checksum`}}",
      "iso_file": "{{user `iso_file`}}",
      "memory": 2048,
      "name": "ubuntu-20-04-base",
      "network_adapters": [
        {
          "bridge": "vmbr0",
          "model": "virtio"
        }
      ],
      "node": "{{user `proxmox_target_node`}}",
      "password": "{{user `proxmox_server_pwd`}}",
      "proxmox_url": "https://{{user `proxmox_server_hostname`}}:{{user `proxmox_server_port`}}/api2/json",
      "qemu_agent": false,
      "ssh_handshake_attempts": "50",
      "ssh_username": "{{user `ssh_username`}}",
      "ssh_password": "{{user `ssh_password`}}",
      "ssh_port": 2222,
      "ssh_pty": true,
      "ssh_timeout": "{{user `ssh_timeout`}}",
      "type": "proxmox",
      "unmount_iso": true,
      "username": "{{user `proxmox_server_user`}}"
    }
  ],
  "provisioners": [
    {
      "execute_command": "{{ .Vars }} sudo -E -S sh '{{ .Path }}'",
      "inline": [
        "ls /"
      ],
      "type": "shell"
    }
  ],
  "variables": {
    "boot_wait": "2s",
    "http_directory": "http",
    "iso_checksum": "caf3fd69c77c439f162e2ba6040e9c320c4ff0d69aad1340a514319a9264df9f",
    "iso_checksum_type": "sha256",
    "iso_file": "local:iso/ubuntu-20.04-live-server-amd64.iso",
    "proxmox_server_hostname": "proxmox-002",
    "proxmox_server_port": "8006",
    "proxmox_server_pwd": "xxxxxxx",
    "proxmox_server_user": "xxxxxxxx",
    "proxmox_target_node": "home",
    "ssh_handshake_attempts": "20",
    "ssh_password": "ubuntu",
    "ssh_username": "ubuntu",
    "ssh_timeout": "10m"
  }
}

user-data:

#cloud-config
autoinstall:
  identity:
    hostname: ubuntu-20-04-base
    password: '$6$wdAcoXrU039hKYPd$508Qvbe7ObUnxoj15DRCkzC3qO7edjH0VV7BPNRDYK4QR8ofJaEEF2heacn0QgD.f8pO8SNp83XNdWG6tocBM1'
    username: ubuntu
  keyboard:
    layout: gb
    variant: uk
  late-commands:
    - sed -i 's/^#*\(send dhcp-client-identifier\).*$/\1 = hardware;/' /target/etc/dhcp/dhclient.conf
    - 'sed -i "s/dhcp4: true/&\n      dhcp-identifier: mac/" /target/etc/netplan/00-installer-config.yaml'
  locale: en_GB.UTF-8
  network:
    network:
      version: 2
      ethernets:
        ens33:
          dhcp4: true
          dhcp-identifier: mac
  ssh:
    allow-pw: true
    authorized-keys:
    - "ssh-rsa AAAAB3NzaC1yc2...."
    install-server: true
  version: 1
@paginabianca
Copy link
Contributor

I don't know if you found a solution to the problem already if not, but packer is not able to determine the VMs IP address.
Normally Proxmox gets the IP addresses of VMs though the qemu-guest-agent installed on the guest.
Since you do not have the guest agent installed Proxmox does not know the VMs IP address because it cannot communicate with the VM.
There are two ways to solve this problem:

  1. Set the ssh_host parameter
  2. Use a preseed.cfg and install the qemu-guest-agent with d-i

For the 1st solution, you tell Packer the VMs IP manually by setting the ssh_host parameter inside the template.
As stated in the docs, when no qemu-agent is installed on the system, the ssh_host should be set.
Now you have to make sure that the VM gets assigned that exact IP address. You can do that by setting a static IP inside the VM or if you have a DHCP server, set mac_address in network_adapters and tell the DHCP server to assign the same IP to that MAC address every time.

As for the 2nd solution I recommend reading about how to use preseed.cfg files to automate the installation process on Ubuntu/Debian here.

@rgevaert
Copy link

I agree with @paginabianca . To be honest I find the second option the easiest. There are several repo's here on github that should get you a working setup.

@goffinf
Copy link
Author

goffinf commented Jul 22, 2020

@paginabianca (Andreas) thanks for your comments. There are a couple of issues that I have been wrestling with that impact your suggested options.

When Proxmox launches the VMs, they obtain their IP via DHCP. I don't want to have a fixed IP because that has implications for other automated provisioning. So setting ssh_host isn't really an option I don't think. I did try exactly that at one point but the IP I set didn't actually get assigned so I will try your suggestion for setting the mac_address in network_adapters. I might come back to this one if option 2 continues to be problematic.

The VMs I am launching are Ubuntu 20.04 which has deprecated preseed in favour of autoinstall (which IMO is a cleaner approach). I have tried configuring cloud-config to install the qemu-guest-agent following some other examples but this has been unsuccessful to-date (see below).

TBH, I have tried so many variants its hard to remember them all, but if you can see anything else that is missing or shouldn't be there I can give it another try ?

Right now I'm using Terraform to do all the provisioning at launch time using a very basic base image. That works fine however there are a few things that I would prefer to bake at build time (the un-ending debate around build vs. launch-time provisioning aka: bake vs. fry)

Kind Regards

Fraser

#cloud-config
autoinstall:
  early-commands:
    - systemctl stop ssh # otherwise packer tries to connect and exceed max attempts
  keyboard:
    layout: gb
    variant: uk
  late-commands:
    - sed -i 's/^#*\(send dhcp-client-identifier\).*$/\1 = hardware;/' /target/etc/dhcp/dhclient.conf
    - 'sed -i "s/dhcp4: true/&\n      dhcp-identifier: mac/" /target/etc/netplan/00-installer-config.yaml'
  locale: en_GB
  network:
    network:
      version: 2
      ethernets:
        eth0:
          dhcp4: true
          dhcp-identifier: mac
  packages:
    - openssh-server
    - qemu-guest-agent
  ssh:
    allow-pw: true
    authorized-keys:
    - "ssh-rsa AAAAB3NzaC1..."
    install-server: true
  storage:
    layout:
      name: direct
   user-data:
    disable_root: false 
    package_upgrade: true
    timezone: Europe/London
    users:
      - name: goffinf
        passwd: $xxxx.
        groups: [adm, cdrom, dip, plugdev, lxd, sudo]
        lock-passwd: false
        sudo: ALL=(ALL) NOPASSWD:ALL
        shell: /bin/bash
        ssh_authorized-keys:
          - "ssh-rsa AAAAB3NzaC1....."
    write_files:
      - path: /usr/local/bin/hello-world.sh
        permissions: "0755"
        content: |
          #!/bin/bash

          FORENAME=${1:-goffinf};
          echo "Hello $FORENAME" >> /usr/local/bin/greeting;
    runcmd:
      - /usr/local/bin/hello-world.sh 'Fraser'
  version: 1

@BarisGece
Copy link

BarisGece commented Nov 16, 2020

Hi @goffinf
Can you try

  • to update "qemu_agent": true in host.json.
  • adding the following commands to user-data(cloud-config) so that they will execute after the qemu-guest-agent package is installed
    • sudo systemctl start qemu-guest-agent
    • sudo systemctl enable qemu-guest-agent

You may continue to get the same error at first run time. If you are sure that Qemu_agent is installed in the VM and the service is started, you should see that the IP part changes after a while when you check it through proxmox.

offon

@oliviermichaelis
Copy link

oliviermichaelis commented Oct 22, 2021

For anyone who stumbles upon this exact issue, I've managed to get it to work with ubuntu-20.04.3-live-server.

I'll include the minimal reproduction code below, starting with the packer template ubuntu.json:

{
  "variables": {
    "proxmox_template_name": "ubuntu-20.04",
    "ubuntu_iso_file": "ubuntu-20.04.3-live-server-amd64.iso"
  },
  "builders": [{
    "type": "proxmox",
    "proxmox_url": "<fill_out_your_url>:8006/api2/json",
    "username": "{{ user `proxmox_username` }}",
    "password": "{{ user `proxmox_password` }}",
    "node": "proxmox",
    "network_adapters": [{
      "bridge": "vmbr0"
    }],
    "disks": [{
      "type": "scsi",
      "disk_size": "20G",
      "storage_pool": "local-lvm",
      "storage_pool_type": "lvm"
    }],
    "iso_file": "local:iso/{{ user `ubuntu_iso_file` }}",
    "unmount_iso": true,
    "boot_wait": "5s",
    "memory": 8192,
    "cores": 4,
    "scsi_controller": "virtio-scsi-single",
    "qemu_agent": true,
    "template_name": "{{ user `proxmox_template_name` }}",
    "http_directory": "http",
    "boot_command": [
      "<esc><wait><esc><wait><f6><wait><esc><wait>",
      "<bs><bs><bs><bs><bs>",
      "autoinstall ds=nocloud-net;s=http://{{ .HTTPIP }}:{{ .HTTPPort }}/ ",
      "-- <enter>"
    ],
    "ssh_username": "ubuntu",
    "ssh_password": "ubuntu",
    "ssh_timeout": "20m"
  }],
  "provisioners": [{
    "type": "shell",
    "inline": [
      "while [ ! -f /var/lib/cloud/instance/boot-finished ]; do echo 'Awaiting cloud-init...'; sleep 1; done"
    ]
  }]
}

Create secrets.json with:

{
 "proxmox_username": "packer@pve",
 "proxmox_password": "<your_password>"
}

Create the http directory with an empty meta-data file, as well as user-data:

#cloud-config
autoinstall:
  version: 1
  locale: en_US
  keyboard:
    layout: eu
  identity:
    hostname: ubuntu-server
    username: ubuntu
    password: "$6$FhcddHFVZ7ABA4Gi$9ME1/XEiFHYx8Qh01w6CPqZZE7EDSf2tOc9Ugs89beYrUMyCyCxXzyBovoRwjN/6ipRnxCKeG/3PmJb1zvMAp/"
  ssh:
    install-server: true
    allow-pw: true
  packages:
    - qemu-guest-agent
  late-commands:
    - curtin in-target --target=/target -- systemctl start qemu-guest-agent
    - curtin in-target --target=/target -- systemctl enable qemu-guest-agent

Run with PACKER_LOG=1 packer build -var-file=secrets.json ubuntu.json

The important part are the late-commands which are executed within the target, where we've previously installed the qemu-guest-agent. Once the VM reboots into the target, qemu-guest-agent will be started by systemd and we can get the IP.

I hope that helps someone :)

@ankurthegamedev
Copy link

@oliviermichaelis ITS WORKING mAN! you made my day thnax

@m4dh4t
Copy link

m4dh4t commented Nov 12, 2021

I've got the same problem as described by @goffinf
By removing the quiet part of the boot option (adding some <bs> before the autoinstall command) I was able to see exactly when it hang during the cloud-init process :
image

I tried serving the user-data and meta-data files via a python http server rather than the one created by packer, and saw that no requests were made to the server, which for me indicates that the issue isn't really about the user-data file.

I stumbled on this issue by doing some research which might be interesting, but honestly I tried so much things that I don't really know what to do anymore

@m4dh4t
Copy link

m4dh4t commented Dec 6, 2021

For those interested, I found a workaround for provisioning the auto-install files to Ubuntu 20.04 without using Packer HTTP server.
In the cloud-init documentation it is mentionned how to create a valid autoinstall iso image using the following command genisoimage -output seed.iso -volid cidata -joliet -rock user-data meta-data.
Once your iso is built, you can simply provide it to your packer builder using the additional_iso_files parameter.
I ended up with something like this, using iso_url so packer loads the iso from my config directory locally.

"additional_iso_files": [
    {
      "iso_url": "./config/seed.iso",
      "storage_pool": "local",
      "iso_checksum": "sha256:HASH",
      "unmount": true
    }
]

Of course the downside of this, is that each time you need to tweak your user-data file, you will also need to rebuild the iso file and and feed the new checksum to packer, but nothing that a little bash script can't do to keep that a convenient alternative.

@akauper
Copy link

akauper commented Jan 26, 2022

God this one took me forever to solve... thank you @oliviermichaelis for the late-commands hint!
For anyone getting stuck AND using WSL2 on windows. For some reason ubuntu in WSL wont host the http directory.

@m4dh4t tipped me off to this. Spin up a full Ubuntu VM somewhere and run packer from there. Works like a charm.

As to why WSL wont host the http folder? No idea. Windows Firewall? Too happy its working to try to figure it out

edit: This thread probably holds the answer: hashicorp/packer#10168
http_interface needs to be set in the packer file
The downside to this is that the packer file becomes specific to the host... long story short just use a standalone linux machine

@nywilken nywilken transferred this issue from hashicorp/packer May 27, 2022
@nywilken nywilken pinned this issue May 27, 2022
@lbajolet-hashicorp lbajolet-hashicorp unpinned this issue Jan 6, 2023
@AntonioBriPerez
Copy link

I've got the same problem as described by @goffinf By removing the quiet part of the boot option (adding some <bs> before the autoinstall command) I was able to see exactly when it hang during the cloud-init process : image

I tried serving the user-data and meta-data files via a python http server rather than the one created by packer, and saw that no requests were made to the server, which for me indicates that the issue isn't really about the user-data file.

I stumbled on this issue by doing some research which might be interesting, but honestly I tried so much things that I don't really know what to do anymore

Did you find a solution on this?

@freddo256
Copy link

@AntonioBriPerez for me the problem was that the server couldn't reach the host packer was running from (it was connected trough a VPN). I created a bastion host in my Proxmox environment and ran packer from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests