Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hybrid cluster: available VM params do not match the selected node #17

Open
denis-itskovich opened this issue Dec 22, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@denis-itskovich
Copy link

denis-itskovich commented Dec 22, 2023

I set up a hybrid 3-node cluster, which consists of 2 nodes, based on Intel/AMD CPUs and 1 node based on Arm64 (Rockchip - Orange Pi 5).

WebUI hostname CPU Architecture
pve1.local Intel x64
pve2.local Intel x64
pve3.local Arm64

The problem is that the available VM params do not match the selected node, but are always taken from the node, who delivered the WebUI page.

For example:

When I open webui of the first node (pve1, https://pve1.local:8006, Intel x64) and try to create VM on pve3 (Arm64) - the available machine types will be i440fx / q35 (instead of virt) and available CPU types will include Intel-architecure types and will not include Arm64-specific types and also the CD/DVD drive will be created as IDE instead of SCSI. I'm still able to create a working VM by manually replacing CD/DVD drive to SCSI, selecting machine type: i440fx and CPU: host

When I open webui of the 3rd node (pve3, https://pve3.local:8006, Arm64) and try to create VM on pve1 (Intel x64) - the available machine type is only virt (which is good for Arm64 machines, but not what is expected for IA VMs), the CD/DVD drive will be created as SCSI and available CPU types won't include IA CPUs, but only Arm64.

It seems like general design problem of proxmox, which is not designed to support hybrid clusters and assumes that all nodes have the same physical abilities, so the selectable abilities in WebUI are not retrieved from the target node, but instead are just provided by the node who is currently serving WebUI to the user

I believe that the right place to open this bug is in proxmox bug tracker, but I'm almost sure that they will just close it as non-supported configuration since Arm64 support is not official

@jiangcuo
Copy link
Owner

Thanks for the feedback.
Like arm\riscv\loongarch are used virt as a machine type. To make it easier for users of non-x86_64 architectures to use, some of the default hardware types have been removed from Port. This can cause anomalies in hybrid clusters.
In the front-end part of Proxmox VE, the host API is used, not the API of the target node, so using the x86_64 API to control other machines will be problematic.
Except for the VM function, all other APIs are the same, so hybrid clusters require users to be able to understand the differences between different architectures.
With the development of non-x86_64 architectures, Proxmox may see a huge business opportunity for other architectures and support other architectures.

@jiangcuo jiangcuo added the enhancement New feature or request label Jan 3, 2024
@abufrejoval
Copy link

It seems like general design problem of proxmox, which is not designed to support hybrid clusters and assumes that all nodes have the same physical abilities, so the selectable abilities in WebUI are not retrieved from the target node, but instead are just provided by the node who is currently serving WebUI to the user

I believe that the right place to open this bug is in proxmox bug tracker, but I'm almost sure that they will just close it as non-supported configuration since Arm64 support is not official

From what I could gather, jiangcuo based his work on stuff that was launched at Proxmox some years ago. So AFAIK there was an iniative at one point to do a Promox port for ARM.

That work was never finished and part of the reason could have been the can of worms you opened and that I've stumbled across, too. I've been running all kinds of hypervisors on my PCs, starting with the very first VMware 1.0 in 1999 and just managing the broad hardware range on x86 with CPUs ranging from i80386 to today's >100 core monster chips is a challenge, that KVM and Proxmox really do their best to properly cover today (and do better than others).

I believe I never got AMD<->Intel live migrations working on RHV/oVirt or Xen/XCP-ng, but it's just fine on Proxmox these days.

Just for fun I tried launching a stopped ARM VM on an x86 host yesterday, fully expecting some catastrophic failure. But there was probably only one CPU core twirling on some VM UEFI boot code, it never crashed, stopped or even did a beep.

But I've run ARM VMs just fine with disks from my x86 HCI cluster running Ceph using an NFS boot ISO also from x86 machines: that kind of stuff should naturally work and I hope it will continue to do so.

Throwing ARM, RISC-V and perhaps soon the verious Chinese and Russian domestic ISAs into one VM orchestrator, won't be an easy extension of a product that simply wasn't imagined to deal with distinct ISAs.

For now I am happy to report that (most) anything that makes sense technically, seems to work well. So you can share all kinds of storage between architectures, start/stop or generally manage machines, except for the configuration part.

There you have to create or modify x86 VMs using the GUI from an x86 host and it's the same with ARM: I consider that very acceptable, but in a productioni environment I'd strongly recommend mixing machines with different ISAs in a single cluster: that very likely won't be supported by Proxmox for a long time and I see them making that impossible before they make it possible, simply because it's probably easier and less error prone.

The one thing I can't make work for now is live-migration between ARM hosts, for which I'll open an issue next...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants