Skip to content

XenServer / XCP-ng hypervisor resource, Storage (secondary storage mount / system VM deployment) #13292

@BannerSmash

Description

@BannerSmash

problem

On a fresh deployment with XCP-ng 8.3 hosts, the secondary storage VM and console proxy VM never deploy. They fail and retry in a loop with ascending VM ids. The management log reports Could not mount secondary storage <nfs>:/<export>, which is misleading: NFS connectivity, exports, and permissions are all fine (the share mounts manually from both the hosts and the management server without issue).

The real cause is that the host-side XAPI plugin /etc/xapi.d/plugins/cloud-plugin-storage fails to load on XCP-ng 8.3 because it does a top-level import lvhdutil, and lvhdutil no longer exists as an importable module on XCP-ng 8.3. Because the plugin crashes at import time, every storage operation that goes through it fails, which CloudStack surfaces as a mount failure and a "storage unreachable" condition.

The plugin deployed to XCP-ng 8.3 hosts is sourced from the xenserver84 directory:
.../xenserver/xcpserver83/../xenserver84/cloud-plugin-storage.

versions

4.22.0.0
Fully patched XCP-ng 8.3 servers

The steps to reproduce the bug

System VMs cycle endlessly with ascending ids. The management log shows, per attempt:

WARN  [c.c.h.x.r.XcpServer83Resource] callHostPlugin failed for cmd: mountNfsSecondaryStorage
      with args remoteDir: <nfs>:/<export>, localDir: /var/cloud_mount/<uuid>, nfsVersion: null,
      due to Task failed!
WARN  [c.c.h.x.r.Xenserver625StorageProcessor] ... Could not mount secondary storage <nfs>:/<export> ...
ERROR [o.a.c.e.o.VolumeOrchestrator] Unable to create volume [ROOT-xxx] ...
WARN  [c.c.v.ClusteredVirtualMachineManagerImpl] Resource [StoragePool:x] is unreachable ...

The underlying XAPI task error reveals the actual failure (host-side Python traceback):

errorInfo: [XENAPI_PLUGIN_FAILURE, non-zero exit, , Traceback (most recent call last):
  File "/etc/xapi.d/plugins/cloud-plugin-storage", line 34, in <module>
    import lvhdutil
ModuleNotFoundError: No module named 'lvhdutil'
]

On the XCP-ng 8.3 host, the lvhdutil source module is absent — only a stale orphan bytecode file remains, which Python 3 will not import:

# find /opt/xensource/sm /usr/lib/python* -name 'lvhdutil*'
/opt/xensource/sm/__pycache__/lvhdutil.cpython-36.pyc

# ls /opt/xensource/sm/ | grep -i lvhd
(no source .py present)

Note the other imports just above import lvhdutil (SR, VDI, SRCommand, util, lvutil, vhdutil) succeed, which confirms /opt/xensource/sm is on the path and that lvhdutil specifically has been removed/refactored out in XCP-ng 8.3.

What to do about it?

Make the import non-fatal in
/usr/share/cloudstack-common/scripts/vm/hypervisor/xenserver/xenserver84/cloud-plugin-storage
on the management server, then push the patched plugin to the hosts (the management server only copies host plugins on host add, not on reconnect/restart, so a manual push or host re-add is required):

import vhdutil
import shutil
try:
    import lvhdutil
except ImportError:
    lvhdutil = None
import errno

After this, the NFS mount succeeds and the system VMs deploy normally.

OPEN QUESTION / SUGGESTED FIX

The guard above unblocks NFS and basic operation, but it does not restore any plugin code path that genuinely needs lvhdutil (relevant to LVM/iSCSI primary storage operations such as snapshots and template-from-volume). The proper fix is likely to import lvhdutil from its new location in XCP-ng 8.3's refactored sm stack (or call its replacement), rather than guarding it to None. Guidance on the correct module/path for 8.3 would be appreciated.

RELATED

Possibly related to discussion #11346 (4.20.1 + XCP-ng 8.3 system VMs failing to deploy), though that case was resolved as an NFSv4 negotiation issue on the storage server — this is a distinct root cause (plugin import failure).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions