Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VMware to KVM Migration NOT Functional with Zone wide Storage #8632

Closed
Tbaugus44 opened this issue Feb 8, 2024 · 9 comments · Fixed by #8815
Closed

VMware to KVM Migration NOT Functional with Zone wide Storage #8632

Tbaugus44 opened this issue Feb 8, 2024 · 9 comments · Fixed by #8815

Comments

@Tbaugus44
Copy link

ISSUE TYPE
  • Bug Report
COMPONENT NAME
UI,API
CLOUDSTACK VERSION
4.19.0
CONFIGURATION

Advanced Networking Zone

OS / ENVIRONMENT

Dual Management and SQL servers all running Ubuntu 22.04. they are running on KVM. Storage Traffic is on a separate nic.

SUMMARY

We have connected to our external Vsphere device and tried to import a vm. the proccess starts and clones the vm on the vshpere side. it then fails and throws the errorr "Error Index 0 out of bounds for length 0"

STEPS TO REPRODUCE
```
tools>Import-Export Instances> Connect to your VMware external environment> Import instance> set your name and choose the storage pool.
``` 
EXPECTED RESULTS
The VM to Be successfully converted and imported into Cloudstack
ACTUAL RESULTS
We get the error Index 0 out of bounds for length 0. the vm is not converted and migrated.
@Tbaugus44
Copy link
Author

@alexandremattioli @nvazquez
management-server.log
Attached is the management server logs.

@Tbaugus44
Copy link
Author

Once we added our primary storage pool to the cluster and not the zone we got this error

The convert process failed for instance e3d560ec-58e8-4c2e-b04d-af0c6be3bf39 from Vmware to KVM on host ERL-KVM-HOST-01.aigcloud.io: The virt-v2v conversion of the instance e3d560ec-58e8-4c2e-b04d-af0c6be3bf39 failed. Please check the agent logs for the virt-v2v output

management-server1.log

@rohityadavcloud
Copy link
Member

@sghazra21
Copy link

I am also facing the same issue while importing , any update on this received or did you managed to resolve it ?

@spdinis
Copy link

spdinis commented May 7, 2024

Hi, I have been doing a lot of testing round this and indeed seems that if the target storage isn't cluster wide it will give the the mentioned error, something that needs to be addressed.

There are other issues, that I likely should address separately, but just so you know, if your source vmware infra uses distributed switches, you won't be able to migrate unless you use a recent EL version on version 9.2 and above.

This is due to the fact that libvirt library when reading the vmx file is expecting always with network labels as the standard switches. so anything running libvirt 8.0.0 like ubuntu 22.04, El8 and others, won't successfully migrate it.

This is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1988211 and has been fixed on RHBA-2023:2171 errata from 9.2 and above.

ubuntu 24.04 is documented to use libvirt 9 so I guess it will work, but 4.19 doesn't officially supports it.

The other issue I have found is that the migration as documented and is a by product of the virt-v2v relies on the connectivity from the converter to the vcenter and vcenter to management network of the ESXi. There won't be a direct connection ever from the converter to the ESXi itself. That means that in most cases the migration is painful slow.

The other thing is that the migration claimed as online isn't really online, so don't even bother doing it, you will endup with potential issues, just shut off the VM.

Finally the way I managed to make it snappy and works very well for us is to present secondary storage or any other NFS volume both to the converter and the ESXi, For excess of caution I do a clone of the original VM after power off into the NFS mount, that is typically quite fast.

Then on the converter node in Rocky 9.3 in my case, I simply do virt-v2v -i vmx [NFS mount path to the cloned vm .vmx file] and then -o libvirt -of qcow2 -os [target virsh storage pool], this output is useful for a server that is part of an existing cluster and you can simply point it to the primary storage you want by just replacing that by the uid you can see in cloudstack. Alternatively you can use -o local -of qcow2 -os [server target mount path], this is useful if the vonverter isn't part of an existing cluster and isn't aware of the cloudstack pools.

The final step for us, can be the 1st is to create a dummy VM with the exact specs you need in cloudstack, boot it up, shut it down, see where are the disks and then on the primary storage simply mv the converted disk to replace the disks created by the dummy instance. And then power up and you just have to correct the network potentially.

After many many attempts I found this method to work in 15 minutes end to end on an average VM that has 30 to 40 GB disk and has worked 100% of the time so far with no hassles. The native cloudstack migration just doesn't work for us because of the data path it takes, it is specifically penalizing in our enviroment. If you have firewalls between things and 1G management networks or a vcenter for different locations, avoid the native migration for now, you will be bagging your head quite significantly with mixed results.

Some of this findings I will likely open separate issues for it, since specially the issue with the distributed switches will let a lot of people crying specially if they are ubuntu shop or EL8.

Note the reason I have put it all together here, is because I was fighting with multiple issues at the same time and other people may encounter the same problem and not be able to isolate each one of the problems. It made me loose a lot of time myself and frustrated not knowing what I was hitting at every different attempt. Maybe this saves others a bit.

@weizhouapache
Copy link
Member

weizhouapache commented May 7, 2024

Hi, I have been doing a lot of testing round this and indeed seems that if the target storage isn't cluster wide it will give the the mentioned error, something that needs to be addressed.

There are other issues, that I likely should address separately, but just so you know, if your source vmware infra uses distributed switches, you won't be able to migrate unless you use a recent EL version on version 9.2 and above.

This is due to the fact that libvirt library when reading the vmx file is expecting always with network labels as the standard switches. so anything running libvirt 8.0.0 like ubuntu 22.04, El8 and others, won't successfully migrate it.

This is documented here: https://bugzilla.redhat.com/show_bug.cgi?id=1988211 and has been fixed on RHBA-2023:2171 errata from 9.2 and above.

ubuntu 24.04 is documented to use libvirt 9 so I guess it will work, but 4.19 doesn't officially supports it.

The other issue I have found is that the migration as documented and is a by product of the virt-v2v relies on the connectivity from the converter to the vcenter and vcenter to management network of the ESXi. There won't be a direct connection ever from the converter to the ESXi itself. That means that in most cases the migration is painful slow.

The other thing is that the migration claimed as online isn't really online, so don't even bother doing it, you will endup with potential issues, just shut off the VM.

Finally the way I managed to make it snappy and works very well for us is to present secondary storage or any other NFS volume both to the converter and the ESXi, For excess of caution I do a clone of the original VM after power off into the NFS mount, that is typically quite fast.

Then on the converter node in Rocky 9.3 in my case, I simply do virt-v2v -i vmx [NFS mount path to the cloned vm .vmx file] and then -o libvirt -of qcow2 -os [target virsh storage pool], this output is useful for a server that is part of an existing cluster and you can simply point it to the primary storage you want by just replacing that by the uid you can see in cloudstack. Alternatively you can use -o local -of qcow2 -os [server target mount path], this is useful if the vonverter isn't part of an existing cluster and isn't aware of the cloudstack pools.

The final step for us, can be the 1st is to create a dummy VM with the exact specs you need in cloudstack, boot it up, shut it down, see where are the disks and then on the primary storage simply mv the converted disk to replace the disks created by the dummy instance. And then power up and you just have to correct the network potentially.

After many many attempts I found this method to work in 15 minutes end to end on an average VM that has 30 to 40 GB disk and has worked 100% of the time so far with no hassles. The native cloudstack migration just doesn't work for us because of the data path it takes, it is specifically penalizing in our enviroment. If you have firewalls between things and 1G management networks or a vcenter for different locations, avoid the native migration for now, you will be bagging your head quite significantly with mixed results.

Some of this findings I will likely open separate issues for it, since specially the issue with the distributed switches will let a lot of people crying specially if they are ubuntu shop or EL8.

Note the reason I have put it all together here, is because I was fighting with multiple issues at the same time and other people may encounter the same problem and not be able to isolate each one of the problems. It made me loose a lot of time myself and frustrated not knowing what I was hitting at every different attempt. Maybe this saves others a bit.

@spdinis
thanks a lot for sharing

just for your information,
@sureshanaparti is working on the improvement #8815

@DaanHoogland
Copy link
Contributor

@sureshanaparti , will #8815 solve this?

@sureshanaparti
Copy link
Contributor

@sureshanaparti , will #8815 solve this?

@spdinis @weizhouapache @DaanHoogland #8815 improves the migration performance & provides flexibility to choose management server or KVM host (with ovftool installed) to import the VMware VM files (OVF) to the conversion storage (secondary or primary NFS). virt-v2v uses the imported OVF for the conversion instead of accessing & converting VM from vCenter directly (earlier behavior). Any virt-v2v, libvirt, other libraries/dependencies related limitations (unsupported guest OSes, network switches - vSwitch vd dvSwitch, disk controllers / types, network adapter types, etc.) are still applicable, and migration might fail in such cases.

@sureshanaparti
Copy link
Contributor

Fixed in #8815

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

8 participants