Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

flashcache + drbd + two node kvm cluster -> VM image corruption after live migration #54

Closed
MvdLande opened this Issue Feb 24, 2012 · 4 comments

Comments

Projects
None yet
3 participants

Hello,

I'm testing a configuration with flashcache on a two node kvm cluster. I have setup these two node clusters before without flashcache without any trouble. When I use flashcache then I experience VM image file corruption after using virsh live migration. For some reason the data on both kvm hosts is not the same. I have tested both write through and write back mode with the same result.

Both the flashcache and the file storage are using disk partitions, could this be a problem?

Problem description:
When using flachcache + drbd in the configuration mentioned below I get a corrupted KVM Virtual Machine Image file after using virsh live migration. This is noticed after repeatedly live migrating the VM between two host servers. The VM host OS is Windows 7 x86 . Without flashcache I get no corrupted VM image file (tested 2000+ live migrations on the same VM) I'm also using the latest virtio-win drivers.

Host OS = Centos 6.2 x86_64
The migration is started using the following command
# virsh migrate --live --verbose test qemu+ssh://vmhost2a.vdl-fittings.local/system
And back again with after 5 minutes
# virsh migrate --live --verbose test qemu+ssh://vmhost2b.vdl-fittings.local/system
(I used cron to automate the task so the VM is live migrated every 5 minutes)

Symptom:
The VM Image corruption was noticed because windows began complaining about unreadable/corrupted system files. Also after rebooting the VM numerous errors were detected by windows disk check. It looks like the cached data is not the same on both servers. I have configured drbd to use the flashcache device so all drbd data should pass through the cache. Also static files became unreadable after a while. (when I tried to open a folder with some files, the folder was corrupted)

Configuration:
The two host servers are using GFS2 as the cluster storage file system to host the Image files. (which works fine without flashcache)

I have the following disk setup:
/dev/sda1 raid1 SSD array (200GB) (using an Adaptec 6805 controller)
/dev/sdc1 raid5 HD array (1.5TB) (using another Adaptec 6805 controller)
/dev/mapper/cachedev the flashcache device

As for flashcache, I have used two setups

  • 1 using flashcache in write through mode

    /sbin/flashcache_create -p thru -b 16k cachedev /dev/sda1 /dev/sdc1

  • 2 using flashcache in write back mode

    /sbin/flashcache_create -p back -b 16k cachedev /dev/sda1 /dev/sdc1

    Both experienced the same VM Image corruption.

The disk /dev/drbd0 is mounted on /VM

My drbd setup is as follows: (using drbd8.3.12)

------------------------------------------------------------------------

Just for testing I kept all configuration in one file #include "drbd.d/global_common.conf"; #include "drbd.d/*.res"; global {

minor-count 64;
usage-count yes;

}

common {
syncer {
rate 110M;
verify-alg sha1;
csums-alg sha1;
al-extents 3733;

cpu-mask 3;

}
}

resource VMstore1 {

protocol C;

startup {
wfc-timeout 1800; # 30 min
degr-wfc-timeout 120; # 2 minutes.
wait-after-sb;
become-primary-on both;
}

disk {
no-disk-barrier;

no-disk-flushes;

}

net {
max-buffers 8000;
max-epoch-size 8000;
sndbuf-size 0;
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
syncer{
cpu-mask 3;
}

on vmhost2a.vdl-fittings.local {
device /dev/drbd0;
disk /dev/mapper/cachedev;
address 192.168.100.3:7788;
meta-disk internal;
}
on vmhost2b.vdl-fittings.local {
device /dev/drbd0;
disk /dev/mapper/cachedev;
address 192.168.100.4:7788;
meta-disk internal;
}
}

------------------------------------------------------------------------

Cluster configuration: (no fence devices)

------------------------------------------------------------------------

# ------------------------------------------------------------------------

Flashcache version

------------------------------------------------------------------------

[root@vmhost2a ~]# modinfo flashcache
filename: /lib/modules/2.6.32-220.4.2.el6.x86_64/weak-updates/flashcache/flashcache.ko
license: GPL
author: Mohan - based on code by Ming
description: device-mapper Facebook flash cache target
srcversion: E1A5D9AA620A2EACC9FA891
depends: dm-mod
vermagic: 2.6.32-220.el6.x86_64 SMP mod_unload modversions

------------------------------------------------------------------------

libvirt xmldump VM configuration

------------------------------------------------------------------------

- test a84df1cb-668f-cf5b-7433-91f99ac23971 2097152 2097152 4 - hvm - - Penryn Intel <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> - /usr/libexec/qemu-kvm -

- - - - - - -

------------------------------------------------------------------------

Hmm, all xml data has been striped...

Contributor

mohans commented Mar 3, 2012

We use flashcache extensively in production (for more than a year now) and have not run into any corruption issues. But we don't use drbd or kvm at all.

So I haven't tested this setup, but I seem to recall that you need to run DRBD 8.4+ to support multiple block devices per resource. Specifically read the first change log in the 8.4.1 section. http://git.drbd.org/?p=drbd-8.4.git;a=blob;f=ChangeLog;hb=HEAD Perhaps try upgrading drbd and follow some of the applicable information here https://plus.google.com/110443614427234590648/posts/56UbDMW1ktz to see if it fixes the issue.

I'm considering a similar flashcache+drbd+kvm setup myself (with storage exported via NFS), but don't yet have the hardware to test it on, so I would be interested to hear your findings.

I'm considering a similar flashcache+drbd+kvm setup myself (with storage exported via NFS), but don't yet have the hardware to test it on.

It's good to know that I'm not the only one who would like to use flashcache with drbd.

So I haven't tested this setup, but I seem to recall that you need to run DRBD 8.4+ to support multiple block devices per resource. >Specifically read the first change log in the 8.4.1 section. http://git.drbd.org/?p=drbd-8.4.git;a=blob;f=ChangeLog;hb=HEAD Perhaps >try upgrading drbd and follow some of the applicable information here >https://plus.google.com/110443614427234590648/posts/56UbDMW1ktz to see if it fixes the issue.

Thanks for the information. It would be great if we got this setup to work!
I'll wait for drbd 8.4.2 before testing again. I have run drbd 8.4.1 without flashcache but it was a bit slow. Therefore I'm currently using drbd 8.3.12.
(And I turned the test cluster into a production cluster(without flashcache), so I have to build a new test cluster)

If you are using a RAID controller for the SSD's, don't use/consider the Adaptec 6000 series (6805 etc.). They tend to lose an SSD array. And then you end up with a read only filesystem and a crashed server. (And don't expect any help from Adaptec) I'm replacing the Adaptec RAID controllers with HP p410 controllers, I have had no issues with those. (but for RAID6 you have to buy an extra license)

I would be interested to hear your findings.

I'll post a message when I start testing on drbd 8.4.2.

This issue was closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment