Commits
colo-v1.5-deve…
Name already in use
Commits on Jul 30, 2015
-
COLO: Add some statistics for number of pages transferred
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Save part of dirty pages to slave during the wait time of check…
…point We can send part of dirty pages to slave during the wait time of checkpoint, where we just sleep before. In this way, we can reduce the pause time for VM when do checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Move the position of ram begin process of saving/loading
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
arch_init: Change the return value of ram_save_complete
Let ram_save_complete return the number of pages that been sent, just like what ram_save_iterate does. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Separate ram and device save/load process
We separate the process of saving/loading ram and device when do checkpoint, we add new helpers for save/load ram/device. With this change, we can directly transfer ram from master to slave without using QEMUSizeBuffer as assistant, which also reduce the size of extra memory been used during checkpoint. Besides, we move the colo_flush_ram_cache to the proper position after the above change. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
savevm: Split load vm state function qemu_loadvm_state
qemu_loadvm_state is too long, and we can simplify it by splitting up with three helper functions. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Expose statistics information of checkpoint to user
You can get some statistics information of checkpoint by using qmp command or hmp command 'info migrate'. Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Add some statistics information for checkpoint
The statistics include: total checkpoint count, checkpoint count because of proxy net packets inconsistent, periodic checkpoint count, also, include VM's downtime during checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO: Add block replication into colo process
Make sure master start block replication after slave's block replication started. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO: Implement shutdown checkpoint
For Secondary VM, we forbid it shutdown directly when in COLO mode, FOR Primary VM's shutdown, we should do some work to ensure the consistent action between PVM and SVM. Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO NIC: Implement NIC checkpoint and failover
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO: Add colo-set-checkpoint-period command
With this command, we can control the period of checkpoint, if there is no comparison of net packets. Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO: Improve checkpoint efficiency by do additional periodic checkpoint
Besides normal checkpoint which according to the result of net packets comparing, We do additional checkpoint periodically, it will reduce the number of dirty pages when do one checkpoint, if we don't do checkpoint for a long time (This is a special case when the net packets is always consistent). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
-
COLO: Do checkpoint according to the result of packets comparation
Only do checkpoint, when the PVM's and SVM's output net packets are inconsistent, We also limit the min time between two continuous checkpoint action, to give VM a change to run. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO: Handle nfnetlink message from proxy module
Proxy module will send message to qemu through nfnetlink. Now, the message only contains the result of packets comparation. We use a global variable 'packet_compare_different' to store the result. And this variable should be accessed by using atomic related function, such as 'atomic_set' 'atomic_xchg'. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO NIC: Some init work related with proxy module
Implement communication protocol with proxy module by using nfnetlink, which requires libnfnetlink libs. Tell proxy module to do initialization work and moreover ask kernel to acknowledge the request. It's is necessary for the first time because Netlink is not a reliable protocol. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO NIC: Implement colo nic init/destroy function
When in colo mode, call colo nic init/destroy function. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
colo-nic: Handle secondary VM's original net device configure
For secondary VM, we need to reconfigure its original net devices, Before go into COLO mode, we detach its original net devices (here is tap) from its default configure (here is bridge), and attach the net devices to forward bridge. When exit from COLO mode, we resume its origianl configure. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO NIC: Implement colo nic device interface configure()
Implement colo nic device interface configure() add a script to configure nic devices: ${QEMU_SCRIPT_DIR}/colo-proxy-script.sh Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> -
tap: Make launch_script() public
We also change the parameters of launch_script(). Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO NIC: Init/remove colo nic devices when add/cleanup tap devices
When go into COLO mode, we need to some init work for all VM's nics. Here we use a list to record these nic, and for now we only support the 'tap' nic backend. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO: Add new command parameter 'forward_nic' 'colo_script' for net
The 'forward_nic' should be assigned with network name, for exmple, 'eth2'. It will be parameter of 'colo_script', 'colo_script' should be assigned with an scirpt path. We parse these parameter in tap. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO failover: Don't do failover during loading VM's state
We should not do failover work while the main thread is loading VM's state, otherwise it will destroy the consistent of VM's memory and device state. Here we add a new failover status 'RELAUNCH' which means we should relaunch the process of failover. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
qmp event: Add event notification for COLO error
If some errors happen during VM's COLO FT stage, it's import to notify the users this event, Togehter with 'colo_lost_heartbeat', users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exit COLO mode. Cc: Markus Armbruster <armbru@redhat.com> Cc: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO failover: Implement COLO primary/secondary vm failover work
If there are some errors happen, we will give users(administrators) time to get involved in failover verdict, which they can decide which side should take over the work by using 'colo_lost_heartbeat' command. Note: The default verdict is primary VM takes over work while secondary VM exit. So if users choose secondary VM to take over work, please make sure that Primary VM is dead, or there will be 'split-brain' problem. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
COLO failover: Introduce state to record failover process
When handling failover, we do different things according to the different stage of failover process, here we introduce a global atomic variable to record the status of failover. We add four failover status to indicate the different stage of failover process. You should use the helpers to get and set the value. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO failover: Introduce a new command to trigger a failover
We leave users to use whatever heartbeat solution they want, if the heartbeat is lost, or other errors they detect, they can use command 'colo_lost_heartbeat' to tell COLO to do failover, COLO will do operations accordingly. For example, If send the command to PVM, Primary will exit COLO mode, and takeover, if to Secondary, Secondary will do failover work and at last takeover server. Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com>
-
COLO RAM: Flush cached RAM into SVM's memory
During the time of VM's running, PVM/SVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be some with PVM's memory after checkpoint. Instead of flushing all content of SVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM or SVM since last checkpoint. In this way, we ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
-
arch_init: Start to trace dirty pages of SVM
we will use this dirty bitmap together with VM's cache RAM dirty bitmap to decide which page in cache should be flushed into VM's RAM. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
-
COLO VMstate: Load VM state into qsb before restore it
We should not destroy the state of secondary until we receive the whole state from the primary, in case the primary fails in the middle of sending the state, so, here we cache the device state in Secondary before restore it. Besides, we should call qemu_system_reset() before load VM state, which can ensure the data is intact. Note: If we discard qemu_system_reset(), there will be some odd error, For exmple, qemu in slave side crashes and reports: KVM: entry failed, hardware error 0x7 EAX=00000000 EBX=0000e000 ECX=00009578 EDX=0000434f ESI=0000fc10 EDI=0000434f EBP=00000000 ESP=00001fca EIP=00009594 EFL=00010246 [---Z-P-] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0040 00000400 0000ffff 00009300 CS =f000 000f0000 0000ffff 00009b00 SS =434f 000434f0 0000ffff 00009300 DS =434f 000434f0 0000ffff 00009300 FS =0000 00000000 0000ffff 00009300 GS =0000 00000000 0000ffff 00009300 LDT=0000 00000000 0000ffff 00008200 TR =0000 00000000 0000ffff 00008b00 GDT= 0002dcc8 00000047 IDT= 00000000 0000ffff CR0=00000010 CR2=ffffffff CR3=00000000 CR4=00000000 DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000 DR6=00000000ffff0ff0 DR7=0000000000000400 EFER=0000000000000000 Code=c0 74 0f 66 b9 78 95 00 00 66 31 d2 66 31 c0 e9 47 e0 fb 90 <f3> 90 fa fc 66 c3 66 53 66 89 c3 66 e8 9d e8 ff ff 66 01 c3 66 89 d8 66 e8 40 e9 ff ff 66 ERROR: invalid runstate transition: 'internal-error' -> 'colo' The reason is, some of the device state will be ignored when saving device state to slave, if the corresponding data is in its initial value, such as 0. But the device state in slave maybe in initialized value, after a loop of checkpoint, there will be inconsistent for the value of device state. This will happen when the PVM reboot or SVM run ahead of PVM in the startup process. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-
COLO RAM: Load PVM's dirty page into SVM's RAM cache temporarily
The ram cache is initially the same as SVM/PVM's memory. At checkpoint, we cache the dirty RAM of PVM into RAM cache in the slave (so that RAM cache always the same as PVM's memory at every checkpoint), we will flush cached RAM to SVM after we receive all PVM's vmstate (RAM/device). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
-
COLO: Save VM state to slave when do checkpoint
We should save PVM's RAM/device to slave when needed. For VM state, we will cache them in slave, we use QEMUSizedBuffer to store the data, we need know the data size of VM state, so in master, we use qsb to store VM state temporarily, and then migrate the data to slave. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
-
QEMUSizedBuffer: Introduce two help functions for qsb
Introduce two new QEMUSizedBuffer APIs which will be used by COLO to buffer VM state: One is qsb_put_buffer(), which put the content of a given QEMUSizedBuffer into QEMUFile, this is used to send buffered VM state to secondary. Another is qsb_fill_buffer(), read 'size' bytes of data from the file into qsb, this is used to get VM state from socket into a buffer. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-
COLO: Add a new RunState RUN_STATE_COLO
Guest will enter this state when paused to save/restore VM state under colo checkpoint. Cc: Eric Blake <eblake@redhat.com> Cc: Markus Armbruster <armbru@redhat.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-
COLO: Implement colo checkpoint protocol
We need communications protocol of user-defined to control the checkpoint process. The new checkpoint request is started by Primary VM, and the interactive process like below: Checkpoint synchronizing points, Primary Secondary NEW @ Suspend SUSPENDED @ Suspend&Save state SEND @ Send state Receive state RECEIVED @ Flush network Load state LOADED @ Resume Resume Start Comparing NOTE: 1) '@' who sends the message 2) Every sync-point is synchronized by two sides with only one handshake(single direction) for low-latency. If more strict synchronization is required, a opposite direction sync-point should be added. 3) Since sync-points are single direction, the remote side may go forward a lot when this side just receives the sync-point. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Gonglei <arei.gonglei@huawei.com>