@@ -35,7 +35,7 @@ To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
35
35
feature on the virtio-net interface and assign the same MAC address to both
36
36
virtio-net and VF interfaces.
37
37
38
- Here is an example XML snippet that shows such configuration.
38
+ Here is an example libvirt XML snippet that shows such configuration:
39
39
::
40
40
41
41
<interface type='network'>
@@ -45,18 +45,32 @@ Here is an example XML snippet that shows such configuration.
45
45
<model type='virtio'/>
46
46
<driver name='vhost' queues='4'/>
47
47
<link state='down'/>
48
- <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/>
48
+ <teaming type='persistent'/>
49
+ <alias name='ua-backup0'/>
49
50
</interface>
50
51
<interface type='hostdev' managed='yes'>
51
52
<mac address='52:54:00:00:12:53'/>
52
53
<source>
53
54
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
54
55
</source>
55
- <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0 '/>
56
+ <teaming type='transient' persistent='ua-backup0 '/>
56
57
</interface>
57
58
59
+ In this configuration, the first device definition is for the virtio-net
60
+ interface and this acts as the 'persistent' device indicating that this
61
+ interface will always be plugged in. This is specified by the 'teaming' tag with
62
+ required attribute type having value 'persistent'. The link state for the
63
+ virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers
64
+ the VF passthrough device for normal communication. The virtio-net device will
65
+ be brought UP during live migration to allow uninterrupted communication.
66
+
67
+ The second device definition is for the VF passthrough interface. Here the
68
+ 'teaming' tag is provided with type 'transient' indicating that this device may
69
+ periodically be unplugged. A second attribute - 'persistent' is provided and
70
+ points to the alias name declared for the virtio-net device.
71
+
58
72
Booting a VM with the above configuration will result in the following 3
59
- netdevs created in the VM.
73
+ interfaces created in the VM:
60
74
::
61
75
62
76
4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
@@ -65,13 +79,36 @@ netdevs created in the VM.
65
79
valid_lft 42482sec preferred_lft 42482sec
66
80
inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
67
81
valid_lft forever preferred_lft forever
68
- 5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP > mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000
82
+ 5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000
69
83
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
70
84
7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
71
85
link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
72
86
73
- ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave
74
- 'standby' and 'primary' netdevs respectively.
87
+ Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby'
88
+ virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface.
89
+
90
+ One point to note here is that some user space network configuration daemons
91
+ like systemd-networkd, ifupdown, etc, do not understand the 'net_failover'
92
+ device; and on the first boot, the VM might end up with both 'failover' device
93
+ and VF accquiring IP addresses (either same or different) from the DHCP server.
94
+ This will result in lack of connectivity to the VM. So some tweaks might be
95
+ needed to these network configuration daemons to make sure that an IP is
96
+ received only on the 'failover' device.
97
+
98
+ Below is the patch snippet used with 'cloud-ifupdown-helper' script found on
99
+ Debian cloud images:
100
+
101
+ ::
102
+ @@ -27,6 +27,8 @@ do_setup() {
103
+ local working="$cfgdir/.$INTERFACE"
104
+ local final="$cfgdir/$INTERFACE"
105
+
106
+ + if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi
107
+ +
108
+ if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then
109
+ # interface is already known to ifupdown, no need to generate cfg
110
+ log "Skipping configuration generation for $INTERFACE"
111
+
75
112
76
113
Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
77
114
==================================================================
@@ -80,40 +117,68 @@ net_failover also enables hypervisor controlled live migration to be supported
80
117
with VMs that have direct attached SR-IOV VF devices by automatic failover to
81
118
the paravirtual datapath when the VF is unplugged.
82
119
83
- Here is a sample script that shows the steps to initiate live migration on
84
- the source hypervisor.
120
+ Here is a sample script that shows the steps to initiate live migration from
121
+ the source hypervisor. Note: It is assumed that the VM is connected to a
122
+ software bridge 'br0' which has a single VF attached to it along with the vnet
123
+ device to the VM. This is not the VF that was passthrough'd to the VM (seen in
124
+ the vf.xml file).
85
125
::
86
126
87
- # cat vf_xml
127
+ # cat vf.xml
88
128
<interface type='hostdev' managed='yes'>
89
129
<mac address='52:54:00:00:12:53'/>
90
130
<source>
91
131
<address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
92
132
</source>
93
- <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0 '/>
133
+ <teaming type='transient' persistent='ua-backup0 '/>
94
134
</interface>
95
135
96
- # Source Hypervisor
136
+ # Source Hypervisor migrate.sh
97
137
#!/bin/bash
98
138
99
- DOMAIN=fedora27-tap01
100
- PF=enp66s0f0
101
- VF_NUM=5
102
- TAP_IF=tap01
103
- VF_XML=
139
+ DOMAIN=vm-01
140
+ PF=ens6np0
141
+ VF=ens6v1 # VF attached to the bridge.
142
+ VF_NUM=1
143
+ TAP_IF=vmtap01 # virtio-net interface in the VM.
144
+ VF_XML=vf.xml
104
145
105
146
MAC=52:54:00:00:12:53
106
147
ZERO_MAC=00:00:00:00:00:00
107
148
149
+ # Set the virtio-net interface up.
108
150
virsh domif-setlink $DOMAIN $TAP_IF up
109
- bridge fdb del $MAC dev $PF master
110
- virsh detach-device $DOMAIN $VF_XML
151
+
152
+ # Remove the VF that was passthrough'd to the VM.
153
+ virsh detach-device --live --config $DOMAIN $VF_XML
154
+
111
155
ip link set $PF vf $VF_NUM mac $ZERO_MAC
112
156
113
- virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system
157
+ # Add FDB entry for traffic to continue going to the VM via
158
+ # the VF -> br0 -> vnet interface path.
159
+ bridge fdb add $MAC dev $VF
160
+ bridge fdb add $MAC dev $TAP_IF master
161
+
162
+ # Migrate the VM
163
+ virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system
164
+
165
+ # Clean up FDB entries after migration completes.
166
+ bridge fdb del $MAC dev $VF
167
+ bridge fdb del $MAC dev $TAP_IF master
114
168
115
- # Destination Hypervisor
169
+ On the destination hypervisor, a shared bridge 'br0' is created before migration
170
+ starts, and a VF from the destination PF is added to the bridge. Similarly an
171
+ appropriate FDB entry is added.
172
+
173
+ The following script is executed on the destination hypervisor once migration
174
+ completes, and it reattaches the VF to the VM and brings down the virtio-net
175
+ interface.
176
+
177
+ ::
178
+ # reattach-vf.sh
116
179
#!/bin/bash
117
180
118
- virsh attach-device $DOMAIN $VF_XML
119
- virsh domif-setlink $DOMAIN $TAP_IF down
181
+ bridge fdb del 52:54:00:00:12:53 dev ens36v0
182
+ bridge fdb del 52:54:00:00:12:53 dev vmtap01 master
183
+ virsh attach-device --config --live vm01 vf.xml
184
+ virsh domif-setlink vm01 vmtap01 down
0 commit comments