From a9af06371535d5f111b06dfdfc11fe0406e4498f Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 10:49:27 +0100 Subject: [PATCH 01/10] [CORE-12279] Live migration for OpenStack --- .../networking/openstack/live-migration.mdx | 358 ++++++++++++++++++ sidebars-calico.js | 1 + 2 files changed, 359 insertions(+) create mode 100644 calico/networking/openstack/live-migration.mdx diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx new file mode 100644 index 0000000000..c045a3e15c --- /dev/null +++ b/calico/networking/openstack/live-migration.mdx @@ -0,0 +1,358 @@ +--- +description: Configure live migration support for Calico on OpenStack, including BGP route priority propagation and Nova integration. +--- + +# Live migration + +## Big picture + +$[prodname] supports live migration of OpenStack VMs with minimal network +disruption. During a live migration, $[prodname]'s Felix agent programs routes +on the target node with elevated priority, ensuring that network traffic +converges to the new node as quickly as possible. For this to work optimally +across a multi-node deployment, your BGP configuration must propagate route +priority information between nodes. + +## Concepts + +### Route priority during live migration + +When Felix handles a live-migrating VM on the target (destination) node, it +programs the VM's route with a higher-priority metric (lower value = higher +priority). By default: + +- **Normal priority** is metric value 1024 +- **Elevated priority** is metric value 512 + +But these values can be changed, if needed, using the following +settings in FelixConfiguration or `/etc/calico/felix.cfg`: + +| Setting | Default | Description | +|-----------------------------|---------|-------------------------------------------------------| +| `IPv4NormalRoutePriority` | 1024 | Kernel route metric for normal VM IPv4 routes | +| `IPv4ElevatedRoutePriority` | 512 | Kernel route metric for live-migrating VM IPv4 routes | +| `IPv6NormalRoutePriority` | 1024 | Kernel route metric for normal VM IPv6 routes | +| `IPv6ElevatedRoutePriority` | 512 | Kernel route metric for live-migrating VM IPv6 routes | + +If you change these from the defaults, adjust the corresponding values in the +BIRD configuration examples below. + +During the overlap period when both the source and target nodes advertise +routes to the migrating VM's IP, remote nodes (and intermediate routers) must +prefer the route via the target node. The elevated-priority metric achieves +this, but only if your BGP configuration propagates priority information +correctly between nodes. The sections below provide examples of how to do this +with BIRD configuration, for both iBGP and eBGP cases. + +:::note + +The same principles apply to live migration for KubeVirt VMs in a Kubernetes +cluster, which is independently documented at [BGP routing for KubeVirt live +migration](../kubevirt/live-migration-bgp.mdx). The key differences between +the considerations for OpenStack and for KubeVirt are as follows. + +- With OpenStack $[prodname] itself does not generate the BGP configuration - + whereas with Kubernetes it does - so instead this is a customer + responsibility. + +- The "route aggregation" detail mentioned in the KubeVirt doc does not apply + to OpenStack. OpenStack IPAM does not use node-affinity, so VM routes are + always propagated as individual /32 (IPv4) or /128 (IPv6) routes. + +::: + +## How to + +### Configure BIRD for route priority propagation (iBGP) + +In an iBGP deployment (all BGP peers share the same AS number), +route priority can be propagated using the BGP LOCAL_PREF attribute, +which is transitive within an AS. + +The required conversion between Linux kernel route metrics and BGP LOCAL_PREF is: + +``` +bgp_local_pref = 2147483647 - krt_metric +krt_metric = 2147483647 - bgp_local_pref +``` + +This inversion is needed because lower kernel metric values mean higher priority, +while higher LOCAL_PREF values mean higher priority. +The constant 2147483647 is 2^31 - 1. + +Add the following filters to your BIRD configuration (`/etc/bird/bird.conf`): + +``` +filter export_bgp { + if (!defined(krt_metric)) then { krt_metric = 1024; } + bgp_local_pref = 2147483647 - krt_metric; + if ( (ifname ~ "tap*") || (ifname ~ "cali*") || (ifname ~ "dummy1") ) then { + if net != 0.0.0.0/0 then accept; + } + reject; +} + +filter import_bgp { + if (defined(bgp_local_pref)&&(bgp_local_pref > 2147482623)) then + preference = 200; + accept; +} + +filter export_kernel { + if (defined(source) && (source = RTS_BGP) && !defined(krt_metric)) then { + krt_metric = 1024; + if (defined(bgp_local_pref)) then { + krt_metric = 2147483647 - bgp_local_pref; + } + if (krt_metric < 1024) then { + preference = 200; + } + } + accept; +} +``` + +These filters work as follows: + +- **`export_bgp`**: When exporting a route to BGP peers, + converts the kernel route metric (`krt_metric`) to `bgp_local_pref`. + Routes with no metric default to 1024 (normal priority). + +- **`import_bgp`**: When importing a route from a BGP peer, + checks whether the route has elevated priority + (`bgp_local_pref > 2147482623`, corresponding to `krt_metric < 1024`). + If so, sets BIRD's `preference` to 200 so that BIRD prefers this remote route + over a local route for the same destination. + This matters when a VM is migrating away from a node that has an active connection to that VM. + +- **`export_kernel`**: When programming a BGP-learned route into the Linux kernel, + converts `bgp_local_pref` back to `krt_metric`. + Also sets `preference = 200` for elevated-priority routes, for the same reason as `import_bgp`. + This conversion is done here rather than in `import_bgp` + because setting `krt_metric` in a BGP import filter does not take effect. + +Update the kernel protocol block to use the `export_kernel` filter: + +``` +protocol kernel { + learn; + persist; + scan time 2; + import all; + graceful restart; + export filter export_kernel; + merge paths on; +} +``` + +Apply the filters to each BGP peer: + +``` +protocol bgp 'peer1' { + description "Route reflector 1"; + local as 65001; + neighbor 10.0.0.1 as 65001; + multihop; + import filter import_bgp; + graceful restart; + export filter export_bgp; + next hop self; + source address 10.0.0.10; +} +``` + +:::note + +For IPv6, create the same filters in your `bird6.conf`, +replacing `0.0.0.0/0` with `::/0` in the `export_bgp` filter. + +::: + +### Configure BIRD for route priority propagation (eBGP) + +In an eBGP deployment (nodes in different racks use different AS numbers), +BGP LOCAL_PREF is not transitive across AS boundaries. +Instead, use BGP communities to signal route priority. + +This approach tags each exported route with a community indicating its priority level: + +| Community | Meaning | Kernel metric | +|-----------|---------|---------------| +| `(65000, 100)` | Elevated priority | 512 | +| `(65000, 200)` | Normal priority | 1024 | + +:::note + +The community values `(65000, 100)` and `(65000, 200)` are examples. +Choose values appropriate for your network +and coordinate with your network team to ensure that intermediate routers +(ToR switches, spine routers) preserve and propagate these communities. + +::: + +Add the following filters to your BIRD configuration: + +``` +filter export_bgp { + if (!defined(krt_metric)) then { krt_metric = 1024; } + if (krt_metric < 1024) then { + bgp_community.add((65000, 100)); + } else { + bgp_community.add((65000, 200)); + } + if ( (ifname ~ "tap*") || (ifname ~ "cali*") || (ifname ~ "dummy1") ) then { + if net != 0.0.0.0/0 then accept; + } + reject; +} + +filter import_bgp { + if (((65000, 100) ~ bgp_community)) then + preference = 200; + accept; +} + +filter export_kernel { + if (defined(source) && (source = RTS_BGP) && !defined(krt_metric)) then { + krt_metric = 1024; + if (((65000, 100) ~ bgp_community)) then { + krt_metric = 512; + } + if (krt_metric < 1024) then { + preference = 200; + } + } + accept; +} +``` + +These filters work as follows: + +- **`export_bgp`**: Tags routes with a community based on their kernel metric. + Elevated-priority routes (metric < 1024) get community `(65000, 100)`; + normal-priority routes get `(65000, 200)`. + +- **`import_bgp`**: Checks incoming routes for the elevated-priority community + and sets BIRD's `preference` to 200 if found. + +- **`export_kernel`**: When programming a BGP-learned route into the Linux kernel, + reads the community to determine the correct `krt_metric`. + Routes with the elevated-priority community get metric 512; + all others default to 1024. + +Update the kernel protocol block to use the `export_kernel` filter (same as for iBGP), +and apply the filters to each BGP peer: + +``` +protocol bgp 'tor1' { + description "Top-of-rack switch"; + local as 65001; + neighbor 10.0.0.1 as 65000; + import filter import_bgp; + graceful restart; + export filter export_bgp; + next hop self; + source address 10.0.0.10; +} +``` + +:::caution + +Your ToR switches and any intermediate routers must be configured to preserve and propagate +the BGP communities used for route priority signaling. +Consult your router documentation for details on community propagation policies. + +::: + +:::note + +For IPv6, create the same filters in your `bird6.conf`, +replacing `0.0.0.0/0` with `::/0` in the `export_bgp` filter. + +::: + +### Configure Nova to wait for network readiness + +When Nova live-migrates a VM, there is a point when the VM becomes live on the target node. +If $[prodname] has not yet finished setting up the network policy +(iptables rules and ipsets) for the VM on that node, +the VM may be unable to send or receive network traffic for a period after the switchover, +potentially up to around 10 seconds. + +To prevent this, configure Nova to wait for the $[prodname] Neutron driver to signal +that networking is fully ready before completing the migration. + +Add the following to `/etc/nova/nova.conf` on all compute nodes: + +```ini +[DEFAULT] +live_migration_wait_for_vif_plug = True +``` + +Then restart the Nova compute service: + +```bash +systemctl restart nova-compute +``` + +With this setting, Nova pauses the compute side of the live migration +until the $[prodname] Neutron driver confirms that routes, iptables rules, and ipsets +are all in place for the VM on the target node. + +### Monitor live migration progress + +$[prodname] emits INFO-level log messages that you can use to track the detailed progress +and timing of live migration operations. +These messages appear in the following components. + +#### Neutron driver + +The $[prodname] Neutron driver (`networking_calico`) logs the following events: + +| Log message | Meaning | +|-------------|---------| +| `Live migration : pre-migrate port from to ` | Nova has initiated live migration; $[prodname] is preparing networking on the target node. | +| `Live migration : destination port active on , notifying Nova` | Networking is ready on the target node; $[prodname] is signaling Nova to proceed. | +| `Live migration : succeeded, port migrated from to ` | Migration is complete; source-node networking has been cleaned up. | + +Example: + +``` +2026-03-27 13:31:11.386 INFO networking_calico [...] Live migration b7ce174c-...: pre-migrate port 480eb297-... from compute2 to compute3 +2026-03-27 13:31:13.600 INFO networking_calico [...] Live migration b7ce174c-...: destination port 480eb297-... active on compute3, notifying Nova +2026-03-27 13:31:15.229 INFO networking_calico [...] Live migration b7ce174c-...: succeeded, port 480eb297-... migrated from compute2 to compute3 +``` + +#### Felix on the source node + +Felix logs when it detects the migration and assumes the SOURCE role for the endpoint: + +``` +LiveMigrationCalculator: LiveMigration created/updated ... source=...compute2... target=...compute3... +LiveMigrationCalculator: emitting role for WEP role=SOURCE ... +``` + +#### Felix on the target node + +Felix on the target node provides the most detailed view via state machine transitions: + +| Transition | Meaning | +|-----------|---------| +| Base → Target | Felix has set up networking (elevated-priority routes, policy) for the VM on the target node. | +| Target → Live | Felix detected a GARP (Gratuitous ARP) from the VM, confirming it is running on the target node. | +| Live → TimeWait | Migration is complete. The source endpoint has been removed; this node now has the sole route for the VM at normal priority. | +| TimeWait → Base | Cleanup is complete. The migration state machine has returned to its initial state. | + +Example: + +``` +13:31:11.393 [INFO] felix/live_migration.go: Live migration state transition from=Base input=Target to=Target +13:31:14.042 [INFO] felix/live_migration.go: Live migration state transition from=Target input=GARPDetected to=Live +13:31:15.229 [INFO] felix/live_migration.go: Live migration state transition from=Live input=NoRole to=TimeWait +13:31:24.607 [INFO] felix/live_migration.go: Live migration state transition from=TimeWait input=Deleted to=Base +``` + +The timestamps on these transitions let you measure how long each phase takes. +For example, the time between the Target and Live transitions (~2.6s in the above) +indicates how long it took for the VM to begin running on the target node +after networking was ready. diff --git a/sidebars-calico.js b/sidebars-calico.js index ee95fc212d..7c4e81a949 100644 --- a/sidebars-calico.js +++ b/sidebars-calico.js @@ -325,6 +325,7 @@ module.exports = { 'networking/openstack/service-ips', 'networking/openstack/host-routes', 'networking/openstack/multiple-regions', + 'networking/openstack/live-migration', 'networking/openstack/kuryr', 'networking/openstack/neutron-api', ], From 6a847e9c4c12bd34d19e8029e1a562744e8b1110 Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 10:54:18 +0100 Subject: [PATCH 02/10] Temp: KubeVirt page doesn't exist yet --- calico/networking/openstack/live-migration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index c045a3e15c..78206c354d 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -48,7 +48,7 @@ with BIRD configuration, for both iBGP and eBGP cases. The same principles apply to live migration for KubeVirt VMs in a Kubernetes cluster, which is independently documented at [BGP routing for KubeVirt live -migration](../kubevirt/live-migration-bgp.mdx). The key differences between +migration](live-migration.mdx). The key differences between the considerations for OpenStack and for KubeVirt are as follows. - With OpenStack $[prodname] itself does not generate the BGP configuration - From 1dc6c4b5368774ed84a6f56dcb96a2d10ba0875e Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 11:55:05 +0100 Subject: [PATCH 03/10] Refinements --- .../networking/openstack/live-migration.mdx | 298 ++++++++++-------- 1 file changed, 169 insertions(+), 129 deletions(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index 78206c354d..aff3e6bb57 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -1,5 +1,5 @@ --- -description: Configure live migration support for Calico on OpenStack, including BGP route priority propagation and Nova integration. +description: Configure live migration support. --- # Live migration @@ -61,44 +61,90 @@ the considerations for OpenStack and for KubeVirt are as follows. ::: -## How to +### BIRD attributes and filters -### Configure BIRD for route priority propagation (iBGP) +Any BGP implementation may be used with $[prodname] for OpenStack to propagate +VM routes between nodes - or even an alternative routing protocol - but BIRD is +a widely used choice, and so we provide example BIRD configurations below for +propagating route priorities to iBGP and eBGP peers. + +#### krt_metric + +In BIRD filters, the `krt_metric` attribute can be read, to see the metric +value with which a route was programmed by Felix into the local Linux kernel, +and set, to control the metric value which BIRD will use when programming an +imported route into the local kernel. -In an iBGP deployment (all BGP peers share the same AS number), -route priority can be propagated using the BGP LOCAL_PREF attribute, -which is transitive within an AS. +- Reading `krt_metric` makes sense when processing a route that was locally + programmed, and which BIRD is going to export to its BGP peers. This is most + naturally done in a BGP export filter. -The required conversion between Linux kernel route metrics and BGP LOCAL_PREF is: +- Setting `krt_metric` makes sense when processing a route received from a BGP + peer that is going to be programmed locally. This unfortunately does not + work in a BGP import filter - arguably the most intuitive location - and must + instead be coded in a kernel export filter, gated on the route source being + `RTS_BGP`. +#### Conversion to BGP protocol attributes + +The general approach is to convert from `krt_metric` to _some_ representation +of priority in the BGP wire protocol, when exporting a route, and then to +perform the inverse conversion - from the wire representation back to +`krt_metric` - when importing a route. + +For iBGP peers the best option on the wire is the BGP LOCAL_PREF attribute. +The `bgp_local_pref` attribute can be read and set, in BIRD filter code, to +read and control this. Higher LOCAL_PREF values are defined to mean higher +priority - the opposite of Linux priority/metric values - so we need +conversions between `krt_metric` and `bgp_local_pref` like: ``` -bgp_local_pref = 2147483647 - krt_metric -krt_metric = 2147483647 - bgp_local_pref +bgp_local_pref = 2^31-1 - krt_metric +krt_metric = 2^31-1 - bgp_local_pref ``` +$[prodname] restricts metric values to the range 1..2^31-2, so `bgp_local_pref` +values will also be in that range. + +For eBGP peers the best options on the wire are +1. using a BGP community value to indicate "high priority" +2. adding to the BGP AS path to _lower_ the priority of all routes that are + _not_ high priority. + +(1) is preferred because it only requires a BGP modification on the wire for +the specific high priority routes that are used during a live migration; +whereas (2) would require a BGP modification for routes in normal operation. + +## How to + +- [Configure BIRD for route priority propagation (iBGP)](#configure-bird-for-route-priority-propagation-ibgp) +- [Configure BIRD for route priority propagation (eBGP)](#configure-bird-for-route-priority-propagation-ebgp) +- [Configure Nova option live_migration_wait_for_vif_plug = True](#configure-nova-option-live_migration_wait_for_vif_plug--true) +- [Monitor live migration progress](#monitor-live-migration-progress) + +### Configure BIRD for route priority propagation (iBGP) -This inversion is needed because lower kernel metric values mean higher priority, -while higher LOCAL_PREF values mean higher priority. -The constant 2147483647 is 2^31 - 1. +When propagating routes within a contiguous AS, route priority is best +represented using the BGP LOCAL_PREF attribute. -Add the following filters to your BIRD configuration (`/etc/bird/bird.conf`): +Add filter code like the following to your BIRD configuration +(`/etc/bird/bird.conf`): ``` filter export_bgp { + ... if (!defined(krt_metric)) then { krt_metric = 1024; } bgp_local_pref = 2147483647 - krt_metric; - if ( (ifname ~ "tap*") || (ifname ~ "cali*") || (ifname ~ "dummy1") ) then { - if net != 0.0.0.0/0 then accept; - } - reject; + ... } filter import_bgp { + ... if (defined(bgp_local_pref)&&(bgp_local_pref > 2147482623)) then preference = 200; - accept; + ... } filter export_kernel { + ... if (defined(source) && (source = RTS_BGP) && !defined(krt_metric)) then { krt_metric = 1024; if (defined(bgp_local_pref)) then { @@ -108,11 +154,11 @@ filter export_kernel { preference = 200; } } - accept; + ... } ``` -These filters work as follows: +This code works as follows: - **`export_bgp`**: When exporting a route to BGP peers, converts the kernel route metric (`krt_metric`) to `bgp_local_pref`. @@ -131,88 +177,67 @@ These filters work as follows: This conversion is done here rather than in `import_bgp` because setting `krt_metric` in a BGP import filter does not take effect. -Update the kernel protocol block to use the `export_kernel` filter: +Update the kernel protocol block to use the `export_kernel` filter (if not already present): ``` protocol kernel { - learn; - persist; - scan time 2; - import all; - graceful restart; + ... export filter export_kernel; - merge paths on; + ... } ``` -Apply the filters to each BGP peer: +Use the `export_bgp` and `import_bgp` filters in the definition of each iBGP +peer: ``` protocol bgp 'peer1' { - description "Route reflector 1"; - local as 65001; - neighbor 10.0.0.1 as 65001; - multihop; + ... import filter import_bgp; - graceful restart; export filter export_bgp; - next hop self; - source address 10.0.0.10; + ... } ``` -:::note - -For IPv6, create the same filters in your `bird6.conf`, -replacing `0.0.0.0/0` with `::/0` in the `export_bgp` filter. - -::: +For IPv6, make the same changes in your `bird6.conf`, replacing `0.0.0.0/0` +with `::/0` in the `export_bgp` filter. ### Configure BIRD for route priority propagation (eBGP) -In an eBGP deployment (nodes in different racks use different AS numbers), -BGP LOCAL_PREF is not transitive across AS boundaries. -Instead, use BGP communities to signal route priority. +When propagating routes to an eBGP peer, route priority is best represented +using a BGP community value. BGP community values do not have standardized +meanings, so the choice and interpretation of a value is a matter only for your +local network. For this example, we choose the value `(65000, 100)` to +indicate a higher priority route. -This approach tags each exported route with a community indicating its priority level: +- Routes _with_ that community value are considered to be higher priority, and + will be mapped to `krt_metric 512`. -| Community | Meaning | Kernel metric | -|-----------|---------|---------------| -| `(65000, 100)` | Elevated priority | 512 | -| `(65000, 200)` | Normal priority | 1024 | +- Routes _without_ that community value are considered to be normal priority, + and will be mapped to `krt_metric 1024`. -:::note - -The community values `(65000, 100)` and `(65000, 200)` are examples. -Choose values appropriate for your network -and coordinate with your network team to ensure that intermediate routers -(ToR switches, spine routers) preserve and propagate these communities. - -::: - -Add the following filters to your BIRD configuration: +Add filter code like the following to your BIRD configuration +(`/etc/bird/bird.conf`): ``` filter export_bgp { + ... if (!defined(krt_metric)) then { krt_metric = 1024; } if (krt_metric < 1024) then { bgp_community.add((65000, 100)); - } else { - bgp_community.add((65000, 200)); - } - if ( (ifname ~ "tap*") || (ifname ~ "cali*") || (ifname ~ "dummy1") ) then { - if net != 0.0.0.0/0 then accept; } - reject; + ... } filter import_bgp { + ... if (((65000, 100) ~ bgp_community)) then preference = 200; - accept; + ... } filter export_kernel { + ... if (defined(source) && (source = RTS_BGP) && !defined(krt_metric)) then { krt_metric = 1024; if (((65000, 100) ~ bgp_community)) then { @@ -222,15 +247,14 @@ filter export_kernel { preference = 200; } } - accept; + ... } ``` These filters work as follows: -- **`export_bgp`**: Tags routes with a community based on their kernel metric. - Elevated-priority routes (metric < 1024) get community `(65000, 100)`; - normal-priority routes get `(65000, 200)`. +- **`export_bgp`**: Tags higher priority routes with a community: + elevated-priority routes (metric < 1024) get community `(65000, 100)`. - **`import_bgp`**: Checks incoming routes for the elevated-priority community and sets BIRD's `preference` to 200 if found. @@ -240,70 +264,76 @@ These filters work as follows: Routes with the elevated-priority community get metric 512; all others default to 1024. -Update the kernel protocol block to use the `export_kernel` filter (same as for iBGP), -and apply the filters to each BGP peer: +Update the kernel protocol block to use the `export_kernel` filter (if not already present): ``` -protocol bgp 'tor1' { - description "Top-of-rack switch"; - local as 65001; - neighbor 10.0.0.1 as 65000; - import filter import_bgp; - graceful restart; - export filter export_bgp; - next hop self; - source address 10.0.0.10; +protocol kernel { + ... + export filter export_kernel; + ... } ``` -:::caution - -Your ToR switches and any intermediate routers must be configured to preserve and propagate -the BGP communities used for route priority signaling. -Consult your router documentation for details on community propagation policies. - -::: - -:::note +Use the `export_bgp` and `import_bgp` filters in the definition of each eBGP +peer: -For IPv6, create the same filters in your `bird6.conf`, -replacing `0.0.0.0/0` with `::/0` in the `export_bgp` filter. +``` +protocol bgp 'peer1' { + ... + import filter import_bgp; + export filter export_bgp; + ... +} +``` -::: +For IPv6, make the same changes in your `bird6.conf`, replacing `0.0.0.0/0` +with `::/0` in the `export_bgp` filter. -### Configure Nova to wait for network readiness +### Configure Nova option live_migration_wait_for_vif_plug = True -When Nova live-migrates a VM, there is a point when the VM becomes live on the target node. -If $[prodname] has not yet finished setting up the network policy -(iptables rules and ipsets) for the VM on that node, -the VM may be unable to send or receive network traffic for a period after the switchover, -potentially up to around 10 seconds. +The Nova option `live_migration_wait_for_vif_plug` means "defer the compute +side of live migration - i.e. copying a VM's state from the source to the +target node - until Neutron and the network driver indicate that networking is +ready for the VM on the target node". We recommend setting this option to +`True`. `True` is also the default value, so an explicit setting should not be +needed. However some previous $[prodname] versions required this setting to be +False, so if you have upgraded from a previous $[prodname] version, consider +reviewing your `nova.conf` and either delete the old `False` setting, or change +it to `True`. -To prevent this, configure Nova to wait for the $[prodname] Neutron driver to signal -that networking is fully ready before completing the migration. +The $[prodname] network driver indicates readiness once all of the interface +configuration, ipsets and iptables are in place for the VM on the target node. +In clusters with complex network policy, ipset and iptables programming can +take noticeable time; occasionally as much as tens of seconds. With +`live_migration_wait_for_vif_plug = True` the live migration timeline proceeds +as follows: -Add the following to `/etc/nova/nova.conf` on all compute nodes: +1. Live migration is requested for a VM. -```ini -[DEFAULT] -live_migration_wait_for_vif_plug = True -``` +2. $[prodname] prepares networking on the target node. VM is still live on the +source node, and traffic is flowing to/from the source node. -Then restart the Nova compute service: +3. $[prodname] and Neutron indicate that networking is ready. Nova begins the +compute side of live migration. -```bash -systemctl restart nova-compute -``` +4. Compute transfer is complete and the VM becomes live on the target node. +$[prodname] updates routing so that traffic now flows to/from the target node. -With this setting, Nova pauses the compute side of the live migration -until the $[prodname] Neutron driver confirms that routes, iptables rules, and ipsets -are all in place for the VM on the target node. +Whereas, with `live_migration_wait_for_vif_plug = False`, (2) and (3) run in +parallel and it is possible for the compute side (3) to complete before the +networking side (2). There can then be a situation where the VM is live on the +target node, but it is not yet possible for traffic to flow correctly to and +from the VM on that node. Hence why we recommend the `True` setting. ### Monitor live migration progress -$[prodname] emits INFO-level log messages that you can use to track the detailed progress -and timing of live migration operations. -These messages appear in the following components. +$[prodname] emits INFO-level log messages that you can use to track the +detailed progress and timing of live migration operations. These messages +appear in the following components. + +In all of these logs, `` uniquely identifies a given live migration +operation, and can be used to correlate the logs from the Neutron driver with +those from Felix on the source and target nodes. #### Neutron driver @@ -325,31 +355,41 @@ Example: #### Felix on the source node -Felix logs when it detects the migration and assumes the SOURCE role for the endpoint: +Felix logs when it detects the migration and assumes the SOURCE role for the +endpoint: ``` -LiveMigrationCalculator: LiveMigration created/updated ... source=...compute2... target=...compute3... -LiveMigrationCalculator: emitting role for WEP role=SOURCE ... +LiveMigrationCalculator: LiveMigration created/updated ... source=...compute2... target=...compute3... uid= +LiveMigrationCalculator: emitting role for WEP role=SOURCE uid= ... ``` #### Felix on the target node -Felix on the target node provides the most detailed view via state machine transitions: +Felix similarly logs when it detects the migration and assumes the TARGET role +for the endpoint: + +``` +LiveMigrationCalculator: LiveMigration created/updated ... source=...compute2... target=...compute3... uid= +LiveMigrationCalculator: emitting role for WEP role=TARGET uid= ... +``` + +In addition, Felix logs the state machine transitions involved in detailed live +migration handling on the target node: | Transition | Meaning | |-----------|---------| -| Base → Target | Felix has set up networking (elevated-priority routes, policy) for the VM on the target node. | -| Target → Live | Felix detected a GARP (Gratuitous ARP) from the VM, confirming it is running on the target node. | -| Live → TimeWait | Migration is complete. The source endpoint has been removed; this node now has the sole route for the VM at normal priority. | -| TimeWait → Base | Cleanup is complete. The migration state machine has returned to its initial state. | +| Base → Target | Felix starts setting up networking for the VM on the target node. | +| Target → Live | Felix has detected a GARP (Gratuitous ARP) from the VM, confirming it is now live on the target node, and starts advertising a high priority route to the VM on this node. | +| Live → TimeWait | OpenStack has indicated the migration is complete. High priority route advertisement continues, to allow time for the nearby network to see the deletion of the VM from the source node. | +| TimeWait → Base | Enough time has now passed. Route advertisement for the VM reverts to normal priority. | -Example: +For example: ``` -13:31:11.393 [INFO] felix/live_migration.go: Live migration state transition from=Base input=Target to=Target -13:31:14.042 [INFO] felix/live_migration.go: Live migration state transition from=Target input=GARPDetected to=Live -13:31:15.229 [INFO] felix/live_migration.go: Live migration state transition from=Live input=NoRole to=TimeWait -13:31:24.607 [INFO] felix/live_migration.go: Live migration state transition from=TimeWait input=Deleted to=Base +13:31:11.393 [INFO] felix/live_migration.go: Live migration state transition from=Base ... input=Target migrationUid= to=Target +13:31:14.042 [INFO] felix/live_migration.go: Live migration state transition from=Target ... input=GARPDetected migrationUid= to=Live +13:31:15.229 [INFO] felix/live_migration.go: Live migration state transition from=Live ... input=NoRole migrationUid= to=TimeWait +13:31:24.607 [INFO] felix/live_migration.go: Live migration state transition from=TimeWait ... input=Deleted migrationUid= to=Base ``` The timestamps on these transitions let you measure how long each phase takes. From 280e7b897af074350acd3d6a13121a49f8c9a9b3 Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 14:00:02 +0100 Subject: [PATCH 04/10] Revert "Temp: KubeVirt page doesn't exist yet" This reverts commit 6a847e9c4c12bd34d19e8029e1a562744e8b1110. --- calico/networking/openstack/live-migration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index aff3e6bb57..937e2b7ffc 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -48,7 +48,7 @@ with BIRD configuration, for both iBGP and eBGP cases. The same principles apply to live migration for KubeVirt VMs in a Kubernetes cluster, which is independently documented at [BGP routing for KubeVirt live -migration](live-migration.mdx). The key differences between +migration](../kubevirt/live-migration-bgp.mdx). The key differences between the considerations for OpenStack and for KubeVirt are as follows. - With OpenStack $[prodname] itself does not generate the BGP configuration - From 7940963120d91903983e802fe965a9f971dc7f7d Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 14:06:24 +0100 Subject: [PATCH 05/10] Fix reference to ../reference/resources/licensekey.mdx --- calico-enterprise/operations/license-options.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico-enterprise/operations/license-options.mdx b/calico-enterprise/operations/license-options.mdx index 8482c1a2d5..da19909c70 100644 --- a/calico-enterprise/operations/license-options.mdx +++ b/calico-enterprise/operations/license-options.mdx @@ -42,6 +42,6 @@ To route these alerts, see [Configure Alertmanager](monitor/prometheus/alertmana ## Additional resources -- [LicenseKey resource](../../reference/resources/licensekey.mdx) +- [LicenseKey resource](../reference/resources/licensekey.mdx) - [Configure Alertmanager](monitor/prometheus/alertmanager.mdx) - [Configure Prometheus](monitor/prometheus/configure-prometheus.mdx) From 213e1d917061fa703fc157394bbec83dc5f9f938 Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Fri, 10 Apr 2026 15:22:31 +0100 Subject: [PATCH 06/10] Remove outdated references to "0.0.0.0/0" In Claude's first draft the filter code included "0.0.0.0/0", but then I removed it because it wasn't part of what this doc is trying to explain. --- calico/networking/openstack/live-migration.mdx | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index 937e2b7ffc..13538767b6 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -199,8 +199,7 @@ protocol bgp 'peer1' { } ``` -For IPv6, make the same changes in your `bird6.conf`, replacing `0.0.0.0/0` -with `::/0` in the `export_bgp` filter. +For IPv6, make the same changes in your `bird6.conf`. ### Configure BIRD for route priority propagation (eBGP) @@ -286,8 +285,7 @@ protocol bgp 'peer1' { } ``` -For IPv6, make the same changes in your `bird6.conf`, replacing `0.0.0.0/0` -with `::/0` in the `export_bgp` filter. +For IPv6, make the same changes in your `bird6.conf`. ### Configure Nova option live_migration_wait_for_vif_plug = True From 526009f0dd7e8da0932e63f41dc350ce67876d0a Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Mon, 13 Apr 2026 17:27:49 +0100 Subject: [PATCH 07/10] Update calico/networking/openstack/live-migration.mdx Co-authored-by: Christopher Tauchen --- calico/networking/openstack/live-migration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index 13538767b6..f1b9c30111 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -25,7 +25,7 @@ priority). By default: - **Elevated priority** is metric value 512 But these values can be changed, if needed, using the following -settings in FelixConfiguration or `/etc/calico/felix.cfg`: +settings in the `FelixConfiguration` resource or in `/etc/calico/felix.cfg`: | Setting | Default | Description | |-----------------------------|---------|-------------------------------------------------------| From 9c4fdfb535f82ea12eb2bbd2fb7ab5f87e6755c3 Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Mon, 13 Apr 2026 17:29:27 +0100 Subject: [PATCH 08/10] Update calico/networking/openstack/live-migration.mdx Co-authored-by: Christopher Tauchen --- calico/networking/openstack/live-migration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index f1b9c30111..8be16a612a 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -68,7 +68,7 @@ VM routes between nodes - or even an alternative routing protocol - but BIRD is a widely used choice, and so we provide example BIRD configurations below for propagating route priorities to iBGP and eBGP peers. -#### krt_metric +#### `krt_metric` In BIRD filters, the `krt_metric` attribute can be read, to see the metric value with which a route was programmed by Felix into the local Linux kernel, From f3f7ab4adada70a14d94151986f6c7e26b1324bb Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Mon, 13 Apr 2026 17:29:45 +0100 Subject: [PATCH 09/10] Update calico/networking/openstack/live-migration.mdx Co-authored-by: Christopher Tauchen --- calico/networking/openstack/live-migration.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index 8be16a612a..2161f00b33 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -287,7 +287,7 @@ protocol bgp 'peer1' { For IPv6, make the same changes in your `bird6.conf`. -### Configure Nova option live_migration_wait_for_vif_plug = True +### Configure Nova option `live_migration_wait_for_vif_plug = True` The Nova option `live_migration_wait_for_vif_plug` means "defer the compute side of live migration - i.e. copying a VM's state from the source to the From a8969bd041f453a597167ac8d0d9d4ef79654efc Mon Sep 17 00:00:00 2001 From: Nell Jerram Date: Mon, 13 Apr 2026 17:37:55 +0100 Subject: [PATCH 10/10] Review markups --- calico/networking/openstack/live-migration.mdx | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/calico/networking/openstack/live-migration.mdx b/calico/networking/openstack/live-migration.mdx index 2161f00b33..fe95cc72c5 100644 --- a/calico/networking/openstack/live-migration.mdx +++ b/calico/networking/openstack/live-migration.mdx @@ -2,7 +2,7 @@ description: Configure live migration support. --- -# Live migration +# Live migration for OpenStack VMs ## Big picture @@ -51,9 +51,9 @@ cluster, which is independently documented at [BGP routing for KubeVirt live migration](../kubevirt/live-migration-bgp.mdx). The key differences between the considerations for OpenStack and for KubeVirt are as follows. -- With OpenStack $[prodname] itself does not generate the BGP configuration - - whereas with Kubernetes it does - so instead this is a customer - responsibility. +- With OpenStack $[prodname] itself does not generate and maintain the BGP + configuration, so instead this is a customer responsibility. Whereas with + Kubernetes $[prodname] does generate and maintain the BGP configuration. - The "route aggregation" detail mentioned in the KubeVirt doc does not apply to OpenStack. OpenStack IPAM does not use node-affinity, so VM routes are