Skip to content

BUGFIX: Static routes flapping due to missing comparison logic#9907

Merged
internet-diglett merged 2 commits intomainfrom
levon/bugfix-static-route-flapping
Feb 24, 2026
Merged

BUGFIX: Static routes flapping due to missing comparison logic#9907
internet-diglett merged 2 commits intomainfrom
levon/bugfix-static-route-flapping

Conversation

@internet-diglett
Copy link
Contributor

It was found that the switch sync background task was removing and adding the same route repeatedly. This is due to the static route being stored in the DB with no value set for the rib_priority field, which deserializes to None. We need to fill this value in with the default value before comparing it to the route that is active in mgd, as the mgd route will have the default value instead of None.

It the future we plan to make this less cumbersome by putting stronger types around this data, but we need bootstore versioning before implementing such types.

Testing

Changes were tested using a local single node deployment of omicron

Before

21:33:34.429Z INFO 8d40ca16-5146-4f5b-b56f-4cdfce6ccff8 (ServerContext): retrieved existing routes
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:433
    rack_id = 8309decf-e513-4e46-806b-060f95608a61
    routes = {Switch0: {V4(SwitchStaticRouteV4 { nexthop: 10.8.0.1, prefix: Prefix4 { value: 0.0.0.0, length: 0 }, vlan: None, priority: Some(1) })}}
21:33:34.429Z INFO 8d40ca16-5146-4f5b-b56f-4cdfce6ccff8 (ServerContext): retrieved desired routes
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:437
    rack_id = 8309decf-e513-4e46-806b-060f95608a61
    routes = {Switch0: {V4(SwitchStaticRouteV4 { nexthop: 10.8.0.1, prefix: Prefix4 { value: 0.0.0.0, length: 0 }, vlan: None, priority: None })}}
21:33:34.429Z INFO 8d40ca16-5146-4f5b-b56f-4cdfce6ccff8 (ServerContext): calculated static routes to add
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:446
    rack_id = 8309decf-e513-4e46-806b-060f95608a61
    routes = {Switch0: AddStaticRouteRequest { v4: AddStaticRoute4Request { routes: StaticRoute4List { list: [StaticRoute4 { nexthop: 10.8.0.1, prefix: Prefix4 { valu
e: 0.0.0.0, length: 0 }, rib_priority: 1, vlan_id: None }] } }, v6: AddStaticRoute6Request { routes: StaticRoute6List { list: [] } } }}
21:33:34.429Z INFO 8d40ca16-5146-4f5b-b56f-4cdfce6ccff8 (ServerContext): calculated static routes to delete
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:452
    rack_id = 8309decf-e513-4e46-806b-060f95608a61
    routes = {Switch0: DeleteStaticRouteRequest { v4: DeleteStaticRoute4Request { routes: StaticRoute4List { list: [StaticRoute4 { nexthop: 10.8.0.1, prefix: Prefix4
{ value: 0.0.0.0, length: 0 }, rib_priority: 1, vlan_id: None }] } }, v6: DeleteStaticRoute6Request { routes: StaticRoute6List { list: [] } } }}

After

21:58:15.357Z INFO a9235b08-9002-440f-abce-7a00c7fdbfaa (ServerContext): retrieved existing routes
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:433
    rack_id = ef46c585-94dc-4c00-87c8-e21a8259b4f8
    routes = {Switch0: {V4(SwitchStaticRouteV4 { nexthop: 10.8.0.1, prefix: Prefix4 { value: 0.0.0.0, length: 0 }, vlan: None, priority: Some(1) })}}
21:58:15.357Z INFO a9235b08-9002-440f-abce-7a00c7fdbfaa (ServerContext): retrieved desired routes
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:437
    rack_id = ef46c585-94dc-4c00-87c8-e21a8259b4f8
    routes = {Switch0: {V4(SwitchStaticRouteV4 { nexthop: 10.8.0.1, prefix: Prefix4 { value: 0.0.0.0, length: 0 }, vlan: None, priority: Some(1) })}}
21:58:15.357Z INFO a9235b08-9002-440f-abce-7a00c7fdbfaa (ServerContext): calculated static routes to add
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:446
    rack_id = ef46c585-94dc-4c00-87c8-e21a8259b4f8
    routes = {}
21:58:15.357Z INFO a9235b08-9002-440f-abce-7a00c7fdbfaa (ServerContext): calculated static routes to delete
    background_task = switch_port_config_manager
    file = nexus/src/app/background/tasks/sync_switch_configuration.rs:452
    rack_id = ef46c585-94dc-4c00-87c8-e21a8259b4f8
    routes = {}

// is happening.
let priority = match route.rib_priority {
Some(v) => Some(v.0),
None => Some(DEFAULT_RIB_PRIORITY_STATIC),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks correct to me, but I think it's a little fragile. We're mostly working in terms of this type:

#[derive(PartialEq, Eq, Hash, Debug)]
enum SwitchStaticRoute {
    V4(SwitchStaticRouteV4),
    V6(SwitchStaticRouteV6),
}

where the priority field inside SwitchStaticRouteV{4,6} is optional. That means we have to do conversions in multiple places:

  • static_routes_on_switch() has to wrap the non-optional priority we get from mgd in Some(_)
  • static_routes_in_db() has to replace any Nones with Some(DEFAULT) (new in this PR)
  • static_routes_to_del() and static_routes_to_add() both have to .unwrap_or(DEFAULT) the priority fields to convert back to the mgd types; but per the previous two points, the values it's operating on should always be Some(_)

It would be a slightly larger (but not that large, I think?) change, but I'd suggest instead we change SwitchStaticRouteV{4,6} to have non-optional priority fields. These are private types in this task so it wouldn't affect any APIs, and it would clean up 2/3 of the previous points:

  • static_routes_on_switch() would no longer need to wrap the values from mgd in Some(_)
  • static_routes_to_del() and static_routes_to_add() would no longer need .unwrap_or()s
  • static_routes_in_db() would still have to replace None with DEFAULT, but it's limited to just this one function

@internet-diglett internet-diglett enabled auto-merge (squash) February 24, 2026 22:37
@internet-diglett internet-diglett merged commit 6efa919 into main Feb 24, 2026
16 checks passed
@internet-diglett internet-diglett deleted the levon/bugfix-static-route-flapping branch February 24, 2026 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants