[RSS] Move switch_port table population to a Nexus bg task#10370
Conversation
|
Taking a look at this now |
| /// fresh populator and drives it directly so we can observe the | ||
| /// only-switch0-then-both transition. | ||
| #[nexus_test(server = crate::Server, extra_sled_agents = 1)] | ||
| async fn test_switch1_comes_up_late(cptestctx: &ControlPlaneTestContext) { |
| @@ -297,69 +296,6 @@ impl super::Nexus { | |||
| Some(IpNet::from(rack_network_config.rack_subnet).into()); | |||
| self.datastore().update_rack_subnet(opctx, &rack).await?; | |||
There was a problem hiding this comment.
Looks like we might be also closing #3602 with this change? 🤔
There was a problem hiding this comment.
I think that's mostly true, yeah. We did get rid of ExternalPortDiscovery::Static. I didn't want to add a delay to every test to wait for us to discover the ports, though, so we still inject qsfp0 for both switches on test startup:
Lines 398 to 418 in 5c70790
Do you think it's fine to close the issue despite that? Or should we change the test startup path here to wait until the bg task has talked to dendrite the way real RSS does (or at least see how badly that slows down the tests)?
There was a problem hiding this comment.
I think it's okay to close it since we've taken it out of the production code path
|
Made one minor change to not stop on first error in 77605f3 |
Working on extending the ls-apis analysis to treat RSS separately from Sled Agent (#10318 (comment)), we realized that certain assumptions that hold for RSS (namely that the rack has a consistent version) are not upheld for early networking. Split early networking into its own crate, and make the main omicron-sled-agent crate depend on it. (Due to #10370, RSS no longer depends on the new early networking crate.) This means that after the work to treat RSS separately, any clients depended on by the new sled-agent-early-networking crate are attributed only to Sled Agent.
Please start reviewing with the module-level comments in
nexus/src/app/background/tasks/populate_switch_ports.rs; they attempt to explain the history and motivation of this change. It's probably the most important thing to review - if any of that reasoning is wrong, the change I'm making in this PR probably is too!This is (preemptively) removing one of the internal-to-sled-agent uses of the
early_networkingmodule, in the hopes that we can eventually remove it entirely. Also closes #3069.