feat(api): add Flat VPC virtualization type for zero-DPU hosts#1775
feat(api): add Flat VPC virtualization type for zero-DPU hosts#1775chet wants to merge 1 commit into
Conversation
|
🌿 Preview your docs: https://nvidia-preview-pull-request-1775.docs.buildwithfern.com/infra-controller |
| // virtualization types. Their capabilities will determine | ||
| // if they are allowed or not. | ||
| vpc1.network_virtualization_type | ||
| .ensure_can_peer_with(vpc2.network_virtualization_type) |
There was a problem hiding this comment.
So does this mean FNN-to-Flat peering is allowed? If so I think we need to update the logic in crate::ethernet_virtualization::tenant_network()to allow for this, since it currently only allows identical virtualization types
There was a problem hiding this comment.
Yeah also good call! @bcavnvidia was asking me about this too -- like if this replaces VpcPeeringPolicy or otherwise.
My thought is the capabilities define what peering we CAN do, and that VpcPeeringPolicy will remain, allowing us to define at a site-level what we WILL do.
Maybe we don't need to, but it seems like nice layering.
There was a problem hiding this comment.
And yeahhh if leverage VpcPeeringPolicy alongside the capabilities, we can enhance capabilities with a new exchanges_overlay_vni_for_peering parameter, and that allows us to generalize the tenant network flow (without needing any specific reference to ::Fnn) to something like:
let vpc_peer_ids: Vec<VpcId> = match policy {
VpcPeeringPolicy::Exclusive => {
let allowed = virtualization_type.capabilities().peers_with.to_vec();
db::vpc_peering::get_vpc_peer_vnis(txn, vpc_id, allowed)
.await?
.into_iter()
.map(|(id, _)| id)
.collect()
}
VpcPeeringPolicy::Mixed => {
db::vpc_peering::get_vpc_peer_ids(txn, vpc_id).await?
}
VpcPeeringPolicy::None => vec![],
};
vpc_peer_prefixes = get_prefixes_by_vpcs(txn, &vpc_peer_ids).await?;
if virtualization_type.exchanges_overlay_vni_for_peering() {
let vni_peer_types: Vec<_> = ALL_VARIANTS.iter().copied()
.filter(|t| t.exchanges_overlay_vni_for_peering())
.collect();
vpc_peer_vnis = db::vpc_peering::get_vpc_peer_vnis(
txn, vpc_id, vni_peer_types,
)
.await?
.iter()
.map(|(_, vni)| *vni as u32)
.collect();
}
| .ensure_supports_segment(&new_network_segment) | ||
| .map_err(CarbideError::from)?; | ||
| virtualization_type.allocates_svi_for(&new_network_segment) | ||
| } else { |
There was a problem hiding this comment.
Should we fail if there's no VPC and new_network_segment.segment_type is HostInband? Your new code in api/src/handlers/instance/mod.rs will fail allocations onto HostInband segments if there's no vpc_id for the segment, so maybe it's better to validate that here?
There was a problem hiding this comment.
Oh yeah good call out -- so, I think a HostInband segment should be able to exist without being bound to a VPC.
For example, maybe we want a HostInband segment for playing around with zero DPU provisioning, but maybe we'd never bind it to a VPC, and that should be ok; if we required a VPC, we'd have this weird inter-dependency thing where we'd need to create a VPC just for getting zero DPU hosts provisioned. This is kind of similar to Admin maybe?
All that to say, you're right in calling this out. I think the actual adjustment is to update the comments in api/src/handlers/instance/mod.rs to explain that better, and improve the error handling a bit to return an error specific to the segment not being bound to a VPC at allocation time.
I guess TLDR is it should be fine to have a HostInband segment not within a VPC, BUT, once it comes time to allocate an instance from the host into a VPC, the segment the host is in needs to be bound to a VPC?
This closes NVIDIA#1522. Adds `Flat` to `VpcVirtualizationType` for VPCs hosted on zero-DPU machines. ETV and FNN both presume a Carbide-managed DPU data plane, so using them for zero-DPU hosts meant allocating overlay machinery that nothing consumed. Flat just records the VPC and lets the network operator's switch fabric own reachability. Per-variant policy lives in a new `VpcCapabilities` profile in `model::vpc::capability`: which host fabric interface the type attaches to (`Dpu` or `Nic`), which segment types it accepts, whether it supports IPv6 / routing profiles / stretched-L2 SVI, and which other types it peers with. Each variant maps to one profile constant; handlers consult capability methods that just read from the profile. Adding a future VPC type is a six-field profile plus one match arm, no handler edits. Flat VPCs and `HostInband` segments are mutually bound -- a Flat VPC can only hold HostInband segments, and HostInband segments can only live in Flat VPCs. Tenants pick `FLAT` through the same VPC create flow as any other type. Docs in a separate PR. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
Description
This closes #1522.
Adds
FlattoVpcVirtualizationTypefor VPCs hosted on zero-DPU machines. ETV and FNN both presume a Carbide-managed DPU data plane, so using them for zero-DPU hosts meant allocating overlay machinery that nothing consumed. Flat just records the VPC and lets the network operator's switch fabric own reachability.Also, as I had mentioned to @Matthias247 and @bcavnvidia on the side:
Per-variant policy lives in a new
VpcCapabilitiesprofile inmodel::vpc::capability: which host fabric interface the type attaches to (DpuorNic), which segment types it accepts, whether it supports IPv6 / routing profiles / stretched-L2 SVI, and which other types it peers with. Each variant maps to one profile constant; handlers consult capability methods that just read from the profile. Adding a future VPC type is a six-field profile plus one match arm, no handler edits.Flat VPCs and
HostInbandsegments are mutually bound -- a Flat VPC can only hold HostInband segments, and HostInband segments can only live in Flat VPCs. Tenants pickFLATthrough the same VPC create flow as any other type.Docs in a separate PR. Tests added!
Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes