feat(api): plumb new InstanceNetworkConfig.auto field through the system#1576
Merged
Conversation
bcavnvidia
approved these changes
May 11, 2026
Matthias247
approved these changes
May 11, 2026
Contributor
Matthias247
left a comment
There was a problem hiding this comment.
let's at least add a check soon that auto isn't set set for hosts with DPUs
Contributor
Author
|
@Matthias247 Yeah I'm actually adding the check right now -- I'll ping you back when it's up. |
This is the first part of NVIDIA#1533 (there will be 1-2 subsequent PRs for integration), but I thought it was important to send a tracer of sorts to show where it fed through. This adds a new `auto` field to the `InstanceNetworkConfig` proto and `api-model` structs with a corresponding `TryFrom` implementation. No callers use this field yet; just getting it dropped into place and defaulting it to `false`. All of the subsequent wiring will come in a follow-up PR. This will be used for "configuring" instance interfaces zero DPU hosts (or hosts whose DPUs are in NIC mode). Since there is no configuration needed (because they're already configured), they don't really need to set anything. However, it was decided there should be an explicit signal. That signal is going to be `auto`. When `auto` is set, it means: - `interfaces` must be empty, because the tenant is asking NICo to `auto`-configure interfaces. - NICo will resolve the interfaces from the `HostInband` segments for the host. `auto` could mean something in the future with regards to some type of pluggable SDN component, where we're saying, "let the plugin deal with configuring the interface(s)". It just happens that in this case it's NICo doing it for zero DPU hosts. Signed-off-by: Chet Nichols III <chetn@nvidia.com>
7feb84d to
dcae057
Compare
Contributor
Author
|
@Matthias247 Done! |
bcavnvidia
approved these changes
May 12, 2026
Matthias247
approved these changes
May 12, 2026
Contributor
You mean the if auto && machine.attached_dpu_machine_ids.len() > 0 { // or look at machine_capabilities
return CarbideError::InvalidArgument(...).into();
}but follow ups are fine |
Contributor
Author
Oh yeah! Yeah that's going to be in the next PR where I start actually wiring things up. |
chet
added a commit
to chet/bare-metal-manager-core
that referenced
this pull request
May 14, 2026
…tion This is the second part of NVIDIA#1533, wiring up the auto field that was plumbed through in NVIDIA#1576. When a tenant sets `auto: true`, NICo resolves the instance's interfaces from the host's `HostInband` segments and stores the resolved config internally. On the wire, callers still see `{ auto: true, interfaces: [] }`, just like they originally sent. The resolved details still only surface in `instance.status.network.interfaces`. We do this via a new `InstanceNetworkConfig::into_external_view()` helper. Additional tweaks include: - `add_inband_interfaces_to_config` (and `with_inband_interfaces_from_machine`) are gated on `network_config.auto`, and no longer auto-fill silently. - Instance allocate gates `auto`-ness on host class: zero DPU hosts are expected to be `auto`, DPU hosts are not. - Instance update path re-resolves on `auto: true` against the host's current `HostInband` segments. No-op if nothing changed, but a real update if the operator added/removed segments since allocation. Tests added/updated! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-core
that referenced
this pull request
May 14, 2026
…tion This is the second part of NVIDIA#1533, wiring up the auto field that was plumbed through in NVIDIA#1576. When a tenant sets `auto: true`, NICo now resolves the instance's interfaces from the host's `HostInband` segments and stores the resolved config internally, instead of magically just doing things. A tenant can set `auto:true` without `interfaces`, or leave `auto:false` and set `interfaces`, but it cannot do both. On the wire, `auto` callers will still see `{ auto: true, interfaces: [] }` as their "stored" config, just like they originally sent (via a new `::into_external_view()` helper, even though internally we are storing them; the resolved details will surface in `instance.status.network.interfaces`, just like they do today for hosts with DPUs. Additional tweaks include: - `add_inband_interfaces_to_config` is gated on `network_config.auto`, and no longer auto-fills silently. - Instance allocate gates `auto`-ness on host class: zero DPU hosts are expected to be `auto`, DPU hosts are not. - Instance update path re-resolves on `auto: true` against the host's current `HostInband` segments. No-op if nothing changed, but a real update if the operator added/removed segments since allocation. - `with_inband_interfaces_from_machine` was dead code so its going away. Tests added/updated! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
10 tasks
chet
added a commit
to chet/bare-metal-manager-core
that referenced
this pull request
May 14, 2026
…tion This is the second part of NVIDIA#1533, wiring up the auto field that was plumbed through in NVIDIA#1576. When a tenant sets `auto: true`, NICo now resolves the instance's interfaces from the host's `HostInband` segments and stores the resolved config internally, instead of magically just doing things. A tenant can set `auto:true` without `interfaces`, or leave `auto:false` and set `interfaces`, but it cannot do both. On the wire, `auto` callers will still see `{ auto: true, interfaces: [] }` as their "stored" config, just like they originally sent (via a new `::into_external_view()` helper, even though internally we are storing them; the resolved details will surface in `instance.status.network.interfaces`, just like they do today for hosts with DPUs. Additional tweaks include: - `add_inband_interfaces_to_config` is gated on `network_config.auto`, and no longer auto-fills silently. - Instance allocate gates `auto`-ness on host class: zero DPU hosts are expected to be `auto`, DPU hosts are not. - Instance update path re-resolves on `auto: true` against the host's current `HostInband` segments. No-op if nothing changed, but a real update if the operator added/removed segments since allocation. - `with_inband_interfaces_from_machine` was dead code so its going away. Tests added/updated! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-core
that referenced
this pull request
May 14, 2026
…tion This is the second part of NVIDIA#1533, wiring up the auto field that was plumbed through in NVIDIA#1576. When a tenant sets `auto: true`, NICo now resolves the instance's interfaces from the host's `HostInband` segments and stores the resolved config internally, instead of magically just doing things. A tenant can set `auto:true` without `interfaces`, or leave `auto:false` and set `interfaces`, but it cannot do both. On the wire, `auto` callers will still see `{ auto: true, interfaces: [] }` as their "stored" config, just like they originally sent (via a new `::into_external_view()` helper, even though internally we are storing them; the resolved details will surface in `instance.status.network.interfaces`, just like they do today for hosts with DPUs. Additional tweaks include: - `add_inband_interfaces_to_config` is gated on `network_config.auto`, and no longer auto-fills silently. - Instance allocate gates `auto`-ness on host class: zero DPU hosts are expected to be `auto`, DPU hosts are not. - Instance update path re-resolves on `auto: true` against the host's current `HostInband` segments. No-op if nothing changed, but a real update if the operator added/removed segments since allocation. - `with_inband_interfaces_from_machine` was dead code so its going away. Tests added/updated! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
that referenced
this pull request
May 15, 2026
…tion (#1674) This is the second part of #1533, wiring up the auto field that was plumbed through in #1576. When a tenant sets `auto: true`, NICo now resolves the instance's interfaces from the host's `HostInband` segments and stores the resolved config internally, instead of magically just doing things. A tenant can set `auto:true` without `interfaces`, or leave `auto:false` and set `interfaces`, but it cannot do both. On the wire, `auto` callers will still see `{ auto: true, interfaces: [] }` as their "stored" config, just like they originally sent (via a new `::into_external_view()` helper, even though internally we are storing them; the resolved details will surface in `instance.status.network.interfaces`, just like they do today for hosts with DPUs. Additional tweaks include: - `add_inband_interfaces_to_config` is gated on `network_config.auto`, and no longer auto-fills silently. - Instance allocate gates `auto`-ness on host class: zero DPU hosts are expected to be `auto`, DPU hosts are not. - Instance update path re-resolves on `auto: true` against the host's current `HostInband` segments. No-op if nothing changed, but a real update if the operator added/removed segments since allocation. - `with_inband_interfaces_from_machine` was dead code so its going away. I've got some `machine-a-tron` stuff I'll do after this PR. Tests added/updated! Signed-off-by: Chet Nichols III <chetn@nvidia.com> ## Description <!-- Describe what this PR does --> ## Type of Change <!-- Check one that best describes this PR --> - [x] **Add** - New feature or capability - [x] **Change** - Changes in existing functionality - [ ] **Fix** - Bug fixes - [ ] **Remove** - Removed features or deprecated functionality - [ ] **Internal** - Internal changes (refactoring, tests, docs, etc.) ## Related Issues (Optional) <!-- If applicable, provide GitHub Issue. --> ## Breaking Changes - [ ] This PR contains breaking changes <!-- If checked above, describe the breaking changes and migration steps --> ## Testing <!-- How was this tested? Check all that apply --> - [x] Unit tests added/updated - [x] Integration tests added/updated - [ ] Manual testing performed - [ ] No testing required (docs, internal refactor, etc.) ## Additional Notes <!-- Any additional context, deployment notes, or reviewer guidance --> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Added automatic network interface resolution for instances on zero-DPU hosts using `auto: true` configuration. * **Bug Fixes** * Enforced immutability of automatic network configuration flag for existing instances. * Stricter validation: `auto` mode now only valid on zero-DPU hosts and rejects explicit interface specifications. * **Documentation** * Clarified network configuration behavior and interface mappings for automatic and manual modes. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/NVIDIA/infra-controller-core/pull/1674) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. This also adds some "VPC capability-like" mappings (similar to what we have in Core) to drive logic decisions at the REST API layer. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
21 tasks
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. This also adds some "VPC capability-like" mappings (similar to what we have in Core) to drive logic decisions at the REST API layer. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. This also adds some "VPC capability-like" mappings (similar to what we have in Core) to drive logic decisions at the REST API layer. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. This also adds some "VPC capability-like" mappings (similar to what we have in Core) to drive logic decisions at the REST API layer. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
chet
added a commit
to chet/bare-metal-manager-rest
that referenced
this pull request
May 23, 2026
This closes NVIDIA#351, and is the REST-side counterpart to the Flat VPC and `auto` interface work that landed in Core (NVIDIA/infra-controller#1775, NVIDIA/infra-controller#1576, and NVIDIA/infra-controller#1674), which together let tenants put instances on zero-DPU hosts (or hosts with their DPU in NIC mode). The discussion on NVIDIA#351 was originally about whether to model this as `vpcId: null` or a special "unmanaged VPC" type. What landed in Core is closer to the latter -- Flat VPCs are real VPCs with VNIs, NSGs, and peering, but NICo doesn't drive their data plane. How it's done: - Brought the two new fields into `nico_nico.proto`. - `Flat` is a new value in the `network_virtualization_type` enum across the API, DB, OpenAPI, and proto layers. - `auto` is a new boolean on instance create / update that flips the interface configuration model from "explicit list" to "NICo resolves from HostInband segments." Mutually exclusive with `interfaces`, persisted on the instance row so partial updates can re-issue the signal without the caller re-supplying it. - Defense-in-depth check at the REST layer that `auto: true` requires a Flat VPC. Core enforces the same rule, but we fail-fast here too to avoid round-tripping the site for an obviously bad request. This also adds some "VPC capability-like" mappings (similar to what we have in Core) to drive logic decisions at the REST API layer. Tests added! Signed-off-by: Chet Nichols III <chetn@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is the first part of #1533 (there will be 1-2 subsequent PRs for integration), but I thought it was important to send a tracer of sorts to show where it fed through.
This adds a new
autofield to theInstanceNetworkConfigproto andapi-modelstructs with a correspondingTryFromimplementation. No callers use this field yet; just getting it dropped into place and defaulting it tofalse.This does include the simple exclusivity check, but all of the subsequent wiring will come in a follow-up PR.
This will be used for "configuring" instance interfaces zero DPU hosts (or hosts whose DPUs are in NIC mode). Since there is no configuration needed (because they're already configured), they don't really need to set anything. However, it was decided there should be an explicit signal. That signal is going to be
auto.When
autois set, it means:interfacesmust be empty, because the tenant is asking NICo toauto-configure interfaces.HostInbandsegments for the host.autocould mean something in the future with regards to some type of pluggable SDN component, where we're saying, "let the plugin deal with configuring the interface(s)". It just happens that in this case it's NICo doing it for zero DPU hosts.Tests added.
Signed-off-by: Chet Nichols III chetn@nvidia.com
Type of Change
Related Issues (Optional)
Breaking Changes
Testing
Additional Notes