Skip to content

feat: rename carbide/forge → NICo in helm charts and deployment configs#1532

Merged
shayan1995 merged 1 commit into
NVIDIA:mainfrom
shayan1995:feat/rename-helm
May 14, 2026
Merged

feat: rename carbide/forge → NICo in helm charts and deployment configs#1532
shayan1995 merged 1 commit into
NVIDIA:mainfrom
shayan1995:feat/rename-helm

Conversation

@shayan1995
Copy link
Copy Markdown
Contributor

Renames the helm chart layout, kustomize bases, and helm-prereqs tooling from the legacy carbide/forge naming to NICo (NVIDIA Infrastructure Controller). The Rust binaries inside the published images still use the carbide names, so the chart and kustomize bases support both:

helm/charts/*:

  • carbide-* charts renamed to nico-* (api, bmc-proxy, dhcp, dns, dsx-exchange-consumer, hardware-health, pxe, ssh-console-rs).
  • ConfigMap data, Files.Get, and source files use the carbide-* filenames as canonical and expose nico-* aliases (carbide-api -config.toml, carbide-bmc-proxy.toml).
  • Deployment templates mount config + nico-roots + firmware at both /etc/nico|forge, /var/run/secrets/nico|forge-roots, and /opt/nico|carbide paths; container command picks /opt/nico/ if present, else /opt/carbide/.
  • nico-api: dual-name env vars (NICO_/CARBIDE_) for DATABASE_URL, COOKIEJAR_KEY, WEB_HOSTNAME, plus dual casbin-policy entries (forge/* + nico/*).
  • nico-{dhcp,dns,pxe}/values.yaml: add FORGE_* aliases for the ROOT_CA{,FILE}_PATH, CLIENT_CERT_PATH, CLIENT_KEY_PATH env vars.
  • nico-ssh-console-rs: dual nico_url/carbide_url + nico/forge ca paths in config.toml; bump progressDeadlineSeconds to 1200 to cover cold-cluster vault-pki + cert-manager signing.

deploy/: carbide-base/ renamed to nico-base/; forge-system → nico-system. Kustomize variants point at the carbide-named binaries (/opt/carbide/, /etc/forge/, FORGE_/CARBIDE_* env) since raw-manifest installs target the current carbide image.

helm-prereqs/*:

  • ncx-{core,rest,site-agent}.yaml renamed to nico-*.
  • vault-forge-issuer.yaml renamed to vault-nico-issuer.yaml.
  • nico-prereqs-dynamic.yaml.gotmpl rename.
  • setup.sh: delete the local-path StorageClass before re-applying (provisioner field is immutable, blocks upgrade from prior forged install); before helmfile sync of MetalLB, delete leftover non-helm RBAC + validating webhook so helm can adopt the namespace cleanly; bump nico-core --timeout from 300s to 900s.
  • preflight.sh: rework metallb-config dry-run check to trigger only on non-zero kubectl exit and ignore the benign "(dry run)" lines so a successful dry-run no longer raises a false-positive.
  • nico-site-agent.yaml: also export CARBIDE_ADDRESS and CARBIDE_SEC_OPT alongside NICO_*.

bluefield/charts/: carbide-fmds, carbide-otelcol, carbide-dpu-otel-agent, carbide-dpu-agent, carbide-dhcp-server renamed to the nico- equivalents.

Description

Type of Change

  • Add - New feature or capability
  • Change - Changes in existing functionality
  • Fix - Bug fixes
  • Remove - Removed features or deprecated functionality
  • Internal - Internal changes (refactoring, tests, docs, etc.)

Related Issues (Optional)

Breaking Changes

  • This PR contains breaking changes

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing performed
  • No testing required (docs, internal refactor, etc.)

Additional Notes

@shayan1995 shayan1995 requested review from a team as code owners May 8, 2026 22:51
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 8, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@shayan1995 shayan1995 force-pushed the feat/rename-helm branch 3 times, most recently from 149f14b to 8d0217d Compare May 13, 2026 21:36
@shayan1995
Copy link
Copy Markdown
Contributor Author

/ok to test 8d0217d

@shayan1995 shayan1995 requested a review from a team as a code owner May 13, 2026 21:53
@shayan1995
Copy link
Copy Markdown
Contributor Author

/ok to test ebafaf7

@shayan1995
Copy link
Copy Markdown
Contributor Author

/ok to test 069008a

@github-actions
Copy link
Copy Markdown

@shayan1995 shayan1995 enabled auto-merge (squash) May 14, 2026 00:21
@shayan1995 shayan1995 force-pushed the feat/rename-helm branch 2 times, most recently from b372a1c to 506f590 Compare May 14, 2026 02:32
@shayan1995
Copy link
Copy Markdown
Contributor Author

/ok to test 506f590

Renames the helm chart layout, kustomize bases, and helm-prereqs
tooling from the legacy carbide/forge naming to NICo (NVIDIA
Infrastructure Controller). The Rust binaries inside the published
images still use the carbide names, so the chart and kustomize bases
support both:

helm/charts/*:
- carbide-* charts renamed to nico-* (api, bmc-proxy, dhcp, dns,
  dsx-exchange-consumer, hardware-health, pxe, ssh-console-rs).
- ConfigMap data, Files.Get, and source files use the carbide-*
  filenames as canonical and expose nico-* aliases (carbide-api
  -config.toml, carbide-bmc-proxy.toml).
- Deployment templates mount config + nico-roots + firmware at both
  /etc/nico|forge, /var/run/secrets/nico|forge-roots, and
  /opt/nico|carbide paths; container command picks /opt/nico/<bin>
  if present, else /opt/carbide/<bin>.
- nico-api: dual-name env vars (NICO_*/CARBIDE_*) for DATABASE_URL,
  COOKIEJAR_KEY, WEB_HOSTNAME, plus dual casbin-policy entries
  (forge/* + nico/*).
- nico-{dhcp,dns,pxe}/values.yaml: add FORGE_* aliases for the
  ROOT_CA{,FILE}_PATH, CLIENT_CERT_PATH, CLIENT_KEY_PATH env vars.
- nico-ssh-console-rs: dual nico_url/carbide_url + nico/forge ca
  paths in config.toml; bump progressDeadlineSeconds to 1200 to
  cover cold-cluster vault-pki + cert-manager signing.

deploy/*: carbide-base/* renamed to nico-base/*; forge-system →
nico-system. Kustomize variants point at the carbide-named binaries
(/opt/carbide/*, /etc/forge/*, FORGE_*/CARBIDE_* env) since
raw-manifest installs target the current carbide image.

helm-prereqs/*:
- ncx-{core,rest,site-agent}.yaml renamed to nico-*.
- vault-forge-issuer.yaml renamed to vault-nico-issuer.yaml.
- nico-prereqs-dynamic.yaml.gotmpl rename.
- setup.sh: delete the local-path StorageClass before re-applying
  (provisioner field is immutable, blocks upgrade from prior forged
  install); before helmfile sync of MetalLB, delete leftover non-helm
  RBAC + validating webhook so helm can adopt the namespace cleanly;
  bump nico-core --timeout from 300s to 900s.
- preflight.sh: rework metallb-config dry-run check to trigger only
  on non-zero kubectl exit and ignore the benign "(dry run)" lines so
  a successful dry-run no longer raises a false-positive.
- nico-site-agent.yaml: also export CARBIDE_ADDRESS and
  CARBIDE_SEC_OPT alongside NICO_*.

bluefield/charts/*: carbide-fmds, carbide-otelcol,
carbide-dpu-otel-agent, carbide-dpu-agent, carbide-dhcp-server
renamed to the nico-* equivalents.

Signed-off-by: Shayan Namaghi <snamaghi@nvidia.com>
@shayan1995
Copy link
Copy Markdown
Contributor Author

/ok to test 0102f4c

@shayan1995 shayan1995 merged commit ca1c97d into NVIDIA:main May 14, 2026
41 checks passed
ianderson-nvidia added a commit to ianderson-nvidia/bare-metal-manager-core that referenced this pull request May 18, 2026
The carbide/forge → NICo rename in PR NVIDIA#1532 updated the helm charts to
mount certs at /opt/nico, but the Rust agent code (and bundled otel
configs) still read certs from /opt/forge/forge_root.pem. The chart and
the binary were pointing at different directories, breaking certificate
discovery for nico-fmds, nico-dpu-agent, nico-dpu-otel-agent, and
nico-otelcol.

Adds helm-unittest coverage under each chart's tests/ directory
verifying both the default (forge) and cleared (nico) render paths.

Signed-off-by: Ian Anderson <ianderson@nvidia.com>
abvarshney-nv pushed a commit that referenced this pull request May 19, 2026
…1790)

The carbide/forge → NICo rename in PR #1532 updated the helm charts to
mount certs at /opt/nico, but the Rust agent code (and bundled otel
configs) still read certs from /opt/forge/forge_root.pem. The chart and
the binary were pointing at different directories, breaking certificate
discovery for nico-fmds, nico-dpu-agent, nico-dpu-otel-agent, and
nico-otelcol.

Adds helm-unittest coverage under each chart's tests/ directory
verifying both the default (forge) and cleared (nico) render paths.

## Description
<!-- Describe what this PR does -->

## Type of Change
<!-- Check one that best describes this PR -->
- [ ] **Add** - New feature or capability
- [ ] **Change** - Changes in existing functionality  
- [ ] **Fix** - Bug fixes
- [ ] **Remove** - Removed features or deprecated functionality
- [X] **Internal** - Internal changes (refactoring, tests, docs, etc.)

## Related Issues (Optional)
<!-- If applicable, provide GitHub Issue. -->

## Breaking Changes
- [ ] This PR contains breaking changes

<!-- If checked above, describe the breaking changes and migration steps
-->

## Testing
<!-- How was this tested? Check all that apply -->
- [X] Unit tests added/updated
- [ ] Integration tests added/updated  
- [ ] Manual testing performed
- [ ] No testing required (docs, internal refactor, etc.)

## Additional Notes
<!-- Any additional context, deployment notes, or reviewer guidance -->

Signed-off-by: Ian Anderson <ianderson@nvidia.com>
ianderson-nvidia added a commit to ianderson-nvidia/bare-metal-manager-core that referenced this pull request May 21, 2026
Every `nico-` string the charts emit is now an overridable value that
defaults to nico, so the umbrella chart ships pure nico yet can be
flipped to the legacy carbide/forge world.
This is needed because the carbide/forge -> nico rename (NVIDIA#1532) moved
the API's SPIFFE validation to nico while machine-cert issuance is still
on forge, so a deployment may need to present a carbide/forge identity
until that migration completes.

Resource naming:

  - `nameOverride` (per subchart, default = chart name `nico-<svc>`) now
    drives every emitted resource name, label, selector,
    serviceAccountName, ConfigMap/Certificate name.
    Chart directories, `Chart.yaml`  names, and the `nico-<svc>.X`
    helper identifiers stay nico.

Certificate identity:

  - The `certificateSpec` helper takes a `svcName` arg (the name
    helper) and `certificate.serviceName` defaults to it, so
    `nameOverride` alone also flips commonName/dnsNames/SPIFFE-URI; this
    is backwards compatible since the default still renders nico. Adds
    `certificate.spiffeServiceName` and `certificate.identityNamespace`
    to decouple the SPIFFE `/sa/` name and namespace from the k8s service.

Cross-references and authz:

  - Clients reach the API via `apiServiceName` (default `nico-api`) in
    nico-dns/pxe/dhcp/ssh-console-rs.
  - The API and bmc-proxy auth config files are rendered with `tpl`
    rather than shipped verbatim, so `spiffe_trust_domain` tracks
    `global.spiffe.trustDomain`, the SPIFFE base-path namespace comes
    from `auth.namespace`, the casbin principals from
    `auth.principals.*`, and the bmc-proxy principal from
    `auth.apiPrincipal`. Whole-file replacement via `configFiles.*` is
    unchanged (that branch is not `tpl`-rendered).

Binary-read names (env vars, the Kea `nico-api-url` param, config keys,
the `/opt/nico` and `/etc/nico` runtime paths, and the dual
`*-config.toml` ConfigMap data keys) are deliberately left intact so the
chart keeps working with either the nico or carbide image variant.

Adds helm-unittest coverage asserting both the nico default render and
the carbide/forge override for naming, certificate SANs, the
`apiServiceName` cross-refs, and the authz config.
ianderson-nvidia added a commit to ianderson-nvidia/bare-metal-manager-core that referenced this pull request May 21, 2026
Every `nico-` string the charts emit is now an overridable value that
defaults to nico, so the umbrella chart ships pure nico yet can be
flipped to the legacy carbide/forge world.
This is needed because the carbide/forge -> nico rename (NVIDIA#1532) moved
the API's SPIFFE validation to nico while machine-cert issuance is still
on forge, so a deployment may need to present a carbide/forge identity
until that migration completes.

Resource naming:

  - `nameOverride` (per subchart, default = chart name `nico-<svc>`) now
    drives every emitted resource name, label, selector,
    serviceAccountName, ConfigMap/Certificate name.
    Chart directories, `Chart.yaml`  names, and the `nico-<svc>.X`
    helper identifiers stay nico.

Certificate identity:

  - The `certificateSpec` helper takes a `svcName` arg (the name
    helper) and `certificate.serviceName` defaults to it, so
    `nameOverride` alone also flips commonName/dnsNames/SPIFFE-URI; this
    is backwards compatible since the default still renders nico. Adds
    `certificate.spiffeServiceName` and `certificate.identityNamespace`
    to decouple the SPIFFE `/sa/` name and namespace from the k8s service.

Cross-references and authz:

  - Clients reach the API via `apiServiceName` (default `nico-api`) in
    nico-dns/pxe/dhcp/ssh-console-rs.
  - The API and bmc-proxy auth config files are rendered with `tpl`
    rather than shipped verbatim, so `spiffe_trust_domain` tracks
    `global.spiffe.trustDomain`, the SPIFFE base-path namespace comes
    from `auth.namespace`, the casbin principals from
    `auth.principals.*`, and the bmc-proxy principal from
    `auth.apiPrincipal`. Whole-file replacement via `configFiles.*` is
    unchanged (that branch is not `tpl`-rendered).

Binary-read names (env vars, the Kea `nico-api-url` param, config keys,
the `/opt/nico` and `/etc/nico` runtime paths, and the dual
`*-config.toml` ConfigMap data keys) are deliberately left intact so the
chart keeps working with either the nico or carbide image variant.

Adds helm-unittest coverage asserting both the nico default render and
the carbide/forge override for naming, certificate SANs, the
`apiServiceName` cross-refs, and the authz config.
ianderson-nvidia added a commit to ianderson-nvidia/bare-metal-manager-core that referenced this pull request May 21, 2026
Every `nico-` string the charts emit is now an overridable value that
defaults to nico, so the umbrella chart ships pure nico yet can be
flipped to the legacy carbide/forge world.
This is needed because the carbide/forge -> nico rename (NVIDIA#1532) moved
the API's SPIFFE validation to nico while machine-cert issuance is still
on forge, so a deployment may need to present a carbide/forge identity
until that migration completes.

Resource naming:

  - `nameOverride` (per subchart, default = chart name `nico-<svc>`) now
    drives every emitted resource name, label, selector,
    serviceAccountName, ConfigMap/Certificate name.
    Chart directories, `Chart.yaml`  names, and the `nico-<svc>.X`
    helper identifiers stay nico.

Certificate identity:

  - The `certificateSpec` helper takes a `svcName` arg (the name
    helper) and `certificate.serviceName` defaults to it, so
    `nameOverride` alone also flips commonName/dnsNames/SPIFFE-URI; this
    is backwards compatible since the default still renders nico. Adds
    `certificate.spiffeServiceName` and `certificate.identityNamespace`
    to decouple the SPIFFE `/sa/` name and namespace from the k8s service.

Cross-references and authz:

  - Clients reach the API via `apiServiceName` (default `nico-api`) in
    nico-dns/pxe/dhcp/ssh-console-rs.
  - The API and bmc-proxy auth config files are rendered with `tpl`
    rather than shipped verbatim, so `spiffe_trust_domain` tracks
    `global.spiffe.trustDomain`, the SPIFFE base-path namespace comes
    from `auth.namespace`, the casbin principals from
    `auth.principals.*`, and the bmc-proxy principal from
    `auth.apiPrincipal`. Whole-file replacement via `configFiles.*` is
    unchanged (that branch is not `tpl`-rendered).

Binary-read names (env vars, the Kea `nico-api-url` param, config keys,
the `/opt/nico` and `/etc/nico` runtime paths, and the dual
`*-config.toml` ConfigMap data keys) are deliberately left intact so the
chart keeps working with either the nico or carbide image variant.

Adds helm-unittest coverage asserting both the nico default render and
the carbide/forge override for naming, certificate SANs, the
`apiServiceName` cross-refs, and the authz config.
ianderson-nvidia added a commit to ianderson-nvidia/bare-metal-manager-core that referenced this pull request May 22, 2026
Every `nico-` string the charts emit is now an overridable value that
defaults to nico, so the umbrella chart ships pure nico yet can be
flipped to the legacy carbide/forge world.
This is needed because the carbide/forge -> nico rename (NVIDIA#1532) moved
the API's SPIFFE validation to nico while machine-cert issuance is still
on forge, so a deployment may need to present a carbide/forge identity
until that migration completes.

Resource naming:

  - `nameOverride` (per subchart, default = chart name `nico-<svc>`) now
    drives every emitted resource name, label, selector,
    serviceAccountName, ConfigMap/Certificate name.
    Chart directories, `Chart.yaml`  names, and the `nico-<svc>.X`
    helper identifiers stay nico.

Certificate identity:

  - The `certificateSpec` helper takes a `svcName` arg (the name
    helper) and `certificate.serviceName` defaults to it, so
    `nameOverride` alone also flips commonName/dnsNames/SPIFFE-URI; this
    is backwards compatible since the default still renders nico. Adds
    `certificate.spiffeServiceName` and `certificate.identityNamespace`
    to decouple the SPIFFE `/sa/` name and namespace from the k8s service.

Cross-references and authz:

  - Clients reach the API via `apiServiceName` (default `nico-api`) in
    nico-dns/pxe/dhcp/ssh-console-rs.
  - The API and bmc-proxy auth config files are rendered with `tpl`
    rather than shipped verbatim, so `spiffe_trust_domain` tracks
    `global.spiffe.trustDomain`, the SPIFFE base-path namespace comes
    from `auth.namespace`, the casbin principals from
    `auth.principals.*`, and the bmc-proxy principal from
    `auth.apiPrincipal`. Whole-file replacement via `configFiles.*` is
    unchanged (that branch is not `tpl`-rendered).

Binary-read names (env vars, the Kea `nico-api-url` param, config keys,
the `/opt/nico` and `/etc/nico` runtime paths, and the dual
`*-config.toml` ConfigMap data keys) are deliberately left intact so the
chart keeps working with either the nico or carbide image variant.

Adds helm-unittest coverage asserting both the nico default render and
the carbide/forge override for naming, certificate SANs, the
`apiServiceName` cross-refs, and the authz config.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants