Skip to content

feat: add list/insert/remove RPCs for NVL domain health records #1832

@jayzhudev

Description

@jayzhudev

Is this a new feature, an enhancement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem this feature solves

NICo core does not receive, store, and present NVLink domain health alerts today. External health services should be able to send NVL domain health alert to NICo for consumption by, e.g.: admin-cli and web-ui for improved observability.

Feature Description

NICo core shall implement RPCs for insertion/removal/listing NVLink health records (alerts) and store them for the admin-cli, web-ui, and external API callers to consume.

Describe your ideal solution

  • Follow the implementation pattern of existing heath record RPCs in forge.proto such as Insert/List/RemoveSwitchHealthReport for Insert/List/RemoveNVLinkDomainHealthReport at the API layer.
  • Store the records into a new DB table and add support in the admin-cli and web-ui for the list/remove paths.
  • Only "unhealthy" records should be received and are for informational only - no auto-remediation is assumed or required.

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow NCX Infra Controller's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Assignees

Labels

admin toolsIssue related to admin tools (CLI/UI)apiaffects API surface areafeatureFeature (deprecated - use issue type, but it's needed for reporting now)rack lifecycleIssues that relate to managing the lifecycle of a full rack (compute, switches and powershelves)
No fields configured for Enhancement.

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions