Release v2.15.0 · NVIDIA/ais-k8s

Reconciliation and rollout improvements:
- Restore rebalance config before decommissioning targets during scale-down, as it may still be disabled from a prior rollout.
- Separated rollout (pod template updates) and scaling (replica count changes) into explicitly guarded operations that cannot overlap
- Replaced ClusterScaling CR state with predicates inferred from StatefulSet status fields; reconciliation decisions are no longer driven by CR state
- Target decommission now checks pod status when a node is absent from the cluster map, skipping pods that are NotFound, Unschedulable, or in CrashLoopBackOff and waiting for others to register.
Admin client reconciliation
- Skips externally-managed deployments (e.g. deployed via Helm) to avoid conflicts
- Uses K8s patch API to avoid update conflicts
- Fixed a bug causing API calls every reconcile due to K8s default fields

AuthN support for operator-managed admin client when spec.auth.usernamePassword is configured
Restricted security context for init and logSidecar containers
Native support for arm64 hosts with multi-arch container image build targets
Added operator_state.md documenting the cluster lifecycle states
spec.proxySpec.pvcRetentionPolicy and spec.targetSpec.pvcRetentionPolicy for configuring retention policies for persistent volume claims.
Default init container resources set to 1 CPU / 1Gi memory (requests == limits) to support Guaranteed QoS
- Applied on spec sync to avoid forced rollout
Reconciliation for volumes and priority class name.
Sync primary container securityContext from spec
spec.proxySpec.probes and spec.targetSpec.probes for configuring health probe timing parameters (liveness, readiness, startup) per daemon role

Provide feedback