Skip to content

fix(driver-kubernetes): propagate log_level as OPENSHELL_LOG_LEVEL env var#1310

Merged
TaylorMutch merged 1 commit into
NVIDIA:mainfrom
mesutoezdil:fix/k8s-log-level-propagation
May 11, 2026
Merged

fix(driver-kubernetes): propagate log_level as OPENSHELL_LOG_LEVEL env var#1310
TaylorMutch merged 1 commit into
NVIDIA:mainfrom
mesutoezdil:fix/k8s-log-level-propagation

Conversation

@mesutoezdil
Copy link
Copy Markdown
Contributor

Summary

The Kubernetes driver was inserting logLevel and environment fields at the Sandbox CR spec level inside sandbox_to_k8s_spec. Neither field exists in the agents.x-k8s.io/v1alpha1 CRD schema. Kubernetes accepts them on unstructured writes and the controller drops them silently, so log_level never reached the sandbox pod.

Changes:

  • Remove the dead CR-level insertions of logLevel and environment
  • Build a merged environment map that includes OPENSHELL_LOG_LEVEL when spec.log_level is set, and pass it to sandbox_template_to_k8s so the value flows through build_env_list into the pod container spec
  • Add a unit test verifying OPENSHELL_LOG_LEVEL appears in the container env and that logLevel is absent from the CR spec

Related Issue

Fixes #1265

Testing

  • mise run pre-commit passes
  • Unit test added: log_level_propagates_as_env_var_to_sandbox_pod
  • All 29 unit tests pass

Checklist

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 11, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…v var

The sandbox_to_k8s_spec function was inserting logLevel and environment
fields at the Sandbox CR spec level. Neither field exists in the
agents.x-k8s.io/v1alpha1 CRD schema, so Kubernetes silently dropped them
and the log_level value never reached the sandbox pod.

Remove the dead CR-level insertions. Build a merged environment map that
includes OPENSHELL_LOG_LEVEL when log_level is set, and pass it to
sandbox_template_to_k8s so the value flows through build_env_list into
the pod container spec.

Add a unit test verifying OPENSHELL_LOG_LEVEL appears in the container
env and that logLevel is absent from the CR spec.

Fixes NVIDIA#1265
@mesutoezdil mesutoezdil force-pushed the fix/k8s-log-level-propagation branch from ee6e3fe to 64fd96a Compare May 11, 2026 16:58
@TaylorMutch TaylorMutch self-assigned this May 11, 2026
Copy link
Copy Markdown
Collaborator

@TaylorMutch TaylorMutch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validated locally on a fresh k3d cluster via mise run helm:k3s:create + mise run helm:skaffold:run against this branch (commit 64fd96ab).

Findings

  • Unit test driver::tests::log_level_propagates_as_env_var_to_sandbox_pod passes.
  • End-to-end positive case: CreateSandbox with spec.log_level="debug" produces a Sandbox CR where the agent container's env array contains OPENSHELL_LOG_LEVEL=debug, and the legacy spec.logLevel / spec.environment fields at the CR root are no longer written. The downstream pod inherits the env var ({"name":"OPENSHELL_LOG_LEVEL","value":"debug"}).
  • End-to-end negative case: CreateSandbox with an empty log_level produces no OPENSHELL_LOG_LEVEL entry in the container env, leaving only the standard supervisor variables — confirming the filter(|s| !s.log_level.is_empty()) guard.
  • The new spec_pod_env helper is correctly threaded through both the template-present and the default-template branches of sandbox_to_k8s_spec, so behavior is consistent regardless of whether the caller supplied a template.

LGTM.

@TaylorMutch TaylorMutch merged commit b9b8bc3 into NVIDIA:main May 11, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(k8s): Kubernetes driver log_level and environment fields set on Sandbox CR spec have no effect

2 participants