Skip to content

feat(cli): complete CLI with full CRUD operations (#563)#621

Open
ArangoGutierrez wants to merge 13 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/563-cli-crud-operations
Open

feat(cli): complete CLI with full CRUD operations (#563)#621
ArangoGutierrez wants to merge 13 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/563-cli-crud-operations

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

Implements the complete CLI CRUD operations epic (#563) to unblock 11 dependent PRs. This adds:

  • Logger verbosity levels: quiet/normal/verbose/debug with -q, --verbose, -d flags
  • Output formatting: table/JSON/YAML support via pkg/output package
  • New commands: describe, get kubeconfig, get ssh-config, ssh, scp, update
  • Refactored list: renamed --quiet to --ids-only for clarity
  • Shared utilities: cmd/cli/common package with GetHostURL and ConnectSSH
  • Status command: integrated output formatting with --output flag

Key changes

  • internal/logger/logger.go — verbosity levels (Quiet/Normal/Verbose/Debug)
  • pkg/output/output.go — Formatter, TablePrinter, TableData interface
  • cmd/cli/common/host.go — shared host URL resolution and SSH connection
  • cmd/cli/describe/ — detailed instance info with JSON/YAML/table output
  • cmd/cli/get/ — kubeconfig download and SSH config generation
  • cmd/cli/ssh/ — interactive SSH and remote command execution
  • cmd/cli/scp/ — SFTP file transfer with recursive directory support
  • cmd/cli/update/ — add components, labels, and re-provision instances

Review fixes included

  • C1: Fixed nil map panic when writing labels during --reprovision
  • C2: Fixed SSH remote command quoting (replaced broken containsSpace logic)
  • I1: Extracted duplicate getHostURL/connectSSH into shared common package
  • I2: Use path (POSIX) for remote SFTP paths instead of filepath
  • I3: Log warnings for skipped files during recursive remote copy
  • I4: Use c.IsSet() for flags with defaults to prevent config overwrites

Test plan

  • All existing CLI tests pass (go test ./cmd/cli/...)
  • New unit tests for output formatting (pkg/output)
  • New unit tests for GetHostURL across ssh, get, and common packages
  • New unit tests for parsePath, describe, update, and label handling
  • Regression test for nil Labels map panic (C1)
  • Manual: holodeck list, holodeck describe <id>, holodeck ssh <id>
  • Manual: holodeck scp ./file <id>:/tmp/, holodeck get kubeconfig <id>
  • Manual: holodeck update <id> --add-driver --driver-version 560.35.03

Closes #563

Copilot AI review requested due to automatic review settings February 6, 2026 08:14
ssh.PublicKeys(signer),
},
// Holodeck instances are ephemeral with no pre-established host keys
HostKeyCallback: ssh.InsecureIgnoreHostKey(), //nolint:gosec

Check failure

Code scanning / CodeQL

Use of insecure HostKeyCallback implementation High

Configuring SSH ClientConfig with insecure HostKeyCallback implementation from
this source
.

Copilot Autofix

AI about 1 hour ago

In general, to fix this problem you need to replace ssh.InsecureIgnoreHostKey() with a HostKeyCallback implementation that validates the server’s host key against an allow list (for example, keys loaded from a known_hosts-style file or a single pinned key using ssh.FixedHostKey). If the server presents a key that is not on the allow list, the callback must return an error so the connection fails.

For this specific function, the least invasive fix that preserves behavior for callers is to introduce a small helper that loads allowed host keys from a file path provided via environment or a conventional location, parses them, and builds a HostKeyCallback that enforces this allow list. We then call this helper from ConnectSSH and set the resulting callback in ssh.ClientConfig. This keeps the signature of ConnectSSH unchanged and adds security without changing how callers invoke it. If no allowed host keys can be loaded, the helper will return an error and ConnectSSH will fail rather than silently disabling verification.

Concretely:

  • In cmd/cli/common/host.go, add imports for bufio, errors, net, strings, and golang.org/x/crypto/ssh/knownhosts.
  • Above ConnectSSH, define a helper getHostKeyCallback() that:
    • Determines a known-hosts file path (for example from an environment variable HOLODECK_SSH_KNOWN_HOSTS, falling back to $HOME/.ssh/known_hosts).
    • Uses knownhosts.New(path) to construct a ssh.HostKeyCallback.
    • Wraps this callback in our own function that adds a clearer error message if validation fails.
  • In ConnectSSH, call getHostKeyCallback() and assign the returned callback to HostKeyCallback instead of ssh.InsecureIgnoreHostKey().

This uses the standard golang.org/x/crypto/ssh/knownhosts helper rather than custom parsing, and relies on a conventional known_hosts file as the allow list, which is a well-understood and secure pattern.

Suggested changeset 1
cmd/cli/common/host.go

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/cmd/cli/common/host.go b/cmd/cli/common/host.go
--- a/cmd/cli/common/host.go
+++ b/cmd/cli/common/host.go
@@ -17,11 +17,17 @@
 package common
 
 import (
+	"bufio"
+	"errors"
 	"fmt"
+	"net"
 	"os"
+	"path/filepath"
+	"strings"
 	"time"
 
 	"golang.org/x/crypto/ssh"
+	"golang.org/x/crypto/ssh/knownhosts"
 
 	"github.com/NVIDIA/holodeck/api/holodeck/v1alpha1"
 	"github.com/NVIDIA/holodeck/internal/logger"
@@ -71,9 +72,61 @@
 	return "", fmt.Errorf("unable to determine host URL")
 }
 
+// getHostKeyCallback returns a HostKeyCallback that validates host keys
+// against a known_hosts-style allow list.
+// It looks for a known_hosts file in the following order:
+//   1. HOLODECK_SSH_KNOWN_HOSTS environment variable
+//   2. $HOME/.ssh/known_hosts
+func getHostKeyCallback() (ssh.HostKeyCallback, error) {
+	knownHostsPath := os.Getenv("HOLODECK_SSH_KNOWN_HOSTS")
+	if knownHostsPath == "" {
+		homeDir, err := os.UserHomeDir()
+		if err != nil {
+			return nil, fmt.Errorf("unable to determine user home directory for known_hosts: %w", err)
+		}
+		knownHostsPath = filepath.Join(homeDir, ".ssh", "known_hosts")
+	}
+
+	file, err := os.Open(knownHostsPath)
+	if err != nil {
+		return nil, fmt.Errorf("failed to open known_hosts file %s: %w", knownHostsPath, err)
+	}
+	defer file.Close()
+
+	scanner := bufio.NewScanner(file)
+	hasEntries := false
+	for scanner.Scan() {
+		line := strings.TrimSpace(scanner.Text())
+		if line == "" || strings.HasPrefix(line, "#") {
+			continue
+		}
+		hasEntries = true
+		break
+	}
+	if err := scanner.Err(); err != nil {
+		return nil, fmt.Errorf("error reading known_hosts file %s: %w", knownHostsPath, err)
+	}
+	if !hasEntries {
+		return nil, errors.New("known_hosts file is empty; refusing to disable host key verification")
+	}
+
+	hostKeyCallback, err := knownhosts.New(knownHostsPath)
+	if err != nil {
+		return nil, fmt.Errorf("failed to create known_hosts host key callback from %s: %w", knownHostsPath, err)
+	}
+
+	// Wrap the knownhosts callback to provide clearer error messages.
+	return func(hostname string, remote net.Addr, key ssh.PublicKey) error {
+		if err := hostKeyCallback(hostname, remote, key); err != nil {
+			return fmt.Errorf("host key verification failed for %s (%s): %w", hostname, remote.String(), err)
+		}
+		return nil
+	}, nil
+}
+
 // ConnectSSH establishes an SSH connection with retries.
-// Holodeck instances are ephemeral with no pre-established host keys,
-// so host key verification is intentionally disabled.
+// Holodeck instances are ephemeral, so host keys are validated against a
+// configurable known_hosts allow list instead of being ignored.
 func ConnectSSH(log *logger.FunLogger, keyPath, userName, hostUrl string) (*ssh.Client, error) {
 	key, err := os.ReadFile(keyPath) //nolint:gosec // keyPath is from trusted env config
 	if err != nil {
@@ -85,13 +137,17 @@
 		return nil, fmt.Errorf("failed to parse private key: %v", err)
 	}
 
+	hostKeyCallback, err := getHostKeyCallback()
+	if err != nil {
+		return nil, fmt.Errorf("failed to initialize SSH host key verification: %v", err)
+	}
+
 	config := &ssh.ClientConfig{
 		User: userName,
 		Auth: []ssh.AuthMethod{
 			ssh.PublicKeys(signer),
 		},
-		// Holodeck instances are ephemeral with no pre-established host keys
-		HostKeyCallback: ssh.InsecureIgnoreHostKey(), //nolint:gosec
+		HostKeyCallback: hostKeyCallback,
 		Timeout:         30 * time.Second,
 	}
 
EOF
Copilot is powered by AI and may make mistakes. Always verify output.
Design for issue NVIDIA#563 covering:
- New CLI commands (describe, get, ssh, scp, update)
- Global verbosity flags (-q, -v, -d)
- Output formatting infrastructure
- Implementation tasks and testing strategy

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Detailed task-by-task implementation plan for:
- Global verbosity flags (-q, -v, -d)
- Unit tests for all new CLI commands
- Output formatter tests

11 tasks with TDD approach and commit checkpoints.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add Verbosity type with four levels (Quiet, Normal, Verbose, Debug) to
control log output. Info, Check, and Warning methods now respect
verbosity settings, while Error always prints. New Debug and Trace
methods provide additional logging granularity for troubleshooting.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add global flags for controlling output verbosity:
- --quiet, -q: Suppress non-error output
- --verbose: Enable verbose output
- --debug, -d: Enable debug-level logging (also reads DEBUG env var)

Flag precedence: --debug > --verbose > --quiet

Note: -v was not used for --verbose to avoid conflict with built-in --version flag.
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Rename the --quiet flag to --ids-only to avoid conflict with the new
global --quiet flag that controls verbosity. The --ids-only flag now
only outputs instance IDs, one per line.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Add comprehensive unit tests for the pkg/output package covering:
- ValidFormats() and IsValidFormat() validation functions
- NewFormatter() with empty, valid, and invalid format inputs
- Formatter.Format() accessor method
- PrintJSON() with struct, map, and slice data types
- PrintYAML() with struct, map, and nested struct data
- Print() dispatch logic for JSON, YAML, and table formats
- PrintTable() with mock TableData interface
- TablePrinter fluent interface (Header, Row, Flush)
- SetWriter() for output redirection
- Error message quality for invalid formats

Achieves 97.9% code coverage.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Adds pkg/output with Formatter, TablePrinter, and TableData interface
for consistent output formatting across all CLI commands.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Adds -o flag (json/yaml/table) to holodeck status with structured
output types for JSON/YAML serialization.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
New commands for issue NVIDIA#563:
- describe: detailed instance introspection with JSON/YAML output
- get kubeconfig: download kubeconfig from instance
- get ssh-config: generate SSH config entry
- ssh: SSH into instance or run remote commands
- scp: copy files to/from instance via SFTP
- update: update instance configuration (add components, labels)

All commands include unit tests for core logic.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Updates list_test.go to use --ids-only instead of the old --quiet/-q
flag which was renamed to avoid conflict with the global -q flag.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
C1: Add nil check for env.Labels before writing provisioned label
in update --reprovision. Previously panicked when no --label flags
were provided and the cached environment had no labels.

C2: Replace incorrect containsSpace+fmt.Sprintf("%q") quoting with
simple strings.Join. SSH session.Run always passes through the
remote shell, so Go-style quoting was wrong. Document this behavior.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
- Extract getHostURL and connectSSH into cmd/cli/common package,
  eliminating duplication across ssh, scp, and get commands (I1)
- Use path instead of filepath for remote SFTP paths to ensure
  correct POSIX separators on all platforms (I2)
- Log warnings for skipped files during recursive remote copy (I3)
- Use c.IsSet() for flags with defaults to prevent overwriting
  existing config with default values (I4)

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
@ArangoGutierrez ArangoGutierrez force-pushed the feat/563-cli-crud-operations branch from ef73606 to fb19ac0 Compare February 6, 2026 08:17
@coveralls
Copy link

coveralls commented Feb 6, 2026

Pull Request Test Coverage Report for Build 21747509586

Details

  • 85 of 87 (97.7%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+1.0%) to 45.946%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/output/output.go 85 87 97.7%
Totals Coverage Status
Change from base Build 21708182973: 1.0%
Covered Lines: 2091
Relevant Lines: 4551

💛 - Coveralls

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive set of CLI CRUD operations to complete the Holodeck CLI (#563), adding essential features for managing GPU-ready cloud environments. The changes introduce global verbosity levels, structured output formatting, and critical SSH/file transfer capabilities.

Changes:

  • Added logger verbosity system (Quiet/Normal/Verbose/Debug) with corresponding flags (-q, --verbose, -d)
  • Created pkg/output package for consistent JSON/YAML/table formatting across commands
  • Implemented six new commands: describe, get (kubeconfig/ssh-config), ssh, scp, and update
  • Refactored list command to support output formats and renamed --quiet to --ids-only
  • Created shared cmd/cli/common package with GetHostURL and ConnectSSH utilities

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
internal/logger/logger.go Added verbosity levels (Quiet/Normal/Verbose/Debug) with new Debug() and Trace() methods
internal/logger/logger_test.go Comprehensive unit tests for verbosity filtering and methods
pkg/output/output.go Formatter interface with table/JSON/YAML support and TableData interface
pkg/output/output_test.go 732 lines of tests covering all output formats and edge cases
cmd/cli/common/host.go Shared GetHostURL and ConnectSSH utilities (contains bug)
cmd/cli/describe/describe.go Detailed instance introspection with multi-format output
cmd/cli/get/get.go Kubeconfig download and SSH config generation subcommands
cmd/cli/ssh/ssh.go Interactive SSH and remote command execution
cmd/cli/scp/scp.go SFTP file transfer with recursive directory support, proper POSIX path handling
cmd/cli/update/update.go Add components, labels, and re-provision instances (contains bug)
cmd/cli/list/list.go Refactored with output formatting, renamed --quiet to --ids-only
cmd/cli/status/status.go Added output formatting with --output flag
cmd/cli/main.go Integrated global verbosity flags and new commands

}
}
case v1alpha1.ProviderSSH:
return env.Spec.HostUrl, nil
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field path should be env.Spec.Instance.HostUrl not env.Spec.HostUrl. The HostUrl field is defined on the Instance struct, not directly on EnvironmentSpec. This will cause a compilation error or runtime panic.

Suggested change
return env.Spec.HostUrl, nil
if env.Spec.Instance != nil {
return env.Spec.Instance.HostUrl, nil
}

Copilot uses AI. Check for mistakes.
}
}
} else if env.Spec.Provider == v1alpha1.ProviderSSH {
hostUrl = env.Spec.HostUrl
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The field path should be env.Spec.Instance.HostUrl not env.Spec.HostUrl. The HostUrl field is defined on the Instance struct, not directly on EnvironmentSpec. This will cause a compilation error or runtime panic.

Suggested change
hostUrl = env.Spec.HostUrl
if env.Spec.Instance == nil {
return fmt.Errorf("instance spec is required for SSH provider")
}
hostUrl = env.Spec.Instance.HostUrl

Copilot uses AI. Check for mistakes.
- Fix gofmt, gosec, staticcheck, unconvert, and errcheck lint issues
- Promote gopkg.in/yaml.v3 from indirect to direct dependency
- Remove planning documents (development artifacts)

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
@ArangoGutierrez ArangoGutierrez force-pushed the feat/563-cli-crud-operations branch from 5f98a33 to 8440213 Compare February 6, 2026 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Epic] Complete CLI with Full CRUD Operations

2 participants