Skip to content

feat(runtime): support multiple installation sources for container runtimes#637

Merged
ArangoGutierrez merged 2 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/issue-567-runtime-sources
Feb 12, 2026
Merged

feat(runtime): support multiple installation sources for container runtimes#637
ArangoGutierrez merged 2 commits intoNVIDIA:mainfrom
ArangoGutierrez:feat/issue-567-runtime-sources

Conversation

@ArangoGutierrez
Copy link
Collaborator

Summary

  • Extend ContainerRuntime to install containerd, Docker, and CRI-O from distribution packages (default), git source builds, or by tracking a moving branch
  • Follows the existing CTK/Kubernetes multi-source pattern with full backward compatibility

Context

Part of #567 (Phase 2 — Runtime Sources). Depends on #635 (provenance) for BuildComponentsStatus awareness, but can be merged independently.

New API Types

  • RuntimeSource enum: package, git, latest
  • RuntimePackageSpec — version pinning
  • RuntimeGitSpec — repo + ref
  • RuntimeLatestSpec — branch tracking + repo override

Templates (3 runtimes x 3 sources)

Runtime Package Git Latest
containerd v1.x (apt/dnf), v2.x (binary) Build from source + runc + CNI Track branch
Docker apt packages + cri-dockerd Build moby + cri-dockerd N/A
CRI-O pkgs.k8s.io packages Build from source + conmon + CNI N/A

Wiring

  • dependency.go: git ref resolution for all three runtimes

Test plan

  • 13 validation tests (all runtime source combinations)
  • Updated constructor tests for containerd, Docker, CRI-O
  • Git/latest source template tests
  • go build ./... passes
  • go vet ./... passes
  • All existing tests pass (68 Ginkgo specs + unit tests)
  • E2E: provision each runtime from git source

Copilot AI review requested due to automatic review settings February 11, 2026 17:17
@coveralls
Copy link

coveralls commented Feb 11, 2026

Pull Request Test Coverage Report for Build 21950537158

Details

  • 143 of 194 (73.71%) changed or added relevant lines in 4 files are covered.
  • 3 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.5%) to 47.501%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/provisioner/templates/containerd.go 59 64 92.19%
pkg/provisioner/templates/crio.go 36 42 85.71%
pkg/provisioner/templates/docker.go 36 43 83.72%
pkg/provisioner/dependency.go 12 45 26.67%
Files with Coverage Reduction New Missed Lines %
pkg/provisioner/templates/crio.go 1 84.44%
pkg/provisioner/templates/containerd.go 2 85.71%
Totals Coverage Status
Change from base Build 21948705726: 0.5%
Covered Lines: 2500
Relevant Lines: 5263

💛 - Coveralls

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the container runtime provisioning capabilities to support multiple installation sources (package, git, latest) for containerd, Docker, and CRI-O. This is part of epic #567 Phase 2 and follows the established multi-source pattern introduced for the NVIDIA Container Toolkit in a previous PR (#635). The changes maintain full backward compatibility with the existing Version field while introducing new structured source specifications.

Changes:

  • Added API types for runtime source selection: RuntimeSource enum and RuntimePackageSpec, RuntimeGitSpec, RuntimeLatestSpec configuration structs
  • Implemented git-based installation templates for all three runtimes with proper build dependencies, source compilation, and systemd service creation
  • Added "latest" branch tracking support for containerd (Docker and CRI-O intentionally excluded per design)
  • Refactored constructor functions to return errors and handle multiple source types, with git ref resolution integrated into the provisioner dependency graph
  • Added comprehensive validation logic and test coverage for all source combinations

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
api/holodeck/v1alpha1/types.go Defines new RuntimeSource enum and spec types for multi-source runtime installation
api/holodeck/v1alpha1/validation.go Adds ContainerRuntime.Validate() method to validate source configurations
api/holodeck/v1alpha1/validation_test.go Comprehensive test coverage for all valid and invalid source combinations
pkg/provisioner/templates/containerd.go Adds git and latest source templates, refactors NewContainerd to return errors
pkg/provisioner/templates/containerd_test.go Updated tests for new sources including git and latest templates
pkg/provisioner/templates/docker.go Adds git source template with moby build support, refactors NewDocker
pkg/provisioner/templates/docker_test.go Updated tests covering package and git sources
pkg/provisioner/templates/crio.go Adds git source template with CRI-O build support, refactors NewCriO
pkg/provisioner/templates/crio_test.go Updated tests covering package and git sources
pkg/provisioner/dependency.go Integrates git ref resolution for all three runtimes using gitref.NewGitHubResolver

Comment on lines +670 to +690
{
name: "Latest source - default",
cr: ContainerRuntime{
Install: true,
Name: ContainerRuntimeContainerd,
Source: RuntimeSourceLatest,
},
wantErr: false,
},
{
name: "Latest source - with config",
cr: ContainerRuntime{
Install: true,
Name: ContainerRuntimeContainerd,
Source: RuntimeSourceLatest,
Latest: &RuntimeLatestSpec{
Track: "release/1.7",
},
},
wantErr: false,
},
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation tests for ContainerRuntime only test the "latest" source with containerd. There should be additional test cases that verify Docker and CRI-O reject the "latest" source (once the validation logic is updated to enforce this restriction). Add test cases like "Latest source - Docker (should fail)" and "Latest source - CRI-O (should fail)" with wantErr: true.

Copilot uses AI. Check for mistakes.

WORK_DIR=$(mktemp -d)
trap 'rm -rf "$WORK_DIR"' EXIT

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dockerGitTemplate doesn't validate that GIT_REPO is non-empty before using it in the git clone command (line 281). While NewDocker sets a default repository, defensive programming suggests validating critical variables in shell scripts. Add a check similar to containerd.go line 428: if [[ -z "${GIT_REPO}" ]]; then holodeck_log "ERROR" "$COMPONENT" "GIT_REPO is empty"; exit 1; fi before the git clone command.

Suggested change
if [[ -z "${GIT_REPO}" ]]; then
holodeck_log "ERROR" "$COMPONENT" "GIT_REPO is empty"
exit 1
fi

Copilot uses AI. Check for mistakes.

WORK_DIR=$(mktemp -d)
trap 'rm -rf "$WORK_DIR"' EXIT

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crioGitTemplate doesn't validate that GIT_REPO is non-empty before using it in the git clone command (line 168). While NewCriO sets a default repository, defensive programming suggests validating critical variables in shell scripts. Add a check similar to containerd.go line 428: if [[ -z "${GIT_REPO}" ]]; then holodeck_log "ERROR" "$COMPONENT" "GIT_REPO is empty"; exit 1; fi before the git clone command.

Suggested change
if [[ -z "${GIT_REPO}" ]]; then
holodeck_log "ERROR" "$COMPONENT" "GIT_REPO is empty"
exit 1
fi

Copilot uses AI. Check for mistakes.
Comment on lines +183 to +184
// Latest source is valid with or without explicit config
return nil
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation accepts RuntimeSourceLatest for all container runtimes, but Docker and CRI-O don't actually implement support for the "latest" source. According to the PR description table, only containerd supports "latest" (Docker and CRI-O are marked "N/A"). The validation should reject "latest" source for Docker and CRI-O to prevent users from specifying an unsupported configuration. Consider adding runtime-specific validation that checks cr.Name and returns an error if source is "latest" but name is "docker" or "crio".

Suggested change
// Latest source is valid with or without explicit config
return nil
// Latest source is only supported for containerd
switch cr.Name {
case "containerd":
// Latest source is valid with or without explicit config for containerd
return nil
case "docker", "crio":
return fmt.Errorf("container runtime source 'latest' is not supported for %s", cr.Name)
default:
return fmt.Errorf("container runtime source 'latest' is only supported for containerd")
}

Copilot uses AI. Check for mistakes.
d.GitRef = cr.Git.Ref
if d.GitRepo == "" {
d.GitRepo = "https://github.com/moby/moby.git"
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewDocker doesn't handle RuntimeSourceLatest case in the switch statement. If a user specifies source=latest for Docker (which is not supported per the PR description), the function will silently return with only Source set but no error. This could lead to confusing behavior. Add a case for "latest" that returns an error explaining that Docker doesn't support latest source tracking, or add a default case that handles unknown sources with an error.

Suggested change
}
}
case "latest":
return nil, fmt.Errorf("docker does not support latest source tracking; use source=package with version=latest instead")
default:
return nil, fmt.Errorf("unknown docker source: %s", d.Source)

Copilot uses AI. Check for mistakes.
c.GitRef = cr.Git.Ref
if c.GitRepo == "" {
c.GitRepo = "https://github.com/cri-o/cri-o.git"
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewCriO doesn't handle RuntimeSourceLatest case in the switch statement. If a user specifies source=latest for CRI-O (which is not supported per the PR description), the function will silently return with only Source set but no error. This could lead to confusing behavior. Add a case for "latest" that returns an error explaining that CRI-O doesn't support latest source tracking, or add a default case that handles unknown sources with an error.

Suggested change
}
}
case "latest":
return nil, fmt.Errorf("crio does not support latest source tracking")

Copilot uses AI. Check for mistakes.
CRI_DOCKERD_VERSION="0.3.17"
if [[ ! -f /usr/local/bin/cri-dockerd ]]; then
CRI_DOCKERD_URL="https://github.com/Mirantis/cri-dockerd/releases/download/v${CRI_DOCKERD_VERSION}/cri-dockerd-${CRI_DOCKERD_VERSION}.${GO_ARCH}.tgz"
curl -L "${CRI_DOCKERD_URL}" | sudo tar xzv -C /usr/local/bin --strip-components=1
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The curl command for downloading cri-dockerd lacks error handling. If the download fails, the pipeline will silently continue and tar will fail with a potentially confusing error. Add error handling by checking the curl exit status or using curl's -f flag which is already present via -fsSL in other parts of the template. Consider wrapping this in a conditional check or adding explicit error handling similar to the git clone operations above.

Suggested change
curl -L "${CRI_DOCKERD_URL}" | sudo tar xzv -C /usr/local/bin --strip-components=1
if ! curl -fsSL "${CRI_DOCKERD_URL}" | sudo tar xzv -C /usr/local/bin --strip-components=1; then
holodeck_error 12 "$COMPONENT" "Failed to download or extract cri-dockerd" \
"Check network connectivity and verify URL: ${CRI_DOCKERD_URL}"
fi

Copilot uses AI. Check for mistakes.
CRI_DOCKERD_URL="https://github.com/Mirantis/cri-dockerd/releases/download/v${CRI_DOCKERD_VERSION}/cri-dockerd-${CRI_DOCKERD_VERSION}.${GO_ARCH}.tgz"
curl -L "${CRI_DOCKERD_URL}" | sudo tar xzv -C /usr/local/bin --strip-components=1
fi

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dockerGitTemplate installs cri-dockerd binary but doesn't create the required systemd service files (cri-docker.service and cri-docker.socket) that are created in the dockerPackageTemplate (lines 164-210). This means Kubernetes won't be able to use Docker as a container runtime after git-based installation. Add the systemd service file creation logic from dockerPackageTemplate to dockerGitTemplate after installing the cri-dockerd binary.

Suggested change
# Create systemd units for cri-dockerd if they do not exist
if [[ ! -f /etc/systemd/system/cri-docker.service ]]; then
sudo tee /etc/systemd/system/cri-docker.service >/dev/null <<'EOF'
[Unit]
Description=CRI interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target docker.service
Wants=network-online.target
[Service]
Type=notify
ExecStart=/usr/local/bin/cri-dockerd --container-runtime-endpoint fd://
ExecReload=/bin/kill -s HUP $MAINPID
KillMode=process
Restart=always
RestartSec=5s
[Install]
WantedBy=multi-user.target
EOF
fi
if [[ ! -f /etc/systemd/system/cri-docker.socket ]]; then
sudo tee /etc/systemd/system/cri-docker.socket >/dev/null <<'EOF'
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=/var/run/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
fi
sudo systemctl daemon-reload
sudo systemctl enable --now cri-docker.service cri-docker.socket

Copilot uses AI. Check for mistakes.
@ArangoGutierrez ArangoGutierrez force-pushed the feat/issue-567-runtime-sources branch from 19cf847 to dc92075 Compare February 12, 2026 13:49
@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 12, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

…ntimes

Extend ContainerRuntime to install containerd, Docker, and CRI-O from
distribution packages (default), git source builds, or by tracking a
moving branch. Follows the existing CTK/Kubernetes multi-source pattern
with full backward compatibility.

- containerd: package (v1.x/v2.x), git, latest
- Docker/moby: package, git (with cri-dockerd)
- CRI-O: package, git

Closes: NVIDIA#567 (Phase 2 — runtime sources)
Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
Resolves gocritic ifElseChain lint warnings in containerd and docker templates.

Signed-off-by: Carlos Eduardo Arango Gutierrez <eduardoa@nvidia.com>
@ArangoGutierrez ArangoGutierrez force-pushed the feat/issue-567-runtime-sources branch from dc92075 to b4e787a Compare February 12, 2026 14:26
@ArangoGutierrez ArangoGutierrez merged commit 81387d3 into NVIDIA:main Feb 12, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants