MCO-2208: MCO-2125: Migrate mco_units and mco_registry TCs from private-repo#6027
MCO-2208: MCO-2125: Migrate mco_units and mco_registry TCs from private-repo#6027ptalgulk01 wants to merge 1 commit into
Conversation
- Add TC 53960 (No failed units in bootstrap machine) to mco_units.go, with bootstrap SSH infrastructure (util/bootstrap/, util/ssh_client.go) - Migrate mco_registry.go with 9 TCs: 43405/50508, 42682, 42680, 52520, 57595, 61555, 61558, 66046, 68736 - Add supporting helpers: NewImageTagMirrorSet, GetDataFromConfigMap, GetImageRegistryCertificates, GetManagedMergedTrustedImageRegistryCertificates, skipTestIfClusterVersion, skipTestIfExtensionsAreUsed, getClusterVersion - Add MCDCrioReloadedRegexp constant - Add templates: repository-mirror.yaml, add-image-tag-mirror-set.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@ptalgulk01: This pull request references MCO-2208 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis PR adds comprehensive testing infrastructure for Machine Config Operator behavior, including SSH connectivity to bootstrap instances, container registry configuration validation, and remote command execution across AWS and Azure platforms. ChangesMCO Testing Infrastructure and Registry Configuration Tests
Sequence DiagramsequenceDiagram
participant Test as Test Suite
participant OC as oc CLI
participant AWS as AWS API
participant SSH as SSH/Bootstrap
participant Node as Bootstrap Node
Test->>OC: GetBootstrap() → get cluster infrastructure name
OC-->>Test: infrastructure name
Test->>AWS: describe-instances(infraName-bootstrap)
AWS-->>Test: public/private IPs
Test->>SSH: Create SshClient with key + IPs
Test->>SSH: RunOutput(systemctl list-units --failed)
SSH->>Node: SSH dial + session setup
Node->>Node: Execute command
Node-->>SSH: Collect stdout/stderr
SSH-->>Test: Command output
Test->>Test: Assert "0 loaded units listed"
sequenceDiagram
participant Test as Registry Test
participant Cluster as Kubernetes Cluster
participant MCD as MCD Pod/Node
participant Node as Worker Node
participant Config as openshift-config
Test->>Cluster: Create ImageContentSourcePolicy / ImageTagMirrorSet
Cluster->>MCD: MCO reconciles resource
MCD->>Node: Write /etc/containers/registries.conf
MCD->>MCD: Log CRI-O reload/skip
Test->>MCD: Poll logs for MCDCrioReloadedRegexp
MCD-->>Test: Log entries
Test->>Node: Read /etc/containers/registries.conf (chroot/remote)
Node-->>Test: Config file content
Test->>Config: Fetch image-registry-ca configmap
Config-->>Test: Certificate data
Test->>Test: Assert mirroring + certs match expected state
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 7 | ❌ 5❌ Failed checks (5 warnings)
✅ Passed checks (7 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ptalgulk01 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 12
🧹 Nitpick comments (1)
test/extended-priv/util/bootstrap/bootstrap.go (1)
56-56: ⚖️ Poor tradeoffConsider making SSH port configurable.
The SSH port is hardcoded to
22. While this is the standard SSH port, some environments may use non-standard ports for security reasons. Consider adding an environment variable or configuration option to override the default port.💡 Example approach for configurability
const ( EnvVarSSHCloudPrivPort = "SSH_CLOUD_PRIV_PORT" DefaultSSHPort = 22 ) func getSSHPort() int { if portStr, exists := os.LookupEnv(EnvVarSSHCloudPrivPort); exists { if port, err := strconv.Atoi(portStr); err == nil && port > 0 && port <= 65535 { return port } } return DefaultSSHPort }Then use
getSSHPort()instead of the hardcoded22at line 56.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended-priv/util/bootstrap/bootstrap.go` at line 56, Replace the hardcoded SSH port 22 with a configurable getter: add constants (e.g., EnvVarSSHCloudPrivPort, DefaultSSHPort) and implement a small helper function getSSHPort() that reads the environment variable, parses/validates it as an int (1-65535) and returns DefaultSSHPort on error or unset; then change the call return buildBootstrap(user, *bootstrapIPs, 22), nil to use getSSHPort() instead of 22 and ensure buildBootstrap continues to accept an int port.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended-priv/mco_registry.go`:
- Around line 16-25: This new Ginkgo suite (g.Describe for "MCO Registry") must
skip on MicroShift; add an explicit MicroShift guard either by inserting an
exutil.IsMicroShiftCluster() check with g.Skip("skip on MicroShift") (for
example in the g.JustBeforeEach or immediately inside the g.Describe body) or by
adding the [Skipped:MicroShift] label to the suite declaration; ensure the guard
uses the existing exutil import and apply it near the PreChecks/oc setup so
tests like PreChecks, g.It and the MCO registry scenario are not executed on
MicroShift clusters.
- Around line 246-257: The test assumes separate master and worker pools by
indexing workers[0] and masters[0]; guard this for single-node (SNO) clusters by
adding a pre-check that skips the test when topology is single-node—call the
existing helper (exutil.IsSingleNode()) and g.Skip(...) or use
skipOnSingleNodeTopology() at the start of the test before calling
NewMachineConfigPool/GetSortedNodes so the code paths that reference
MachineConfigPoolWorker, MachineConfigPoolMaster, workers[0], and masters[0]
never run on SNO.
- Around line 573-575: The cleanup assertion in the test is checking for the
wrong mirror string; update the Expect assertion that calls rf.GetTextContent()
(inside the ImageTagMirrorSet cleanup test) to verify the tag-mirror value
configured in this test ("mirror.example.com/redhat") is not present instead of
"example.io/digest-example/ubi-minimal" so the test fails when the tag-mirror
wasn't removed; search for the Expect(...).NotTo(o.ContainSubstring(...)) call
and replace the checked substring accordingly.
- Around line 686-690: The current check compares the raw `oc get` output string
to "No resources found", which is brittle; instead call the same command but
request machine-readable names (e.g., add "-o name" or "--no-headers -o name")
via oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy",
"-o", "name").Output() and treat an empty output as "none found"; update the
logic that uses the output variable and the
logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output) to only log
when the returned name list is non-empty so presence detection is robust to
formatting changes.
- Around line 184-185: The test currently invokes a shell wrapper via
exec.Command("bash", "-c", "cp "+secretFile+" "+newSecretFile) which is
unnecessary and unsafe; replace that with a direct exec.Command invocation (call
exec.Command("cp", secretFile, newSecretFile).Output()) or, even better, perform
the copy using Go stdlib (open secretFile, create newSecretFile and io.Copy) and
keep the existing error assertion o.Expect(copyErr).NotTo(o.HaveOccurred());
update the call sites around the secretFile and newSecretFile variables in the
test/extended-priv/mco_registry.go code accordingly.
In `@test/extended-priv/mco_units.go`:
- Around line 382-390: The SSE Eventually block that calls bs.SSH.RunOutput with
failedUnitsCommand should supply explicit timeout and polling interval
parameters to reduce flakiness; modify the o.Eventually invocation that wraps
the anonymous func returning failedUnits (and that ends with
.Should(o.ContainSubstring("0 loaded units listed"))) to pass a concrete timeout
and polling duration consistent with other tests (e.g., same durations used
elsewhere) so the SSH call is retried for the specified period with a fixed poll
interval instead of relying on defaults.
In `@test/extended-priv/mco.go`:
- Around line 156-163: The getClusterVersion function currently indexes
splitValues[0] and [1] without validating the output of
oc.AsAdmin().WithoutNamespace().Run(...).Output(); guard against empty or
malformed desired.version by trimming clusterBuild, checking that clusterBuild
is not empty and that strings.Split(clusterBuild, ".") yields at least two
elements, and return a clear error (rather than panicking) when the version
cannot be parsed; update getClusterVersion to perform these checks before
composing and returning the "major.minor" string.
- Around line 191-193: The current check compares raw jsonpath strings
wExtensions and mExtensions to the literal "[]" which is brittle; instead
normalize and reliably detect emptiness by trimming whitespace and handling
jsonpath's special outputs or by parsing the value as JSON into a slice. Update
the conditional around wExtensions and mExtensions to treat any of the following
as empty: empty string, trimmed "[]", "<no value>", or a
successfully-unmarshaled zero-length []interface{} (use encoding/json to
unmarshal and check len==0), and only call g.Skip when either side actually
contains non-empty extensions.
In `@test/extended-priv/util/bootstrap/aws.go`:
- Around line 54-58: The AWS CLI call using exec.Command(...,
"describe-instances", ..., "--query",
"Reservations[0].Instances[0].PublicIpAddress", "--output", "text") can return
the literal "null" (or empty string) which is ambiguous; change the call to
request JSON output (use "--output", "json") and parse the returned JSON (or, if
you keep text output, explicitly check the raw stdout for "null" or an empty
string) after the exec.Command call that uses instanceName, then return distinct
errors like "instance not found" when Reservations/Instances are missing and "no
public IP" when the instance exists but PublicIpAddress is null/empty; update
the code around the exec.Command(...) invocation to implement this check and
adjust the returned value/error accordingly.
- Around line 53-63: The aws CLI invocation in awsCLIGetInstancePublicIP (and
similarly awsCLIGetInstancePrivateIP) lacks a timeout; wrap the command with a
context.WithTimeout and use exec.CommandContext to ensure the call is cancelled
if it exceeds a reasonable duration (e.g., 30s), call cancel() in a defer,
capture and return the context/timeout error alongside the command output error,
and trim the output as before—update both functions to follow this pattern so
tests/CI cannot hang on an unresponsive AWS CLI.
In `@test/extended-priv/util/bootstrap/azure.go`:
- Line 19: The call to
oc.WithoutNamespace().AsAdmin().Run("get").Args(...).Output() ignores the
returned error and may leave infraName empty; change the assignment to capture
the error (e.g., infraName, err :=
oc.WithoutNamespace().AsAdmin().Run("get").Args(...).Output()), check err !=
nil, and handle it appropriately (return the error or fail the test / log and
exit) so that failures from the oc command are surfaced instead of producing an
incorrect bootstrapName; ensure subsequent uses of infraName (where
bootstrapName is formed) only proceed when the call succeeded.
In `@test/extended-priv/util/ssh_client.go`:
- Around line 51-57: Replace the inline HostKeyCallback that returns nil with
the idiomatic ssh.InsecureIgnoreHostKey() in the ssh.ClientConfig construction
(the block returning &ssh.ClientConfig{...}), and ensure you add a connection
timeout when dialing SSH (e.g., use a net.Dialer with Dialer.Timeout or a
context with deadline around ssh.Dial/net.Dial) so the client does not hang on
network issues; update the code paths that call ssh.Dial or create the net.Conn
to use the timeout-aware dialer.
---
Nitpick comments:
In `@test/extended-priv/util/bootstrap/bootstrap.go`:
- Line 56: Replace the hardcoded SSH port 22 with a configurable getter: add
constants (e.g., EnvVarSSHCloudPrivPort, DefaultSSHPort) and implement a small
helper function getSSHPort() that reads the environment variable,
parses/validates it as an int (1-65535) and returns DefaultSSHPort on error or
unset; then change the call return buildBootstrap(user, *bootstrapIPs, 22), nil
to use getSSHPort() instead of 22 and ensure buildBootstrap continues to accept
an int port.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 9832543a-f456-455e-a5a9-79e95f2352d4
📒 Files selected for processing (11)
test/extended-priv/const.gotest/extended-priv/mco.gotest/extended-priv/mco_registry.gotest/extended-priv/mco_units.gotest/extended-priv/testdata/files/add-image-tag-mirror-set.yamltest/extended-priv/testdata/files/repository-mirror.yamltest/extended-priv/util.gotest/extended-priv/util/bootstrap/aws.gotest/extended-priv/util/bootstrap/azure.gotest/extended-priv/util/bootstrap/bootstrap.gotest/extended-priv/util/ssh_client.go
| var _ = g.Describe("[sig-mco][Suite:openshift/machine-config-operator/longduration][Serial][Disruptive] MCO Registry", func() { | ||
| defer g.GinkgoRecover() | ||
|
|
||
| var oc = exutil.NewCLI("mco", exutil.KubeConfigPath()) | ||
|
|
||
| g.JustBeforeEach(func() { | ||
| PreChecks(oc) | ||
| }) | ||
|
|
||
| g.It("[PolarionID:43405][PolarionID:50508][OTP] node drain is not needed for mirror config change in container registry. Nodes not tainted. [Disruptive]", func() { |
There was a problem hiding this comment.
Add explicit MicroShift guard for this new suite.
This suite uses OpenShift APIs/features not available on MicroShift, but it lacks a protective skip (exutil.IsMicroShiftCluster() + g.Skip or [Skipped:MicroShift] labeling).
As per coding guidelines, "When new Ginkgo e2e tests are added ... check whether they use any APIs or features that are NOT available on MicroShift ... Do NOT flag ... if test body contains exutil.IsMicroShiftCluster() check with g.Skip()."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco_registry.go` around lines 16 - 25, This new Ginkgo
suite (g.Describe for "MCO Registry") must skip on MicroShift; add an explicit
MicroShift guard either by inserting an exutil.IsMicroShiftCluster() check with
g.Skip("skip on MicroShift") (for example in the g.JustBeforeEach or immediately
inside the g.Describe body) or by adding the [Skipped:MicroShift] label to the
suite declaration; ensure the guard uses the existing exutil import and apply it
near the PreChecks/oc setup so tests like PreChecks, g.It and the MCO registry
scenario are not executed on MicroShift clusters.
| _, copyErr := exec.Command("bash", "-c", "cp "+secretFile+" "+newSecretFile).Output() | ||
| o.Expect(copyErr).NotTo(o.HaveOccurred()) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate and examine the file at lines 184-185
find . -name "mco_registry.go" -path "*/extended-priv/*" | head -5Repository: openshift/machine-config-operator
Length of output: 114
🏁 Script executed:
# Once found, examine the context around lines 184-185
if [ -f "test/extended-priv/mco_registry.go" ]; then
sed -n '170,200p' "test/extended-priv/mco_registry.go"
fiRepository: openshift/machine-config-operator
Length of output: 1652
🏁 Script executed:
# Also check how secretFile and newSecretFile are defined in the function
if [ -f "test/extended-priv/mco_registry.go" ]; then
sed -n '1,185p' "test/extended-priv/mco_registry.go" | tail -50
fiRepository: openshift/machine-config-operator
Length of output: 2785
🏁 Script executed:
# Find the getPullSecret and generateTmpFile functions in the same file
grep -n "func getPullSecret\|func generateTmpFile" test/extended-priv/mco_registry.goRepository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Get the implementation of these helper functions
sed -n '1,200p' test/extended-priv/mco_registry.go | grep -A 15 "func getPullSecret\|func generateTmpFile"Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Search more comprehensively in the file for these function definitions
rg "func (getPullSecret|generateTmpFile)" test/extended-priv/mco_registry.go -A 10Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Use broader search
rg -n "getPullSecret|generateTmpFile" test/extended-priv/mco_registry.go | head -20Repository: openshift/machine-config-operator
Length of output: 238
🏁 Script executed:
# Search for function definitions in the entire file
cat test/extended-priv/mco_registry.go | grep -n "^func getPullSecret\|^func generateTmpFile"Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# They might be on different lines, search more broadly
rg "^func.*getPullSecret|^func.*generateTmpFile" test/extended-priv/mco_registry.goRepository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# Check if these functions are defined elsewhere in the test directory
find test -name "*.go" -exec rg -l "func getPullSecret|func generateTmpFile" {} \;Repository: openshift/machine-config-operator
Length of output: 104
🏁 Script executed:
# Get the entire file to search within it
wc -l test/extended-priv/mco_registry.goRepository: openshift/machine-config-operator
Length of output: 116
🏁 Script executed:
# Read the file and search for the functions
cat test/extended-priv/mco_registry.go | grep -n "func getPullSecret"Repository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
# The functions are in util.go, let's examine them
rg -n "func getPullSecret|func generateTmpFile" test/extended-priv/util.go -A 15Repository: openshift/machine-config-operator
Length of output: 1709
🏁 Script executed:
# Also search for these specific function names
grep -n "func getPullSecret\|func generateTmpFile" test/extended-priv/util.goRepository: openshift/machine-config-operator
Length of output: 201
🏁 Script executed:
# Let's get the full util.go file size first
wc -l test/extended-priv/util.goRepository: openshift/machine-config-operator
Length of output: 109
🏁 Script executed:
# Check what OutputToFile returns - look at the CLI implementation or understand the return type
grep -n "OutputToFile" test/extended-priv/util.goRepository: openshift/machine-config-operator
Length of output: 288
🏁 Script executed:
# Let's understand the actual values - check what e2e.TestContext.OutputDir typically contains
rg "TestContext.OutputDir" test/extended-priv/ -B 2 -A 2Repository: openshift/machine-config-operator
Length of output: 4522
🏁 Script executed:
# Check if there are any constraints on namespace names in the code
rg -n "namespace" test/extended-priv/mco_registry.go | head -10Repository: openshift/machine-config-operator
Length of output: 556
Use direct exec.Command instead of shell wrapper for file copy.
Using exec.Command("bash", "-c", ...) for simple file operations is unnecessary and introduces an anti-pattern. Even though the paths here are constrained, use direct command execution as the proper Go idiom.
- _, copyErr := exec.Command("bash", "-c", "cp "+secretFile+" "+newSecretFile).Output()
+ _, copyErr := exec.Command("cp", secretFile, newSecretFile).Output()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _, copyErr := exec.Command("bash", "-c", "cp "+secretFile+" "+newSecretFile).Output() | |
| o.Expect(copyErr).NotTo(o.HaveOccurred()) | |
| _, copyErr := exec.Command("cp", secretFile, newSecretFile).Output() | |
| o.Expect(copyErr).NotTo(o.HaveOccurred()) |
🧰 Tools
🪛 OpenGrep (1.20.0)
[ERROR] 184-184: Dynamic command passed to exec.Command with a shell invocation. Pass arguments directly to exec.Command without a shell wrapper.
(coderabbit.command-injection.go-exec-command)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco_registry.go` around lines 184 - 185, The test
currently invokes a shell wrapper via exec.Command("bash", "-c", "cp
"+secretFile+" "+newSecretFile) which is unnecessary and unsafe; replace that
with a direct exec.Command invocation (call exec.Command("cp", secretFile,
newSecretFile).Output()) or, even better, perform the copy using Go stdlib (open
secretFile, create newSecretFile and io.Copy) and keep the existing error
assertion o.Expect(copyErr).NotTo(o.HaveOccurred()); update the call sites
around the secretFile and newSecretFile variables in the
test/extended-priv/mco_registry.go code accordingly.
| wmcp := NewMachineConfigPool(oc.AsAdmin(), MachineConfigPoolWorker) | ||
| mmcp := NewMachineConfigPool(oc.AsAdmin(), MachineConfigPoolMaster) | ||
|
|
||
| workers, wsErr := wmcp.GetSortedNodes() | ||
| o.Expect(wsErr).ShouldNot(o.HaveOccurred(), "Error getting the nodes in worker pool") | ||
|
|
||
| masters, msErr := mmcp.GetSortedNodes() | ||
| o.Expect(msErr).ShouldNot(o.HaveOccurred(), "Error getting the nodes in master pool") | ||
|
|
||
| firstUpdatedWorker := workers[0] | ||
| firstUpdatedMaster := masters[0] | ||
|
|
There was a problem hiding this comment.
Guard multi-pool/master-worker assumptions for SNO.
The test assumes separate master and worker pools/nodes (workers[0], masters[0]) without SNO protection, which can break in single-replica topology.
As per coding guidelines, "When new Ginkgo e2e tests are added ... Flag tests that make assumptions about multi-node or HA clusters ... Do NOT flag ... if test body contains exutil.IsSingleNode() check with g.Skip() ... or skipOnSingleNodeTopology()."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco_registry.go` around lines 246 - 257, The test assumes
separate master and worker pools by indexing workers[0] and masters[0]; guard
this for single-node (SNO) clusters by adding a pre-check that skips the test
when topology is single-node—call the existing helper (exutil.IsSingleNode())
and g.Skip(...) or use skipOnSingleNodeTopology() at the start of the test
before calling NewMachineConfigPool/GetSortedNodes so the code paths that
reference MachineConfigPoolWorker, MachineConfigPoolMaster, workers[0], and
masters[0] never run on SNO.
| o.Expect(rf.GetTextContent()).NotTo(o.ContainSubstring(`example.io/digest-example/ubi-minimal`), | ||
| "The configuration in file /etc/containers/registries.conf was not restored after deleting the ImageTagMirrorSet resource") | ||
| logger.Infof("OK!\n") |
There was a problem hiding this comment.
ITMS cleanup assertion checks the wrong mirror value.
This ITMS test verifies removal of a digest-mirror string (example.io/digest-example/ubi-minimal) instead of the tag-mirror value configured in this test (mirror.example.com/redhat), so it can pass even when cleanup failed.
Suggested fix
- o.Expect(rf.GetTextContent()).NotTo(o.ContainSubstring(`example.io/digest-example/ubi-minimal`),
+ o.Expect(rf.GetTextContent()).NotTo(o.ContainSubstring(`mirror.example.com/redhat`),
"The configuration in file /etc/containers/registries.conf was not restored after deleting the ImageTagMirrorSet resource")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| o.Expect(rf.GetTextContent()).NotTo(o.ContainSubstring(`example.io/digest-example/ubi-minimal`), | |
| "The configuration in file /etc/containers/registries.conf was not restored after deleting the ImageTagMirrorSet resource") | |
| logger.Infof("OK!\n") | |
| o.Expect(rf.GetTextContent()).NotTo(o.ContainSubstring(`mirror.example.com/redhat`), | |
| "The configuration in file /etc/containers/registries.conf was not restored after deleting the ImageTagMirrorSet resource") | |
| logger.Infof("OK!\n") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco_registry.go` around lines 573 - 575, The cleanup
assertion in the test is checking for the wrong mirror string; update the Expect
assertion that calls rf.GetTextContent() (inside the ImageTagMirrorSet cleanup
test) to verify the tag-mirror value configured in this test
("mirror.example.com/redhat") is not present instead of
"example.io/digest-example/ubi-minimal" so the test fails when the tag-mirror
wasn't removed; search for the Expect(...).NotTo(o.ContainSubstring(...)) call
and replace the checked substring accordingly.
| output, err := oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy").Output() | ||
| o.ExpectWithOffset(1, err).NotTo(o.HaveOccurred(), "Error checking if ImageContentSourcePolicy exist in the cluster or not") | ||
|
|
||
| if output != "No resources found" { | ||
| logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output) |
There was a problem hiding this comment.
ICSP presence detection is brittle on plain-text oc get output.
Comparing against exact "No resources found" is fragile (newlines/format changes). Querying names directly is safer.
Suggested fix
- output, err := oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy").Output()
+ output, err := oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy", "-o", "name").Output()
o.ExpectWithOffset(1, err).NotTo(o.HaveOccurred(), "Error checking if ImageContentSourcePolicy exist in the cluster or not")
- if output != "No resources found" {
+ if strings.TrimSpace(output) != "" {
logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output)
g.Skip("There are ImageContentSourcePolicies resources in the cluster. This test case is not compatible with ImageContentSourcePolicies resources.")
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| output, err := oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy").Output() | |
| o.ExpectWithOffset(1, err).NotTo(o.HaveOccurred(), "Error checking if ImageContentSourcePolicy exist in the cluster or not") | |
| if output != "No resources found" { | |
| logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output) | |
| output, err := oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy", "-o", "name").Output() | |
| o.ExpectWithOffset(1, err).NotTo(o.HaveOccurred(), "Error checking if ImageContentSourcePolicy exist in the cluster or not") | |
| if strings.TrimSpace(output) != "" { | |
| logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco_registry.go` around lines 686 - 690, The current check
compares the raw `oc get` output string to "No resources found", which is
brittle; instead call the same command but request machine-readable names (e.g.,
add "-o name" or "--no-headers -o name") via
oc.AsAdmin().WithoutNamespace().Run("get").Args("ImageContentSourcePolicy",
"-o", "name").Output() and treat an empty output as "none found"; update the
logic that uses the output variable and the
logger.Infof("ImageContentSourcePolicy in cluster:\n%s", output) to only log
when the returned name list is non-empty so presence detection is robust to
formatting changes.
| if wExtensions != "[]" || mExtensions != "[]" { | ||
| g.Skip("Current cluster is using extensions. This test cannot be execute in a cluster using extensions") | ||
| } |
There was a problem hiding this comment.
Extension detection is too strict for raw jsonpath output.
Comparing to exactly "[]" is brittle ("", whitespace, <no value> can appear) and may skip tests incorrectly.
Suggested fix
- if wExtensions != "[]" || mExtensions != "[]" {
+ wExt := strings.TrimSpace(wExtensions)
+ mExt := strings.TrimSpace(mExtensions)
+ hasWorkerExt := wExt != "" && wExt != "[]" && wExt != "<no value>"
+ hasMasterExt := mExt != "" && mExt != "[]" && mExt != "<no value>"
+ if hasWorkerExt || hasMasterExt {
g.Skip("Current cluster is using extensions. This test cannot be execute in a cluster using extensions")
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/mco.go` around lines 191 - 193, The current check compares
raw jsonpath strings wExtensions and mExtensions to the literal "[]" which is
brittle; instead normalize and reliably detect emptiness by trimming whitespace
and handling jsonpath's special outputs or by parsing the value as JSON into a
slice. Update the conditional around wExtensions and mExtensions to treat any of
the following as empty: empty string, trimmed "[]", "<no value>", or a
successfully-unmarshaled zero-length []interface{} (use encoding/json to
unmarshal and check len==0), and only call g.Skip when either side actually
contains non-empty extensions.
| func awsCLIGetInstancePublicIP(instanceName string) (string, error) { | ||
| out, err := exec.Command("aws", "ec2", "describe-instances", | ||
| "--filters", fmt.Sprintf("Name=tag:Name,Values=%s", instanceName), | ||
| "--query", "Reservations[0].Instances[0].PublicIpAddress", | ||
| "--output", "text", | ||
| ).Output() | ||
| if err != nil { | ||
| return "", err | ||
| } | ||
| return strings.TrimSpace(string(out)), nil | ||
| } |
There was a problem hiding this comment.
Add timeout and context to AWS CLI command execution.
The exec.Command calls have no timeout, which could cause tests to hang indefinitely if the AWS CLI is unresponsive or network connectivity is poor. This is especially problematic in CI environments.
⏱️ Proposed fix to add timeout context
+import (
+ "context"
+ "time"
+)
+
func awsCLIGetInstancePublicIP(instanceName string) (string, error) {
+ ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
+ defer cancel()
+
- out, err := exec.Command("aws", "ec2", "describe-instances",
+ out, err := exec.CommandContext(ctx, "aws", "ec2", "describe-instances",
"--filters", fmt.Sprintf("Name=tag:Name,Values=%s", instanceName),
"--query", "Reservations[0].Instances[0].PublicIpAddress",
"--output", "text",
).Output()Apply the same pattern to awsCLIGetInstancePrivateIP.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/util/bootstrap/aws.go` around lines 53 - 63, The aws CLI
invocation in awsCLIGetInstancePublicIP (and similarly
awsCLIGetInstancePrivateIP) lacks a timeout; wrap the command with a
context.WithTimeout and use exec.CommandContext to ensure the call is cancelled
if it exceeds a reasonable duration (e.g., 30s), call cancel() in a defer,
capture and return the context/timeout error alongside the command output error,
and trim the output as before—update both functions to follow this pattern so
tests/CI cannot hang on an unresponsive AWS CLI.
| out, err := exec.Command("aws", "ec2", "describe-instances", | ||
| "--filters", fmt.Sprintf("Name=tag:Name,Values=%s", instanceName), | ||
| "--query", "Reservations[0].Instances[0].PublicIpAddress", | ||
| "--output", "text", | ||
| ).Output() |
There was a problem hiding this comment.
AWS CLI query assumes instance exists at array position [0].
The JMESPath query Reservations[0].Instances[0].PublicIpAddress will return null or fail if no reservations or instances exist, rather than returning an error. When AWS CLI outputs null, it's converted to the string "null" in text format, which could be confused with a valid value.
Consider adding error handling to distinguish between "instance not found" (empty results) and "instance exists but has no public IP" (empty/None value).
🔍 Suggested improvement for robustness
func awsCLIGetInstancePublicIP(instanceName string) (string, error) {
out, err := exec.Command("aws", "ec2", "describe-instances",
"--filters", fmt.Sprintf("Name=tag:Name,Values=%s", instanceName),
"--query", "Reservations[0].Instances[0].PublicIpAddress",
"--output", "text",
).Output()
if err != nil {
return "", err
}
- return strings.TrimSpace(string(out)), nil
+ result := strings.TrimSpace(string(out))
+ if result == "null" {
+ return "", fmt.Errorf("no instance found with name %s", instanceName)
+ }
+ return result, nil
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/util/bootstrap/aws.go` around lines 54 - 58, The AWS CLI
call using exec.Command(..., "describe-instances", ..., "--query",
"Reservations[0].Instances[0].PublicIpAddress", "--output", "text") can return
the literal "null" (or empty string) which is ambiguous; change the call to
request JSON output (use "--output", "json") and parse the returned JSON (or, if
you keep text output, explicitly check the raw stdout for "null" or an empty
string) after the exec.Command call that uses instanceName, then return distinct
errors like "instance not found" when Reservations/Instances are missing and "no
public IP" when the instance exists but PublicIpAddress is null/empty; update
the code around the exec.Command(...) invocation to implement this check and
adjust the returned value/error accordingly.
| // GetIPs returns the IPs of the bootstrap machine if this machine exists in Azure. | ||
| // Azure bootstrap discovery is not yet implemented; returns InstanceNotFound so the test skips. | ||
| func (a AzureBSInfoProvider) GetIPs(oc *util.CLI) (*Ips, error) { | ||
| infraName, _ := oc.WithoutNamespace().AsAdmin().Run("get").Args("infrastructure", "cluster", "-o=jsonpath={.status.infrastructureName}").Output() |
There was a problem hiding this comment.
Handle the error from the oc command.
The error returned by .Output() is silently ignored. If the oc get infrastructure command fails, infraName will be an empty string, causing bootstrapName to become "-bootstrap" instead of the correct value. This masks command failures and could lead to confusing error messages downstream.
🛡️ Proposed fix to handle the error
- infraName, _ := oc.WithoutNamespace().AsAdmin().Run("get").Args("infrastructure", "cluster", "-o=jsonpath={.status.infrastructureName}").Output()
+ infraName, err := oc.WithoutNamespace().AsAdmin().Run("get").Args("infrastructure", "cluster", "-o=jsonpath={.status.infrastructureName}").Output()
+ if err != nil {
+ logger.Errorf("Could not get infrastructure name for Azure bootstrap discovery. Error: %s", err)
+ return nil, err
+ }
bootstrapName := infraName + "-bootstrap"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| infraName, _ := oc.WithoutNamespace().AsAdmin().Run("get").Args("infrastructure", "cluster", "-o=jsonpath={.status.infrastructureName}").Output() | |
| infraName, err := oc.WithoutNamespace().AsAdmin().Run("get").Args("infrastructure", "cluster", "-o=jsonpath={.status.infrastructureName}").Output() | |
| if err != nil { | |
| logger.Errorf("Could not get infrastructure name for Azure bootstrap discovery. Error: %s", err) | |
| return nil, err | |
| } | |
| bootstrapName := infraName + "-bootstrap" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/util/bootstrap/azure.go` at line 19, The call to
oc.WithoutNamespace().AsAdmin().Run("get").Args(...).Output() ignores the
returned error and may leave infraName empty; change the assignment to capture
the error (e.g., infraName, err :=
oc.WithoutNamespace().AsAdmin().Run("get").Args(...).Output()), check err !=
nil, and handle it appropriately (return the error or fail the test / log and
exit) so that failures from the oc command are surfaced instead of producing an
incorrect bootstrapName; ensure subsequent uses of infraName (where
bootstrapName is formed) only proceed when the call succeeded.
| return &ssh.ClientConfig{ | ||
| User: sshClient.User, | ||
| Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)}, | ||
| HostKeyCallback: func(hostname string, remote net.Addr, key ssh.PublicKey) error { | ||
| return nil | ||
| }, | ||
| }, nil |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -path "*/test/extended-priv/util/ssh_client.go" -type fRepository: openshift/machine-config-operator
Length of output: 117
🏁 Script executed:
cat -n ./test/extended-priv/util/ssh_client.goRepository: openshift/machine-config-operator
Length of output: 3020
🏁 Script executed:
rg "SshClient|RunOutput" --type go test/extended-priv/ -A 2 -B 2Repository: openshift/machine-config-operator
Length of output: 2388
🏁 Script executed:
rg "Timeout|context\." test/extended-priv/util/ssh_client.goRepository: openshift/machine-config-operator
Length of output: 59
🏁 Script executed:
rg "ssh\.ClientConfig|HostKeyCallback" --type go -B 2 -A 2Repository: openshift/machine-config-operator
Length of output: 8238
Use the idiomatic ssh.InsecureIgnoreHostKey() and add a timeout to prevent hanging on network issues.
The current code disables host key verification with an inline function returning nil, which is not the idiomatic approach—use ssh.InsecureIgnoreHostKey() instead. Additionally, the SSH client configuration lacks a timeout, which can cause tests to hang indefinitely if there are network issues. Reference implementations in the vendored Kubernetes SSH framework (vendor/k8s.io/kubernetes/test/e2e/framework/ssh/ssh.go) include both the proper API and a timeout.
Suggested fix
return &ssh.ClientConfig{
User: sshClient.User,
Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)},
- HostKeyCallback: func(hostname string, remote net.Addr, key ssh.PublicKey) error {
- return nil
- },
+ HostKeyCallback: ssh.InsecureIgnoreHostKey(),
+ Timeout: 150 * time.Second,
}, nil📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| return &ssh.ClientConfig{ | |
| User: sshClient.User, | |
| Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)}, | |
| HostKeyCallback: func(hostname string, remote net.Addr, key ssh.PublicKey) error { | |
| return nil | |
| }, | |
| }, nil | |
| return &ssh.ClientConfig{ | |
| User: sshClient.User, | |
| Auth: []ssh.AuthMethod{ssh.PublicKeys(signer)}, | |
| HostKeyCallback: ssh.InsecureIgnoreHostKey(), | |
| Timeout: 150 * time.Second, | |
| }, nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended-priv/util/ssh_client.go` around lines 51 - 57, Replace the
inline HostKeyCallback that returns nil with the idiomatic
ssh.InsecureIgnoreHostKey() in the ssh.ClientConfig construction (the block
returning &ssh.ClientConfig{...}), and ensure you add a connection timeout when
dialing SSH (e.g., use a net.Dialer with Dialer.Timeout or a context with
deadline around ssh.Dial/net.Dial) so the client does not hang on network
issues; update the code paths that call ssh.Dial or create the net.Conn to use
the timeout-aware dialer.
|
@ptalgulk01: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary by CodeRabbit
Tests