Skip to content

STOR-2793:Add testcases for snapshot of images#30891

Open
chao007 wants to merge 1 commit intoopenshift:mainfrom
chao007:imagesnapshot
Open

STOR-2793:Add testcases for snapshot of images#30891
chao007 wants to merge 1 commit intoopenshift:mainfrom
chao007:imagesnapshot

Conversation

@chao007
Copy link
Contributor

@chao007 chao007 commented Mar 17, 2026

Add 5 test cases for STOP-2793.

    PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
     images should create snapshot using images type and restore data successfully with RWO block
    [Suite:openshift/conformance/parallel] (92.364s)
    PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
     images should support multiple snapshots from same PVC using images type [Suite:openshift/conformance/parallel] (146.265s)
    PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
     images should not be marked as default VolumeSnapshotClass [Suite:openshift/conformance/parallel] (15.082s)
    PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
     images should create VolumeSnapshotClass with snapshot-type images parameter [Suite:openshift/conformance/parallel] (17.764s)
    PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
     images should create snapshot using images type and restore data successfully with RWO filesystem
    [Suite:openshift/conformance/parallel] (91.453s)

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 17, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 17, 2026

@chao007: This pull request references STOR-2793 which is a valid jira issue.

Details

In response to this:

Add 5 test cases for STOP-2793.

   PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
    images should create snapshot using images type and restore data successfully with RWO block
   [Suite:openshift/conformance/parallel] (92.364s)
   PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
    images should support multiple snapshots from same PVC using images type [Suite:openshift/conformance/parallel] (146.265s)
   PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
    images should not be marked as default VolumeSnapshotClass [Suite:openshift/conformance/parallel] (15.082s)
   PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
    images should create VolumeSnapshotClass with snapshot-type images parameter [Suite:openshift/conformance/parallel] (17.764s)
   PASSED: [sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type
    images should create snapshot using images type and restore data successfully with RWO filesystem
   [Suite:openshift/conformance/parallel] (91.453s)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

Walkthrough

Adds a new GCP-specific e2e test file that validates GCE PD CSI "images" VolumeSnapshot behavior and restore flows for filesystem and block volumes, plus VolumeSnapshotClass checks, multi-snapshot scenarios, and helper functions for PVC/pod lifecycle and cleanup.

Changes

Cohort / File(s) Summary
GCP Snapshot Images Test Suite
test/extended/storage/gcp_snapshot_images.go
New test file (+~673 lines) implementing VolumeSnapshotClass verification, end-to-end snapshot and restore tests for filesystem and block PVCs, multi-snapshot/timepoint validation, writer/reader pods, PVC creation (including restore-from-snapshot), polling logic for readiness, and deferred cleanup. Areas to focus review: snapshot readiness/error polling, block-device dd/read validation, and multi-snapshot handle uniqueness checks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: chao007
Once this PR has been reviewed and has the lgtm label, please assign gnufied for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested review from bertinatto and jsafrane March 17, 2026 07:00
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
test/extended/storage/gcp_snapshot_images.go (1)

558-579: Reuse the shared VolumeSnapshot helper instead of copying it.

createGCPSnapshot duplicates the existing createSnapshot(...) apply-manifest flow in test/extended/storage/driver_configuration.go. Extending that shared helper to accept a class name keeps snapshot creation logic in one place and avoids drift.

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/storage/gcp_snapshot_images.go` around lines 558 - 579,
createGCPSnapshot duplicates the apply-manifest flow implemented by the shared
helper createSnapshot in test/extended/storage/driver_configuration.go; refactor
to reuse that helper by extending createSnapshot to accept an optional
volumeSnapshotClassName (or additional parameter for the snapshot class) and
update createGCPSnapshot to call createSnapshot with the namespace,
snapshotName, pvcName and the class name, then remove the duplicated manifest
construction and apply logic from createGCPSnapshot so all VolumeSnapshot
creation is centralized in createSnapshot.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended/storage/gcp_snapshot_images.go`:
- Around line 335-340: The test currently only compares
boundVolumeSnapshotContentName (handle1/handle2) which doesn't prove
point-in-time restores; update the test to perform actual restores from
snapshot1 and snapshot2 using the oc.Run restore/create-PVC logic (same helper
used elsewhere in this file) and mount or read the restored PVC contents, then
assert that the restore from snapshot1 yields testData1 and the restore from
snapshot2 yields testData2; reference snapshot1, snapshot2, testData1, testData2
and the existing oc.Run("get")/restore helpers to locate where to add the
restore-and-assert steps and clean up restored PVCs afterwards.
- Around line 115-118: The poll loop currently swallows errors from
isSnapshotReady; update each Eventually call that calls isSnapshotReady (e.g.,
the block using isSnapshotReady(oc, snapshotName)) to capture the returned error
and, when ready==false and err!=nil, call getSnapshotErrorMessage(...) (and
include the get error) to surface the real failure instead of timing out — i.e.,
change the lambda to return a boolean only when ready is true, but if err != nil
invoke the test failure with the getSnapshotErrorMessage output (and the
underlying error) so API/controller errors fail immediately; apply the same
change to the other Eventually sites mentioned (lines around 212-215, 304-307,
330-333) referencing the same isSnapshotReady/getSnapshotErrorMessage pattern.

---

Nitpick comments:
In `@test/extended/storage/gcp_snapshot_images.go`:
- Around line 558-579: createGCPSnapshot duplicates the apply-manifest flow
implemented by the shared helper createSnapshot in
test/extended/storage/driver_configuration.go; refactor to reuse that helper by
extending createSnapshot to accept an optional volumeSnapshotClassName (or
additional parameter for the snapshot class) and update createGCPSnapshot to
call createSnapshot with the namespace, snapshotName, pvcName and the class
name, then remove the duplicated manifest construction and apply logic from
createGCPSnapshot so all VolumeSnapshot creation is centralized in
createSnapshot.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5e101960-889d-42ff-a8ea-266489063225

📥 Commits

Reviewing files that changed from the base of the PR and between 085b8a0 and 96c84a6.

📒 Files selected for processing (1)
  • test/extended/storage/gcp_snapshot_images.go

Comment on lines +115 to +118
o.Eventually(func() bool {
ready, _ := isSnapshotReady(oc, snapshotName)
return ready
}, gcpSnapshotPollTimeout, gcpSnapshotPollInterval).Should(o.BeTrue(), "snapshot should be ready")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't suppress snapshot readiness errors in the poll loops.

These Eventually blocks discard isSnapshotReady errors, so API failures or controller-reported snapshot errors only surface as a generic 10-minute timeout. test/extended/storage/driver_configuration.go already has getSnapshotErrorMessage(...); use that and the get error to fail with the real reason.

Also applies to: 212-215, 304-307, 330-333

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/storage/gcp_snapshot_images.go` around lines 115 - 118, The
poll loop currently swallows errors from isSnapshotReady; update each Eventually
call that calls isSnapshotReady (e.g., the block using isSnapshotReady(oc,
snapshotName)) to capture the returned error and, when ready==false and
err!=nil, call getSnapshotErrorMessage(...) (and include the get error) to
surface the real failure instead of timing out — i.e., change the lambda to
return a boolean only when ready is true, but if err != nil invoke the test
failure with the getSnapshotErrorMessage output (and the underlying error) so
API/controller errors fail immediately; apply the same change to the other
Eventually sites mentioned (lines around 212-215, 304-307, 330-333) referencing
the same isSnapshotReady/getSnapshotErrorMessage pattern.

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

@chao007: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-metal-ipi-ovn-ipv6 96c84a6 link true /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-trt
Copy link

openshift-trt bot commented Mar 17, 2026

Risk analysis has seen new tests most likely introduced by this PR.
Please ensure that new tests meet guidelines for naming and stability.

New Test Risks for sha: 96c84a6

Job Name New Test Risk
pull-ci-openshift-origin-main-e2e-gcp-ovn Medium - "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create VolumeSnapshotClass with snapshot-type images parameter [Suite:openshift/conformance/parallel]" is a new test, and was only seen in one job.
pull-ci-openshift-origin-main-e2e-gcp-ovn Medium - "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create snapshot using images type and restore data successfully with RWO block [Suite:openshift/conformance/parallel]" is a new test, and was only seen in one job.
pull-ci-openshift-origin-main-e2e-gcp-ovn Medium - "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create snapshot using images type and restore data successfully with RWO filesystem [Suite:openshift/conformance/parallel]" is a new test, and was only seen in one job.
pull-ci-openshift-origin-main-e2e-gcp-ovn Medium - "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should not be marked as default VolumeSnapshotClass [Suite:openshift/conformance/parallel]" is a new test, and was only seen in one job.
pull-ci-openshift-origin-main-e2e-gcp-ovn Medium - "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should support multiple snapshots from same PVC using images type [Suite:openshift/conformance/parallel]" is a new test, and was only seen in one job.

New tests seen in this PR at sha: 96c84a6

  • "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create VolumeSnapshotClass with snapshot-type images parameter [Suite:openshift/conformance/parallel]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
  • "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create snapshot using images type and restore data successfully with RWO block [Suite:openshift/conformance/parallel]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
  • "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should create snapshot using images type and restore data successfully with RWO filesystem [Suite:openshift/conformance/parallel]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
  • "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should not be marked as default VolumeSnapshotClass [Suite:openshift/conformance/parallel]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]
  • "[sig-storage][Driver:gcp-pd][FeatureGate:CSIDriverSharedResource] GCP PD CSI Driver VolumeSnapshotClass with snapshot-type images should support multiple snapshots from same PVC using images type [Suite:openshift/conformance/parallel]" [Total: 1, Pass: 1, Fail: 0, Flake: 0]

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
test/extended/storage/gcp_snapshot_images.go (1)

113-128: ⚠️ Potential issue | 🟠 Major

Fail snapshot polling on the first API/controller error.

These Eventually blocks still turn isSnapshotReady(...) failures into false and also discard the getSnapshotErrorMessage(...) lookup error, so a broken snapshot just burns the full 10-minute timeout and loses the real cause. Fail immediately when the readiness check errors, and include any snapshot error/get failure in that message.

Also applies to: 220-235, 324-337, 360-373

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/storage/gcp_snapshot_images.go` around lines 113 - 128, Do not
swallow snapshot API/controller errors inside the Eventually block: when
isSnapshotReady(oc, snapshotName) returns an error, immediately fail the test
with e2e.Failf and include that error; likewise, when !ready call
getSnapshotErrorMessage(oc, snapshotName) and if it returns a non-empty message
or an error, immediately fail with e2e.Failf including both the snapshot error
string and any lookup error. Update the anonymous func used by Eventually (the
block around isSnapshotReady/getSnapshotErrorMessage) to call e2e.Failf on those
error cases instead of returning false so failures surface immediately.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/extended/storage/gcp_snapshot_images.go`:
- Around line 97-103: The current polling loop that waits for pods like
writerPod to reach v1.PodSucceeded never exits early if a pod reaches
v1.PodFailed, causing long timeouts; add a shared helper (e.g.,
waitForPodSucceeded) that polls the pod phase via
oc.AdminKubeClient().CoreV1().Pods(...).Get(...), and if the phase is
v1.PodFailed immediately stop polling, fetch and emit the pod.Status.Reason and
recent logs, and return a failing assertion so tests surface failure
immediately; replace the repeated blocks that call
o.Eventually(...).Should(o.Equal(v1.PodSucceeded)) (locations referencing
writerPod and similar reader pods) with calls to this helper. Ensure the helper
accepts ctx, oc, pod name/object and the polling timeout/interval so it can be
reused in all the listed locations.

---

Duplicate comments:
In `@test/extended/storage/gcp_snapshot_images.go`:
- Around line 113-128: Do not swallow snapshot API/controller errors inside the
Eventually block: when isSnapshotReady(oc, snapshotName) returns an error,
immediately fail the test with e2e.Failf and include that error; likewise, when
!ready call getSnapshotErrorMessage(oc, snapshotName) and if it returns a
non-empty message or an error, immediately fail with e2e.Failf including both
the snapshot error string and any lookup error. Update the anonymous func used
by Eventually (the block around isSnapshotReady/getSnapshotErrorMessage) to call
e2e.Failf on those error cases instead of returning false so failures surface
immediately.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d0e1eabf-476e-43ab-ae9c-52de98ed66fc

📥 Commits

Reviewing files that changed from the base of the PR and between 96c84a6 and 005164e.

📒 Files selected for processing (1)
  • test/extended/storage/gcp_snapshot_images.go

Comment on lines +97 to +103
// Wait for Pod to complete
e2e.Logf("Waiting for writer Pod to complete")
o.Eventually(func() v1.PodPhase {
pod, err := oc.AdminKubeClient().CoreV1().Pods(oc.Namespace()).Get(ctx, writerPod.Name, metav1.GetOptions{})
o.Expect(err).NotTo(o.HaveOccurred())
return pod.Status.Phase
}, gcpSnapshotPollTimeout, gcpSnapshotPollInterval).Should(o.Equal(v1.PodSucceeded))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Stop polling once a pod reaches Failed.

All of these waits only succeed on v1.PodSucceeded, so any writer/reader pod that exits unsuccessfully will sit in Failed until gcpSnapshotPollTimeout expires. That turns real data-mismatch or attach/mount failures into an extra 10 minutes of noise instead of surfacing the pod reason/logs immediately. A shared waitForPodSucceeded helper would fix this in one place.

Also applies to: 159-165, 204-210, 266-272, 311-315, 347-351, 403-407, 430-434

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/extended/storage/gcp_snapshot_images.go` around lines 97 - 103, The
current polling loop that waits for pods like writerPod to reach v1.PodSucceeded
never exits early if a pod reaches v1.PodFailed, causing long timeouts; add a
shared helper (e.g., waitForPodSucceeded) that polls the pod phase via
oc.AdminKubeClient().CoreV1().Pods(...).Get(...), and if the phase is
v1.PodFailed immediately stop polling, fetch and emit the pod.Status.Reason and
recent logs, and return a failing assertion so tests surface failure
immediately; replace the repeated blocks that call
o.Eventually(...).Should(o.Equal(v1.PodSucceeded)) (locations referencing
writerPod and similar reader pods) with calls to this helper. Ensure the helper
accepts ctx, oc, pod name/object and the polling timeout/interval so it can be
reused in all the listed locations.

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e-aws-csi
/test e2e-aws-ovn-fips
/test e2e-aws-ovn-microshift
/test e2e-aws-ovn-microshift-serial
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-gcp-csi
/test e2e-gcp-ovn
/test e2e-gcp-ovn-upgrade
/test e2e-metal-ipi-ovn-ipv6
/test e2e-vsphere-ovn
/test e2e-vsphere-ovn-upi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants