status.total counter is not correct for openshift/conformance suite #27350

mtulio · 2022-08-09T19:50:09Z

The field total from status is not correct on the openshift/conformance suite (default, parallel).

The problem was found when running the OPCT on the latest release. The OPCT is built on top of openshift-tests binary, and consumes that counter to report the execution to the user when running the tool. More details is available here: https://issues.redhat.com/browse/SPLAT-696

Version

$ oc version
Client Version: 4.10.10
Server Version: 4.11.0
Kubernetes Version: v1.24.0+9546431

Steps To Reproduce

openshift-tests run openshift/conformance
Wait for 1127th test
Check if the total keep increasing with the index, second field [(failed/index/total)] of status

Current Result

after the 1127th test, the total counter keeps increasing with the index:

openshift-tests version: 4.11.0-202208020706.p0.gb860532.assembly.stream-b860532
Starting SimultaneousPodIPControllerI0809 16:31:15.790490    3733 shared_informer.go:255] Waiting for caches to sync for SimultaneousPodIPController
started: (0/1/1127) "[sig-scheduling][Early] The openshift-monitoring pods should be scheduled on different nodes [Suite:openshift/conformance/parallel]"

(...)

started: (0/1126/1127) "[sig-storage] PersistentVolumes-expansion  loopback local block volume should support online expansion on node [Suite:openshift/conformance/parallel] [Suite:k8s]"

passed: (38s) 2022-08-09T17:12:21 "[sig-storage] In-tree Volumes [Driver: nfs] [Testpattern: Dynamic PV (default fs)] provisioning should provision storage with mount options [Suite:openshift/conformance/parallel] [Suite:k8s]"

started: (0/1127/1127) "[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: tmpfs] [Testpattern: Generic Ephemeral-volume (block volmode) (late-binding)] ephemeral should support two pods which have the same volume definition [Suite:openshift/conformance/parallel] [Suite:k8s]"

passed: (6.6s) 2022-08-09T17:12:21 "[sig-storage] Downward API volume should provide container's memory request [NodeConformance] [Conformance] [Suite:openshift/conformance/parallel/minimal] [Suite:k8s]"

started: (0/1128/1128) "[sig-storage] In-tree Volumes [Driver: cinder] [Testpattern: Dynamic PV (immediate binding)] topology should fail to schedule a pod which has topologies that conflict with AllowedTopologies [Suite:openshift/conformance/parallel] [Suite:k8s]"

skip [k8s.io/kubernetes@v1.24.0/test/e2e/storage/framework/testsuite.go:116]: Driver local doesn't support GenericEphemeralVolume -- skipping
Ginkgo exit error 3: exit with code 3

skipped: (400ms) 2022-08-09T17:12:21 "[sig-storage] In-tree Volumes [Driver: local][LocalVolumeType: tmpfs] [Testpattern: Generic Ephemeral-volume (block volmode) (late-binding)] ephemeral should support two pods which have the same volume definition [Suite:openshift/conformance/parallel] [Suite:k8s]"

started: (0/1129/1129) "[sig-storage] In-tree Volumes [Driver: emptydir] [Testpattern: Dynamic PV (default fs)] capacity provides storage capacity information [Suite:openshift/conformance/parallel] [Suite:k8s]"

After that, it keeps increasing until the last test (3475th):

started: (30/3474/3474) "[sig-arch][bz-etcd][Late] Alerts alert/etcdGRPCRequestsSlow should not be at or above pending [Suite:openshift/conformance/parallel]"

passed: (4.5s) 2022-08-09T18:26:40 "[sig-arch][bz-Unknown][Late] Alerts alert/KubePodNotReady should not be at or above info in all the other namespaces [Suite:openshift/conformance/parallel]
"

started: (30/3475/3475) "[sig-arch][bz-Unknown][Late] Alerts alert/KubePodNotReady should not be at or above pending in ns/default [Suite:openshift/conformance/parallel]"

Expected Result

started: (0/1/3475)   (....)

Additional Information

Extracting the openshift-tests from the same release the cluster is running, I got a different counter:

$ ./.local/bin/openshift-install-linux-4.11.0 version
./.local/bin/openshift-install-linux-4.11.0 4.11.0
built from commit 37684309bcb598757c99d3ea9fbc0758343d64a5
release image quay.io/openshift-release-dev/ocp-release@sha256:300bce8246cf880e792e106607925de0a404484637627edf5f517375517d54a4
release architecture amd64

$ RELEASE_IMAGE=$(./.local/bin/openshift-install-linux-4.11.0 version | awk '/release image/ {print $3}')
$ TESTS_IMAGE=$(oc adm release info --image-for='tests' $RELEASE_IMAGE)

$ oc image extract $TESTS_IMAGE --file="/usr/bin/openshift-tests" -a ~/.openshift/pull-secret-latest.json

$ chmod u+x openshift-tests
$ ./openshift-tests run --dry-run openshift/conformance |wc -l
3487

The text was updated successfully, but these errors were encountered:

elmiko · 2022-08-16T14:10:44Z

we talked about this issue during the install flex sync meeting today, we don't think this is overly concerning but it will be an issue for people who want to monitor the count as it's happening. it will be difficult to determine when the tests will end.

openshift-bot · 2022-11-15T01:00:17Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2022-12-15T08:30:24Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2023-01-15T00:00:26Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2023-01-15T00:01:10Z

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

rvanderp3 mentioned this issue Aug 16, 2022

include storage and must-gather tests in expectedTestCount #27356

Merged

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 15, 2022

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 15, 2022

openshift-ci bot closed this as completed Jan 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

status.total counter is not correct for openshift/conformance suite #27350

status.total counter is not correct for openshift/conformance suite #27350

mtulio commented Aug 9, 2022 •

edited

elmiko commented Aug 16, 2022 •

edited

openshift-bot commented Nov 15, 2022

openshift-bot commented Dec 15, 2022

openshift-bot commented Jan 15, 2023

openshift-ci bot commented Jan 15, 2023

status.total counter is not correct for openshift/conformance suite #27350

status.total counter is not correct for openshift/conformance suite #27350

Comments

mtulio commented Aug 9, 2022 • edited

Version

Steps To Reproduce

Current Result

Expected Result

Additional Information

elmiko commented Aug 16, 2022 • edited

openshift-bot commented Nov 15, 2022

openshift-bot commented Dec 15, 2022

openshift-bot commented Jan 15, 2023

openshift-ci bot commented Jan 15, 2023

mtulio commented Aug 9, 2022 •

edited

elmiko commented Aug 16, 2022 •

edited