move remaining disruption tests to invariants #28144

deads2k · 2023-08-07T15:45:51Z

dropped new apiserver testing that Vadim has a separate PR working on.

deads2k · 2023-08-07T19:05:57Z

looks to be working. ci-cluster-network-liveness still exists in backends and would have failed intervals, but it doesn't have associated test because it never fails.

deads2k · 2023-08-07T20:29:35Z

/retest

neisw · 2023-08-07T21:23:35Z

pkg/cmd/openshift-tests/dev/dev.go

@@ -153,6 +152,9 @@ a running cluster.
 `),

 		RunE: func(cmd *cobra.Command, args []string) error {
+			if true {
+				return fmt.Errorf("this command got nerfed")
+			}


I have to at least ask, you really want this in here?

What is this command trying to do and does it make sense this way with invariant tests? Would a rewrite when we need it again make more sense?

I'm up for removing the command of you're down with it

As far as removing it goes I would be interested on checking with @dgoodwin to see what his original intent was and if it is still of use. If leaving it disabled then a comment regarding why it was disabled would suffice.

This command lets developers work on disruption test code and run it against a given intervals file containing disruption we're interested in testing against, getting feedback in seconds rather than hours waiting for CI to run. I would prefer to keep it operational.

With your new testing, would it be possible to run through the junit generation parts of your interface, but skip the setup/generate intervals portions.

That capability was already lost at a preview point int eh refactor. Is it frequently enough used to pre-emptively create it or should it just be made the next time its needed?

I'm really not clear on which part it's trying to test. possibilities

the code doing sample collection

the code recording the sample failures

the code summarizing the disruption summary json file

the code rendering a timeline

the code looking up historical values

the code creating junit reports.

Which part is this command trying to do?

It is supposed to run the "should remain available" disruption tests against the historical data on disk. It assumes it's already given an intervals file with the observed disruption. In your list, #5, plus running the tests and viewing the output to see what would fail and with what values.

Deep is going to need similar very soon for alerts.

neisw · 2023-08-07T21:29:05Z

pkg/invariantlibrary/allowedbackenddisruption/matches.go

-	if backendName == externalservice.LivenessProbeBackend {
-		aed := allowedExternalDisruption
-		return &aed, "forgiving limit for disruption to an external service", nil
-	}


IIRC This was in place so we didn't fail due to our check to see if the cluster running the tests was having connection issues.

The test was always ok. I've left the backend check and it reports intervals and overall disruption. Do we also need the "never fail" test?

Nope, I see you are dropping the tests out when I saw the change I just wanted to be sure weren't risking failures on these checks again.

neisw · 2023-08-08T18:27:37Z

/lgtm

openshift-ci · 2023-08-08T18:29:46Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, neisw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/OWNERS~~ [deads2k]
~~test/OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2023-08-08T21:19:21Z

@deads2k: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-ovn-single-node-serial	`3aeb2d7`	link	false	`/test e2e-aws-ovn-single-node-serial`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-trt-bot · 2023-08-08T21:35:20Z

Job Failure Risk Analysis for sha: 3aeb2d7

Job Name	Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-serial	High [sig-apps] Daemon set [Serial] should surge pods onto nodes when spec was updated and update strategy is RollingUpdate [Suite:openshift/conformance/serial] [Suite:k8s] This test has passed 100.00% of 53 runs on jobs ['periodic-ci-openshift-release-master-ci-4.14-e2e-aws-ovn-serial' 'periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-serial'] in the last 14 days.
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial	Low [sig-storage] PersistentVolumes-local Stress with local volumes [Serial] should be able to process many pods and reuse local volumes [Suite:openshift/conformance/serial] [Suite:k8s] This test has passed 70.59% of 34 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.14-e2e-aws-ovn-single-node-serial'] in the last 14 days.

neisw · 2023-08-08T21:40:35Z

/override ci/prow/e2e-aws-ovn-serial

Failure for [sig-apps] Daemon set [Serial] should surge pods onto nodes when spec was updated and update strategy is RollingUpdate is not related

openshift-ci · 2023-08-08T21:40:42Z

@neisw: Overrode contexts on behalf of neisw: ci/prow/e2e-aws-ovn-serial

In response to this:

/override ci/prow/e2e-aws-ovn-serial

Failure for [sig-apps] Daemon set [Serial] should surge pods onto nodes when spec was updated and update strategy is RollingUpdate is not related

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

deads2k added 4 commits August 7, 2023 10:07

remove unnecessary global modification from upgrade

bb2fa9d

switch external service monitoring to new invariant test

5f0d198

move ingress disruption to invariant

9edf57c

remove more special cases for monitor initialization

3aeb2d7

openshift-ci bot requested review from bparees and csrwng August 7, 2023 15:46

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 7, 2023

neisw reviewed Aug 7, 2023

View reviewed changes

openshift-ci bot assigned neisw Aug 8, 2023

openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 8, 2023

openshift-merge-robot merged commit e973bdc into openshift:master Aug 8, 2023
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

move remaining disruption tests to invariants #28144

move remaining disruption tests to invariants #28144

deads2k commented Aug 7, 2023 •

edited

deads2k commented Aug 7, 2023

deads2k commented Aug 7, 2023

neisw Aug 7, 2023

deads2k Aug 8, 2023

neisw Aug 8, 2023

dgoodwin Aug 8, 2023

deads2k Aug 8, 2023

dgoodwin Aug 8, 2023

neisw Aug 7, 2023

deads2k Aug 8, 2023

neisw Aug 8, 2023

neisw commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023 •

edited

openshift-trt-bot commented Aug 8, 2023

neisw commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023

move remaining disruption tests to invariants #28144

move remaining disruption tests to invariants #28144

Conversation

deads2k commented Aug 7, 2023 • edited

deads2k commented Aug 7, 2023

deads2k commented Aug 7, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

neisw commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023 • edited

openshift-trt-bot commented Aug 8, 2023

neisw commented Aug 8, 2023

openshift-ci bot commented Aug 8, 2023

deads2k commented Aug 7, 2023 •

edited

openshift-ci bot commented Aug 8, 2023 •

edited