Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add several skip cases for pathological events #26380

Merged
merged 4 commits into from Aug 6, 2021

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented Aug 5, 2021

builds on #26375, only the last commit is pertinent.

Started burning down the list of pathological events.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 5, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 5, 2021
@deads2k deads2k force-pushed the ignore-events-01 branch 2 times, most recently from 6133b79 to 2172e00 Compare August 5, 2021 19:11
@deads2k
Copy link
Contributor Author

deads2k commented Aug 5, 2021

/retest

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 6, 2021

@deads2k: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-gcp-upgrade de8df4d link /test e2e-gcp-upgrade
ci/prow/e2e-agnostic-cmd de8df4d link /test e2e-agnostic-cmd
ci/prow/e2e-aws-fips de8df4d link /test e2e-aws-fips
ci/prow/e2e-aws-single-node de8df4d link /test e2e-aws-single-node
ci/prow/e2e-metal-ipi-ovn-ipv6 de8df4d link /test e2e-metal-ipi-ovn-ipv6

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

// isRepeatedEventOKFunc takes a monitorEvent as input and returns true if the repeated event is OK.
// this commonly happens for known bugs and for cases where events are repeated intentionally by tests.
// the string is the message to display for the failure.
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval) (bool, string)
Copy link
Member

@wking wking Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: thoughts about:

// checkRepeatedEventFunc takes a monitorEvent as input and returns an error if the
// repeated event is not ok.  Returns nil for known bugs, events repeated intentionally
// by tests, and other acceptable events.
type checkRepeatedEventFunc func(monitorEvent monitorapi.EventInterval) error

So you don't have to explain a bool, string return combo?

}

// isRepeatedEventOKFunc takes a monitorEvent as input and returns true if the repeated event is OK.
// this commonly happens for known bugs and for cases where events are repeated intentionally by tests.
// the string is the message to display for the failure.
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval) (bool, string)
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval, kubeClientConfig *rest.Config) bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you're dropping the string response, but still talk about it in the godocs two lines up. I still think it would be more convenient as an error return. And the context.TODO() in isConsoleReadinessDuringInstallation suggests it should take a ctx context.Context argument too.

if !strings.Contains(monitorEvent.Locator, "ns/openshift-console") {
return false
}
if !strings.HasPrefix(monitorEvent.Locator, "pod/console-") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your example a few lines up in the godocs opens with ns/openshift-console , so would expect Contains instead of HasPrefix here.

}
tokens := strings.Split(monitorEvent.Locator, " ")
tokens = strings.Split(tokens[1], "/")
podName := tokens[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to split into tokens, can we just make a map[string][string] instead of using the Locator Contains business above? It is unlikely that ns/openshift-console is a substring of something in the Locator that is not the namespace entry, but it feels weird to do some unstructured checking before unmarshalling when we could unmarshal and then do structured checking. And then here, when you just assume that the second token is the pod name, you could use tokens["pod"] to conveniently get that pod name without being as brittle.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to split into tokens, can we just make a map[string][string] instead of using the Locator Contains business above? It is unlikely that ns/openshift-console is a substring of something in the Locator that is not the namespace entry, but it feels weird to do some unstructured checking before unmarshalling when we could unmarshal and then do structured checking. And then here, when you just assume that the second token is the pod name, you could use tokens["pod"] to conveniently get that pod name without being as brittle.

This is a good general idea given the event compression format. I think I'm going to try to move the other way and keep all the event content as a real event.

}

// this block gets the actual event from the API. This is ugly, but necessary because we don't have real events.
// It may be interesting to track real events, but it would be expensive.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be expensive to track the real event IDs, so you could ask for them by name here instead of pulling them out of a list response?


// Kubectl Port forwarding ***
// The same pod name is used many times for all these tests with a tight readiness check to make the tests fast.
// This results in hundreds of events while the pod isn't ready.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the pod -> the pod, removing a doubled space

@deads2k deads2k added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. labels Aug 6, 2021
@deads2k
Copy link
Contributor Author

deads2k commented Aug 6, 2021

this only touches a recently added test, bypassing BZ requirements.

Slack thread says this looks ok and can be refined later.

Need to stop the bleeding on master, so forcing this in.

@deads2k deads2k merged commit f0546d0 into openshift:master Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants