New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add several skip cases for pathological events #26380
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
6133b79
to
2172e00
Compare
/retest |
2172e00
to
42fad0b
Compare
42fad0b
to
de8df4d
Compare
@deads2k: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
// isRepeatedEventOKFunc takes a monitorEvent as input and returns true if the repeated event is OK. | ||
// this commonly happens for known bugs and for cases where events are repeated intentionally by tests. | ||
// the string is the message to display for the failure. | ||
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval) (bool, string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: thoughts about:
// checkRepeatedEventFunc takes a monitorEvent as input and returns an error if the
// repeated event is not ok. Returns nil for known bugs, events repeated intentionally
// by tests, and other acceptable events.
type checkRepeatedEventFunc func(monitorEvent monitorapi.EventInterval) error
So you don't have to explain a bool, string
return combo?
} | ||
|
||
// isRepeatedEventOKFunc takes a monitorEvent as input and returns true if the repeated event is OK. | ||
// this commonly happens for known bugs and for cases where events are repeated intentionally by tests. | ||
// the string is the message to display for the failure. | ||
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval) (bool, string) | ||
type isRepeatedEventOKFunc func(monitorEvent monitorapi.EventInterval, kubeClientConfig *rest.Config) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here you're dropping the string response, but still talk about it in the godocs two lines up. I still think it would be more convenient as an error
return. And the context.TODO()
in isConsoleReadinessDuringInstallation
suggests it should take a ctx context.Context
argument too.
if !strings.Contains(monitorEvent.Locator, "ns/openshift-console") { | ||
return false | ||
} | ||
if !strings.HasPrefix(monitorEvent.Locator, "pod/console-") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your example a few lines up in the godocs opens with ns/openshift-console
, so would expect Contains
instead of HasPrefix
here.
} | ||
tokens := strings.Split(monitorEvent.Locator, " ") | ||
tokens = strings.Split(tokens[1], "/") | ||
podName := tokens[1] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're going to split into tokens, can we just make a map[string][string]
instead of using the Locator
Contains
business above? It is unlikely that ns/openshift-console
is a substring of something in the Locator
that is not the namespace entry, but it feels weird to do some unstructured checking before unmarshalling when we could unmarshal and then do structured checking. And then here, when you just assume that the second token is the pod name, you could use tokens["pod"]
to conveniently get that pod name without being as brittle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're going to split into tokens, can we just make a
map[string][string]
instead of using theLocator
Contains
business above? It is unlikely thatns/openshift-console
is a substring of something in theLocator
that is not the namespace entry, but it feels weird to do some unstructured checking before unmarshalling when we could unmarshal and then do structured checking. And then here, when you just assume that the second token is the pod name, you could usetokens["pod"]
to conveniently get that pod name without being as brittle.
This is a good general idea given the event compression format. I think I'm going to try to move the other way and keep all the event content as a real event.
} | ||
|
||
// this block gets the actual event from the API. This is ugly, but necessary because we don't have real events. | ||
// It may be interesting to track real events, but it would be expensive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it be expensive to track the real event IDs, so you could ask for them by name here instead of pulling them out of a list response?
|
||
// Kubectl Port forwarding *** | ||
// The same pod name is used many times for all these tests with a tight readiness check to make the tests fast. | ||
// This results in hundreds of events while the pod isn't ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the pod
-> the pod
, removing a doubled space
this only touches a recently added test, bypassing BZ requirements. Slack thread says this looks ok and can be refined later. Need to stop the bleeding on master, so forcing this in. |
builds on #26375, only the last commit is pertinent.
Started burning down the list of pathological events.