-
Notifications
You must be signed in to change notification settings - Fork 39.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Print progress updates to stdout and publish to URL #84524
Conversation
/sig testing
|
e80343c
to
2e12706
Compare
/approve |
/retest |
2e12706
to
8036986
Compare
/assign @andrewsykim |
/assign @timothysc |
/retest |
test/e2e/reporters/progress.go
Outdated
// NewProgressReporter returns a progress reporter which posts updates to the given URL. | ||
func NewProgressReporter(progressUpdateURL string) *ProgressReporter { | ||
c := &http.Client{ | ||
Timeout: time.Second * 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks great! But are you thinking about making Timeout
configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really anticipate something like that being particularly configurable. It is really just specifying a timeout so it doesn't hang forever. And regardless, the penalty for one slow request is that it gets dropped effectively which is OK IMO for one or two events.
Do you see a reason it should be a higher value? I wouldn't care about increasing it to a high, but reasonable, value.
test/e2e/reporters/progress.go
Outdated
|
||
// NewProgressReporter returns a progress reporter which posts updates to the given URL. | ||
func NewProgressReporter(progressUpdateURL string) *ProgressReporter { | ||
c := &http.Client{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, If the progressUpdateURL
is empty, we don't have to create a http client.
test/e2e/framework/test_context.go
Outdated
@@ -292,6 +295,8 @@ func RegisterCommonFlags(flags *flag.FlagSet) { | |||
|
|||
flags.BoolVar(&TestContext.ListImages, "list-images", false, "If true, will show list of images used for runnning tests.") | |||
flags.StringVar(&TestContext.KubectlPath, "kubectl-path", "kubectl", "The kubectl binary to use. For development, you might use 'cluster/kubectl.sh' here.") | |||
|
|||
flags.StringVar(&TestContext.ProgressUpdateURL, "progress-report-url", "", "The URL to POST progress updates to as the suite runs to assist in aiding integrations. If empty, no messages sent.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just a question.
Why don't we name ProgressReportURL
instead?
Most test context variables consist of the same words from the option names like the above.
That could be easy to search the code from the option name.
reporter.NumSkipped++ | ||
return | ||
default: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a question: This default
case is unexpected, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct; we are just parsing for success/failure/skip. There are other types but they aren't really applicable at this time for how we're using this (https://github.com/onsi/ginkgo/blob/000d317071340972710dc1da9ae3c2c016a7a560/types/types_test.go#L10-L16)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM and seems useful, should probably be discussed in SIG testing or Testing Commons meeting (again) before approving?
// sendUpdates serializes the current progress and prints it to stdout and also posts it to the configured endpoint if set. | ||
func (reporter *ProgressReporter) sendUpdates() { | ||
b := reporter.serialize() | ||
fmt.Println(string(b)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this log line intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes; one of the conditions/goals for doing this came from @timothysc during our testing commons meeting and it was to print out something useful to the user to get the same sort of feedback. Currently, when looking at the logs you can't make much sense of how the run as a whole is going or how close it is to finishing until it is done.
This solves that problem.
@andrewsykim Our next meeting for testing commons is on the 11th (this coming monday) correct? If that's the case then I'm OK waiting to merge until then. I brought this up months ago when the ticket was filed but it had come up at the Aug5th meeting (some time ago!) and feedback was positive and I was given the green light to do this as long as I added stdout printing in additino to the webhook. The only question implementation-wise I have is whether or not I should print the full list of failures to stdout as well. The way the Sonobuoy webhook is right now it expects the full status each time so we need to POST it that way. However, if you have a busted run with hundreds of failures it will definitely be sub-optimal to reprint the failed tests each time. The downside of changing the way its printed in each case is (1) it removes a benefit of this approach where you can grep for one of these progress lines and each one will be self-contained and really tell you the full state of affairs (2) it complicates the code a bit (3) may be less performant if we have to serialize a second time but the items are small most of the time so not a huge deal |
8036986
to
12f27ba
Compare
One common frustration of end users running the e2e suite is that they take a significant amount of time and it is difficult to gauge progress. Even if tailing the logs it can be difficult to see where one test starts and another ends or understand the if there have been failures in the past 1h of logs. This change adds a new custom reporter which prints summary information as tests complete. This includes the number of tests to run and how many have been passed/failed/skipped along with which tests have failed. A new flag can be set which pushes these values to an endpoint. This is intended for integration with Sonobuoy but any endpoint could consume and surface this data to the user so they can better understand the state of the current test run.
12f27ba
to
ed1d527
Compare
|
||
// Stream the progress to stdout and optionally a URL accepting progress updates. | ||
r = append(r, e2ereporters.NewProgressReporter(framework.TestContext.ProgressReportURL)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just skip appending the progress reporter if the URL is empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That means we also don't log to stdout but maybe that's okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe logging to stdout should be a separate ginko reporter that's enabled by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand the motivation to decouple the URL and stdout printing; all the logic to count/gather/serialize would be the same. Even if we reused this datatype and had one for stdout printing and one for posting to URLs, then the memory footprint and processing overhead would be double (one for each of the reporters).
What is the desired endgoal; you want the ability to disable this functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes -- it sounds like the new progress reporter is actually doing two things: reporting progress via stdout and reporting progress by sending a POST request to a URL. Based on how ginkgo deals with custom reports, it sounds like they should be decoupled. Reporting to stdout can be on by default but still optional.
I won't block on this though but something we should consider in the future.
I refreshed the image just to be sure we were testing the latest code; you can run:
to see it in action. Then wait a bit and start watching |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewsykim, johnSchnake, neolit123 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
// As new tests are completed it updates the values and prints them to stdout and optionally, sends the updates | ||
// to the configured URL. | ||
type ProgressReporter struct { | ||
LastMsg string `json:"msg"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a future PR, I think a LastReported
timestamp would be useful.
What type of PR is this?
/kind feature
Minor testing feature to clarify/surface progress of a test run.
What this PR does / why we need it:
One common frustration of end-users running the e2e suite is that
they take a significant amount of time and it is difficult to
gauge progress.
Even if tailing the logs it can be difficult to see where one
test starts and another ends or understand the if there have been
failures in the past 1h of logs.
This change adds a new custom reporter that prints summary information
as tests complete. This includes the number of tests to run and how
many have been passed/failed/skipped along with which tests have failed.
A new flag can be set which pushes these values to an endpoint. This is
intended for integration with Sonobuoy but any endpoint could consume and
surface this data to the user so they can better understand the state
of the current test run.
Which issue(s) this PR fixes:
Fixes #78535
Special notes for your reviewer:
I've published an image that utilizes this; see the
sonobuoy run
command below. It is based on the v1.16.0 conformance image but swaps out the e2e.test binary for my own from this PR.Does this PR introduce a user-facing change?:
Only for testing; does not impact k8s users themselves.
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Example output in logs:
Example command to utilize this with Sonobuoy (will have some more Sonobuoy improvements to make this a default and such, but wanted to show existing value):
And this leads to the value being readable from
sonobuoy status --json
: