Print progress updates to stdout and publish to URL #84524

johnSchnake · 2019-10-29T15:34:19Z

What type of PR is this?
/kind feature

Minor testing feature to clarify/surface progress of a test run.

What this PR does / why we need it:
One common frustration of end-users running the e2e suite is that
they take a significant amount of time and it is difficult to
gauge progress.

Even if tailing the logs it can be difficult to see where one
test starts and another ends or understand the if there have been
failures in the past 1h of logs.

This change adds a new custom reporter that prints summary information
as tests complete. This includes the number of tests to run and how
many have been passed/failed/skipped along with which tests have failed.

A new flag can be set which pushes these values to an endpoint. This is
intended for integration with Sonobuoy but any endpoint could consume and
surface this data to the user so they can better understand the state
of the current test run.

Which issue(s) this PR fixes:
Fixes #78535

Special notes for your reviewer:
I've published an image that utilizes this; see the sonobuoy run command below. It is based on the v1.16.0 conformance image but swaps out the e2e.test binary for my own from this PR.

Does this PR introduce a user-facing change?:
Only for testing; does not impact k8s users themselves.

A new flag "progress-report-url" has been added to the test context which allows progress information about the test run to be sent to a webhook. In addition, this information is printed to stdout to aid in users watching the logs.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
Example output in logs:

{"msg":"PASSED [sig-storage] Downward API volume should update annotations on modification [NodeConformance] [Conformance]","total":277,"completed":5,"numFailures":0,"numSkipped":98}

Example command to utilize this with Sonobuoy (will have some more Sonobuoy improvements to make this a default and such, but wanted to show existing value):

sonobuoy run --kube-conformance-image=schnake/conformance:progressv4 --plugin-env e2e.E2E_EXTRA_ARGS="--progress-report-url=http://localhost:8099/progress"

And this leads to the value being readable from sonobuoy status --json:

{"plugins":[{"plugin":"e2e","node":"global","status":"running","result-status":"","result-counts":null,"progress":{"name":"e2e","node":"global","timestamp":"2019-10-29T15:29:48.0908739Z","msg":"PASSED [sig-cli] Kubectl client Kubectl describe should check if kubectl describe prints relevant information for rc and pods  [Conformance]","total":277,"completed":24}}],"status":"running","tar-info":{"name":"","created":"0001-01-01T00:00:00Z","sha256":"","size":0}}

johnSchnake · 2019-10-29T15:41:17Z

/sig testing
/area e2e-test-framework
/priority important-longterm

Marking as import-longterm but I'm hoping to get this merged for the v1.17 release.
Technically a feature but is so small and is just a testing "feature" that a KEP does not seem necessary/appropriate.
Happy to take feedback about tweaks to the printed data but I'm hoping we don't bikeshed too much at this time. We can tweak this over time if need-be but the basics about how many tests pass/fail are invaluable.

@dims @andrewsykim

test/e2e/reporters/progress.go

neolit123 · 2019-10-29T18:14:00Z

/approve
i like the idea!

johnSchnake · 2019-10-29T18:45:51Z

/retest

johnSchnake · 2019-10-30T20:21:46Z

/assign @andrewsykim

johnSchnake · 2019-10-30T20:21:56Z

/assign @timothysc

johnSchnake · 2019-10-31T13:50:01Z

/retest

johnSchnake · 2019-10-31T16:42:24Z

@hh

SataQiu · 2019-11-04T08:09:41Z

test/e2e/reporters/progress.go

+// NewProgressReporter returns a progress reporter which posts updates to the given URL.
+func NewProgressReporter(progressUpdateURL string) *ProgressReporter {
+	c := &http.Client{
+		Timeout: time.Second * 10,


It looks great! But are you thinking about making Timeout configurable?

I don't really anticipate something like that being particularly configurable. It is really just specifying a timeout so it doesn't hang forever. And regardless, the penalty for one slow request is that it gets dropped effectively which is OK IMO for one or two events.

Do you see a reason it should be a higher value? I wouldn't care about increasing it to a high, but reasonable, value.

SataQiu · 2019-11-04T08:17:01Z

test/e2e/reporters/progress.go

+
+// NewProgressReporter returns a progress reporter which posts updates to the given URL.
+func NewProgressReporter(progressUpdateURL string) *ProgressReporter {
+	c := &http.Client{


IMO, If the progressUpdateURL is empty, we don't have to create a http client.

oomichi · 2019-11-05T21:47:02Z

test/e2e/framework/test_context.go

@@ -292,6 +295,8 @@ func RegisterCommonFlags(flags *flag.FlagSet) {

 	flags.BoolVar(&TestContext.ListImages, "list-images", false, "If true, will show list of images used for runnning tests.")
 	flags.StringVar(&TestContext.KubectlPath, "kubectl-path", "kubectl", "The kubectl binary to use. For development, you might use 'cluster/kubectl.sh' here.")
+
+	flags.StringVar(&TestContext.ProgressUpdateURL, "progress-report-url", "", "The URL to POST progress updates to as the suite runs to assist in aiding integrations. If empty, no messages sent.")


nit: just a question.
Why don't we name ProgressReportURL instead?
Most test context variables consist of the same words from the option names like the above.
That could be easy to search the code from the option name.

oomichi · 2019-11-05T21:57:34Z

test/e2e/reporters/progress.go

+		reporter.NumSkipped++
+		return
+	default:
+		return


just a question: This default case is unexpected, right?

Correct; we are just parsing for success/failure/skip. There are other types but they aren't really applicable at this time for how we're using this (https://github.com/onsi/ginkgo/blob/000d317071340972710dc1da9ae3c2c016a7a560/types/types_test.go#L10-L16)

test/e2e/reporters/progress.go

andrewsykim

Overall LGTM and seems useful, should probably be discussed in SIG testing or Testing Commons meeting (again) before approving?

test/e2e/reporters/progress.go

andrewsykim · 2019-11-06T18:26:34Z

test/e2e/reporters/progress.go

+// sendUpdates serializes the current progress and prints it to stdout and also posts it to the configured endpoint if set.
+func (reporter *ProgressReporter) sendUpdates() {
+	b := reporter.serialize()
+	fmt.Println(string(b))


Is this log line intentional?

Yes; one of the conditions/goals for doing this came from @timothysc during our testing commons meeting and it was to print out something useful to the user to get the same sort of feedback. Currently, when looking at the logs you can't make much sense of how the run as a whole is going or how close it is to finishing until it is done.

This solves that problem.

johnSchnake · 2019-11-06T19:56:00Z

@andrewsykim Our next meeting for testing commons is on the 11th (this coming monday) correct? If that's the case then I'm OK waiting to merge until then. I brought this up months ago when the ticket was filed but it had come up at the Aug5th meeting (some time ago!) and feedback was positive and I was given the green light to do this as long as I added stdout printing in additino to the webhook.

The only question implementation-wise I have is whether or not I should print the full list of failures to stdout as well. The way the Sonobuoy webhook is right now it expects the full status each time so we need to POST it that way. However, if you have a busted run with hundreds of failures it will definitely be sub-optimal to reprint the failed tests each time. The downside of changing the way its printed in each case is (1) it removes a benefit of this approach where you can grep for one of these progress lines and each one will be self-contained and really tell you the full state of affairs (2) it complicates the code a bit (3) may be less performant if we have to serialize a second time but the items are small most of the time so not a huge deal

test/e2e/reporters/progress.go

One common frustration of end users running the e2e suite is that they take a significant amount of time and it is difficult to gauge progress. Even if tailing the logs it can be difficult to see where one test starts and another ends or understand the if there have been failures in the past 1h of logs. This change adds a new custom reporter which prints summary information as tests complete. This includes the number of tests to run and how many have been passed/failed/skipped along with which tests have failed. A new flag can be set which pushes these values to an endpoint. This is intended for integration with Sonobuoy but any endpoint could consume and surface this data to the user so they can better understand the state of the current test run.

andrewsykim · 2019-11-11T18:08:21Z

test/e2e/e2e.go


+	// Stream the progress to stdout and optionally a URL accepting progress updates.
+	r = append(r, e2ereporters.NewProgressReporter(framework.TestContext.ProgressReportURL))


Can we just skip appending the progress reporter if the URL is empty?

That means we also don't log to stdout but maybe that's okay?

Or maybe logging to stdout should be a separate ginko reporter that's enabled by default?

I'm not sure I understand the motivation to decouple the URL and stdout printing; all the logic to count/gather/serialize would be the same. Even if we reused this datatype and had one for stdout printing and one for posting to URLs, then the memory footprint and processing overhead would be double (one for each of the reporters).

What is the desired endgoal; you want the ability to disable this functionality?

Yes -- it sounds like the new progress reporter is actually doing two things: reporting progress via stdout and reporting progress by sending a POST request to a URL. Based on how ginkgo deals with custom reports, it sounds like they should be decoupled. Reporting to stdout can be on by default but still optional.

I won't block on this though but something we should consider in the future.

johnSchnake · 2019-11-11T18:16:13Z

I refreshed the image just to be sure we were testing the latest code; you can run:

sonobuoy run --plugin e2e --kube-conformance-image=schnake/conformance:demo --plugin-env e2e.E2E_EXTRA_ARGS="--progress-report-url=http://localhost:8099/progress" --image-pull-policy=IfNotPresent

to see it in action. Then wait a bit and start watching sonobuoy status --json to see the progress.

andrewsykim · 2019-11-11T19:59:33Z

/approve
/lgtm

k8s-ci-robot · 2019-11-11T20:00:11Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, johnSchnake, neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~test/OWNERS~~ [andrewsykim]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

andrewsykim · 2019-11-11T20:03:45Z

test/e2e/reporters/progress.go

+// As new tests are completed it updates the values and prints them to stdout and optionally, sends the updates
+// to the configured URL.
+type ProgressReporter struct {
+	LastMsg string `json:"msg"`


For a future PR, I think a LastReported timestamp would be useful.

k8s-ci-robot requested review from oomichi and SataQiu October 29, 2019 15:35

k8s-ci-robot added area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 29, 2019

k8s-ci-robot added priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Oct 29, 2019

johnSchnake force-pushed the customReporter branch from e80343c to 2e12706 Compare October 29, 2019 15:42

johnSchnake commented Oct 29, 2019

View reviewed changes

test/e2e/reporters/progress.go Outdated Show resolved Hide resolved

johnSchnake commented Oct 29, 2019

View reviewed changes

test/e2e/reporters/progress.go Outdated Show resolved Hide resolved

johnSchnake force-pushed the customReporter branch from 2e12706 to 8036986 Compare October 30, 2019 20:21

k8s-ci-robot assigned andrewsykim Oct 30, 2019

k8s-ci-robot assigned timothysc Oct 30, 2019

SataQiu approved these changes Nov 4, 2019

View reviewed changes

oomichi reviewed Nov 5, 2019

View reviewed changes

andrewsykim reviewed Nov 6, 2019

View reviewed changes

test/e2e/reporters/progress.go Outdated Show resolved Hide resolved

andrewsykim reviewed Nov 6, 2019

View reviewed changes

johnSchnake force-pushed the customReporter branch from 8036986 to 12f27ba Compare November 6, 2019 20:06

andrewsykim reviewed Nov 6, 2019

View reviewed changes

test/e2e/reporters/progress.go Outdated Show resolved Hide resolved

johnSchnake force-pushed the customReporter branch from 12f27ba to ed1d527 Compare November 9, 2019 21:35

andrewsykim reviewed Nov 11, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 11, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 11, 2019

andrewsykim reviewed Nov 11, 2019

View reviewed changes

k8s-ci-robot merged commit 1ab6360 into kubernetes:master Nov 11, 2019

k8s-ci-robot added this to the v1.17 milestone Nov 11, 2019

johnSchnake mentioned this pull request Nov 12, 2019

Utilize progress updates with E2E plugin after v1.17+ vmware-tanzu/sonobuoy#992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Print progress updates to stdout and publish to URL #84524

Print progress updates to stdout and publish to URL #84524

johnSchnake commented Oct 29, 2019

johnSchnake commented Oct 29, 2019

neolit123 commented Oct 29, 2019

johnSchnake commented Oct 29, 2019

johnSchnake commented Oct 30, 2019

johnSchnake commented Oct 30, 2019

johnSchnake commented Oct 31, 2019

johnSchnake commented Oct 31, 2019

SataQiu Nov 4, 2019

johnSchnake Nov 4, 2019

SataQiu Nov 4, 2019

oomichi Nov 5, 2019

oomichi Nov 5, 2019

johnSchnake Nov 6, 2019

andrewsykim left a comment

andrewsykim Nov 6, 2019

johnSchnake Nov 6, 2019

johnSchnake commented Nov 6, 2019

andrewsykim Nov 11, 2019

andrewsykim Nov 11, 2019

andrewsykim Nov 11, 2019 •

edited

Loading

johnSchnake Nov 11, 2019

andrewsykim Nov 11, 2019

johnSchnake commented Nov 11, 2019

andrewsykim commented Nov 11, 2019

k8s-ci-robot commented Nov 11, 2019

andrewsykim Nov 11, 2019


		// Stream the progress to stdout and optionally a URL accepting progress updates.
		r = append(r, e2ereporters.NewProgressReporter(framework.TestContext.ProgressReportURL))

Print progress updates to stdout and publish to URL #84524

Print progress updates to stdout and publish to URL #84524

Conversation

johnSchnake commented Oct 29, 2019

johnSchnake commented Oct 29, 2019

neolit123 commented Oct 29, 2019

johnSchnake commented Oct 29, 2019

johnSchnake commented Oct 30, 2019

johnSchnake commented Oct 30, 2019

johnSchnake commented Oct 31, 2019

johnSchnake commented Oct 31, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewsykim left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnSchnake commented Nov 6, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewsykim Nov 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnSchnake commented Nov 11, 2019

andrewsykim commented Nov 11, 2019

k8s-ci-robot commented Nov 11, 2019

Choose a reason for hiding this comment

andrewsykim Nov 11, 2019 •

edited

Loading