Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds an optional golang runner to the conformance test image #79284

Merged
merged 1 commit into from Jul 2, 2019

Conversation

@johnSchnake
Copy link
Contributor

commented Jun 21, 2019

What type of PR is this?
/kind feature

What this PR does / why we need it:
Adds a go app which runs the e2e tests with ginkgo.

  • Supports all the existing env vars of the bash script
  • Improved flow control to avoid and better report issues
    regarding the process PID
  • Adds flags for modifying where to find the test binary and
    ginkgo binary so that you can run it locally
  • Adds 3 flags for specifying extra args before the double-dash,
    extra args after the double-dash, and the seperator to use between
    values in those env vars. This allows setting arbitrary, complex
    values for use on the command such as flags which include spaces
    or other characters.

Which issue(s) this PR fixes:
Fixes #78699
Fixes #78599

Special notes for your reviewer:
This is not, at this time, replacing the bash runner so as to avoid breaking changes. I imagine this being merged and then letting it bake for a little while and we see if we run into any unforeseen problems.

To utilize this, you can try:

sonobuoy run --mode=quick --kube-conformance-image=schnake/conformance:goRunner --plugin-env e2e.E2E_USE_GO_RUNNER=true --plugin-env e2e.E2E_EXTRA_GINKGO_ARGS="--afterSuiteHook=echo (ginkgo-suite-name) suite tests have [(ginkgo-suite-passed)]" --wait

outfile=$(sonobuoy retrieve) && mkdir results && tar -xf $outfile -C results

head results/podlogs/heptio-sonobuoy/sonobuoy-e2e-job-*/logs/e2e.txt
tail results/podlogs/heptio-sonobuoy/sonobuoy-e2e-job-*/logs/e2e.txt

Notice a few things to see that the go runner is being used:

  • the command to be run (with paths, env, etc) is all printed at the start of the command. (The Env: [] indicates the command is just using the underlying os env. I should modify the image to print that as well so it is not confusing and has more debug use)
  • I passed an arbitrary flag --afterSuiteHook which took spaces, parentheses, and brackets without an issue. That is why you see e2e suite tests have [[PASS]] at the end of the suite.

Does this PR introduce a user-facing change?:
Not sure if this counts as user-facing or not, but I went ahead and added a note.

When using the conformance test image, a new environment variable E2E_USE_GO_RUNNER will cause the tests to be run with the new Golang-based test runner rather than the current bash wrapper.

/sig testing
/area testing

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 21, 2019

Ping @dims Let me know what you think.

I also thought about marking this as /area deflake but it doesn't remove a flaky test ,but flaky behavior in certain test runners. As well, wasn't sure about /area conformance since it is not about a specific conformance test.

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 21, 2019

/retest

@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch from 2652291 to e49a193 Jun 21, 2019
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 21, 2019

So I think the build is failing because I'm adding a new binary and not properly integrating it into the build. What do I need to do?

cp: cannot stat '/workspace/k8s.io/kubernetes/cluster/images/conformance/../../../_output/dockerized/bin/linux/amd64/go-runner': No such file or directory

Locally I build via the typical make flow. On a mac so I run:

build/run.sh make WHAT="cluster/images/conformance/go-runner"
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 22, 2019

Seems like I have to simultaneously update k8s test-infra https://github.com/kubernetes/test-infra/blob/master/experiment/kind-conformance-e2e.sh#L26

And maybe a few other places. I've not yet had changes that dependent changes in different k8s repos at one time. What is the protocol here?

Copy link
Member

left a comment

SGTM,
added minor comments.

i think long run it would have been better if we support a versioned configuration for these options that can be mounted in the container.
env. vars are arguably more difficult to maintain.

PS: this folder should be moved outside of /cluster.

@@ -36,6 +36,14 @@ saveResults() {
echo -n "${RESULTS_DIR}/e2e.tar.gz" > "${RESULTS_DIR}/done"
}

# Optional Golang runner alternative to the bash script.
# Entry provided via env var to simplify invocation.
if [[ -n ${E2E_USE_GO_RUNNER:-} ]]; then

This comment has been minimized.

Copy link
@neolit123

neolit123 Jun 22, 2019

Member

if we temp. enable this by default we can even test it with a presubmit e2e:
pull-kubernetes-conformance-image-test
?

This comment has been minimized.

Copy link
@johnSchnake

johnSchnake Jun 23, 2019

Author Contributor

I'm not exactly sure I understand the question; if I push a commit where it is the default (just for testing) it should run fine; the log and junit output is the same. The problem I'm having right now it seems though is build related; maybe you can help.

It seems like pull-kubernetes-conformance-image-test is calling into the test-infra repo which doesn't build go-runner properly so it messes things up I think. I must be missing some part of how to integrate this into the build. The error is below but the test infra code is here (https://github.com/kubernetes/test-infra/blob/master/experiment/kind-conformance-e2e.sh#L26)

W0622 14:50:22.002] cp: cannot stat '/go/src/k8s.io/kubernetes/cluster/images/conformance/../../../_output/dockerized/bin/linux/amd64/go-runner': No such file or directory
...
W0622 14:50:28.225] subprocess.CalledProcessError: Command '('bash', '-c', 'cd ./../../k8s.io/kubernetes && source ./../test-infra/experiment/kind-conformance-e2e.sh')' returned non-zero exit status 2
E0622 14:50:28.230] Command failed

So I've integrated into the build and as a test target, is there somewhere else I need to update? Or is there a different place just for the dockerized artifacts? Or is this a problem because of a disconnect between test infra and k/k?

This comment has been minimized.

Copy link
@neolit123

neolit123 Jun 23, 2019

Member

forgot that our script:
https://github.com/kubernetes/test-infra/blob/master/experiment/kind-conformance-e2e.sh#L40
loads an image from GCR and does not build a new image, thus the runner will be absent there.

so my idea will not work.

This comment has been minimized.

Copy link
@johnSchnake

johnSchnake Jun 23, 2019

Author Contributor

That's just a kind load which is going to load the image you have tagged locally, right? And right above that, it builds this image.

I think that's the problem I'm having now: It tries to build/load the image but that script is in the test-infra repo. @fejta said there isn't a way to update that script at the same time as this PR because they are in different repos; so I feel like I'm in a chicken/egg situation.

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2019

@neolit123

PS: this folder should be moved outside of /cluster.

I always wondered how it ended up there instead of in test/images. So you'd be good if I move it there too? Or should I put that in a separate PR?

i think long run it would have been better if we support a versioned configuration for these options that can be mounted in the container.
env. vars are arguably more difficult to maintain.

Do you have an example of where configs are used this way? It seemed to me to be the convention for docker images to just use env vars. But maybe since there are so many possible values a mountable config makes more sense? It is just harder to mount a separate config than setting 1-10 env vars.

@neolit123

This comment has been minimized.

Copy link
Member

commented Jun 23, 2019

I always wondered how it ended up there instead of in test/images. So you'd be good if I move it there too? Or should I put that in a separate PR?

if test/images is only for images that are used during individual e2e tests, then it might not be the right place for the conformance image. we also have "build/", where the pause image lives.

i think all images under test/images and build/ should be moved to /build/images. but this is for separate PRs and needs discussion.

Do you have an example of where configs are used this way? It seemed to me to be the convention for docker images to just use env vars. But maybe since there are so many possible values a mountable config makes more sense? It is just harder to mount a separate config than setting 1-10 env vars.

haproxy bind mounts it's config.
https://docs.docker.com/samples/library/haproxy/

in our case its doable with a volume that points to a ConfigMap and the ConfigMap object can be a part of conformance-e2e.yaml.

i do think it makes sense, but maybe it's something to consider for the future.

apiVersion: v1
kind: ConfigMap
metadata:
  name: runner-config
  namespace: conformance
data:
  config.yaml: |
    the-config-goes-here
---
apiVersion: v1
kind: Pod
metadata:
  name: e2e-conformance-test
  namespace: conformance
spec:
  containers:
  - name: conformance-container
    image: k8s.gcr.io/conformance-amd64:v1.14
    imagePullPolicy: IfNotPresent
    volumeMounts:
    - name: output-volume
      mountPath: /tmp/results
    - name: config-volume
      mountPath: /runner/config.yaml
      subPath: config.yaml
  volumes:
  - name: output-volume
    hostPath:
      path: /tmp/results
  - name: config-volume
    configMap:
      name: runner-config
  restartPolicy: Never
  serviceAccountName: conformance-serviceaccount
@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch 2 times, most recently from 41d678e to c6aa152 Jun 24, 2019
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

https://prow.k8s.io/view/gcs/kubernetes-jenkins/pr-logs/pull/79284/pull-kubernetes-bazel-test/1143260929190793217 is the most recent bazel build run; it passes. The github integration is messed up I guess and isn't getting updated after that timeout that is reported still.

@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch from c6aa152 to a8c082f Jun 25, 2019
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

Fixed the header on my makefile; now the only thing I know is an issue is how this PR in k/k is running scripts in k/test-infra which presume how to build this. I don't understand how we would change how to build something if those tests are in another repo and presume to know how to build. this.

@BenTheElder

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

/test pull-kubernetes-e2e-kind

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

/test pull-kubernetes-integration
/test pull-kubernetes-e2e-gce-100-performance

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

@neolit123 So non-blocking test pull-kubernetes-conformance-image-test won't pass since those scripts live in k/test-infra and wont know about this new go-runner component.

@BenTheElder agreed there isn't a good way to get that to work and just thought I could get this merged and then merge the changes to the scripts accordingly.

Are you OK with that? I can get the PR up in k/test-infra first so they can be reviewed/merge approximately the same time. I figure though that merging the k/k change without the script change would break other users.

I thought that instead, I could merge something into k/test-infra that would try to build the new component and ignore failures. That way it is effectively noop until this goes in, and then it would start working with the new component. The only downside to that is that it would require another PR afterwards to clean up that 'ignore build error' section. But that seems safer to me.

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jun 25, 2019

/test pull-kubernetes-node-e2e-containerd

@neolit123

This comment has been minimized.

Copy link
Member

commented Jun 25, 2019

@neolit123 So non-blocking test pull-kubernetes-conformance-image-test won't pass since those scripts live in k/test-infra and wont know about this new go-runner component.

i think a proper e2e WRT to the image changes should be building the image and testing that particular build.

@k8s-ci-robot k8s-ci-robot removed the lgtm label Jul 1, 2019
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

  • Clarified the name of the run method to be more verbose/clear configureAndRunWithEnv
  • updated a handful of error paths to add more context the the wrapped error
  • added a few more comments
@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch from ceddbc9 to a087585 Jul 1, 2019
@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels Jul 1, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 1, 2019

@johnSchnake: ¯\_(ツ)_/¯

In response to this:

/unshrug

Apparently, that is a command :) I didn't realize the bot would take a shrug action.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

So I'll need to debug it but I think the issue is just that git uploaded something that was a symlink and seemed to turn it into a regular file when committing it, hence the one unit test failure.

(sorry for the accidental closure; wrong button to click)

@johnSchnake johnSchnake closed this Jul 1, 2019
@johnSchnake johnSchnake reopened this Jul 1, 2019
@fejta

This comment has been minimized.

Copy link
Contributor

commented Jul 1, 2019

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Jul 1, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, fejta, johnSchnake, spiffxp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

So it is checked in correctly as a symlink (see https://github.com/kubernetes/kubernetes/blob/a08758598960e442176b9a26ebf2ea1d9ddd10c6/cluster/images/conformance/go-runner/testdata/tartest/file3link) but the problem is that the client that is testing it seems to have core.symlink set to false so the file gets checked out as a regular file with the link test.

Maybe it was just a bad idea to try and unit test that portion; it would also have problems on windows I'm sure so I'll just remove the symlink portion of that test.

@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch from a087585 to fff7c78 Jul 1, 2019
Adds a go app which runs the e2e tests with ginkgo.

 - Supports all the existing env vars of the bash script
 - Improved flow control to avoid and better report issues
regarding the process PID
 - Adds flags for modifying where to find the test binary and
ginkgo binary so that you can run it locally
 - Adds 3 flags for specifying extra args before the double-dash,
extra args after the double-dash, and the seperator to use between
values in those env vars. This allows setting arbitrary, complex
values for use on the command such as flags which include spaces
or other characters.
@johnSchnake johnSchnake force-pushed the johnSchnake:conformanceGoRunner branch from fff7c78 to b3f5a08 Jul 1, 2019
@johnSchnake

This comment has been minimized.

Copy link
Contributor Author

commented Jul 1, 2019

Forgot to add the boilerplate header to a few files; repushed.

@spiffxp

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

/lgtm
/hold
it looks to me like @timothysc's concerns have been addressed but just to be sure, /hold cancel when satisfied

@timothysc

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

/hold cancel

thx @johnSchnake for the updates!

@k8s-ci-robot k8s-ci-robot merged commit e79dcc2 into kubernetes:master Jul 2, 2019
23 checks passed
23 checks passed
cla/linuxfoundation johnSchnake authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Job succeeded.
Details
pull-kubernetes-cross Job succeeded.
Details
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@johnSchnake johnSchnake deleted the johnSchnake:conformanceGoRunner branch Jul 3, 2019
akraino-github pushed a commit to akraino-edge-stack/validation that referenced this pull request Oct 8, 2019
For the time being do not execute e2e tests which assume fixed domain
name [1] conflicting cluster's domain name. Configurability has
been implemented [2] but available only starting from k8s 1.16.

[1] vmware-tanzu/sonobuoy#733
[2] kubernetes/kubernetes#79284

JIRA: VAL-61

Signed-off-by: Juha Kosonen <juha.kosonen@nokia.com>
Change-Id: I9b13dc7a402b3304b2b22a33a6eb8e446d3abcc6
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.