Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make integration test work with go test. #64803

Merged
merged 1 commit into from
Sep 25, 2018

Conversation

misterikkit
Copy link

@misterikkit misterikkit commented Jun 6, 2018

What this PR does / why we need it:

The EtcdMain function in integration tests is designed to launch an etcd
instance only when running in a bazel test. For non-bazel, we rely on
hack/make-rules/test-integration.sh to bring up the etcd instance.

This patch fixes the following in EtcdMain:

  1. If etcd is not found in ${RUNFILES_DIR} then look in ${PATH}.
  2. Try to connect to the etcd started by make test-integraion; if it
    is up, then don't start etcd.
  3. Gracefully shut down etcd after tests.
  4. Don't use sync.Once.

The benefit of this change is that integration tests work with go test
as well as make test-integration without users needing to do anything
special. That makes it much easier to pass go testing flags to tests and
integrate with IDEs.

Special notes for your reviewer:
I made sure this does not break bazel test.

Release note:

NONE

/kind cleanup
/sig testing
@kubernetes/sig-testing-pr-reviews
/cc @kubernetes/sig-contributor-experience-pr-reviews

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. sig/testing Categorizes an issue or PR as relevant to SIG Testing. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 6, 2018

func setupETCD() {
etcdSetup.Do(func() {
if os.Getenv("RUNFILES_DIR") == "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RUNFILES_DIR is used when this is run under bazel. This should probably be documented better (at least have a comment), but there's unfortunately not really a better way to pass the path of this dependency, since this is implemented in a library that's used by a bunch of individual go tests. I'd rather not break this functionality.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I had no idea about RUNFILES_DIR. I'll fix this PR to not break that. Testing it with this command.

 bazel test //test/integration/scheduler:go_default_test --test_arg=-test.v --test_arg=-test.run=xxx --test_output=all

@ixdy
Copy link
Member

ixdy commented Jun 7, 2018

Also, kubernetes is moving in the direction of not supporting bare go build / go test - we have already removed a few generated files from the tree, and are likely to remove more soon.

@misterikkit
Copy link
Author

Also, kubernetes is moving in the direction of not supporting bare go build / go test - we have already removed a few generated files from the tree, and are likely to remove more soon.

I understand that this is not the supported path or the way of the future. However, it did seem like low-hanging fruit that would speed up my dev->test cycles. I was hoping that it would be an acceptable change.

I'm a bit sad that go build / go test may stop working, because make and bazel are slower and don't integrate as nicely with IDEs. This is clearly not the place for me to plead that case, however.

@timothysc timothysc removed their request for review June 7, 2018 13:47
@ixdy
Copy link
Member

ixdy commented Jun 7, 2018

if we don't break any existing functionality, I'm fine with this PR.

It's worth noting that better IDE support is something that's on the radar for the bazel-go integration, though it's not ready yet.

@misterikkit
Copy link
Author

if we don't break any existing functionality, I'm fine with this PR.

Thanks. I confirmed that my fixup patch restores bazel functionality. I tested with this command.

 bazel test //test/integration/scheduler:go_default_test --test_arg=-test.v --test_arg=-test.run=xxx --test_output=all

I will squash commits and update the commit message next.

@misterikkit
Copy link
Author

@ixdy are you the right person to review, now that I've fixed the bazel part?

fmt.Fprintf(os.Stderr, installEtcd)
return nil, fmt.Errorf("could not find etcd in PATH: %v", err)
}
// TODO: Pick an unused port.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this part worries me a little bit. since bazel may run multiple integration tests in parallel, we want to keep the etcd instances separate, hence previously it used a constant "random" port. without any means to ensure this port is unused, this seems like it will cause test flakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can put it back to the way it was, but IMO the previous behavior encodes assumptions about argv[0] that are not obvious. How do I know if argv[0] == 'make' or 'bazel' or 'go'?

I was thinking that the classic, "get a free port from the kernel and then release it" method would be more suitable.

Or I can put the old behavior back.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think argv[0] would be the name of the go test that was compiled, e.g. under bazel it might be something like bazel-bin/test/integration/auth/linux_amd64_stripped/go_default_test.

I'm not sure whether this does the right thing under just go test though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though yes, picking an actual free port would be best. the current method of using the test name doesn't help if we run 10 instances of the same test in parallel, e.g. to shake out flakes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the feedback!

@ixdy
Copy link
Member

ixdy commented Jun 12, 2018

/test pull-kubernetes-bazel-test-integration-canary

@ixdy
Copy link
Member

ixdy commented Jun 12, 2018

/retest

@AishSundar
Copy link
Contributor

/retest

@misterikkit
Copy link
Author

This PR still needs approval. Can anyone from @kubernetes/sig-testing-pr-reviews take a look?

@cblecker
Copy link
Member

/test pull-kubernetes-bazel-test-integration-canary

@ixdy
Copy link
Member

ixdy commented Aug 15, 2018

ack, this fell off my radar, sorry about that. I'll add it to my queue for another look.

@ixdy
Copy link
Member

ixdy commented Aug 15, 2018

/assign

@misterikkit
Copy link
Author

/retest

@misterikkit
Copy link
Author

I am getting the same error on pull-kubernetes-bazel-test-integration-canary from multiple runs, but I don't understand how it could be related to my change.

I0823 17:16:46.466] Call:  gsutil -q -h Content-Type:application/json cp /tmp/gsutil_jQeLMu gs://kubernetes-jenkins/pr-logs/pull/64803/pull-kubernetes-bazel-test-integration-canary/6/started.json
I0823 17:16:53.641] process 349 exited with code 0 after 0.1m
I0823 17:16:53.642] Call:  gsutil -q -h Content-Type:text/plain -h 'x-goog-meta-link: gs://kubernetes-jenkins/pr-logs/pull/64803/pull-kubernetes-bazel-test-integration-canary/6' cp /tmp/gsutil_9_AaPO gs://kubernetes-jenkins/pr-logs/directory/pull-kubernetes-bazel-test-integration-canary/6.txt
I0823 17:16:55.863] process 530 exited with code 0 after 0.0m
I0823 17:16:55.865] Call:  /workspace/./test-infra/jenkins/../scenarios/kubernetes_execute_bazel.py --test=//test/integration/... --test-args=--config=integration --timeout=60
W0823 17:16:55.889] usage: kubernetes_execute_bazel.py [-h] cmd [args [args ...]]
W0823 17:16:55.890] kubernetes_execute_bazel.py: error: too few arguments
E0823 17:16:55.892] Command failed
I0823 17:16:55.892] process 711 exited with code 2 after 0.0m
E0823 17:16:55.893] FAIL: pull-kubernetes-bazel-test-integration-canary
I0823 17:16:55.893] Call:  gcloud auth activate-service-account --key-file=/etc/service-account/service-account.json
W0823 17:16:56.384] Activated service account credentials for: [pr-kubekins@kubernetes-jenkins-pull.iam.gserviceaccount.com]
I0823 17:16:56.443] process 712 exited with code 0 after 0.0m
I0823 17:16:56.443] Call:  gcloud config get-value account
I0823 17:16:56.799] process 725 exited with code 0 after 0.0m
I0823 17:16:56.799] Will upload results to gs://kubernetes-jenkins/pr-logs using pr-kubekins@kubernetes-jenkins-pull.iam.gserviceaccount.com

@BenTheElder
Copy link
Member

that job is not required and probably broken

@ixdy
Copy link
Member

ixdy commented Sep 1, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 1, 2018
if err != nil {
glog.Fatalf("Failed to run etcd: %v", err)
glog.Warningf("error during etcd cleanup: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried running this locally and I'm seeing this constantly warning here with

W0901 00:06:50.067389      15 etcd.go:118] error during etcd cleanup: remove /tmp/integration_test_etcd_data551762727: directory not empty

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

@cblecker
Copy link
Member

cblecker commented Sep 1, 2018

/skip

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel comment for consistent failures.

The EtcdMain function in integration tests is designed to launch an etcd
instance only when running in a bazel test. For non-bazel, we rely on
hack/make-rules/test-integration.sh to bring up the etcd instance.

This patch fixes the following in EtcdMain:
1. If etcd is not found in ${RUNFILES_DIR} then look in ${PATH}.
2. Try to connect to the etcd started by `make test-integraion`; if it
   is up, then don't start etcd.
3. Gracefully shut down etcd after tests.
4. Get a port from the OS instead of deriving it from argv[0].
5. Don't use sync.Once.

The benefit of this change is that integration tests work with `go test`
as well as `make test-integration` without users needing to do anything
special. That makes it much easier to pass go testing flags to tests and
integrate with IDEs.
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 10, 2018
@dims
Copy link
Member

dims commented Sep 24, 2018

/lgtm

(re-apply lgtm from @ixdy)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 24, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dims, ixdy, misterikkit

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Sep 25, 2018

@misterikkit: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test-integration-canary c974d0dc50cff38e77fbe9e803e9f530cf3c4516 link /test pull-kubernetes-bazel-test-integration-canary

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@ixdy
Copy link
Member

ixdy commented Sep 25, 2018

/retest

@k8s-ci-robot k8s-ci-robot merged commit 6b1af84 into kubernetes:master Sep 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants