Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a few behavioral e2e tests #387

Merged
merged 6 commits into from Dec 6, 2019
Merged

Conversation

@xueweiz
Copy link
Contributor

xueweiz commented Nov 20, 2019

This PR is part of #296 .

This PR does a few things:

  1. Added gomega to vendor.
  2. Added a new go program problem-maker, which is used to simulate problems (DockerHung, ext4 error, OOM kill...) for e2e tests.
  3. Added a new rule to detect ext4 errors and warnings.
  4. Added e2e tests for reporting filesystem problems, and allow the tests to run in parallel.
  5. Added e2e tests for reporting OOM kills and docker hung.

Tested via:
make clean && ZONE=us-central1-a PROJECT=xueweiz-experimental IMAGE_FAMILY=cos-73-lts IMAGE_PROJECT=cos-cloud SSH_USER=${USER} SSH_KEY=~/.ssh/id_rsa ARTIFACTS=/tmp/npd make e2e-test

@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Nov 20, 2019

@xueweiz xueweiz force-pushed the xueweiz:test-pr branch 4 times, most recently from 80c14b2 to af22959 Nov 20, 2019
@xueweiz xueweiz changed the title Add behavioral e2e tests for ext4 filesystem problems Add a few behavioral e2e tests Nov 26, 2019
.gitignore Show resolved Hide resolved
test/e2e/problemmaker/problem_maker.go Show resolved Hide resolved

// AddFlags adds log counter command line options to pflag.
func (o *options) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&o.Rate, "rate", 1.0,

This comment has been minimized.

Copy link
@wangzhen127

wangzhen127 Nov 26, 2019

Member

Should this be int?

This comment has been minimized.

Copy link
@xueweiz

xueweiz Dec 5, 2019

Author Contributor

I was planning to keep it as a float, so that we allow rate such as "100 problems per minute". If we only allow int, then we could only have "60 problems per minute" or "120 problems per minute".
This is used to stress NPD and measure its CPU usage under heavy log processing. I'm not sure whether "1 per second" will be a good granularity.
I don't have any strong opinion through.

This comment has been minimized.

Copy link
@wangzhen127

wangzhen127 Dec 6, 2019

Member

I am confused. The description of this flag is "Number of times the problem should be generated per second". Even if use int here, you can still have 100 problems per minute, right?

This comment has been minimized.

Copy link
@xueweiz

xueweiz Dec 6, 2019

Author Contributor

"100 problems per minute" is actually "1.667 problems per second" :) which cannot be expressed using an int

This comment has been minimized.

Copy link
@wangzhen127

wangzhen127 Dec 6, 2019

Member

Oh, I see what you mean now.

test/e2e/problemmaker/problem_maker.go Outdated Show resolved Hide resolved
test/e2e/problemmaker/problem_maker.go Outdated Show resolved Hide resolved
test/e2e/metriconly/metrics_test.go Show resolved Hide resolved
test/e2e/metriconly/e2e_npd_test.go Show resolved Hide resolved
@xueweiz xueweiz force-pushed the xueweiz:test-pr branch 2 times, most recently from abba864 to 58f989e Dec 5, 2019
@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Dec 5, 2019

Hi @wangzhen127 , thanks for the review! I just fixed the problems you mentioned. Could you help taking another look? Thanks!

@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Dec 5, 2019

The pull-npd-e2e-node test failure is due to kubernetes/kubernetes#85933, not this PR. Fixing it.

@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Dec 5, 2019

kubernetes/kubernetes#85931 should have fixed the problem.
/retest

test/e2e/metriconly/e2e_npd_test.go Show resolved Hide resolved

// AddFlags adds log counter command line options to pflag.
func (o *options) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&o.Rate, "rate", 1.0,

This comment has been minimized.

Copy link
@wangzhen127

wangzhen127 Dec 6, 2019

Member

I am confused. The description of this flag is "Number of times the problem should be generated per second". Even if use int here, you can still have 100 problems per minute, right?

test/e2e/problemmaker/README.md Outdated Show resolved Hide resolved
xueweiz added 5 commits Sep 19, 2019
Also added support for running e2e tests in parallel.
Also fixes two minor bugs:

1. Change default Boskos wait timeout to 2 minutes.
This is because the current test timeout is configured to 10 minutes.
Running each test case taks 1-2 minutes, and each node will run 1-2 test
cases. 5 minutes timeout on waiting for Boskos may cause a test timeout,
which we want to avoid.

2. Create artifact subdir with 0755 rather than 0644.
Because execution bit should be set on the directories.
@xueweiz xueweiz force-pushed the xueweiz:test-pr branch from 58f989e to 7d28dde Dec 6, 2019
@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Dec 6, 2019

Hi Zhen, thanks for the input! I just fixed the README and replied to the comments above.

(Sorry some of my replies seem not showing up in this page. You might have to go to "Files changed" page to see them.)

Copy link
Member

wangzhen127 left a comment

/lgtm


// AddFlags adds log counter command line options to pflag.
func (o *options) AddFlags(fs *pflag.FlagSet) {
fs.Float32Var(&o.Rate, "rate", 1.0,

This comment has been minimized.

Copy link
@wangzhen127

wangzhen127 Dec 6, 2019

Member

Oh, I see what you mean now.

@k8s-ci-robot k8s-ci-robot added the lgtm label Dec 6, 2019
@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

k8s-ci-robot commented Dec 6, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wangzhen127, xueweiz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [wangzhen127,xueweiz]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 9d584df into kubernetes:master Dec 6, 2019
9 of 10 checks passed
9 of 10 checks passed
tide Not mergeable. Retesting: pull-npd-e2e-kubernetes-gce-ubuntu
Details
cla/linuxfoundation xueweiz authorized
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
pull-npd-build Job succeeded.
Details
pull-npd-e2e-kubernetes-gce-gci Job succeeded.
Details
pull-npd-e2e-kubernetes-gce-gci-custom-flags Job succeeded.
Details
pull-npd-e2e-kubernetes-gce-ubuntu Job succeeded.
Details
pull-npd-e2e-kubernetes-gce-ubuntu-custom-flags Job succeeded.
Details
pull-npd-e2e-node Job succeeded.
Details
pull-npd-test Job succeeded.
Details
@xueweiz

This comment has been minimized.

Copy link
Contributor Author

xueweiz commented Dec 6, 2019

Thanks a lot for the review Zhen!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.