Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e node pod overhead #88558

Merged
merged 1 commit into from Mar 6, 2020
Merged

Conversation

egernst
Copy link
Contributor

@egernst egernst commented Feb 26, 2020

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Adds e2e_node test for the PodOverhead feature, and helps ensure this feature behaves as expected.

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Feb 26, 2020
@egernst egernst changed the title E2e node pod overhead e2e node pod overhead Feb 26, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 26, 2020
@k8s-ci-robot k8s-ci-robot added area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 26, 2020
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 26, 2020
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 26, 2020
@egernst egernst force-pushed the e2e_node-PodOverhead branch 2 times, most recently from 5c6206d to b7b9a13 Compare February 26, 2020 05:45
@egernst
Copy link
Contributor Author

egernst commented Feb 26, 2020

/test pull-kubernetes-e2e-kind-ipv6

@BenTheElder
Copy link
Member

I have a bit less context on what a node_e2e test should ideally look like, I would like sig node to review this
/uncc
[will observe though!]

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. sig/network Categorizes an issue or PR as relevant to SIG Network. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 27, 2020
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Feb 27, 2020
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
Containers: []v1.Container{
{
Image: busyboxImage,
Name: "container" + string(uuid.NewUUID()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why randomize the container name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do e2e_node run in parallel? I suppose since it isn't a "real" cluster we shouldn't expect there could be a pod running already with this name?

I saw this was best practice in the scheduler e2e tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container name doesn't need to be unique, just the pod name.

test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
@egernst
Copy link
Contributor Author

egernst commented Feb 28, 2020

Updated, PTAL @tallclair

@mrunalp
Copy link
Contributor

mrunalp commented Feb 28, 2020

@derekwaynecarr ptal

Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits

Containers: []v1.Container{
{
Image: busyboxImage,
Name: "container" + string(uuid.NewUUID()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The container name doesn't need to be unique, just the pod name.

test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
test/e2e_node/runtimeclass_test.go Outdated Show resolved Hide resolved
This test will verify that the Pod cgroup created takes Overhead into
account.

Signed-off-by: Eric Ernst <eric@amperecomputing.com>
Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 28, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: egernst, tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 28, 2020
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Feb 28, 2020

@egernst: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
pull-kubernetes-e2e-gce-csi-serial bd3be7d6134f87de0d05a4b1510a54b14237c2e4 link /test pull-kubernetes-e2e-gce-csi-serial
pull-kubernetes-e2e-gce-storage-slow bd3be7d6134f87de0d05a4b1510a54b14237c2e4 link /test pull-kubernetes-e2e-gce-storage-slow
pull-kubernetes-e2e-gce-storage-snapshot bd3be7d6134f87de0d05a4b1510a54b14237c2e4 link /test pull-kubernetes-e2e-gce-storage-snapshot
pull-kubernetes-e2e-gce-alpha-features afe5c71807f2a08fee36d01ac9d6bd3ce8123a05 link /test pull-kubernetes-e2e-gce-alpha-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@egernst
Copy link
Contributor Author

egernst commented Feb 28, 2020

/test pull-kubernetes-e2e-kind-ipv6

@tallclair
Copy link
Member

/milestone v1.18

@k8s-ci-robot k8s-ci-robot added this to the v1.18 milestone Mar 5, 2020
@tallclair tallclair added kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 5, 2020
@egernst
Copy link
Contributor Author

egernst commented Mar 5, 2020

/test tide ??

Copy link
Member

@derekwaynecarr derekwaynecarr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to try this, I am confused by my own math.

"github.com/onsi/ginkgo"
)

// makePodToVerifyCgroups returns a pod that verifies the existence of the specified cgroups.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "makePodToVerifyCgroupSize"

podUID string
handler string
)
ginkgo.By("Creating a RuntimeClass with Overhead definied", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "defined"

})
ginkgo.By("Checking if the pod cgroup was created appropriately", func() {
cgroupsToVerify := []string{"pod" + podUID}
pod := makePodToVerifyCgroupSize(cgroupsToVerify, "30000", "251658240")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just tracking my mental math...

200m+100m=300m cpu

3000 * 1024 / 1000 = 3072

is what i would have expected for cpu.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is checking cpu.cfs_quota_us not cpu.shares. however, this test should also check cpu.shares.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be if you were checking shares, which isn't the relevant setting. Kubelet will set cpu.cfs_quota_us to enforce the CPU limit (which should correspond to the total limits+pod overhead) -- see the check at [1].

CPU quota is based relative to a period (set to 100,000), so we are checking for 30,000, which shows a limiting to effectively. 300 milliCPU.

[1] - https://github.com/kubernetes/kubernetes/pull/88558/files#diff-c3ca917e18d307eabb48f941504b035eR53

Copy link
Contributor Author

@egernst egernst Mar 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shameless plug to gist @mcastelino put together, discussing shares v. quota: https://gist.github.com/mcastelino/b8ce9a70b00ee56036dadd70ded53e9f#what-happens

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I saw CFS and mentally said shares and started doing math. Value for quota is right.

@k8s-ci-robot k8s-ci-robot merged commit e23e720 into kubernetes:master Mar 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants