Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More testcases for GOAWAY #94390

Merged
merged 1 commit into from Sep 4, 2020

Conversation

answer1991
Copy link
Contributor

@answer1991 answer1991 commented Sep 1, 2020

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

  1. Add a testcase(TestGOAWAYConcurrency) to check if GOAWAY affect concurrency requests
  2. Refactor GOAWAY testcases

Which issue(s) this PR fixes:

Fixes #91131

Special notes for your reviewer:

  1. Golang version go1.14.7 linux/amd64(has not fixed GOAWAY issue), testcase TestGOAWAYConcurrency failed with error like:
--- FAIL: TestGOAWAYConcurrency (5.19s)
    goaway_test.go:487: failed to request "/post", err: expect response status code: 200, but got: 500. response body: request declared a Content-Length of 5 but only wrote 0 bytes
    goaway_test.go:487: failed to request "/post", err: expect response status code: 200, but got: 500. response body: request declared a Content-Length of 5 but only wrote 0 bytes
    goaway_test.go:487: failed to request "/post", err: expect response status code: 200, but got: 500. response body: request declared a Content-Length of 5 but only wrote 0 bytes
    goaway_test.go:487: failed to request "/post", err: expect response status code: 200, but got: 500. response body: request declared a Content-Length of 5 but only wrote 0 bytes
  1. Golang version go1.15 linux/amd64, testcase TestGOAWAYConcurrency succeed.

Does this PR introduce a user-facing change?:

None

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

None

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 1, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @answer1991. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/apiserver sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Sep 1, 2020
@answer1991
Copy link
Contributor Author

/ok-to-test
/sig api-machinery

@k8s-ci-robot
Copy link
Contributor

@answer1991: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/ok-to-test
/sig api-machinery

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-none Denotes a PR that doesn't merit a release note. labels Sep 1, 2020
@eddiezane
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 1, 2020
@fedebongio
Copy link
Contributor

/assign @deads2k

@answer1991
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@answer1991
Copy link
Contributor Author

As we expected, after upgrading to golang 1.15, GOAWAY feature will not affect concurrency requests. We now are ready to enable GOAWAY feature.

PTAL @lavalamp @deads2k @wojtek-t

GOAWAYResponse = compbasemetrics.NewCounter(
&compbasemetrics.CounterOpts{
Name: "apiserver_goaway_response",
Help: "Number of requests which server response with GOAWAY frame",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

response -> responded

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what time period? if you see the previous 2, they say in last second

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we are adding something new apiserver_goaway_response so this is not just adding test cases. am i reading this right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I used a wrong metrics type. I will fixed it later.

so this is not just adding test cases. am i reading this right? - yes, 2 commits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what time period? if you see the previous 2, they say in last second

The last two metrics are gauges, this is a counter, counters don't have time frames.

However I do think it'd be good to add this metric in a separate PR. I'd like to take the tests first.

The metric is about connections (right?), but the other metrics here are about requests. So it's confusing and I think we should talk about that more. Looks like the tests don't depend on this metric, so hopefully it won't be much trouble to send it in a separate PR.

responseBodySize = len(responseBody)

// requestPostBody is the request body which client must send to test GOAWAY server for POST method,
// otherwise, test GOAWAY server will response 400 HTTP status code.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit: "response" is a noun, "respond" is the verb)

@answer1991
Copy link
Contributor Author

Fixed the nit and removed the metrics commit.

@lavalamp
Copy link
Member

lavalamp commented Sep 3, 2020

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 3, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: answer1991, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 3, 2020
@answer1991
Copy link
Contributor Author

/test pull-kubernetes-e2e-kind

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Sep 4, 2020
@wojtek-t
Copy link
Member

wojtek-t commented Sep 4, 2020

Adding milestone - this is purely test change.

/milestone v1.20

@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Sep 4, 2020
@answer1991
Copy link
Contributor Author

/kind cleanup

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Sep 4, 2020
@k8s-ci-robot k8s-ci-robot merged commit 2373477 into kubernetes:master Sep 4, 2020

// TestGOAWAYConcurrency tests GOAWAY frame will not affect concurrency requests in a single http client instance.
// Known issues in history: https://github.com/kubernetes/kubernetes/issues/91131.
func TestGOAWAYConcurrency(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test fails 100% of the time when run with go test -race k8s.io/apiserver/pkg/server/filters -run TestGOAWAYConcurrency, disabling in #94529 and opened #94532 to track fixing the test or the component

Copy link
Contributor Author

@answer1991 answer1991 Sep 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

caused by golang http/http2 pkg, I have already opened an issue #41234 at golang repo .

if err != nil {
t.Fatalf("unexpect connection err: %v", err)
}
localAddr = append(localAddr, conn.LocalAddr().String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this added a data race on unguarded localAddr, fixed in #94530

Copy link
Contributor Author

@answer1991 answer1991 Sep 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I know, it's not the root cause of TestClientReceivedGOAWAY flake, because dial a new connection only when client send a new request, and in this case, requests are sent one by one. In other words, if this line cause data race, then the additional unexpected connection set up, which means the test case or the configuration has problem.

I'm still debug the root cause of TestClientReceivedGOAWAY flake, for now, I doubt that I used an incorrect http2 server/client configuration(http.Transport.MaxConnsPerHost=1 or SETTINGS_MAX_CONCURRENT_STREAMS limited), or the dial function ignored context.Context which will cause more than one goroutine dial the connection when the first canceled.

Another interesting thing is I got a data race result that read at if tc.expectConnections != len(localAddr)(main testing goroutine) and write at localAddr = append(localAddr, conn.LocalAddr().String()), I'm very confused why the client try to set-up a new connection when we finished all requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GOAWAY may cause client error when client request server in high concurrency
9 participants