Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client-go: add unit test to verify order of calls with retry #108262

Merged
merged 1 commit into from
Feb 23, 2022

Conversation

tkashem
Copy link
Contributor

@tkashem tkashem commented Feb 21, 2022

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This unit test verifies that Do, DoRaw, Stream, and Watch invoke the flowcontrol.RateLimiter, BackoffManager, and metrics.ResultMetric appropriately and in right order for retry.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. labels Feb 21, 2022
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 21, 2022
@tkashem
Copy link
Contributor Author

tkashem commented Feb 22, 2022

/assign @wojtek-t

@wojtek-t when you have some time, please review it. This will pave the way for adding the new metric to count retry attempts.

@aojea
Copy link
Member

aojea commented Feb 22, 2022

@tkashem Ive added that metric in my pr
#106911

Can we follow up there to not duplicate work?

@tkashem
Copy link
Contributor Author

tkashem commented Feb 22, 2022

@aojea that's great, i will review the retry metric PR you are working on. In the meantime, this unit test PR can still merge, it asserts on the expected behavior with rate limiter, backoff and the existing metric call.

Can you review this PR please? :)

@aojea
Copy link
Member

aojea commented Feb 22, 2022

I think we should test more permutations,

  • success
  • sucess after retries
  • failure after exhausting the retries
  • don't retry on network errors

something like (it can be simplified.) 😄

diff --git a/staging/src/k8s.io/client-go/rest/request_test.go b/staging/src/k8s.io/client-go/rest/request_test.go
index 49689119223..8ac86355183 100644
--- a/staging/src/k8s.io/client-go/rest/request_test.go
+++ b/staging/src/k8s.io/client-go/rest/request_test.go
@@ -3000,8 +3000,10 @@ func (lb *withRateLimiterBackoffManagerAndMetrics) Do() {
 
 func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc func(ctx context.Context, r *Request)) {
        type expected struct {
-               attempts int
-               order    []string
+               attempts    int
+               order       []string
+               sleeps      []string
+               statusCodes []string
        }
 
        // we define the expected order of how the client invokes the
@@ -3010,47 +3012,6 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
        //  - A: original request fails with a retryable response: (500, 'Retry-After: 1')
        //  - B: retry 1: successful with a status code 200
        // so we have a total of 2 attempts
-       callOrderExpected := []string{
-               // before we send the request to the server:
-               // - we wait as dictated by the client rate lmiter
-               // - we wait, as dictated by the backoff manager
-               "RateLimiter.Wait",
-               "BackoffManager.CalculateBackoff",
-               "BackoffManager.Sleep",
-
-               // A: first attempt for which the server sends a retryable response
-               "Client.Do",
-
-               // we got a response object, status code: 500, Retry-Afer: 1
-               //  - call metrics method with appropriate status code
-               //  - update backoff parameters with the status code returned
-               //  - sleep for N seconds from 'Retry-After: N' response header
-               "RequestResult.Increment",
-               "BackoffManager.UpdateBackoff",
-               "BackoffManager.Sleep",
-               // sleep for delay dictated by backoff parameters
-               "BackoffManager.CalculateBackoff",
-               "BackoffManager.Sleep",
-               // wait as dictated by the client rate lmiter
-               "RateLimiter.Wait",
-
-               // B: 2nd attempt: retry, and this should return a status code=200
-               "Client.Do",
-
-               // it's a success, so do the following:
-               //  - call metrics and update backoff parameters
-               "RequestResult.Increment",
-               "BackoffManager.UpdateBackoff",
-       }
-       sleepExpected := []string{
-               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
-               (1 * time.Second).String(), // from 'Retry-After: 1' response header
-               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
-       }
-       statusCodesExpected := []string{
-               "500",
-               "200",
-       }
 
        tests := []struct {
                name          string
@@ -3060,7 +3021,76 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                expectations map[string]expected
        }{
                {
-                       name:       "success after two retries",
+                       name:       "success",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: &http.Response{StatusCode: http.StatusOK}, err: nil},
+                       },
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate limiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{
+                                               "200",
+                                       },
+                               },
+                               "Watch": {
+                                       attempts: 1,
+                                       // Watch does not do 'RateLimiter.Wait' before initially sending the request to the server
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                               },
+                               "Stream": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate limiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                               },
+                       },
+               },
+               {
+                       name:       "success after 2 retries",
                        maxRetries: 2,
                        serverReturns: []responseErr{
                                {response: retryAfterResponse(), err: nil},
@@ -3069,19 +3099,358 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                        expectations: map[string]expected{
                                "Do": {
                                        attempts: 2,
-                                       order:    callOrderExpected,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                                "Watch": {
                                        attempts: 2,
-                                       // Watch does not do 'RateLimiter.Wait' before initially sending the request to the server
-                                       order: callOrderExpected[1:],
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                                "Stream": {
                                        attempts: 2,
-                                       order:    callOrderExpected,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                        },
                },
+               {
+                       name:       "failure after 2 retries",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: retryAfterResponse(), err: nil},
+                               {response: retryAfterResponse(), err: nil},
+                               {response: retryAfterResponse(), err: nil},
+                       },
+
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "Client.Do",
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "RateLimiter.Wait",
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (4 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                               "Watch": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                               "Stream": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                       },
+               },
+               {
+                       name:       "do not retry on network errors",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: nil, err: fmt.Errorf("network error")},
+                       },
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"},
+                               },
+                               "Watch": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"}},
+                               "Stream": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"}},
+                       },
+               },
        }
 
        for _, test := range tests {
@@ -3149,11 +3518,11 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                        if !cmp.Equal(expected.order, interceptor.order) {
                                t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(expected.order, interceptor.order))
                        }
-                       if !cmp.Equal(sleepExpected, interceptor.sleeps) {
-                               t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(sleepExpected, interceptor.sleeps))
+                       if !cmp.Equal(expected.sleeps, interceptor.sleeps) {
+                               t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(expected.sleeps, interceptor.sleeps))
                        }
-                       if !cmp.Equal(statusCodesExpected, interceptor.statusCodes) {
-                               t.Errorf("%s: Expected status codes to match, diff: %s", key, cmp.Diff(statusCodesExpected, interceptor.statusCodes))
+                       if !cmp.Equal(expected.statusCodes, interceptor.statusCodes) {
+                               t.Errorf("%s: Expected status codes to match, diff: %s", key, cmp.Diff(expected.statusCodes, interceptor.statusCodes))
                        }
                })
        }

is the behavior of "failure after exhausting the retries" correct?

--- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/success (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/success_after_2_retries (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/failure_after_2_retries (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/do_not_retry_on_network_errors (0.00s)
PAS

@tkashem
Copy link
Contributor Author

tkashem commented Feb 22, 2022

success
sucess after retries
failure after exhausting the retries
don't retry on network errors

@aojea we have dedicated tests for these above scenarios, this PR adds a new test only to validate on backoff, rate limiter and metric call invocation.

@jpbetz
Copy link
Contributor

jpbetz commented Feb 22, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 22, 2022
@aojea
Copy link
Member

aojea commented Feb 22, 2022

to validate on backoff, rate limiter and metric call invocation.

ah ok, I though it would be good to validate that combo on the different scenarios

If there is no need it LGTM then

@tkashem
Copy link
Contributor Author

tkashem commented Feb 23, 2022

/test pull-kubernetes-e2e-kind

@aojea
Copy link
Member

aojea commented Feb 23, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 23, 2022
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold - for reacting on my comments

"BackoffManager.Sleep",
// sleep for delay dictated by backoff parameters
"BackoffManager.CalculateBackoff",
"BackoffManager.Sleep",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this PR, but it's a bit strange to me that we're sleeping twice between two subsequent calls.
I think we should try to simplify it a bit (in a subsequent PR).

@aojea @tkashem - thoughts?

Copy link
Member

@aojea aojea Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised too, but I think that one sleep is the "retry-after" sleep and the other the actual backoff

		(1 * time.Second).String(),
		// backoff.Sleep before retry 1 (B)
		(2 * time.Minute).String(),

Line 3045

Copy link
Member

@aojea aojea Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I've found it

klog.V(4).Infof("Got a Retry-After %s response for attempt %d to %v", retryAfter.Wait, retryAfter.Attempt, url)
if backoff != nil {
backoff.Sleep(retryAfter.Wait)
}
return nil

why do we have to sleep on the retryAfter only if backoff exists?
are not both Sleeps independent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wojtek-t today, with a default client-go configuration of a noBackoff, it's always BackoffManager.Sleep(0). I think when #106272 merges we can remove the sleep from Retry-After response.

I am working on a refactor PR where I am trying to put the backoff, rate limiter, and metric calls in a common site so Do, Watch, and Stream can use re-use them. I have added a TODO to remove it once #106272 merges.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#106272 merges we can remove the sleep from Retry-After response.

I don't think we can just simply remove it - we should respect what kube-apiserver is returning to us (it's always returning 1 now, but we should address that too at some point).

So I think it's not abot removing - it's rather about unifying them and making something like:
BackOffManager.Sleep(max(backoffManager.CalculateBackoff(), retryAfter))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is more forward looking, i will bake that in my next PR

},
"Watch": {
attempts: 2,
// Watch does not do 'RateLimiter.Wait' before initially sending the request to the server
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another thing that seems strange to me - if it's not throttling before the first request, why it is throttling before retries?

@tkashem - can you please maybe open an issue with those inonsistencies so that we can discuss (and fix assuming I'm not missing something subtle) those things there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I will open an issue with this, thanks!

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tkashem, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 23, 2022
@tkashem
Copy link
Contributor Author

tkashem commented Feb 23, 2022

@wojtek-t @aojea #108302

@wojtek-t
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2022
@k8s-ci-robot k8s-ci-robot merged commit 08d3285 into kubernetes:master Feb 23, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Feb 23, 2022
tkashem added a commit to tkashem/kubernetes that referenced this pull request Mar 29, 2022
…test"

This reverts commit 08d3285, reversing
changes made to 6d1c9a9.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants