client-go: add unit test to verify order of calls with retry #108262

tkashem · 2022-02-21T21:23:20Z

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

This unit test verifies that Do, DoRaw, Stream, and Watch invoke the flowcontrol.RateLimiter, BackoffManager, and metrics.ResultMetric appropriately and in right order for retry.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

tkashem · 2022-02-22T13:16:06Z

/assign @wojtek-t

@wojtek-t when you have some time, please review it. This will pave the way for adding the new metric to count retry attempts.

aojea · 2022-02-22T13:39:37Z

@tkashem Ive added that metric in my pr
#106911

Can we follow up there to not duplicate work?

tkashem · 2022-02-22T14:41:03Z

@aojea that's great, i will review the retry metric PR you are working on. In the meantime, this unit test PR can still merge, it asserts on the expected behavior with rate limiter, backoff and the existing metric call.

Can you review this PR please? :)

aojea · 2022-02-22T16:49:19Z

I think we should test more permutations,

success
sucess after retries
failure after exhausting the retries
don't retry on network errors

something like (it can be simplified.) 😄

diff --git a/staging/src/k8s.io/client-go/rest/request_test.go b/staging/src/k8s.io/client-go/rest/request_test.go
index 49689119223..8ac86355183 100644
--- a/staging/src/k8s.io/client-go/rest/request_test.go
+++ b/staging/src/k8s.io/client-go/rest/request_test.go
@@ -3000,8 +3000,10 @@ func (lb *withRateLimiterBackoffManagerAndMetrics) Do() {
 
 func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc func(ctx context.Context, r *Request)) {
        type expected struct {
-               attempts int
-               order    []string
+               attempts    int
+               order       []string
+               sleeps      []string
+               statusCodes []string
        }
 
        // we define the expected order of how the client invokes the
@@ -3010,47 +3012,6 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
        //  - A: original request fails with a retryable response: (500, 'Retry-After: 1')
        //  - B: retry 1: successful with a status code 200
        // so we have a total of 2 attempts
-       callOrderExpected := []string{
-               // before we send the request to the server:
-               // - we wait as dictated by the client rate lmiter
-               // - we wait, as dictated by the backoff manager
-               "RateLimiter.Wait",
-               "BackoffManager.CalculateBackoff",
-               "BackoffManager.Sleep",
-
-               // A: first attempt for which the server sends a retryable response
-               "Client.Do",
-
-               // we got a response object, status code: 500, Retry-Afer: 1
-               //  - call metrics method with appropriate status code
-               //  - update backoff parameters with the status code returned
-               //  - sleep for N seconds from 'Retry-After: N' response header
-               "RequestResult.Increment",
-               "BackoffManager.UpdateBackoff",
-               "BackoffManager.Sleep",
-               // sleep for delay dictated by backoff parameters
-               "BackoffManager.CalculateBackoff",
-               "BackoffManager.Sleep",
-               // wait as dictated by the client rate lmiter
-               "RateLimiter.Wait",
-
-               // B: 2nd attempt: retry, and this should return a status code=200
-               "Client.Do",
-
-               // it's a success, so do the following:
-               //  - call metrics and update backoff parameters
-               "RequestResult.Increment",
-               "BackoffManager.UpdateBackoff",
-       }
-       sleepExpected := []string{
-               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
-               (1 * time.Second).String(), // from 'Retry-After: 1' response header
-               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
-       }
-       statusCodesExpected := []string{
-               "500",
-               "200",
-       }
 
        tests := []struct {
                name          string
@@ -3060,7 +3021,76 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                expectations map[string]expected
        }{
                {
-                       name:       "success after two retries",
+                       name:       "success",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: &http.Response{StatusCode: http.StatusOK}, err: nil},
+                       },
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate limiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{
+                                               "200",
+                                       },
+                               },
+                               "Watch": {
+                                       attempts: 1,
+                                       // Watch does not do 'RateLimiter.Wait' before initially sending the request to the server
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                               },
+                               "Stream": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate limiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                               },
+                       },
+               },
+               {
+                       name:       "success after 2 retries",
                        maxRetries: 2,
                        serverReturns: []responseErr{
                                {response: retryAfterResponse(), err: nil},
@@ -3069,19 +3099,358 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                        expectations: map[string]expected{
                                "Do": {
                                        attempts: 2,
-                                       order:    callOrderExpected,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                                "Watch": {
                                        attempts: 2,
-                                       // Watch does not do 'RateLimiter.Wait' before initially sending the request to the server
-                                       order: callOrderExpected[1:],
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                                "Stream": {
                                        attempts: 2,
-                                       order:    callOrderExpected,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "200",
+                                       },
                                },
                        },
                },
+               {
+                       name:       "failure after 2 retries",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: retryAfterResponse(), err: nil},
+                               {response: retryAfterResponse(), err: nil},
+                               {response: retryAfterResponse(), err: nil},
+                       },
+
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "Client.Do",
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               "RateLimiter.Wait",
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (4 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                               "Watch": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                               "Stream": {
+                                       attempts: 3,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // we got a response object, status code: 500, Retry-Afer: 1
+                                               //  - call metrics method with appropriate status code
+                                               //  - update backoff parameters with the status code returned
+                                               //  - sleep for N seconds from 'Retry-After: N' response header
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // sleep for delay dictated by backoff parameters
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+                                               // wait as dictated by the client rate lmiter
+                                               "RateLimiter.Wait",
+
+                                               // B: 2nd attempt: retry, and this should return a status code=200
+                                               "Client.Do",
+
+                                               // it's a success, so do the following:
+                                               //  - call metrics and update backoff parameters
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s",                       // initial backoff.Sleep before we send the request to the server for the first time
+                                               (1 * time.Second).String(), // from 'Retry-After: 1' response header
+                                               (2 * time.Minute).String(), // backoff.Sleep before retry 1 (B)
+                                       },
+                                       statusCodes: []string{
+                                               "500",
+                                               "500",
+                                       },
+                               },
+                       },
+               },
+               {
+                       name:       "do not retry on network errors",
+                       maxRetries: 2,
+                       serverReturns: []responseErr{
+                               {response: nil, err: fmt.Errorf("network error")},
+                       },
+                       expectations: map[string]expected{
+                               "Do": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"},
+                               },
+                               "Watch": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait, as dictated by the backoff manager
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"}},
+                               "Stream": {
+                                       attempts: 1,
+                                       order: []string{
+                                               // before we send the request to the server:
+                                               // - we wait as dictated by the client rate lmiter
+                                               // - we wait, as dictated by the backoff manager
+                                               "RateLimiter.Wait",
+                                               "BackoffManager.CalculateBackoff",
+                                               "BackoffManager.Sleep",
+
+                                               // A: first attempt for which the server sends a retryable response
+                                               "Client.Do",
+
+                                               // ???
+                                               "RequestResult.Increment",
+                                               "BackoffManager.UpdateBackoff",
+                                       },
+                                       sleeps: []string{
+                                               "0s", // initial backoff.Sleep before we send the request to the server for the first time
+                                       },
+                                       statusCodes: []string{"<error>"}},
+                       },
+               },
        }
 
        for _, test := range tests {
@@ -3149,11 +3518,11 @@ func testRetryWithRateLimiterBackoffAndMetrics(t *testing.T, key string, doFunc
                        if !cmp.Equal(expected.order, interceptor.order) {
                                t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(expected.order, interceptor.order))
                        }
-                       if !cmp.Equal(sleepExpected, interceptor.sleeps) {
-                               t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(sleepExpected, interceptor.sleeps))
+                       if !cmp.Equal(expected.sleeps, interceptor.sleeps) {
+                               t.Errorf("%s: Expected order of calls to match, diff: %s", key, cmp.Diff(expected.sleeps, interceptor.sleeps))
                        }
-                       if !cmp.Equal(statusCodesExpected, interceptor.statusCodes) {
-                               t.Errorf("%s: Expected status codes to match, diff: %s", key, cmp.Diff(statusCodesExpected, interceptor.statusCodes))
+                       if !cmp.Equal(expected.statusCodes, interceptor.statusCodes) {
+                               t.Errorf("%s: Expected status codes to match, diff: %s", key, cmp.Diff(expected.statusCodes, interceptor.statusCodes))
                        }
                })
        }

is the behavior of "failure after exhausting the retries" correct?

--- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/success (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/success_after_2_retries (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/failure_after_2_retries (0.00s)
    --- PASS: TestRequestDoRetryWithRateLimiterBackoffAndMetrics/do_not_retry_on_network_errors (0.00s)
PAS

tkashem · 2022-02-22T20:56:43Z

success
sucess after retries
failure after exhausting the retries
don't retry on network errors

@aojea we have dedicated tests for these above scenarios, this PR adds a new test only to validate on backoff, rate limiter and metric call invocation.

jpbetz · 2022-02-22T21:10:10Z

/triage accepted

aojea · 2022-02-22T21:43:05Z

to validate on backoff, rate limiter and metric call invocation.

ah ok, I though it would be good to validate that combo on the different scenarios

If there is no need it LGTM then

tkashem · 2022-02-23T00:04:07Z

/test pull-kubernetes-e2e-kind

aojea · 2022-02-23T06:34:00Z

/lgtm

wojtek-t

/lgtm
/approve

/hold - for reacting on my comments

wojtek-t · 2022-02-23T07:36:21Z

staging/src/k8s.io/client-go/rest/request_test.go

+		"BackoffManager.Sleep",
+		// sleep for delay dictated by backoff parameters
+		"BackoffManager.CalculateBackoff",
+		"BackoffManager.Sleep",


Unrelated to this PR, but it's a bit strange to me that we're sleeping twice between two subsequent calls.
I think we should try to simplify it a bit (in a subsequent PR).

@aojea @tkashem - thoughts?

I was surprised too, but I think that one sleep is the "retry-after" sleep and the other the actual backoff

(1 * time.Second).String(), // backoff.Sleep before retry 1 (B) (2 * time.Minute).String(),

Line 3045

ok, I've found it

kubernetes/staging/src/k8s.io/client-go/rest/with_retry.go

Lines 171 to 175 in 296bf4f

klog.V(4).Infof("Got a Retry-After %s response for attempt %d to %v", retryAfter.Wait, retryAfter.Attempt, url)

if backoff != nil {

backoff.Sleep(retryAfter.Wait)

}

return nil

why do we have to sleep on the retryAfter only if backoff exists?
are not both Sleeps independent?

@wojtek-t today, with a default client-go configuration of a noBackoff, it's always BackoffManager.Sleep(0). I think when #106272 merges we can remove the sleep from Retry-After response.

I am working on a refactor PR where I am trying to put the backoff, rate limiter, and metric calls in a common site so Do, Watch, and Stream can use re-use them. I have added a TODO to remove it once #106272 merges.

#106272 merges we can remove the sleep from Retry-After response.

I don't think we can just simply remove it - we should respect what kube-apiserver is returning to us (it's always returning 1 now, but we should address that too at some point).

So I think it's not abot removing - it's rather about unifying them and making something like:
BackOffManager.Sleep(max(backoffManager.CalculateBackoff(), retryAfter))

yes, this is more forward looking, i will bake that in my next PR

wojtek-t · 2022-02-23T07:38:53Z

staging/src/k8s.io/client-go/rest/request_test.go

+				},
+				"Watch": {
+					attempts: 2,
+					// Watch does not do 'RateLimiter.Wait' before initially sending the request to the server


This is another thing that seems strange to me - if it's not throttling before the first request, why it is throttling before retries?

@tkashem - can you please maybe open an issue with those inonsistencies so that we can discuss (and fix assuming I'm not missing something subtle) those things there?

yes, I will open an issue with this, thanks!

k8s-ci-robot · 2022-02-23T07:42:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tkashem, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/client-go/OWNERS~~ [wojtek-t]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tkashem · 2022-02-23T14:44:27Z

@wojtek-t @aojea #108302

wojtek-t · 2022-02-23T16:34:09Z

/hold cancel

…test" This reverts commit 08d3285, reversing changes made to 6d1c9a9.

tkashem force-pushed the retry-unit-test branch from 396317f to 166c7d2 Compare February 21, 2022 21:27

k8s-ci-robot requested review from caesarxuchao and deads2k February 21, 2022 21:27

k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 21, 2022

tkashem force-pushed the retry-unit-test branch from 166c7d2 to 6f878de Compare February 21, 2022 21:34

k8s-ci-robot assigned wojtek-t Feb 22, 2022

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 22, 2022

client-go: add unit test to verify order of calls

f6a66bb

tkashem force-pushed the retry-unit-test branch from 6f878de to f6a66bb Compare February 22, 2022 23:11

k8s-ci-robot assigned aojea Feb 23, 2022

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 23, 2022

wojtek-t reviewed Feb 23, 2022

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2022

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 23, 2022

tkashem mentioned this pull request Feb 23, 2022

client-go: inconsistencies among Do, Watch, and Stream #108302

Closed

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 23, 2022

k8s-ci-robot merged commit 08d3285 into kubernetes:master Feb 23, 2022

k8s-ci-robot added this to the v1.24 milestone Feb 23, 2022

tkashem mentioned this pull request Feb 25, 2022

client-go: refactor retry logic for backoff, rate limiter and metric to be reused by Watch, Stream, and Do #108347

Merged

tkashem added a commit to tkashem/kubernetes that referenced this pull request Mar 29, 2022

Revert "Merge pull request kubernetes#108262 from tkashem/retry-unit-…

a743551

…test" This reverts commit 08d3285, reversing changes made to 6d1c9a9.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

client-go: add unit test to verify order of calls with retry #108262

client-go: add unit test to verify order of calls with retry #108262

tkashem commented Feb 21, 2022

tkashem commented Feb 22, 2022

aojea commented Feb 22, 2022

tkashem commented Feb 22, 2022

aojea commented Feb 22, 2022 •

edited

tkashem commented Feb 22, 2022

jpbetz commented Feb 22, 2022

aojea commented Feb 22, 2022

tkashem commented Feb 23, 2022

aojea commented Feb 23, 2022

wojtek-t left a comment

wojtek-t Feb 23, 2022

aojea Feb 23, 2022 •

edited

aojea Feb 23, 2022 •

edited

tkashem Feb 23, 2022

wojtek-t Feb 23, 2022

tkashem Feb 23, 2022

wojtek-t Feb 23, 2022

tkashem Feb 23, 2022

k8s-ci-robot commented Feb 23, 2022

tkashem commented Feb 23, 2022

wojtek-t commented Feb 23, 2022

	klog.V(4).Infof("Got a Retry-After %s response for attempt %d to %v", retryAfter.Wait, retryAfter.Attempt, url)
	if backoff != nil {
	backoff.Sleep(retryAfter.Wait)
	}
	return nil

client-go: add unit test to verify order of calls with retry #108262

client-go: add unit test to verify order of calls with retry #108262

Conversation

tkashem commented Feb 21, 2022

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

tkashem commented Feb 22, 2022

aojea commented Feb 22, 2022

tkashem commented Feb 22, 2022

aojea commented Feb 22, 2022 • edited

tkashem commented Feb 22, 2022

jpbetz commented Feb 22, 2022

aojea commented Feb 22, 2022

tkashem commented Feb 23, 2022

aojea commented Feb 23, 2022

wojtek-t left a comment

Choose a reason for hiding this comment

wojtek-t Feb 23, 2022

Choose a reason for hiding this comment

aojea Feb 23, 2022 • edited

Choose a reason for hiding this comment

aojea Feb 23, 2022 • edited

Choose a reason for hiding this comment

tkashem Feb 23, 2022

Choose a reason for hiding this comment

wojtek-t Feb 23, 2022

Choose a reason for hiding this comment

tkashem Feb 23, 2022

Choose a reason for hiding this comment

wojtek-t Feb 23, 2022

Choose a reason for hiding this comment

tkashem Feb 23, 2022

Choose a reason for hiding this comment

k8s-ci-robot commented Feb 23, 2022

tkashem commented Feb 23, 2022

wojtek-t commented Feb 23, 2022

aojea commented Feb 22, 2022 •

edited

aojea Feb 23, 2022 •

edited

aojea Feb 23, 2022 •

edited