kubectl: fix timeout=32s for some rest APIs when --request-timeout=0 #103619

BoleynSu · 2021-07-09T18:02:47Z

According to kubectl options, for --request-timeout, a value
of zero means don't timeout requests. This PR fixes the wrong
behavior.

What type of PR is this?

/kind bug

What this PR does / why we need it:

Please check #103618

Which issue(s) this PR fixes:

Fixes #103618

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Fix kubectl timing out on slow connections.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

According to `kubectl options`, for `--request-timeout`, a value of zero means don't timeout requests. This PR fixes the wrong behavior.

k8s-ci-robot · 2021-07-09T18:02:55Z

Welcome @BoleynSu!

It looks like this is your first PR to kubernetes/kubernetes 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kubernetes has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2021-07-09T18:02:55Z

Hi @BoleynSu. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot · 2021-07-09T18:03:35Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: BoleynSu
To complete the pull request process, please assign lavalamp after the PR has been reviewed.
You can assign the PR to them by writing /assign @lavalamp in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

staging/src/k8s.io/client-go/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

BoleynSu · 2021-07-09T18:22:31Z

This patch is to correct the current behavior. However, as it has been there for 3 years, it is very likely that someone is already depending on the current behavior. If we do not want to change the current behavior, maybe we can change the usage message instead?

fedebongio · 2021-07-13T20:07:49Z

/assign @jpbetz
/cc @roycaihw
/triage accepted

eddiezane · 2021-07-19T20:08:25Z

staging/src/k8s.io/client-go/discovery/discovery_client.go

@@ -462,9 +459,6 @@ func withRetries(maxRetries int, f func() ([]*metav1.APIGroup, []*metav1.APIReso
 func setDiscoveryDefaults(config *restclient.Config) error {
 	config.APIPath = ""
 	config.GroupVersion = nil
-	if config.Timeout == 0 {


We shouldn't remove the default timeout to preserve backwards compatibility.

What if we add -1 for no timeout instead?

Actually I think this is already the current behavior?

kubectl get nodes --request-timeout=-1s -v10 ... I0719 14:20:47.145520 41053 round_trippers.go:435] curl ... foo.com/api/v1/nodes?limit=500'

it looks like it's an issue specific to the kubectl documentation. Unless there is a mismatch between the client-go documentation and client-go behavior, we shouldn't change client-go.

I believe this is a doc-code dismatch. Other places have the correct behavior while only this one misbehaves.

kubectl options shows the following.

--request-timeout='0': The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don't timeout requests.

Note that I did not really check if other places work as documented. Only that when searching the codebase for this particular flag I did not find any other code using it wrongly.

it looks like it's an issue specific to the kubectl documentation. Unless there is a mismatch between the client-go documentation and client-go behavior, we shouldn't change client-go.

In staging/src/k8s.io/client-gorest/config.go, we also have

// The maximum length of time to wait before giving up on a server request. A value of zero means no timeout. Timeout time.Duration

+1 to @roycaihw's comment. If we want to respect the --request-timeout documentation in kubectl (which seems reasonable) we should do it by passing in a value to client-go to tell it to not timeout. We cannot delete (or otherwise change) the default in client-go (which is used by more clients than just kubectl), since that would be a breaking change to the other clients.

I agree with the comments about driving behavior from the kubectl side, rather than here

even in kubectl, I'm not sure removing timeout entirely makes sense... the server will still time out eventually, regardless of what the client does (and at around 30 seconds for short-lived requests, I think, at least for REST API requests)

BoleynSu · 2021-07-20T14:49:18Z

/sig api-machinery

BoleynSu

Friendly ping.

I just notice one of my comments is in pending status for a few days and I do not know how to make it public, so I updated my last comment instead. PTAL.

BoleynSu · 2021-07-27T04:51:00Z

@jpbetz Please check the last paragraph in the comment above yours. Sent from https://boleyn.su/phone

…

On Tue, Jul 27, 2021, 03:37 Joe Betz ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In staging/src/k8s.io/client-go/discovery/discovery_client.go <#103619 (comment)> : > @@ -462,9 +459,6 @@ func withRetries(maxRetries int, f func() ([]*metav1.APIGroup, []*metav1.APIReso func setDiscoveryDefaults(config *restclient.Config) error { config.APIPath = "" config.GroupVersion = nil - if config.Timeout == 0 { +1 to @roycaihw <https://github.com/roycaihw>'s comment. If we want to respect the --request-timeout documentation in kubectl (which seems reasonable) we should do it by passing in a value to client-go to tell it to not timeout. We cannot change the default in client-go (which is used by more clients than just kubectl), since that would be a breaking change to the other clients. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103619 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXVSKBF6Q6XJAMZJX3YZEDTZW2QXANCNFSM5ADGO4JA> .

eddiezane · 2021-07-28T00:30:41Z

I ran through this in a debugger and get what is happening now. @BoleynSu apologies for not understanding earlier.

To summarize:

When running kubectl get nodes --request-timeout=0s --cache-dir=/dev/null the parsed and merged config does indeed have config.Timeout set to 0 here. The issue is that we are not able to differentiate between the user supplied value of 0 and the Go default value of 0 - which is why using -1 worked in my testing.

The callstack looks like this:

k8s.io/client-go/discovery.setDiscoveryDefaults(discovery_client.go:462)
k8s.io/client-go/discovery.NewDiscoveryClientForConfig(discovery_client.go:487)
k8s.io/client-go/discovery/cached/disk.NewCachedDiscoveryClientForConfig(cached_discovery.go:281)
k8s.io/cli-runtime/pkg/genericclioptions.(*ConfigFlags).ToDiscoveryClient(config_flags.go:253)
k8s.io/kubectl/pkg/cmd/util.(*MatchVersionFlags).ToDiscoveryClient(kubectl_match_version.go:91)
k8s.io/cli-runtime/pkg/resource.NewBuilder.func1(builder.go:206)

As it stands now we aren't able to set discovery clients (created by NewDiscoveryClientForConfig) to use a timeout of 0.

We aren't able to remove the default or make it a pointer but maybe we can pass metadata that the timeout was set in the config (like @jpbetz suggested above) or change the expectation that a -1 one be used for no timeout.

kubernetes/staging/src/k8s.io/client-go/rest/config.go

Lines 131 to 132 in d92b788

    
           // The maximum length of time to wait before giving up on a server request. A value of zero means no timeout. 
        
           Timeout time.Duration

@liggitt @soltysh thoughts?

BoleynSu · 2021-07-29T01:08:21Z

IIUC, they are only for create/patch/... which change the data, not for retrieving data. @liggitt Sent from https://boleyn.su/phone

…

On Thu, Jul 29, 2021, 03:19 Jordan Liggitt ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In staging/src/k8s.io/client-go/discovery/discovery_client.go <#103619 (comment)> : > @@ -462,9 +459,6 @@ func withRetries(maxRetries int, f func() ([]*metav1.APIGroup, []*metav1.APIReso func setDiscoveryDefaults(config *restclient.Config) error { config.APIPath = "" config.GroupVersion = nil - if config.Timeout == 0 { I'm not sure removing timeout entirely makes sense... the server will still time out eventually, regardless of what the client does (and at around 30 seconds for short-lived requests <https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/apiserver/pkg/endpoints/handlers/rest.go#L57>, I think, at least for REST API requests) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103619 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXVSKA3TSS7JY6OBRIPCRTT2BJ5JANCNFSM5ADGO4JA> .

liggitt · 2021-07-29T13:44:28Z

IIUC, they are only for create/patch/... which change the data, not for retrieving data. @liggitt

get requests still have a server-side upper bound timeout (defaulting to 60 seconds, I think)

BoleynSu · 2021-07-30T01:07:45Z

@liggitt I think we only set a timeout for watch requests in get.go. My request to the API server over a very slow connection should timeout whatever parameters I use for the kubectl cli if what you think is true. But it did not. Sent from https://boleyn.su/phone

…

On Thu, Jul 29, 2021, 21:44 Jordan Liggitt ***@***.***> wrote: IIUC, they are only for create/patch/... which change the data, not for retrieving data. @liggitt <https://github.com/liggitt> get requests still have a server-side upper bound timeout (defaulting to 60 seconds, I think) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#103619 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAXVSKDKX4IWLUU6A5OEFMDT2FLMTANCNFSM5ADGO4JA> .

BoleynSu · 2021-08-08T14:08:05Z

Friendly ping.

k8s-triage-robot · 2021-11-08T17:04:38Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

BoleynSu · 2021-11-09T14:55:37Z

Friendly ping.

k8s-triage-robot · 2021-12-09T15:03:11Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

dims · 2022-01-05T16:31:26Z

@fedebongio this probably needs an additional assignee?

k8s-triage-robot · 2022-02-04T17:13:25Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot · 2022-02-04T17:14:45Z

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kubectl: fix timeout=32s for some rest APIs when --request-timeout=0

df3cfd0

According to `kubectl options`, for `--request-timeout`, a value of zero means don't timeout requests. This PR fixes the wrong behavior.

k8s-ci-robot requested review from jpbetz and soltysh July 9, 2021 18:03

k8s-ci-robot assigned jpbetz Jul 13, 2021

k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 13, 2021

k8s-ci-robot requested a review from roycaihw July 13, 2021 20:07

k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 13, 2021

eddiezane reviewed Jul 19, 2021

View reviewed changes

BoleynSu commented Jul 24, 2021

View reviewed changes

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 8, 2021

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 9, 2021

k8s-ci-robot closed this Feb 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubectl: fix timeout=32s for some rest APIs when --request-timeout=0 #103619

kubectl: fix timeout=32s for some rest APIs when --request-timeout=0 #103619

BoleynSu commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

BoleynSu commented Jul 9, 2021

fedebongio commented Jul 13, 2021

eddiezane Jul 19, 2021

eddiezane Jul 19, 2021

roycaihw Jul 19, 2021

BoleynSu Jul 20, 2021

BoleynSu Jul 20, 2021 •

edited

jpbetz Jul 26, 2021 •

edited

liggitt Jul 28, 2021 •

edited

BoleynSu commented Jul 20, 2021

BoleynSu left a comment

BoleynSu commented Jul 27, 2021 via email

eddiezane commented Jul 28, 2021

BoleynSu commented Jul 29, 2021 via email

liggitt commented Jul 29, 2021

BoleynSu commented Jul 30, 2021 via email

BoleynSu commented Aug 8, 2021

k8s-triage-robot commented Nov 8, 2021

BoleynSu commented Nov 9, 2021

k8s-triage-robot commented Dec 9, 2021

dims commented Jan 5, 2022

k8s-triage-robot commented Feb 4, 2022

k8s-ci-robot commented Feb 4, 2022

kubectl: fix timeout=32s for some rest APIs when --request-timeout=0 #103619

kubectl: fix timeout=32s for some rest APIs when --request-timeout=0 #103619

Conversation

BoleynSu commented Jul 9, 2021

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

k8s-ci-robot commented Jul 9, 2021

BoleynSu commented Jul 9, 2021

fedebongio commented Jul 13, 2021

eddiezane Jul 19, 2021

Choose a reason for hiding this comment

eddiezane Jul 19, 2021

Choose a reason for hiding this comment

roycaihw Jul 19, 2021

Choose a reason for hiding this comment

BoleynSu Jul 20, 2021

Choose a reason for hiding this comment

BoleynSu Jul 20, 2021 • edited

Choose a reason for hiding this comment

jpbetz Jul 26, 2021 • edited

Choose a reason for hiding this comment

liggitt Jul 28, 2021 • edited

Choose a reason for hiding this comment

BoleynSu commented Jul 20, 2021

BoleynSu left a comment

Choose a reason for hiding this comment

BoleynSu commented Jul 27, 2021 via email

eddiezane commented Jul 28, 2021

BoleynSu commented Jul 29, 2021 via email

liggitt commented Jul 29, 2021

BoleynSu commented Jul 30, 2021 via email

BoleynSu commented Aug 8, 2021

k8s-triage-robot commented Nov 8, 2021

BoleynSu commented Nov 9, 2021

k8s-triage-robot commented Dec 9, 2021

dims commented Jan 5, 2022

k8s-triage-robot commented Feb 4, 2022

k8s-ci-robot commented Feb 4, 2022

BoleynSu Jul 20, 2021 •

edited

jpbetz Jul 26, 2021 •

edited

liggitt Jul 28, 2021 •

edited