Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The CRI v1 implementation of validateServiceConnection does not check for unavailable #114956

Closed
afbjorklund opened this issue Jan 10, 2023 · 3 comments · Fixed by #115102
Closed
Assignees
Labels
area/kubelet kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@afbjorklund
Copy link

What happened?

It "connects" to dockershim

DEBU[0000] Connected successfully using endpoint: unix:///var/run/dockershim.sock 
DEBU[0000] VersionRequest: &VersionRequest{Version:v1,} 
E1213 20:22:57.677536   53350 remote_runtime.go:145] "Version from runtime service failed" err="rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory\""
DEBU[0000] VersionResponse: nil                         

What did you expect to happen?

There is no dockershim.sock

ERRO[0000] validate service connection: CRI v1 runtime API is not available for endpoint "unix:///var/run/dockershim.sock": rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix /var/run/dockershim.sock: connect: no such file or directory" 

How can we reproduce it (as minimally and precisely as possible)?

Run crictl v1.26.0, without setting up /etc/crictl.yaml (or using -r)

Anything else we need to know?

Worked in crictl v1.25.0, that still had CRI v1alpha2

This patch fixed the issue, to make the fallbacks work:

--- a/vendor/k8s.io/kubernetes/pkg/kubelet/cri/remote/remote_runtime.go
+++ b/vendor/k8s.io/kubernetes/pkg/kubelet/cri/remote/remote_runtime.go
@@ -122,6 +122,8 @@ func (r *remoteRuntimeService) validateServiceConnection(ctx context.Context, co
 
        } else if status.Code(err) == codes.Unimplemented {
                return fmt.Errorf("CRI v1 runtime API is not implemented for endpoint %q: %w", endpoint, err)
+       } else if status.Code(err) == codes.Unavailable {
+               return fmt.Errorf("CRI v1 runtime API is not available for endpoint %q: %w", endpoint, err)
        }
 
        return nil

Kubernetes version

v1.26.0

Cloud provider

Lima

OS version

PRETTY_NAME="Ubuntu 22.04.1 LTS"

Install tools

Container runtime (CRI) and version (if applicable)

v1.6.12

Related plugins (CNI, CSI, ...) and versions (if applicable)

@afbjorklund afbjorklund added the kind/bug Categorizes issue or PR as related to a bug. label Jan 10, 2023
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jan 10, 2023
@k8s-ci-robot
Copy link
Contributor

@afbjorklund: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jan 10, 2023
@afbjorklund
Copy link
Author

/sig node

@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 10, 2023
@SergeyKanzhelev SergeyKanzhelev added this to Triage in SIG Node Bugs Jan 11, 2023
@saschagrunert
Copy link
Member

@afbjorklund I agree that we can check for unavailable connection because it will only be done on kubelet startup.

@saschagrunert saschagrunert added area/kubelet triage/accepted Indicates an issue or PR is ready to be actively worked on. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Jan 16, 2023
@saschagrunert saschagrunert moved this from Triage to Triaged in SIG Node Bugs Jan 16, 2023
@saschagrunert saschagrunert self-assigned this Jan 16, 2023
saschagrunert added a commit to saschagrunert/kubernetes that referenced this issue Jan 17, 2023
We only have one CRI API (v1) to validate during the initial connection
of the kubelet with the container runtime. Therefore we can now verify
all kind of GRPC related issues.

Fixes: kubernetes#114956

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
@SergeyKanzhelev SergeyKanzhelev removed this from Triaged in SIG Node Bugs Jan 18, 2023
karan2704 pushed a commit to karan2704/kubernetes that referenced this issue Feb 6, 2023
We only have one CRI API (v1) to validate during the initial connection
of the kubelet with the container runtime. Therefore we can now verify
all kind of GRPC related issues.

Fixes: kubernetes#114956

Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/node Categorizes an issue or PR as relevant to SIG Node. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants