Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug why some /csi.v1.Node methods are not called #95796

Closed
YiannisGkoufas opened this issue Oct 22, 2020 · 6 comments
Closed

Debug why some /csi.v1.Node methods are not called #95796

YiannisGkoufas opened this issue Oct 22, 2020 · 6 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@YiannisGkoufas
Copy link

YiannisGkoufas commented Oct 22, 2020

Hi! I was working on our CSI S3 implementation ( https://github.com/IBM/dataset-lifecycle-framework/tree/master/src/csi-s3 ) and have successfully tested it multiple environments of k8s 1.16+ without a problem.
However, in a recent deployment I have been getting some issues about the attacher not being called.

Specifically, in a working setting I was seeing logs like this:

I1022 14:47:03.515918       1 utils.go:97] GRPC call: /csi.v1.Controller/CreateVolume
I1022 14:47:03.516230       1 utils.go:98] GRPC request: {"capacity_range":{"required_bytes":5368709120},"name":"pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d","parameters":{"mounter":"goofys"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":5}}]}
I1022 14:47:03.520123       1 controllerserver.go:59] Got a request to create volume pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d
I1022 14:47:04.585273       1 controllerserver.go:88] create volume pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d
I1022 14:47:04.585541       1 utils.go:103] GRPC response: {"volume":{"capacity_bytes":10000000000000,"volume_context":{"mounter":"goofys"},"volume_id":"pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d"}}
I1022 14:47:23.282358       1 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I1022 14:47:23.282392       1 utils.go:98] GRPC request: {}
I1022 14:47:23.283316       1 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1022 14:47:23.297263       1 utils.go:97] GRPC call: /csi.v1.Node/NodeStageVolume
I1022 14:47:23.297292       1 utils.go:98] GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d/globalmount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"mounter":"goofys","storage.kubernetes.io/csiProvisionerIdentity":"1603377947105-8081-ch.ctrox.csi.s3-driver"},"volume_id":"pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d"}
I1022 14:47:23.298971       1 utils.go:103] GRPC response: {}
I1022 14:47:23.306173       1 utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
I1022 14:47:23.306211       1 utils.go:98] GRPC request: {}
I1022 14:47:23.306726       1 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1022 14:47:23.318664       1 utils.go:97] GRPC call: /csi.v1.Node/NodePublishVolume
I1022 14:47:23.318695       1 utils.go:98] GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d/globalmount","target_path":"/var/lib/kubelet/pods/289430f2-cff7-4382-bc15-d947385f5197/volumes/kubernetes.io~csi/pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d/mount","volume_capability":{"AccessType":{"Mount":{}},"access_mode":{"mode":5}},"volume_context":{"mounter":"goofys","storage.kubernetes.io/csiProvisionerIdentity":"1603377947105-8081-ch.ctrox.csi.s3-driver"},"volume_id":"pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d"}
I1022 14:47:23.320385       1 nodeserver.go:77] target /var/lib/kubelet/pods/289430f2-cff7-4382-bc15-d947385f5197/volumes/kubernetes.io~csi/pvc-5ae4381e-8cc0-48cd-864f-a741f91fdc8d/mount

In the non-working case the logs stop on:

I1022 15:24:39.874091       1 utils.go:103] GRPC response: {"capabilities":[{"Type":{"Rpc":{"type":1}}}]}
I1022 15:26:44.082754       1 utils.go:97] GRPC call: /csi.v1.Controller/CreateVolume
I1022 15:26:44.082787       1 utils.go:98] GRPC request: {"capacity_range":{"required_bytes":5368709120},"name":"pvc-816b6ed1-c2b1-46a2-9089-c3b38a838699","parameters":{"mounter":"goofys"},"secrets":"***stripped***","volume_capabilities":[{"AccessType":{"Mount":{}},"access_mode":{"mode":5}}]}
I1022 15:26:44.085223       1 controllerserver.go:59] Got a request to create volume pvc-816b6ed1-c2b1-46a2-9089-c3b38a838699
I1022 15:26:45.137443       1 controllerserver.go:88] create volume pvc-816b6ed1-c2b1-46a2-9089-c3b38a838699
I1022 15:26:45.137731       1 utils.go:103] GRPC response: {"volume":{"capacity_bytes":10000000000000,"volume_context":{"mounter":"goofys"},"volume_id":"pvc-816b6ed1-c2b1-46a2-9089-c3b38a838699"}}

This means that the methods /csi.v1.Node/NodeGetCapabilities /csi.v1.Node/NodeStageVolume etc are not called and I am not sure how to debug this. I suspect that this might have to do with the socket connection but I don't know of a way to verify that.

Any help would be highly appreciated!

Environment:
Kubernetes 1.16

@YiannisGkoufas YiannisGkoufas added the kind/bug Categorizes issue or PR as related to a bug. label Oct 22, 2020
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 22, 2020
@k8s-ci-robot
Copy link
Contributor

@YiannisGkoufas: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@YiannisGkoufas
Copy link
Author

/sig Storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 22, 2020
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 20, 2021
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 19, 2021
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

No branches or pull requests

3 participants