Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes MountVolume.NewMounter errors not displayed to users via describe events #42006

Merged
merged 1 commit into from Mar 24, 2017

Conversation

screeley44
Copy link
Contributor

@screeley44 screeley44 commented Feb 23, 2017

Fixes #42004

This fixes the problem of mount errors being eaten and not displayed to users again. Specifically erros caught in MountVolume.NewMounter (like missing endpoints, etc...)

Current behavior for any mount failure:

Events:
  FirstSeen    LastSeen    Count    From            SubObjectPath    Type        Reason        Message
  ---------    --------    -----    ----            -------------    --------    ------        -------
  12m        12m        1    default-scheduler            Normal        Scheduled    Successfully assigned glusterfs-bb-pod1 to 127.0.0.1
  10m        1m        5    kubelet, 127.0.0.1            Warning        FailedMount    Unable to mount volumes for pod "glusterfs-bb-pod1_default(67c9dfa7-f9f5-11e6-aee2-5254003a59cf)": timeout expired waiting for volumes to attach/mount for pod "default"/"glusterfs-bb-pod1". list of unattached/unmounted volumes=[glusterfsvol]
  10m        1m        5    kubelet, 127.0.0.1            Warning        FailedSync    Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"glusterfs-bb-pod1". list of unattached/unmounted volumes=[glusterfsvol]

New Behavior:

For example on glusterfs - deliberately didn't create endpoints, now correct message is displayed:

Events:
  FirstSeen	LastSeen	Count	From			SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----			-------------	--------	------		-------
  2m		2m		1	default-scheduler			Normal		Scheduled	Successfully assigned glusterfs-bb-pod1 to 127.0.0.1
  54s		54s		1	kubelet, 127.0.0.1			Warning		FailedMount	Unable to mount volumes for pod "glusterfs-bb-pod1_default(8edd2c25-fa09-11e6-92ae-5254003a59cf)": timeout expired waiting for volumes to attach/mount for pod "default"/"glusterfs-bb-pod1". With error timed out waiting for the condition. list of unattached/unmounted volumes=[glusterfsvol]
  54s		54s		1	kubelet, 127.0.0.1			Warning		FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "default"/"glusterfs-bb-pod1". With error timed out waiting for the condition. list of unattached/unmounted volumes=[glusterfsvol]
  2m		6s		814	kubelet, 127.0.0.1			Warning		FailedMount	MountVolume.NewMounter failed for volume "kubernetes.io/glusterfs/8edd2c25-fa09-11e6-92ae-5254003a59cf-glusterfsvol" (spec.Name: "glusterfsvol") pod "8edd2c25-fa09-11e6-92ae-5254003a59cf" (UID: "8edd2c25-fa09-11e6-92ae-5254003a59cf") with: endpoints "glusterfs-cluster" not found

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 23, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. release-note-label-needed labels Feb 23, 2017
@screeley44 screeley44 changed the title Fixes mount errors being eaten again and not displayed to users via describe events Fixes MountVolume.Setup errors not displayed to users via describe events Feb 23, 2017
@screeley44
Copy link
Contributor Author

@kubernetes/sig-storage @saad-ali cc

@screeley44 screeley44 changed the title Fixes MountVolume.Setup errors not displayed to users via describe events Fixes MountVolume.NewMounter errors not displayed to users via describe events Feb 23, 2017
@@ -364,9 +364,10 @@ func (vm *volumeManager) WaitForAttachAndMount(pod *v1.Pod) error {
}

return fmt.Errorf(
"timeout expired waiting for volumes to attach/mount for pod %q/%q. list of unattached/unmounted volumes=%v",
"timeout expired waiting for volumes to attach/mount for pod %q/%q. With error %v. list of unattached/unmounted volumes=%v",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/With/with ?

Another question is what will be the exact err here, is it also time expired? Is it duplicate with the beginning of this error message timeout expired waiting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gyliu513 - yeah, I don't think I need this part of the change, from what I can tell it is a duplicate "timeout waiting for condition" error - I will remove this.

Copy link
Contributor

@gyliu513 gyliu513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm thanks @screeley44

@gnufied
Copy link
Member

gnufied commented Feb 27, 2017

/approve

@screeley44
Copy link
Contributor Author

@k8s-bot verify test this

@screeley44 screeley44 force-pushed the error-events3 branch 3 times, most recently from 50da219 to e48c663 Compare March 1, 2017 17:59
@screeley44
Copy link
Contributor Author

@k8s-bot test this: #41889

@screeley44
Copy link
Contributor Author

@k8s-bot cvm gce e2e test this: issue #41889

@screeley44
Copy link
Contributor Author

@k8s-bot gce etcd3 e2e test this: issue #41898

@screeley44
Copy link
Contributor Author

@k8s-bot node e2e test this

@screeley44
Copy link
Contributor Author

@k8s-bot cvm gce e2e test this: issue #42440

@screeley44
Copy link
Contributor Author

@k8s-bot gce etcd3 e2e test this: issue #41889

@screeley44
Copy link
Contributor Author

@saad-ali - can you approve and lgtm this simple fix ... thanks!

@screeley44
Copy link
Contributor Author

@jsafrane - can you take a look at this simple fix and approve if it looks ok to you

@gnufied
Copy link
Member

gnufied commented Mar 6, 2017

cc @jingxu97 can we merge this? it looks good to me.

@jingxu97
Copy link
Contributor

jingxu97 commented Mar 6, 2017

make sure to test this fix.
/lgtm

@k8s-ci-robot
Copy link
Contributor

@jingxu97: you can't LGTM a PR unless you are an assignee.

In response to this comment:

make sure to test this fix.
/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jingxu97
Copy link
Contributor

jingxu97 commented Mar 6, 2017

/approve

@jsafrane jsafrane added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels Mar 7, 2017
@jsafrane
Copy link
Member

jsafrane commented Mar 7, 2017

/approve

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

The following people have approved this PR: gnufied, jingxu97, jsafrane, screeley44

Needs approval from an approver in each of these OWNERS Files:

We suggest the following people:
cc @saad-ali
You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 7, 2017
@jsafrane jsafrane added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 7, 2017
@jsafrane
Copy link
Member

jsafrane commented Mar 7, 2017

marked as lgtm on behalf of @jingxu97

@jsafrane
Copy link
Member

jsafrane commented Mar 7, 2017

@k8s-bot kops aws e2e test this

@screeley44
Copy link
Contributor Author

@jsafrane can this be merged or is queue closed until next release?

@gnufied
Copy link
Member

gnufied commented Mar 10, 2017

We need someone to mark this for 1.6 to be merged right now. I think this is small enough change that we can ship in 1.6. cc @jsafrane @childsb

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 42522, 42545, 42556, 42006, 42631)

@k8s-github-robot k8s-github-robot merged commit 803369b into kubernetes:master Mar 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MountVolume errors being eaten and not showing in describe events for MountVolume.NewMounter
9 participants