glusterfs log files may become very large if volumes mount failed #68050

houjun41544 · 2018-08-30T08:42:18Z

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

/kind feature

What happened:
when a pod start and need to mount a gluster volume, a log file will be created to record the errors in glusterfs mount. The log path is at /var/lib/kubelet/plugins/kubernetes.io/glusterfs/[volName]/[podName]-glusterfs.log

we encounter a scenario that pod mount glusterfs always failed since the glusterfs server is unreachable for a long time. Finally the log file become quite large,as bellow.

when could these log files be cleaned up without manual deletion?

What you expected to happen:

I consider if we should remove the log file after every mount whether mount failed or successfully.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

Kubernetes version (use kubectl version): v1.10
Cloud provider or hardware configuration:
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
Others:

The text was updated successfully, but these errors were encountered:

houjun41544 · 2018-08-30T08:59:06Z

/sig storage

houjun41544 · 2018-09-04T02:18:04Z

@kubernetes/sig-storage-bugs, can you have a look at this?

k8s-ci-robot · 2018-09-04T02:18:11Z

@houjun41544: Reiterating the mentions to trigger a notification:
@kubernetes/sig-storage-bugs

In response to this:

@kubernetes/sig-storage-bugs, can you have a look at this?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jsafrane · 2018-09-18T10:34:25Z

/assign @humblec

houjun41544 · 2018-09-19T06:16:40Z

@humblec @jsafrane We tried to solve this problem by clearing the gluster log file after reading the log.#68814

humblec · 2018-09-19T06:24:33Z

@houjun41544 One of the mechanism we normally suggest here is: applying logrotation in this path. It should help to archive and clear the logs accordingly.

I consider if we should remove the log file after every mount whether mount failed or successfully.

Clearing the log after each mount success/failure may not be good as admins want to track it later

houjun41544 · 2018-09-19T06:40:04Z

@humblec If the log file is provided by user, they could archive and clear the logs by themselves. But if it is driver specific ,maybe users do not known the log file existed at all.

warmchang · 2018-09-20T07:42:00Z

Clearing the log after each mount success/failure may not be good as admins want to track it later

@humblec
The glusterfs log info will be read and stored in the kubelet log, the admins can track it from the kubelet's log file, so the glusterfs log is redundant and can be emptied.

	// Failed mount scenario.
	// Since glusterfs does not return error text
	// it all goes in a log file, we will read the log file
	logErr := readGlusterLog(log, b.pod.Name)
	if logErr != nil {
		return fmt.Errorf("mount failed: %v the following error information was pulled from the glusterfs log to help diagnose this issue: %v", errs, logErr)
	}

houjun41544 · 2018-09-20T07:43:54Z

@humblec In addition, it seems that the log files will never be removed even if pods and volumes have been deleted.
Why not place the log file in the pod plugin dir rather than plugin dir?

humblec · 2018-09-20T12:33:58Z

@humblec
The glusterfs log info will be read and stored in the kubelet log, the admins can track it from the kubelet's log file, so the glusterfs log is redundant and can be emptied.

@warmchang Its just last 2 lines are exposed to kubelet , most of the time these 2 lines can give clue, but at times the sequence of events need to be looked at for debugging the issue so the logfile can help.

humblec · 2018-09-20T12:39:13Z

@humblec In addition, it seems that the log files will never be removed even if pods and volumes have been deleted.

It seems to me that, clearing/removing log file when pod is deleted would be better.
@jsafrane Any thoughts on this ?

warmchang · 2018-09-20T15:38:42Z

It seems to me that, clearing/removing log file when pod is deleted would be better.

@humblec The result of this modification is the same. As you described, when the volume mount failed, the admin will check log file to debug the issue. But unfortunately, the log does not existed because it has been cleaned/deleted when the POD dies.

admoriarty · 2018-10-16T14:58:03Z

Would love to see this issue resolved, it just caused us to starting losing a node intermittently due to a DiskPressure threshold being hit, because of months old logs of this type, the largest being 7G!

warmchang · 2018-10-17T04:52:43Z

@admoriarty That's why @houjun41544 submitted this issure, and the PR #68814 try to solve it.

fejta-bot · 2019-01-15T05:32:16Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

houjun41544 · 2019-01-15T05:59:09Z

/remove-lifecycle stale

fejta-bot · 2019-04-15T06:05:40Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2019-05-15T06:50:50Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

fejta-bot · 2019-06-14T07:33:56Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-06-14T07:34:04Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dmoessne · 2019-07-14T10:08:08Z

/reopen

k8s-ci-robot · 2019-07-14T10:08:15Z

@dmoessne: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dmoessne · 2019-07-14T10:08:59Z

so, I think this is still an issue and do not get why it is abandoned. Are there at least any recommendations how to avoid that ?

houjun41544 · 2019-07-15T01:24:36Z

/reopen

k8s-ci-robot · 2019-07-15T01:24:43Z

@houjun41544: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

fejta-bot · 2019-08-14T02:23:31Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

k8s-ci-robot · 2019-08-14T02:23:38Z

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

koep · 2020-01-07T14:37:29Z

I agree with dmoessne that this issue should be addressed - at least in the form of documentation that recommends steps on how to mitigate the issue (e. g. logrotate). CC @humblec any thoughts?

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. kind/bug Categorizes issue or PR as related to a bug. labels Aug 30, 2018

k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 30, 2018

houjun41544 changed the title ~~glusterfs log files become very large since mount failed~~ glusterfs log files may become very large if volumes mount failed Sep 4, 2018

k8s-ci-robot assigned humblec Sep 18, 2018

houjun41544 mentioned this issue Sep 19, 2018

Empty the gluster log file after reading logs if logfile is driver specific #68814

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 15, 2019

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 15, 2019

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 15, 2019

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 15, 2019

k8s-ci-robot closed this as completed Jun 14, 2019

k8s-ci-robot reopened this Jul 15, 2019

k8s-ci-robot closed this as completed Aug 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

glusterfs log files may become very large if volumes mount failed #68050

glusterfs log files may become very large if volumes mount failed #68050

houjun41544 commented Aug 30, 2018 •

edited

Loading

houjun41544 commented Aug 30, 2018

houjun41544 commented Sep 4, 2018

k8s-ci-robot commented Sep 4, 2018

jsafrane commented Sep 18, 2018

houjun41544 commented Sep 19, 2018 •

edited

Loading

humblec commented Sep 19, 2018 •

edited

Loading

houjun41544 commented Sep 19, 2018

warmchang commented Sep 20, 2018

houjun41544 commented Sep 20, 2018

humblec commented Sep 20, 2018

humblec commented Sep 20, 2018

warmchang commented Sep 20, 2018

admoriarty commented Oct 16, 2018

warmchang commented Oct 17, 2018

fejta-bot commented Jan 15, 2019

houjun41544 commented Jan 15, 2019

fejta-bot commented Apr 15, 2019

fejta-bot commented May 15, 2019

fejta-bot commented Jun 14, 2019

k8s-ci-robot commented Jun 14, 2019

dmoessne commented Jul 14, 2019

k8s-ci-robot commented Jul 14, 2019

dmoessne commented Jul 14, 2019

houjun41544 commented Jul 15, 2019

k8s-ci-robot commented Jul 15, 2019

fejta-bot commented Aug 14, 2019

k8s-ci-robot commented Aug 14, 2019

koep commented Jan 7, 2020

glusterfs log files may become very large if volumes mount failed #68050

glusterfs log files may become very large if volumes mount failed #68050

Comments

houjun41544 commented Aug 30, 2018 • edited Loading

houjun41544 commented Aug 30, 2018

houjun41544 commented Sep 4, 2018

k8s-ci-robot commented Sep 4, 2018

jsafrane commented Sep 18, 2018

houjun41544 commented Sep 19, 2018 • edited Loading

humblec commented Sep 19, 2018 • edited Loading

houjun41544 commented Sep 19, 2018

warmchang commented Sep 20, 2018

houjun41544 commented Sep 20, 2018

humblec commented Sep 20, 2018

humblec commented Sep 20, 2018

warmchang commented Sep 20, 2018

admoriarty commented Oct 16, 2018

warmchang commented Oct 17, 2018

fejta-bot commented Jan 15, 2019

houjun41544 commented Jan 15, 2019

fejta-bot commented Apr 15, 2019

fejta-bot commented May 15, 2019

fejta-bot commented Jun 14, 2019

k8s-ci-robot commented Jun 14, 2019

dmoessne commented Jul 14, 2019

k8s-ci-robot commented Jul 14, 2019

dmoessne commented Jul 14, 2019

houjun41544 commented Jul 15, 2019

k8s-ci-robot commented Jul 15, 2019

fejta-bot commented Aug 14, 2019

k8s-ci-robot commented Aug 14, 2019

koep commented Jan 7, 2020

houjun41544 commented Aug 30, 2018 •

edited

Loading

houjun41544 commented Sep 19, 2018 •

edited

Loading

humblec commented Sep 19, 2018 •

edited

Loading