Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LOG-3949: Vector not releasing deleted file handles #154

Merged
merged 1 commit into from Oct 1, 2023

Conversation

syedriko
Copy link

@syedriko syedriko commented Sep 28, 2023

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to practical infinity. Out of the box, this default effectively turns this feature off. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 28, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait, of type Duration, defaulting to 5 seconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 28, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait, of type Duration, defaulting to 5 seconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 28, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to 5,000 milliseconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 28, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to 5000 milliseconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@syedriko syedriko force-pushed the syedriko-log-3949 branch 2 times, most recently from 3cfa7d7 to 63662e2 Compare September 28, 2023 03:53
@syedriko
Copy link
Author

/test unit

2 similar comments
@syedriko
Copy link
Author

/test unit

@syedriko
Copy link
Author

/test unit

@syedriko
Copy link
Author

/assign @jcantrill

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 29, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to 5000 milliseconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@jcantrill jcantrill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold
/approve

src/sources/file.rs Outdated Show resolved Hide resolved
src/sources/kubernetes_logs/mod.rs Outdated Show resolved Hide resolved
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 29, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to 5000 milliseconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link

openshift-ci bot commented Sep 29, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcantrill, syedriko

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 29, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to 5000 milliseconds. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@syedriko
Copy link
Author

@jcantrill PTAL

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 30, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to practical infinity. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 30, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to practical infinity. Out of the box, this default effectively turns this feature off. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jcantrill
Copy link
Member

/hold cancel
/lgtm

@openshift-ci-robot
Copy link

openshift-ci-robot commented Oct 1, 2023

@syedriko: This pull request references LOG-3949 which is a valid jira issue.

In response to this:

For the file and kubernetes_logs sources, introduced a new configuration variable, rotate_wait_ms, of type Duration, defaulting to practical infinity. Out of the box, this default effectively turns this feature off. It determines how long vector is going to keep trying to read from a log file that has been deleted (most likely due to log rotation, hence the name of the variable). Once that time span has expired, vector closes the file descriptor of the deleted file, thus allowing the OS to reclaim the storage space occupied by the file.
This behavior is similar to that of Fluentd's tail plugin: https://docs.fluentd.org/input/tail#rotate_wait
There is also a new metric being introduced - vector_file_deleted_given_up_total, which is a counter.

JIRA:
https://issues.redhat.com/browse/LOG-3949

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-merge-robot openshift-merge-robot merged commit 01b49d0 into ViaQ:release-5.8 Oct 1, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants