Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pkg/stanza] Functionality to delete files once processed #16314

Closed
schmikei opened this issue Nov 15, 2022 · 3 comments · Fixed by #17839
Closed

[pkg/stanza] Functionality to delete files once processed #16314

schmikei opened this issue Nov 15, 2022 · 3 comments · Fixed by #17839
Assignees
Labels
enhancement New feature or request needs triage New item requiring triage pkg/stanza receiver/filelog

Comments

@schmikei
Copy link
Contributor

schmikei commented Nov 15, 2022

Component(s)

pkg/stanza, receiver/filelog

Is your feature request related to a problem? Please describe.

One use case that has come up is the ability to specify to the receiver that once a file has been fully read and processed by the collector that it should be removed from the host system.

Some users would argue that this feature would greatly increase usage of the filelogreceiver in environments that are either sensitive to disk size or desire not to keep log data on a host once its been exported by their preferred exporter.

Specifically one of the user stories that is attempting to be addressed here is to read & process hundreds of compressed log files from a designated directory, with each of them being deleted after the read is complete.

Describe the solution you'd like

Some way for specifying to the fileconsumer.Config that once we are done with a file, the consumer will delete the file.

By default this behavior should be disabled.

So something akin to:

type Config struct {
	Finder                  `mapstructure:",squash"`
	...
        DeleteOnceFinished bool `mapstructure:"delete_once_finished"`
}

Describe alternatives you've considered

One thing that is somewhat is an anti-pattern is that this is actively modifying the host environment of the collector which is something that usually should be avoided.

Other alternative parameters is to emulate what is available for K8s cron-jobs and employ some type of parameter that is like ttl_seconds_after_finished where if a file has no new entries within a certain amount of seconds it can be deleted.

Would love for more discussions on how to solve this user case but wanted to get the discussion going on how we could potentially solve this use case using open-telemetry.

Additional context

As of submitting this, mostly wanted to gather thoughts about what would be a correct way of solving this problem. Not proposing that this would be the final solution.

@schmikei schmikei added enhancement New feature or request needs triage New item requiring triage labels Nov 15, 2022
@github-actions
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

I think this functionality would be valuable for some use cases. As noted in today's SIG, there is at least one other log agent that supports this (logstash).

The main concern here is security. The worst case scenario is that a malicious actor is able to delete important files. I believe we can mitigate this to a large extend by putting this functionality behind a permanent feature gate. This would require that users who wish enable this functionality must run the process with a specific flag. This means that a malicious actor who is only able to reconfigure the collector (but not in control the startup command) is unable to delete files.

@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request needs triage New item requiring triage pkg/stanza receiver/filelog
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants