Skip to content

Silently ignore empty files in filestream#49196

Merged
rdner merged 7 commits intoelastic:mainfrom
rdner:ignore-empty-files
Mar 3, 2026
Merged

Silently ignore empty files in filestream#49196
rdner merged 7 commits intoelastic:mainfrom
rdner:ignore-empty-files

Conversation

@rdner
Copy link
Member

@rdner rdner commented Mar 2, 2026

Proposed commit message

Empty files are excluded from processing in filestream as early as possible.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works. Where relevant, I have used the stresstest.sh script to run them under stress conditions and race detector to verify their stability.
  • I have added an entry in ./changelog/fragments using the changelog tool.

Benchmarks

Environment: darwin/arm64, Apple M4 Pro, 12 threads

cd ./filebeat/input/filestream
go test -bench='^BenchmarkGetFiles$' -benchmem -run=^$ -count=5
Metric main this PR Delta Significant
Time (ms/op) 1.853 1.851 ~0% No (p=0.548)
Memory (MiB/op) 1.197 1.196 ~0% No (p=0.151)
Allocs/op 9,107 9,107 ~0% No (p=0.159)

So, no impact.

Related issues

Empty files are excluded from processing in filestream as early as possible.
@rdner rdner self-assigned this Mar 2, 2026
@rdner rdner added enhancement Filebeat Filebeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team backport-active-all Automated backport with mergify to all the active branches labels Mar 2, 2026
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Mar 2, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@rdner rdner marked this pull request as ready for review March 2, 2026 16:47
@rdner rdner requested a review from a team as a code owner March 2, 2026 16:47
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@rdner rdner requested review from AndersonQ and belimawr and removed request for andrzej-stencel March 2, 2026 16:47
@coderabbitai
Copy link

coderabbitai bot commented Mar 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4c57a4 and 4a548ec.

📒 Files selected for processing (1)
  • filebeat/input/filestream/input_integration_test.go

📝 Walkthrough

Walkthrough

Detects and silently exclude zero-byte files from filestream processing. Adds a new package error errFileEmpty and reorders getIngestTarget to detect empty regular files and empty symlink targets early, deferring initialization of file info until after these checks. GetFiles suppresses debug logging for errFileEmpty cases. Tests and integration tests were added/adjusted to cover empty regular files, symlinks to empty files, and updated truncation/offset expectations. A changelog fragment changelog/fragments/1772466645-ignore-empty-files.yaml was added.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR successfully implements all key requirements from issue #48891: empty files are silently skipped, non-empty small files retain existing warnings, early exclusion prevents unnecessary processing.
Out of Scope Changes check ✅ Passed All changes directly support the primary objective. Test adjustments (5-byte truncation) reflect the new behavior where zero-byte truncation equals delete/re-create, maintaining scope alignment.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the filestream scanner/watcher logic to treat 0-byte files as a special case: they are excluded from ingestion decisions early and do not contribute to fingerprint “too small” warnings/log noise, aligning behavior with the empty-file expectations described in #48891.

Changes:

  • Add an errFileEmpty sentinel and exclude empty regular files (and symlinks to empty targets) during ingest-target resolution.
  • Suppress per-scan debug logging for the empty-file case in fileScanner.GetFiles().
  • Update/add tests to cover silent exclusion of empty files and symlinks; add changelog fragment.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
filebeat/input/filestream/fswatch.go Introduces errFileEmpty and skips empty files early; avoids debug logging for empty-file ingest-target failures.
filebeat/input/filestream/fswatch_test.go Updates watcher test expectation (no debug for empty files) and adds scanner/getIngestTarget tests for empty file exclusion.
filebeat/input/filestream/fswatch_integration_test.go Minor test cleanup (unused param) and relocates mustFingerprintIdentifier helper.
changelog/fragments/1772466645-ignore-empty-files.yaml Adds changelog entry for the enhancement.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

rdner and others added 2 commits March 2, 2026 19:22
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@rdner
Copy link
Member Author

rdner commented Mar 2, 2026

I still need to fix the TestFilestreamTruncated* tests (or fix the behavior, let's see). Putting back in draft for now.

@rdner rdner marked this pull request as draft March 2, 2026 20:38
@rdner
Copy link
Member Author

rdner commented Mar 2, 2026

I still need to fix the TestFilestreamTruncated* tests (or fix the behavior, let's see). Putting back in draft for now.

So, that's because now truncating a file to zero is handled the same way as delete/re-create. Which I cannot see any problem with (am I missing something?).

Truncating to anything but zero is still handled with the special behavior as before. I adjusted the existing tests to reflect that.

@rdner rdner marked this pull request as ready for review March 2, 2026 21:20
@rdner rdner requested a review from belimawr March 2, 2026 21:20
@AndersonQ
Copy link
Member

So, that's because now truncating a file to zero is handled the same way as delete/re-create. Which I cannot see any problem with (am I missing something?).

I don't think so, the truncated file would get a new fingerprint anyway.

@rdner rdner merged commit 404fb99 into elastic:main Mar 3, 2026
55 checks passed
@github-actions
Copy link
Contributor

github-actions bot commented Mar 3, 2026

@Mergifyio backport 8.19 9.2 9.3

@mergify
Copy link
Contributor

mergify bot commented Mar 3, 2026

backport 8.19 9.2 9.3

✅ Backports have been created

Details

Cherry-pick of 404fb99 has failed:

On branch mergify/bp/8.19/pr-49196
Your branch is up to date with 'origin/8.19'.

You are currently cherry-picking commit 404fb99fd.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   changelog/fragments/1772466645-ignore-empty-files.yaml
	modified:   filebeat/input/filestream/fswatch.go
	modified:   filebeat/input/filestream/input_integration_test.go

Unmerged paths:
  (use "git add/rm <file>..." as appropriate to mark resolution)
	deleted by us:   filebeat/input/filestream/fswatch_integration_test.go
	both modified:   filebeat/input/filestream/fswatch_test.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Cherry-pick of 404fb99 has failed:

On branch mergify/bp/9.2/pr-49196
Your branch is up to date with 'origin/9.2'.

You are currently cherry-picking commit 404fb99fd.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   changelog/fragments/1772466645-ignore-empty-files.yaml
	modified:   filebeat/input/filestream/fswatch.go
	modified:   filebeat/input/filestream/fswatch_integration_test.go
	modified:   filebeat/input/filestream/input_integration_test.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   filebeat/input/filestream/fswatch_test.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Cherry-pick of 404fb99 has failed:

On branch mergify/bp/9.3/pr-49196
Your branch is up to date with 'origin/9.3'.

You are currently cherry-picking commit 404fb99fd.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	new file:   changelog/fragments/1772466645-ignore-empty-files.yaml
	modified:   filebeat/input/filestream/fswatch.go
	modified:   filebeat/input/filestream/fswatch_integration_test.go
	modified:   filebeat/input/filestream/input_integration_test.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   filebeat/input/filestream/fswatch_test.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

mergify bot pushed a commit that referenced this pull request Mar 3, 2026
Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_integration_test.go
#	filebeat/input/filestream/fswatch_test.go
mergify bot pushed a commit that referenced this pull request Mar 3, 2026
Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_test.go
mergify bot pushed a commit that referenced this pull request Mar 3, 2026
Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_test.go
rdner added a commit that referenced this pull request Mar 3, 2026
)

* Silently ignore empty files in filestream (#49196)

Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_test.go

* Resolve conflicts

---------

Co-authored-by: Denis <denis.rechkunov@elastic.co>
rdner added a commit that referenced this pull request Mar 3, 2026
…9230)

* Silently ignore empty files in filestream (#49196)

Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_integration_test.go
#	filebeat/input/filestream/fswatch_test.go

* Resolve conflicts

---------

Co-authored-by: Denis <denis.rechkunov@elastic.co>
rdner added a commit that referenced this pull request Mar 3, 2026
)

* Silently ignore empty files in filestream (#49196)

Empty files are excluded from processing in filestream as early as possible.
Now the truncate to zero behavior equals to delete/recreate.

(cherry picked from commit 404fb99)

# Conflicts:
#	filebeat/input/filestream/fswatch_test.go

* Resolve conflicts

---------

Co-authored-by: Denis <denis.rechkunov@elastic.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-active-all Automated backport with mergify to all the active branches enhancement Filebeat Filebeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

filestream: empty should be silently ignored instead of triggering fingerprint warnings

5 participants