Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Filebeat] Implement a more stable file identifier #34419

Closed
7 tasks
belimawr opened this issue Jan 30, 2023 · 9 comments · Fixed by #35734
Closed
7 tasks

[Filebeat] Implement a more stable file identifier #34419

belimawr opened this issue Jan 30, 2023 · 9 comments · Fixed by #35734
Assignees
Labels
bug Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team

Comments

@belimawr
Copy link
Contributor

belimawr commented Jan 30, 2023

Issue

Our current default implementation implementation for file_identity uses a device ID that is not unique in some situations. One situation is when using using LVM according to Red Hat Enterprise documentation. Another problem is the use of inodes, in some situations inodes can be reused, which will make Filebeat skip some lines or whole files.

This issue is about exploring a more widely stable file_identity implementation for Filebeat and/or any Beat that requires it.

EDIT: as much as possible, the solution should be working in 7.17.

Current fix idea

Add a new fingerprint file identity option based on a computed hash from the initial bytes in the file.
The number of bytes to read should be configurable.

Definition of done

Target: 8.9 and 7.17.11

  • Implement the new file identity option that can be selected.
  • Experiment to find the "best" default value for the number of bytes to read.
  • This new file identifier needs to at first fix the issue on Linux.
  • This new file identifier needs to fix the issue on MacOS and Windows.
  • Test to validate this file identifier is not impacting other in place.
  • Test to confirm this file identifier is used when selected.
  • Test to confirm we are not ingesting the same file when inodes changed.
@belimawr belimawr added bug Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Jan 30, 2023
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@pierrehilbert
Copy link
Collaborator

pierrehilbert commented Feb 7, 2023

@nimarezainia: do you know users facing this situation?

@nimarezainia
Copy link
Contributor

nimarezainia commented Feb 8, 2023

@nimarezainia: do you know users facing this situation?

~~ I am not aware, No. ~~
Update: inode issues faced are related to this.

@jlind23
Copy link
Collaborator

jlind23 commented Mar 20, 2023

@belimawr looks like this might have some overlap with #27278 right?

@belimawr
Copy link
Contributor Author

@belimawr looks like this might have some overlap with #27278 right?

Yes, there is some overlap. We could have a meta issue for all file identity related problems.

@nimarezainia
Copy link
Contributor

nimarezainia commented Mar 21, 2023

@belimawr if there's any opportunity for the user to test a fix for the inode problem that would be great.

@belimawr
Copy link
Contributor Author

I edited the issue adding information about the inode reuse problem as well. So we have a broader scope here.

@nimarezainia
Copy link
Contributor

nimarezainia commented Mar 28, 2023

Hi, we need this fix for the inode rotation issue. Please prioritize for that use case.

@pierrehilbert
Copy link
Collaborator

Next step is to define the best way to fix this issue.

@cmacknz cmacknz assigned rdner and unassigned belimawr May 3, 2023
@jlind23 jlind23 added the Team:Elastic-Agent Label for the Agent team label Jun 9, 2023
This was referenced Jul 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Team:Elastic-Agent Label for the Agent team Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants