Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grok should ignore tilde backup files when processing patterns_dir #33

Open
jordansissel opened this issue May 17, 2015 · 3 comments
Open

Comments

@jordansissel
Copy link
Contributor

(This issue was originally filed by @mrec at elastic/logstash#2271)


(This comes from the discussion of #2244)

When testing a config using grok and custom patterns, a user will often be editing pattern definition files in patterns_dir between run attempts. Many (most?) Linux-ey text editors create backup files, named as the original filename plus a ~ suffix, in the same location as the original; even though they aren't hidden these are often invisible by default in file browsers. When dealing with multiple pattern definition files, and especially when renaming them, it's possible to have a lot of these tilde files lying around after a while.

grok currently reads everything in patterns_dir, including any tilde backups. It quite reasonably doesn't define the order in which it reads them, and it doesn't warn if e.g. the definition of MYPATTERN in a stale patterns~ or previousfilename~ backup file overrides the definition of MYPATTERN in patterns. Hilarity ensues. Also hair-tearing, teeth-gnashing, bad language and various other undesirable outcomes.

I propose that grok should ignore any files in patterns_dir ending in a ~. There may be other things it'd be beneficial to blacklist too, but this seems like a good start.

@jordansissel
Copy link
Contributor Author

Many (most?) Linux-ey text editors create backup files, named as the original filename plus a ~ suffix

The last time I did research on this showed that vim, emacs, nano, and several other editors all use different backup file name schemes.

I am not in favor of hardcoded blacklisting of file names. Something similar to how .gitignore works would be preferable because it would be user controllable.

@magnusbaeck
Copy link

Perhaps the fundamental problem here is that Logstash isn't transparent about what it's doing. Git's behavior is easily observable with e.g. git status while for Logstash you have to increase the log level to get any kind of clues.

Another difference compared to Git and other version control tools is that in those cases different people might want to ignore different files, so there is a clear value in having it configurable. Here, not as much.

I don't mind making this configurable, but the defaults should be sane so that casual users won't fall into this trap. Otherwise it's all just a waste of time.

@breml
Copy link

breml commented Nov 25, 2015

I provided PR #63, which may be a solution for this issue, without breaking compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants