You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When testing a config using grok and custom patterns, a user will often be editing pattern definition files in patterns_dir between run attempts. Many (most?) Linux-ey text editors create backup files, named as the original filename plus a ~ suffix, in the same location as the original; even though they aren't hidden these are often invisible by default in file browsers. When dealing with multiple pattern definition files, and especially when renaming them, it's possible to have a lot of these tilde files lying around after a while.
grok currently reads everything in patterns_dir, including any tilde backups. It quite reasonably doesn't define the order in which it reads them, and it doesn't warn if e.g. the definition of MYPATTERN in a stale patterns~ or previousfilename~ backup file overrides the definition of MYPATTERN in patterns. Hilarity ensues. Also hair-tearing, teeth-gnashing, bad language and various other undesirable outcomes.
I propose that grok should ignore any files in patterns_dir ending in a ~. There may be other things it'd be beneficial to blacklist too, but this seems like a good start.
The text was updated successfully, but these errors were encountered:
Many (most?) Linux-ey text editors create backup files, named as the original filename plus a ~ suffix
The last time I did research on this showed that vim, emacs, nano, and several other editors all use different backup file name schemes.
I am not in favor of hardcoded blacklisting of file names. Something similar to how .gitignore works would be preferable because it would be user controllable.
Perhaps the fundamental problem here is that Logstash isn't transparent about what it's doing. Git's behavior is easily observable with e.g. git status while for Logstash you have to increase the log level to get any kind of clues.
Another difference compared to Git and other version control tools is that in those cases different people might want to ignore different files, so there is a clear value in having it configurable. Here, not as much.
I don't mind making this configurable, but the defaults should be sane so that casual users won't fall into this trap. Otherwise it's all just a waste of time.
(This issue was originally filed by @mrec at elastic/logstash#2271)
(This comes from the discussion of #2244)
When testing a config using grok and custom patterns, a user will often be editing pattern definition files in
patterns_dir
between run attempts. Many (most?) Linux-ey text editors create backup files, named as the original filename plus a~
suffix, in the same location as the original; even though they aren't hidden these are often invisible by default in file browsers. When dealing with multiple pattern definition files, and especially when renaming them, it's possible to have a lot of these tilde files lying around after a while.grok
currently reads everything inpatterns_dir
, including any tilde backups. It quite reasonably doesn't define the order in which it reads them, and it doesn't warn if e.g. the definition ofMYPATTERN
in a stalepatterns~
orpreviousfilename~
backup file overrides the definition ofMYPATTERN
inpatterns
. Hilarity ensues. Also hair-tearing, teeth-gnashing, bad language and various other undesirable outcomes.I propose that
grok
should ignore any files inpatterns_dir
ending in a~
. There may be other things it'd be beneficial to blacklist too, but this seems like a good start.The text was updated successfully, but these errors were encountered: