Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding DATE_8601 to the list of default Grok patterns #124

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

berglh
Copy link

@berglh berglh commented Jan 13, 2016

On line 65 of this pattern file is the following comment:

 # datestamp is YYYY/MM/DD-HH:MM:SS.UUUU (or something like it)

It turns out the format in this comment isn't actually covered with YYYY MM DD with any other delimiter other than hyphens such as in TIMESTAMP_ISO8601.

There are a lot of log files that contain YYYY/MM/DD, not just the official ISO8601 time stamp YYYY-?MM-?DD.

I made the DATE_8601 pattern and suffixed it to the DATE pattern so when DATESTAMP is use, it covers all three common use formats. The YYYY MM DD is in used in China, Korea, Taiwan, Hungary, Iran, Japan, Lithuania due to ISO 8601. Also plenty of other countries use this format. I think for completeness of the basic date patterns it has a place here with all three common delimiters.

I think perhaps DATE_8601 is the wrong name for this pattern. Using slashes or dot delimiters is not an official 8601 format according to ISO8601.

I'm suspecting we should change this, but I don't know what a suitable suffix would be as it's hard to aggregate the aforementioned countries into a region code like the other date patterns in the file.

@elasticsearch-release
Copy link

Jenkins standing by to test this. If you aren't a maintainer, you can ignore this comment. Someone with commit access, please review this and clear it for Jenkins to run; then say 'jenkins, test it'.

Woops, wrong way around
@tedder
Copy link

tedder commented May 18, 2020

This would address #67 and #112. I'm not sure if it resolves it, as .. wouldn't the DATE_* entries match before the third option is hit? The sane way to do it would be to add a "mandatory 4-digit year" entry to pick it off first:

YEAR4 (?>\d{4})
DATE_8061 %{YEAR4}[./-]%{MONTHNUM}[./-]%{MONTHDAY}
DATE %{DATE_8061}|%{DATE_US}|%{DATE_EU}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants