New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added epoch date formats to configure parsing of unix dates #11453
Added epoch date formats to configure parsing of unix dates #11453
Conversation
|
||
|`epoch_second`|A formatter for the number of seconds since the epoch. | ||
|
||
|`epoch_second_millis`|A formatter for the number of milliseconds since |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this not be epoch_millis
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed
Look good, left a small naming comment. Also, worth checking how this fits into @rjernst 's mapping changes (I think he has a PR outstanding on that) |
} | ||
return Long.toString(dateTimeFormatter.parser().parseMillis(timestampAsString)); | ||
} catch (RuntimeException e1) { | ||
throw new TimestampParsingException(timestampAsString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this was there before, but can we not lose the original exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, added
This is great! I left a couple comments. I also would prefer to just use |
d04ff1d
to
2d55d8c
Compare
updated the PR/replied to all of your comments. thx for the hints! |
LGTM |
@@ -133,6 +134,10 @@ public static FormatDateTimeFormatter forPattern(String input, Locale locale) { | |||
formatter = ISODateTimeFormat.yearMonth(); | |||
} else if ("yearMonthDay".equals(input) || "year_month_day".equals(input)) { | |||
formatter = ISODateTimeFormat.yearMonthDay(); | |||
} else if ("epoch_second".equals(input)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is confusing if we support camelCase in some of the options in this parser and not others (even if they are new). We should either support camelCase for all options or for none to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we plan to resolve this before 2.0? If so, I am fine leaving it as it is...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it will be resolved (rejecting camelCase) as its a huge change across 100s of files (because most parser use this style and not ParseField) and I don't see us doing that change quickly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it matters. We should not force making huge changes to the entire codebase in order to not add things which will just be deprecated and/or confusing to the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But I think this is confusing. If I have a format specified as yearMonthDay
that works then I would expect to be able to change it to epochSecond
and it would work. Supporting some values in camelCase for date formats and not other values is very confusing to a user. I'm all for removing camelCase but we should be consistent with it, especially when its different values for the same setting (in the case different values of format
for date fields).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with removing all the camelCase options for all formats in this PR to make it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already do not support camelCase for all settings, and I don't think there is any consistency even within the same query/field type/whatever.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with removing all the camelCase options for all formats in this PR to make it consistent.
This is the kind of statement that stalls progress. Requiring huge changes just to make a small improvement should not be necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am only talking about the date formats here, not across the whole codebase (i can see the above statement might have been a bit ambiguous on that). All the multi-word date format values above support both a camelCase and an underscored version. That should be consistent, whether that means supporting both for now or only supporting the underscored version I don't have a strong opinion but its hardly a huge change to update the date format values to be consistent and its not a huge overhead to maintain an extra 2 camelCase options given that any change to that policy would require a change to all the other date formats too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized we aren't even talking about setting names, but the valid values for the format
setting. This argument to use ParseValue does not make sense. We don't support camelCase in eg the index
option. We should not do it here, it will just add more work for users if we allow them to start using a new value that will just go away in the future (and will require them to change the value to what they would have found in the first place if they had tried using camelCase and seen an error).
2d55d8c
to
9ddf7d8
Compare
This commit changes the date handling. First and foremost Elasticsearch does not try to convert every date to a unix timestamp first and then uses the configured date. This now allows for dates like `2015121212` to be parsed correctly. Instead it is now explicit by adding a `epoch_second` and `epoch_millis` date format. This also means, that the default date format now is `epoch_millis||dateOptionalTime` to remain backwards compatible. Closes elastic#5328 Relates elastic#10971
9ddf7d8
to
01e8eaf
Compare
|
||
private static final Pattern MILLI_SECOND_PRECISION_PATTERN = Pattern.compile("^\\d{1,13}$"); | ||
private static final Pattern SECOND_PRECISION_PATTERN = Pattern.compile("^\\d{1,10}$"); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might be busted by-design, JSON numerical values only have a float type, so "1.470417092e+09" (produced by Go's json encoder) will just result in IllegalArgumentException[Invalid format: "1.470417092e+09"];
. Any recommended way around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, pass the value as a json string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah I suppose that might be ok, in this case it's user-defined input so that's pretty awkward but it beats breaking.
This PR adds the support the new date formats, namely
epoch_second
andepoch_second_millis
. By adding those, we can remove the internal logic to try and parse everything as a unix timestamp first and only then use our regular date format.This PR tries to remain backwards compatible by using
epoch_second_millis||dateOptionalTime
where before onlydateOptionalTime
was used in combination with the parsing.Some BWC occured, ie, a date like
10000
is now always a year without any other configuration instead of a unix timestamp.Also, in the current implementation, the
RangeQueryParser
allows to configure a timezone for queries usingfrom
/to
unix timestamps, even though the timezone is ignored in the query, as a timestamp is always UTC.Closes #5328
Relates #10971