Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added epoch date formats to configure parsing of unix dates #11453

Merged

Conversation

spinscale
Copy link
Contributor

This PR adds the support the new date formats, namely epoch_second and epoch_second_millis. By adding those, we can remove the internal logic to try and parse everything as a unix timestamp first and only then use our regular date format.

This PR tries to remain backwards compatible by using epoch_second_millis||dateOptionalTime where before only dateOptionalTime was used in combination with the parsing.

Some BWC occured, ie, a date like 10000 is now always a year without any other configuration instead of a unix timestamp.

Also, in the current implementation, the RangeQueryParser allows to configure a timezone for queries using from/to unix timestamps, even though the timezone is ignored in the query, as a timestamp is always UTC.

Closes #5328
Relates #10971

@spinscale spinscale added v2.0.0-beta1 :Search/Mapping Index mappings, including merging and defining field types review labels Jun 2, 2015

|`epoch_second`|A formatter for the number of seconds since the epoch.

|`epoch_second_millis`|A formatter for the number of milliseconds since
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this not be epoch_millis?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

@colings86
Copy link
Contributor

Look good, left a small naming comment. Also, worth checking how this fits into @rjernst 's mapping changes (I think he has a PR outstanding on that)

}
return Long.toString(dateTimeFormatter.parser().parseMillis(timestampAsString));
} catch (RuntimeException e1) {
throw new TimestampParsingException(timestampAsString);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this was there before, but can we not lose the original exception?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, added

@rjernst
Copy link
Member

rjernst commented Jun 3, 2015

This is great! I left a couple comments. I also would prefer to just use epoch_millis.

@spinscale spinscale force-pushed the 1506-add-epoch-to-named-date-formats branch from d04ff1d to 2d55d8c Compare June 3, 2015 09:41
@spinscale
Copy link
Contributor Author

updated the PR/replied to all of your comments. thx for the hints!

@rjernst
Copy link
Member

rjernst commented Jun 3, 2015

LGTM

@@ -133,6 +134,10 @@ public static FormatDateTimeFormatter forPattern(String input, Locale locale) {
formatter = ISODateTimeFormat.yearMonth();
} else if ("yearMonthDay".equals(input) || "year_month_day".equals(input)) {
formatter = ISODateTimeFormat.yearMonthDay();
} else if ("epoch_second".equals(input)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is confusing if we support camelCase in some of the options in this parser and not others (even if they are new). We should either support camelCase for all options or for none to be consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we plan to resolve this before 2.0? If so, I am fine leaving it as it is...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will be resolved (rejecting camelCase) as its a huge change across 100s of files (because most parser use this style and not ParseField) and I don't see us doing that change quickly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it matters. We should not force making huge changes to the entire codebase in order to not add things which will just be deprecated and/or confusing to the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I think this is confusing. If I have a format specified as yearMonthDay that works then I would expect to be able to change it to epochSecond and it would work. Supporting some values in camelCase for date formats and not other values is very confusing to a user. I'm all for removing camelCase but we should be consistent with it, especially when its different values for the same setting (in the case different values of format for date fields).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also be fine with removing all the camelCase options for all formats in this PR to make it consistent.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do not support camelCase for all settings, and I don't think there is any consistency even within the same query/field type/whatever.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also be fine with removing all the camelCase options for all formats in this PR to make it consistent.

This is the kind of statement that stalls progress. Requiring huge changes just to make a small improvement should not be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am only talking about the date formats here, not across the whole codebase (i can see the above statement might have been a bit ambiguous on that). All the multi-word date format values above support both a camelCase and an underscored version. That should be consistent, whether that means supporting both for now or only supporting the underscored version I don't have a strong opinion but its hardly a huge change to update the date format values to be consistent and its not a huge overhead to maintain an extra 2 camelCase options given that any change to that policy would require a change to all the other date formats too

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized we aren't even talking about setting names, but the valid values for the format setting. This argument to use ParseValue does not make sense. We don't support camelCase in eg the index option. We should not do it here, it will just add more work for users if we allow them to start using a new value that will just go away in the future (and will require them to change the value to what they would have found in the first place if they had tried using camelCase and seen an error).

@spinscale spinscale force-pushed the 1506-add-epoch-to-named-date-formats branch from 2d55d8c to 9ddf7d8 Compare June 3, 2015 12:21
This commit changes the date handling. First and foremost Elasticsearch
does not try to convert every date to a unix timestamp first and then
uses the configured date. This now allows for dates like `2015121212` to
be parsed correctly.

Instead it is now explicit by adding a `epoch_second` and `epoch_millis`
date format. This also means, that the default date format now is
`epoch_millis||dateOptionalTime` to remain backwards compatible.

Closes elastic#5328
Relates elastic#10971
@spinscale spinscale force-pushed the 1506-add-epoch-to-named-date-formats branch from 9ddf7d8 to 01e8eaf Compare June 3, 2015 16:08
@spinscale spinscale merged commit 01e8eaf into elastic:master Jun 3, 2015
@kevinkluge kevinkluge removed the review label Jun 3, 2015
@clintongormley clintongormley added >feature :Dates and removed :Search/Mapping Index mappings, including merging and defining field types labels Jun 4, 2015
@clintongormley clintongormley changed the title Dates: Added epoch date formats to configure parsing of unix dates Added epoch date formats to configure parsing of unix dates Jun 6, 2015

private static final Pattern MILLI_SECOND_PRECISION_PATTERN = Pattern.compile("^\\d{1,13}$");
private static final Pattern SECOND_PRECISION_PATTERN = Pattern.compile("^\\d{1,10}$");

Copy link

@tj tj Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be busted by-design, JSON numerical values only have a float type, so "1.470417092e+09" (produced by Go's json encoder) will just result in IllegalArgumentException[Invalid format: "1.470417092e+09"];. Any recommended way around this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, pass the value as a json string.

Copy link

@tj tj Aug 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah I suppose that might be ok, in this case it's user-defined input so that's pretty awkward but it beats breaking.

@clintongormley clintongormley added :Search/Mapping Index mappings, including merging and defining field types and removed :Dates labels Feb 13, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature :Search/Mapping Index mappings, including merging and defining field types v2.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

filtering on _timestamp field (with custom format) not working
6 participants