New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More strict parsing of ISO dates #6227
More strict parsing of ISO dates #6227
Conversation
talked with @clintongormley - makes sense to not replace those dates, but create new ones with a 'strict' prefix (like |
@clintongormley updated the PR, maybe you can take another look Changes:
If this is ok, I will need to update documentation, I will add the missing formats and their format (added this info in the test so far), and mention the |
hi @spinscale Almost there, but there are two default formats used for detection. one is now strict ,but the other is not: https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/index/mapper/object/RootObjectMapper.java#L45 This means that this illegal date is still mapped:
|
Updated the rootobjectmapper to have a more strict date. Talked with @clintongormley to possible have strict mapping for self formatted dates as well, like |
@clintongormley ok, so having a parsing format like |
LGTM |
@spinscale any news on getting this one merged? |
@spinscale giving this one back to you :) |
Also see #5328 |
I dug a bit around in joda time and still do not see any other pattern than to copy the whole class and use the fixed date builders. Another approach would have been to take supplied date string and compare it with the date being created by the date formatter - if it doesnt match, then the formatting has been different. However this flaw has the problem, that we need to parse the date twice and it becomes really complex, when we use the |
4c70305
to
e852304
Compare
.endObject().endObject().string(); | ||
|
||
String indexName = "test-" + randomAsciiOfLength(10).toLowerCase(Locale.ROOT); | ||
IndexService index = createIndex(indexName, settingsBuilder().put(IndexMetaData.SETTING_VERSION_CREATED, Version.V_1_5_2_ID).build()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it sucks we can do this, but it helps here :)
e852304
to
ba2054d
Compare
incoporporated your review comments @bleskes |
@@ -171,6 +244,38 @@ public static FormatDateTimeFormatter forPattern(String input, Locale locale) { | |||
return new FormatDateTimeFormatter(input, formatter.withZone(DateTimeZone.UTC), locale); | |||
} | |||
|
|||
public static FormatDateTimeFormatter getStrictStandardDateFormater() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: getStrictStandardDateFormat_t_er
I went through and left some comments. Looks nice. I think we also need to document this on the 2.0 migration doc. Also would love someone who is more at home with the mapping code to have a look /cc @rjernst |
@@ -548,4 +554,13 @@ protected void doXContentBody(XContentBuilder builder, boolean includeDefaults, | |||
} | |||
} | |||
} | |||
|
|||
private static FormatDateTimeFormatter getDateTimeFormatter(Settings indexSettings) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is cleaner to have inline? it is very short and not used anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. removed.
I left a couple comments on the mapping portion. |
ba2054d
to
392d8bd
Compare
if (Version.indexCreated(context.indexSettings()).before(Version.V_2_0_0)) { | ||
boolean includesEpochFormatter = dateTimeFormatter.format().contains("epoch_"); | ||
if (fieldType().dateTimeFormatter().equals(Defaults.DATE_TIME_FORMATTER)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't this have the same problem? If the user upgrades to 2.0, and tries to set their older index to use the new defaults, they will be forced back to non strict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, you are right. A bug in the code didnt let my test fail accordingly... will rethink.
Thx for pointing out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added a new commit with a test fixing this. I just check if the the parser actually parsed a new format and only in case if not, the defaults are applied
392d8bd
to
d004f96
Compare
LGTM |
@bleskes are you good with the documentation changes? |
@@ -49,6 +49,13 @@ first millisecond of the rounding scope. The semantics work as follows: | |||
[[built-in]] | |||
=== Built In Formats | |||
|
|||
Most of the above dates have a `strict` companion dates, which means, that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
above -> bellow
left one minor comment. I think we also need an entry in migrate_2_0.asciidoc ? |
d004f96
to
f8faac5
Compare
fixed the typo and added a small paragraph to the upgrade asciidoc |
LGTM. Thx @spinscale |
f8faac5
to
72f82a0
Compare
If you are using the default date or the named identifiers of dates, the current implementation was allowed to read a year with only one digit. In order to make this more strict, this fixes a year to be at least 4 digits. Same applies for month, day, hour, minute, seconds. Also the new default is `strictDateOptionalTime` for indices created with Elasticsearch 2.0 or newer. In addition a couple of not exposed date formats have been exposed, as they have been mentioned in the documentation. Closes elastic#6158
72f82a0
to
b612cab
Compare
If you are using the default date or the named identifiers of dates,
the current implementation was allowed to read a year with only one
digit. In order to make this more strict, this fixes a year to be at
least 4 digits.
Closes #6158