Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Make parsing of date more lenient #52137

Merged
merged 6 commits into from Feb 10, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -130,9 +130,9 @@
import static java.util.Collections.emptyList;
import static java.util.Collections.singletonList;
import static org.elasticsearch.xpack.ql.type.DataTypeConverter.converterFor;
import static org.elasticsearch.xpack.sql.util.DateUtils.asDateOnly;
import static org.elasticsearch.xpack.sql.util.DateUtils.asTimeOnly;
import static org.elasticsearch.xpack.sql.util.DateUtils.ofEscapedLiteral;
import static org.elasticsearch.xpack.sql.util.DateUtils.dateOfEscapedLiteral;
import static org.elasticsearch.xpack.sql.util.DateUtils.dateTimeOfEscapedLiteral;

abstract class ExpressionBuilder extends IdentifierBuilder {

Expand Down Expand Up @@ -761,9 +761,9 @@ private SqlTypedParamValue param(TerminalNode node) {
public Literal visitDateEscapedLiteral(DateEscapedLiteralContext ctx) {
String string = string(ctx.string());
Source source = source(ctx);
// parse yyyy-MM-dd
// parse yyyy-MM-dd (time optional but is set to 00:00:00.000 because of the conversion to DATE
try {
return new Literal(source, asDateOnly(string), SqlDataTypes.DATE);
return new Literal(source, dateOfEscapedLiteral(string), SqlDataTypes.DATE);
} catch(DateTimeParseException ex) {
throw new ParsingException(source, "Invalid date received; {}", ex.getMessage());
}
Expand All @@ -789,7 +789,7 @@ public Literal visitTimestampEscapedLiteral(TimestampEscapedLiteralContext ctx)
Source source = source(ctx);
// parse yyyy-mm-dd hh:mm:ss(.f...)
try {
return new Literal(source, ofEscapedLiteral(string), DataTypes.DATETIME);
return new Literal(source, dateTimeOfEscapedLiteral(string), DataTypes.DATETIME);
} catch (DateTimeParseException ex) {
throw new ParsingException(source, "Invalid timestamp received; {}", ex.getMessage());
}
Expand Down
Expand Up @@ -42,6 +42,30 @@ public final class DateUtils {
.appendLiteral('T')
.append(ISO_LOCAL_TIME)
.toFormatter().withZone(UTC);
private static final DateTimeFormatter DATE_ESCAPED_LITERAL_FORMATTER_WHITESPACE = new DateTimeFormatterBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name of the formatters could be more descriptive? The fact that time can be added, but it's optional, imo should be reflected in the formatter's name. Also, not to abuse the length of the names, how about DATE_OPTIONAL_TIME_FORMATTER_WHITESPACE? (ie get rid of ESCAPED_LITERAL keep the FORMATTER and mention the whitespace/'T') Same idea could be applied to the rest of the formatters: `DATE_OPTIONAL_OPTIONAL_TIME_FORMATTER_T, ISO_DATE_OPTIONAL_TIME_FORMATTER_WHITESPACE, ISO_DATE_OPTIONAL_TIME_FORMATTER_T.....

.append(ISO_LOCAL_DATE)
.optionalStart()
.appendLiteral(' ')
.append(ISO_LOCAL_TIME)
.toFormatter().withZone(UTC);
private static final DateTimeFormatter DATE_ESCAPED_LITERAL_FORMATTER_T_LITERAL = new DateTimeFormatterBuilder()
.append(ISO_LOCAL_DATE)
.optionalStart()
.appendLiteral('T')
.append(ISO_LOCAL_TIME)
.toFormatter().withZone(UTC);
private static final DateTimeFormatter ISO_LOCAL_DATE_OPTIONAL_TIME_FORMATTER_WHITESPACE = new DateTimeFormatterBuilder()
.append(DATE_ESCAPED_LITERAL_FORMATTER_WHITESPACE)
.optionalStart()
.appendZoneOrOffsetId()
.optionalEnd()
.toFormatter().withZone(UTC);
private static final DateTimeFormatter ISO_LOCAL_DATE_OPTIONAL_TIME_FORMATTER_T_LITERAL = new DateTimeFormatterBuilder()
.append(DATE_ESCAPED_LITERAL_FORMATTER_T_LITERAL)
.optionalStart()
.appendZoneOrOffsetId()
.optionalEnd()
.toFormatter().withZone(UTC);

private static final DateFormatter UTC_DATE_TIME_FORMATTER = DateFormatter.forPattern("date_optional_time").withZone(UTC);
private static final int DEFAULT_PRECISION_FOR_CURRENT_FUNCTIONS = 3;
Expand Down Expand Up @@ -91,7 +115,17 @@ public static ZonedDateTime asDateTime(long millis, ZoneId id) {
* Parses the given string into a Date (SQL DATE type) using UTC as a default timezone.
*/
public static ZonedDateTime asDateOnly(String dateFormat) {
return LocalDate.parse(dateFormat, ISO_LOCAL_DATE).atStartOfDay(UTC);
int separatorIdx = dateFormat.indexOf('-');
if (separatorIdx == 0) { // negative year
separatorIdx = dateFormat.indexOf('-', separatorIdx + 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since separatorIdx == 0 then the indexOf call should be indexOf('-', 1) right?

}
separatorIdx = dateFormat.indexOf('-', separatorIdx + 1) + 3;
// Avoid index out of bounds - it will lead to DateTimeParseException anyways
if (separatorIdx >= dateFormat.length() || dateFormat.charAt(separatorIdx) == 'T') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which situation is the > valid here? (I refer to separatorIdx >= dateFormat.length()) Isn't == enough to detect a date-only string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 lines above we do +3 to move directly after the day part in the string.
If the string passed is something like: 2020-12- then we have moved passed the string's length and dateFormat.charAt() will throw OutOfBounds exception.

I forgot to add a test for that, will do.

return LocalDate.parse(dateFormat, ISO_LOCAL_DATE_OPTIONAL_TIME_FORMATTER_T_LITERAL).atStartOfDay(UTC);
} else {
return LocalDate.parse(dateFormat, ISO_LOCAL_DATE_OPTIONAL_TIME_FORMATTER_WHITESPACE).atStartOfDay(UTC);
}
}

public static ZonedDateTime asDateOnly(ZonedDateTime zdt) {
Expand All @@ -109,7 +143,17 @@ public static ZonedDateTime asDateTime(String dateFormat) {
return DateFormatters.from(UTC_DATE_TIME_FORMATTER.parse(dateFormat)).withZoneSameInstant(UTC);
}

public static ZonedDateTime ofEscapedLiteral(String dateFormat) {
public static ZonedDateTime dateOfEscapedLiteral(String dateFormat) {
int separatorIdx = dateFormat.lastIndexOf('-') + 3;
// Avoid index out of bounds - it will lead to DateTimeParseException anyways
if (separatorIdx >= dateFormat.length() || dateFormat.charAt(separatorIdx) == 'T') {
return LocalDate.parse(dateFormat, DATE_ESCAPED_LITERAL_FORMATTER_T_LITERAL).atStartOfDay(UTC);
} else {
return LocalDate.parse(dateFormat, DATE_TIME_ESCAPED_LITERAL_FORMATTER_WHITESPACE).atStartOfDay(UTC);
}
}

public static ZonedDateTime dateTimeOfEscapedLiteral(String dateFormat) {
int separatorIdx = dateFormat.lastIndexOf('-') + 3;
// Avoid index out of bounds - it will lead to DateTimeParseException anyways
if (separatorIdx >= dateFormat.length() || dateFormat.charAt(separatorIdx) == 'T') {
Expand Down
Expand Up @@ -81,6 +81,13 @@ private String buildDate() {
return sb.toString();
}

private String buildTime() {
if (randomBoolean()) {
return (randomBoolean() ? "T" : " ") + "11:22" + buildSecsAndFractional();
}
return "";
}

private String buildSecsAndFractional() {
if (randomBoolean()) {
return ":55" + randomFrom("", ".1", ".12", ".123", ".1234", ".12345", ".123456",
Expand Down Expand Up @@ -212,7 +219,7 @@ public void testFunctionWithFunctionWithArgAndParams() {
}

public void testDateLiteral() {
Literal l = dateLiteral(buildDate());
Literal l = dateLiteral(buildDate() + buildTime());
assertThat(l.dataType(), is(DATE));
}

Expand Down
Expand Up @@ -172,13 +172,19 @@ public void testConversionToDate() {
Converter conversion = converterFor(KEYWORD, to);
assertNull(conversion.convert(null));

assertEquals(date(0L), conversion.convert("1970-01-01"));
assertEquals(date(1483228800000L), conversion.convert("2017-01-01"));
assertEquals(date(-1672531200000L), conversion.convert("1917-01-01"));
assertEquals(date(18000000L), conversion.convert("1970-01-01"));
assertEquals(date(1581292800000L), conversion.convert("2020-02-10T10:20"));
assertEquals(date(-125908819200000L), conversion.convert("-2020-02-10T10:20:30.123"));
assertEquals(date(1581292800000L), conversion.convert("2020-02-10T10:20:30.123456789"));

// double check back and forth conversion
assertEquals(date(1581292800000L), conversion.convert("2020-02-10 10:20"));
assertEquals(date(-125908819200000L), conversion.convert("-2020-02-10 10:20:30.123"));
assertEquals(date(1581292800000L), conversion.convert("2020-02-10 10:20:30.123456789"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sake of completeness, I'd suggest a test without the duration separator, but with a timezone (or a Z).


assertEquals(date(1581292800000L), conversion.convert("2020-02-10T10:20+05:00"));
assertEquals(date(-125908819200000L), conversion.convert("-2020-02-10T10:20:30.123-06:00"));
assertEquals(date(1581292800000L), conversion.convert("2020-02-10T10:20:30.123456789+03:00"));

// double check back and forth conversion
ZonedDateTime zdt = org.elasticsearch.common.time.DateUtils.nowWithMillisResolution();
Converter forward = converterFor(DATE, KEYWORD);
Converter back = converterFor(KEYWORD, DATE);
Expand Down Expand Up @@ -285,7 +291,6 @@ public void testConversionToDateTime() {
assertEquals(dateTime(18000000L), conversion.convert("1970-01-01T00:00:00-05:00"));

// double check back and forth conversion

ZonedDateTime dt = org.elasticsearch.common.time.DateUtils.nowWithMillisResolution();
Converter forward = converterFor(DATETIME, KEYWORD);
Converter back = converterFor(KEYWORD, DATETIME);
Expand Down Expand Up @@ -692,4 +697,4 @@ static ZonedDateTime date(long millisSinceEpoch) {
static OffsetTime time(long millisSinceEpoch) {
return DateUtils.asTimeOnly(millisSinceEpoch);
}
}
}