-
Notifications
You must be signed in to change notification settings - Fork 256
Add Interval Literal #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Interval Literal #190
Conversation
|
Regarding the syntax, |
|
After revisiting the SQL standard and validation with Postgres, both |
|
I will do the review on Thursday at the latest. |
src/parser/bison_parser.y
Outdated
| | extract_expr | ||
| | cast_expr | ||
| | array_expr | ||
| | interval_expression |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to introduce an 'inverval_literal' in ll. 1029 ('literal: [...]') instead. (interval_expression is used in the SQL standard to enforce type compatibility (see http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt p.168 ), which we do not do in the hyrise parser)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just about to change that. I guess Expr::isLiteral() should return true for interval literals, as well? And is there a reason not to include date literals here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, I added isType(kExprLiteralDate) || isType(kExprLiteralInterval) to Expr::isLiteral().
Arrg, the interval syntax is complex and differs depending on the specific database system: Further, note, it is |
This ambiguation works now. |
mweisgut
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the above comments, it would be helpful if the description of this PR summarizes the different ways to define the intervals, i.e., 'n' day, '1 day', and '5 days'.
| !SELECT * FROM t WHERE a = DATE '2000-01-01' + INTERVAL '30' DAYS; | ||
| !SELECT * FROM t WHERE a = DATE '2000-01-01' + x DAYS; | ||
| !SELECT * FROM t WHERE a = DATE '2000-01-01' + INTERVAL 'x' DAY; | ||
| !SELECT * FROM t WHERE a = DATE '2000-01-01' + INTERVAL '3.3 DAYS'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if the two following statements should be correctly parsed. With this PR, they are successfully parsed. However, they do not make semantic sense, do they?
SELECT * FROM t WHERE a BETWEEN 3 DAYS AND 3 DAYS;
SELECT * FROM t WHERE a BETWEEN INTERVAL '3 DAYS' AND INTERVAL '3 DAYS';
Based on briefly checking the SQL specification, I think it should be parsable with the following rules, starting from a between predicate:
- between predicate
- row value predicand
- row value constructor predicand
- common value expression
- interval value expression
- ...
So the question here would be whether we should let parsing those statements fail or if further sanity checks in Hyrise should reject such statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to pass such checks to Hyrise. Also ensuring that intervals may only be input to '+' or '-' operations etc. would be out of the parser's scope for now. Regarding semantic sense, DB Fiddle suggests that Postgres has no problem with intervals in BETWEEN statements, e.g., SELECT INTERVAL '2 days' BETWEEN INTERVAL '1 day' AND INTERVAL '3 days'evaluates to 'true'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. @klauck and I also discussed that and we also think that such semantic checks should be part of Hyrise.
https://sqliteonline.com/ also suggests that Postgres returns true for SELECT INTERVAL '2 days' BETWEEN INTERVAL '1 day' AND INTERVAL '3 days'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klauck we talked about introducing the data type INTERVAL [1 - 3]. However, I think it is out of this PR's scope.
[1] https://www.postgresql.org/docs/current/sql-createtable.html
[2] https://www.oracletutorial.com/oracle-basics/oracle-interval/
[3] https://www.ibm.com/docs/en/informix-servers/12.10?topic=types-sql-datetime-interval-data
src/sql/Expr.cpp
Outdated
|
|
||
| Expr* Expr::makeIntervalLiteral(int64_t duration, DatetimeField unit) { | ||
| Expr* e = new Expr(kExprLiteralInterval); | ||
| e->name = strdup("INTERVAL"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about the use of the name field here (and suggest removing the line (and also in makeExtract and makeCast)). I think the purpose of the name field was storing actual string values, which may differ among different Expression instances (e.g. different string literals) and not the name of the expression type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, all changes look fine for me. Thanks for your revision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The three name fields are now removed, tests + print methods are adjusted accordingly.
This PR adds SQL date intervals, i.e., supports statements such the following:
This is achieved by a new type of literal/
Expr, with typekExpressionLiteralInterval. It is returned as second operand of the binary plus (or minus, respectivley) expression, where the start date ist the first one. Only checks if duration is a positive integer and the time unit is any of second(s), minute(s), hour(s), day(s), month(s), year(s). Any further sanity checks are in response of the consuming system.closes #184
closes #178