Query statements scan the database with related tables, and return the private computed result. This document describes the syntax for SQL queries accepted in DPSQL.
query_statement: select: SELECT < aggregation function > < FROM {tables, ...} > { JOIN < tables > on < bool_expression >} [ WHERE { bool_expression, ...} ] [ GROUP BY { table.field / expression, ... } ]
Notation rules
- Square brackets
[ ]
indicate optional expressions with 0 or 1 appearance. - round brackets
( )
indicate optional expressions with 0 or multiple appearances. - curly brackets
{ }
indicate required expressions with 1 or multiple appearances. - angle brackets
< >
indicate required expression with exact one appearance.
We have three supported aggregation functions in SQL syntax
-
count(*)
Count aggregation function typically can be used for get the number of record in a table.
Note that distinct count is not supported in this version.
Example:
Select count(*) from lineitem;
-
sum(expression)
Sum aggregation function typically can be used for get the sum of an expression involved some numerical columns in the schema. Example:
Select sum(l_quantity) from lineitem;
-
max(expression, index)
Max aggregation function typically can be used for get the k-th largest value of an expression involved some numerical columns in the schema.
Value of the index should be an integer, and greater or equal to 0. While 0 index is used for minimum value, natural number is used for the k-largest value. Example:
Select max(l_quantity, 10) from lineitem; \\ get 10-th largest value
Select max(l_quantity, 0) from lineitem; \\ get the minimum value
Select max(l_quantity, -10) from lineitem; \\ INVALID
After the From keyword, there should be a list of table names separated by comma ,
.
Example:
from lineitem, orders
Users can choose either explicit join keyword, or implicit join in the SQL syntax.
We only consider inner equal-join in this version for now.
Example:
join orders on o_orderkey = l_orderkey
Following the Where keyword, there should be a list of
boolean expressions concatenated by logical operation and
or or
.
Boolean expression can be used for two intentions:
- Join condition for implicit join .
- Filter for record.
Example:
WHERE lineitem.l_orderkey = orders.o_orderkey
AND l_shipdate > CAST('1994-01-01' AS date)
Following the Group By keyword, there should ba a list of table
columns or expressions that can be a grouping attribute seperated by comma ,
.
Example:
group by n_name, extract(year FROM l_shipdate)
- All the table column in the input query should not be ambiguous.
- All the grouping attribute in the group by clause will be considered as public.