Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Pinot Query Language
Similar to SQL
The Pinot Query Language (PQL) is very similar to standard SQL:
SELECT COUNT(*) FROM myTable
SELECT COUNT(*), MAX(foo), SUM(bar) FROM myTable
Grouping on Aggregation
SELECT MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM myTable GROUP BY bar, baz TOP 50
SELECT COUNT(*) FROM myTable WHERE foo = 'foo' AND bar BETWEEN 1 AND 20 OR (baz < 42 AND quux IN ('hello', 'goodbye') AND quuux NOT IN (42, 69))
SELECT * FROM myTable WHERE quux < 5 LIMIT 50
Ordering on Selection
SELECT foo, bar FROM myTable WHERE baz > 20 ORDER BY bar DESC LIMIT 100
Pagination on Selection
Note: results might not be consistent if column ordered by has same value in multiple rows.
SELECT foo, bar FROM myTable WHERE baz > 20 ORDER BY bar DESC LIMIT 50, 100
Differences with SQL
JOIN is not supported
JOIN in PQL.
TOP instead of
LIMIT for truncation
LIMIT n has no effect in grouping queries, should use
TOP n instead. If no
TOP n defined, PQL will use
TOP 10 as default truncation setting.
No need to select the columns to group with
The following two queries are both supported in PQL, where the non-aggregation columns are ignored.
SELECT MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable GROUP BY bar, baz TOP 50 SELECT bar, baz, MIN(foo), MAX(foo), SUM(foo), AVG(foo) FROM mytable GROUP BY bar, baz TOP 50
Always order by the aggregated value
The results will always order by the aggregated value itself.
Results equivalent to grouping on each aggregation
The results for query:
SELECT MIN(foo), MAX(foo) FROM myTable GROUP BY bar TOP 50
will be the same as the combining results from the following queries:
SELECT MIN(foo) FROM myTable GROUP BY bar TOP 50 SELECT MAX(foo) FROM myTable GROUP BY bar TOP 50
where we don't put the results for the same group together.
The select statement is as follows:
SELECT <outputColumn> (, outputColumn)* FROM <tableName> (WHERE ... | GROUP BY ... | ORDER BY ... | TOP ... | LIMIT ...)*
outputColumn can be
* to project all columns, columns (
baz) or aggregation functions (
Supported aggregations on single-value columns
Supported aggregations on multi-value columns
Supported predicates are comparisons with a constant using the standard SQL operators (
<>, '!=') , range comparisons using
foo BETWEEN 42 AND 69), set membership (
foo IN (1, 2, 4, 8)) and exclusion (
foo NOT IN (1, 2, 4, 8)). For
BETWEEN, the range is inclusive.
GROUP BY clause groups aggregation results by a list of columns.
ORDER BY clause orders selection results by a list of columns. PQL supports ordering
TOP n clause causes the n largest group results to be returned. If not specified, the top 10 groups are returned.
LIMIT n clause causes the selection results to contain at most n results.
LIMIT a, b clause paginates the selection results from the a th results and return at most b results.
Transform Function in Aggregation and Grouping
In aggregation and grouping, each column can be transformed from one or multiple columns.
For example, the following query will calculate the maximum value of column
foo divided by column
bar grouping on the column
time converted from time unit
SELECT MAX(DIV(foo, bar) FROM myTable GROUP BY TIMECONVERT(time, 'MILLISECONDS', 'SECONDS')
Supported transform functions
ADD: sum of at least two values
SUB: difference between two values
MULT: product of at least two values
DIV: quotient of two values
TIMECONVERT: takes 3 arguments, converts the value into another time unit. E.g.
TIMECONVERT(time, 'MILLISECONDS', 'SECONDS')
DATETIMECONVERT: takes 4 arguments, converts the value into another date time format, and buckets time based on the given time granularity. E.g.
DATETIMECONVERT(date, '1:MILLISECONDS:EPOCH', '1:SECONDS:EPOCH', '15:MINUTES')
VALUEIN: takes at least 2 arguments, where the first argument is a multi-valued column, and the following arguments are constant values. The transform function will filter the value from the multi-valued column with the given constant values. The
VALUEINtransform function is especially useful when the same multi-valued column is both filtering column and grouping column. E.g.
VALUEIN(mvColumn, 3, 5, 15)