Skip to content
This repository has been archived by the owner on Mar 30, 2021. It is now read-only.

Sample Queries

hbutani edited this page Jul 30, 2015 · 11 revisions

These are against the denormalized(flattened) TPCH sales star schema as described in the Simple Example

Basic Aggregation

SQL

select l_returnflag, l_linestatus, count(*), 
       sum(l_extendedprice) as s, max(ps_supplycost) as m, avg(ps_availqty) as a,count(distinct o_orderkey)  
from orderLineItemPartSupplier 
group by l_returnflag, l_linestatus

Logical Plan

Aggregate [l_returnflag#69,l_linestatus#70], [l_returnflag#69,l_linestatus#70,COUNT(1) AS c2#109L,SUM(l_extendedprice#66) AS s#106,MAX(ps_supplycost#81) AS m#107,AVG(CAST(ps_availqty#80, LongType)) AS a#108,COUNT(DISTINCT o_orderkey#53) AS c6#110L]
 Project [l_extendedprice#66,o_orderkey#53,ps_supplycost#81,l_returnflag#69,l_linestatus#70,ps_availqty#80]
  Relation[o_orderkey#53,o_custkey#54,o_orderstatus#55,...

Physical Plan

Project [l_returnflag#69,l_linestatus#70,alias-1#112L AS c2#109L,alias-2#111 AS s#106,alias-3#115 AS m#107,alias-4#113 AS a#108,alias-7#114L AS c6#110L]
 PhysicalRDD [alias-2#111,alias-3#115,alias-7#114L,alias-4#113,l_returnflag#69,l_linestatus#70,alias-1#112L], DruidRDD[2] at RDD at DruidRDD.scala:16

Clone this wiki locally