In [1]:
#;.pykx.disableJupyter()

In [2]:
# https://code.kx.com/pykx/3.0/examples/jupyter-integration.html#q-first-mode
import pykx as kx
kx.util.jupyter_qfirst_enable()

PyKX now running in 'jupyter_qfirst' mode. All cells by default will be run as q code. 
Include '%%py' at the beginning of each cell to run as python code. 


##### Initialization

In [3]:
/insert in init.q 
\l buildtaq.q
\l ./db/taq

Creating local partitioned database in ./db for use within the current section.
Finished database creation.


# Queries - qSQL 
##### Learning Objectives

To understand:
* How to construct a qSQL query
* The four different qSQL queries - `select`,`exec`,`update` and `delete`
* Building queries with constraints
* Building queries with aggregations
* Building queries with grouping
* Updating existing data
* Deleting existing data
* Using `fby` 

# Introduction 

The most common method of table querying and manipulation is qSQL, an SQL-like syntax built into the q language.

There are four fundamental actions qSQL allows us to use with a table:
* [`select`](https://code.kx.com/q/ref/select/) - choose data from a table
* [`exec`](https://code.kx.com/q/ref/exec/) - return data from a table, in a non-table format
* [`update`](https://code.kx.com/q/ref/update/) - perform some modification on a table
* [`delete`](https://code.kx.com/q/ref/delete/) - remove data from a table 


## Data

The tables that are used throughout this notebook comprise some [partitioned](https://code.kx.com/q4m3/14_Introduction_to_Kdb%2B/#14634-partitioned-tables) tables (<code>\`trade</code>,<code>\`quote</code> and <code>\`nbbo</code>),  and some [flat](https://www.tutorialspoint.com/kdbplus/q_tables_on_disk.htm) tables (<code>\`daily</code>,<code>\`depth</code> and <code>\`mas</code>) which are stored locally to this Queries module in a folder called db/taq.

In [4]:
tables[]

`daily`depth`mas`nbbo`quote`td`trade


In [5]:
tables[]! count each value each tables[]          //A quick shortcut to see each table and the associated table counts 

daily| 330
depth| 1000
mas  | 15
nbbo | 9807714
quote| 16336312
td   | 330
trade| 3268145


Let's look at the schema of both tables:

In [8]:
meta trade
meta daily
cols trade

c    | t f a
-----| -----
date | d    
sym  | s   p
time | t    
price| f    
size | j    
stop | b    
cond | c    
ex   | c    
c    | t f a
-----| -----
date | d    
sym  | s    
open | f    
high | f    
low  | f    
close| f    
price| f    
size | j    
`date`sym`time`price`size`stop`cond`ex


#  Choosing data from a table - `select` 

The qSQL `select` statement can be used to return data from a table, select particular columns, aggregate and/or filter data where necessary.

## Syntax

The `select` template has the following form:

    select <return columns> by <grouping columns> from <table> where <filter conditions>

The most basic qSQL `select` statement is the below:

In [12]:
select from daily       //returns all the records in the daily table
daily~select from daily //this is the same as calling the table as a variable 

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07 532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07 530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170      531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07 539534
1b


## Virtual column 

 ##### Virtual column `i` 
In addition to existing and computed columns, a virtual column `i` exists which maps to a record index within the table. We refer to this column as virtual as it is not visible in the `meta` of the table but we can use it as we would any other column in our table. 

In [13]:
select i from trade      

x 
--
0 
1 
2 
3 
4 
5 
6 
7 
8 
9 
10
11
12
13
14
15
16
17
18
19
..


## Queries with specified return - the `select` clause


We can use `select` to return a subset of the columns within a table, or to create new columns. 

In [14]:
select date, sym, open, size from daily //selecting a subset of columns 

date       sym  open  size  
----------------------------
2020.01.02 AAPL 83.88 536408
2020.01.02 AIG  26.97 532160
2020.01.02 AMD  33.01 530579
2020.01.02 DELL 12    531534
2020.01.02 DOW  19.99 539534
2020.01.02 GOOG 71.97 535376
2020.01.02 HPQ  35.98 524307
2020.01.02 IBM  42    531573
2020.01.02 INTC 50.98 538767
2020.01.02 MSFT 29    545950
2020.01.02 ORCL 35    533825
2020.01.02 PEP  22    525361
2020.01.02 PRU  59.04 543107
2020.01.02 SBUX 62.88 530383
2020.01.02 TXN  17.99 529999
2020.01.03 AAPL 86.14 554761
2020.01.03 AIG  29    559732
2020.01.03 AMD  33.9  552365
2020.01.03 DELL 12.07 543730
2020.01.03 DOW  20.45 552440
..


We can use assignment within our statement to rename the resultant columns too: 

In [15]:
//we can pick and choose which to rename
select dt: date, stock:sym, open, sz: size from daily 

dt         stock open  sz    
-----------------------------
2020.01.02 AAPL  83.88 536408
2020.01.02 AIG   26.97 532160
2020.01.02 AMD   33.01 530579
2020.01.02 DELL  12    531534
2020.01.02 DOW   19.99 539534
2020.01.02 GOOG  71.97 535376
2020.01.02 HPQ   35.98 524307
2020.01.02 IBM   42    531573
2020.01.02 INTC  50.98 538767
2020.01.02 MSFT  29    545950
2020.01.02 ORCL  35    533825
2020.01.02 PEP   22    525361
2020.01.02 PRU   59.04 543107
2020.01.02 SBUX  62.88 530383
2020.01.02 TXN   17.99 529999
2020.01.03 AAPL  86.14 554761
2020.01.03 AIG   29    559732
2020.01.03 AMD   33.9  552365
2020.01.03 DELL  12.07 543730
2020.01.03 DOW   20.45 552440
..


And can create new columns on the fly e.g. a new column called `mid` which is the midpoint of our `high` and `low` prices:

In [16]:
select date, sym, high, low, mid: 0.5*high+low from daily

date       sym  high  low   mid   
----------------------------------
2020.01.02 AAPL 87.45 78.69 83.07 
2020.01.02 AIG  29.85 26.36 28.105
2020.01.02 AMD  34.92 31.3  33.11 
2020.01.02 DELL 13.56 11.52 12.54 
2020.01.02 DOW  21.17 19.49 20.33 
2020.01.02 GOOG 73.97 67.89 70.93 
2020.01.02 HPQ  36.91 31.88 34.395
2020.01.02 IBM  44.65 40.89 42.77 
2020.01.02 INTC 51.32 45.97 48.645
2020.01.02 MSFT 32.27 28.32 30.295
2020.01.02 ORCL 35.94 32.6  34.27 
2020.01.02 PEP  23.05 21.73 22.39 
2020.01.02 PRU  60.44 56.33 58.385
2020.01.02 SBUX 67.69 60.88 64.285
2020.01.02 TXN  19.08 17.13 18.105
2020.01.03 AAPL 93.72 83.81 88.765
2020.01.03 AIG  31.61 27.77 29.69 
2020.01.03 AMD  39.34 33.13 36.235
2020.01.03 DELL 12.89 11.59 12.24 
2020.01.03 DOW  21.87 19.96 20.915
..


<img src="../qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:2px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> The newly created column can't be referenced later within the same query as the column does not actually exist until the final result table is returned.</i></p>

In [None]:
//example - this will error with 'mid as kdb+/q doesn't know what this is yet
select date, sym, high, low, mid: 0.5*high+low, mid+high from daily

Creating a column doesn't mean that it permanently exists in the table. From the below query, we can see that our new column `mid` doesn't remain in our `daily` table. 

In [17]:
daily 
meta daily

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07 532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07 530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170      531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07 539534
2020.01.02 GOOG 71.97 73.97 67.89 71.01 3.800257e+07 535376
2020.01.02 HPQ  35.98 36.91 31.88 34.6  1.804498e+07 524307
2020.01.02 IBM  42    44.65 40.89 41.44 2.267041e+07 531573
2020.01.02 INTC 50.98 51.32 45.97 49.46 2.638187e+07 538767
2020.01.02 MSFT 29    32.27 28.32 28.74 1.650542e+07 545950
2020.01.02 ORCL 35    35.94 32.6  34.83 1.823452e+07 533825
2020.01.02 PEP  22    23.05 21.73 22.47 1.176689e+07 525361
2020.01.02 PRU  59.04 60.44 56.33 59.61 3.170256e+07 543107
2020.01.02 SBUX 62.88 67.69 60.88 63.01 3.407796e+07 530383
2020.01.02 TXN  17.99 19.08 17.13 18.07 

If we did want to persist this change, we can use direct reassignment: 

In [18]:
daily2:select date, sym, high, low, mid: 0.5*high+low from daily
daily2

date       sym  high  low   mid   
----------------------------------
2020.01.02 AAPL 87.45 78.69 83.07 
2020.01.02 AIG  29.85 26.36 28.105
2020.01.02 AMD  34.92 31.3  33.11 
2020.01.02 DELL 13.56 11.52 12.54 
2020.01.02 DOW  21.17 19.49 20.33 
2020.01.02 GOOG 73.97 67.89 70.93 
2020.01.02 HPQ  36.91 31.88 34.395
2020.01.02 IBM  44.65 40.89 42.77 
2020.01.02 INTC 51.32 45.97 48.645
2020.01.02 MSFT 32.27 28.32 30.295
2020.01.02 ORCL 35.94 32.6  34.27 
2020.01.02 PEP  23.05 21.73 22.39 
2020.01.02 PRU  60.44 56.33 58.385
2020.01.02 SBUX 67.69 60.88 64.285
2020.01.02 TXN  19.08 17.13 18.105
2020.01.03 AAPL 93.72 83.81 88.765
2020.01.03 AIG  31.61 27.77 29.69 
2020.01.03 AMD  39.34 33.13 36.235
2020.01.03 DELL 12.89 11.59 12.24 
2020.01.03 DOW  21.87 19.96 20.915
..


##### Exercise

Extract the `sym`, `close` and `size` columns from our `daily` table. 

In [None]:
select sym, close, size from daily

In [19]:
//your answer here 
select sym, close, size from daily

sym  close size  
-----------------
AAPL 86.22 536408
AIG  29.01 532160
AMD  33.94 530579
DELL 12.07 531534
DOW  20.45 539534
GOOG 71.01 535376
HPQ  34.6  524307
IBM  41.44 531573
INTC 49.46 538767
MSFT 28.74 545950
ORCL 34.83 533825
PEP  22.47 525361
PRU  59.61 543107
SBUX 63.01 530383
TXN  18.07 529999
AAPL 87.95 554761
AIG  30.14 559732
AMD  34.37 552365
DELL 12.01 543730
DOW  20.47 552440
..


##### Exercise 
Extract the same columns, but this time add a new boolean column called `Asym` which is true when the sym starts with an `"A"` and false otherwise. Assign this output to a new table `aDaily`.

In [None]:
aDaily:select sym, close, size, Asym:sym like "A*" from daily //we can evaluate any q expressions we like here!
aDaily

In [21]:
//your answer here 
aDaily:select sym, close, size, Asym:sym like "A*" from daily

### Querying with aggregations 
The columns of a table are lists, and we can perform aggregations and other functions or analytics using them like we can any list. 

In [22]:
select sum size,sum price from trade

size      price       
----------------------
178085211 1.336482e+08


##### Exercise 

Return the maximum price and average trade size from the trade table 

In [None]:
select max price, avg size from trade

In [24]:
//your answer here 
select max price, avg size from trade

price size    
--------------
93.94 54.49122


## Queries with constraints - the `where` clause

The `where` clause in qSQL allows us to specify conditions and filter our data accordingly. 

Suppose we want to select only trades that are associated with Apple, we can add this as a condition using the `where` clause: 

In [25]:
select from daily where sym=`AAPL

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 4.452568e+07 536408
2020.01.03 AAPL 86.14 93.72 83.81 87.95 4.93008e+07  554761
2020.01.06 AAPL 88.01 90.8  83.48 85.78 4.673882e+07 535121
2020.01.07 AAPL 85.83 87.36 81.05 85.12 4.353936e+07 518017
2020.01.08 AAPL 85.07 88.32 79.7  84.95 4.499806e+07 534561
2020.01.09 AAPL 84.97 89.93 84.12 86.15 4.491606e+07 515401
2020.01.10 AAPL 86.1  87.79 79.27 85.19 4.23042e+07  509741
2020.01.13 AAPL 85.14 87.34 81.84 84.93 4.545826e+07 538500
2020.01.14 AAPL 84.99 89.29 81.58 82.6  4.513405e+07 528053
2020.01.15 AAPL 82.63 86.29 79.1  83.24 4.353784e+07 527158
2020.01.16 AAPL 83.15 87.93 80.98 84.08 4.579812e+07 542165
2020.01.17 AAPL 84.05 84.56 77.99 82.89 4.434635e+07 544940
2020.01.20 AAPL 82.95 89.58 81.32 86.78 4.532787e+07 528665
2020.01.21 AAPL 86.81 90.67 83.75 84.03 4.581064e+07 523532
2020.01.22 AAPL 84.17 87.12 81.27 82.7  

The `where` statement can contain any number of constraints separated by commas:

In [26]:
select from trade where sym =`AAPL, size > 70, date = 2020.01.02  //looking at our bigger trade table now 

date       sym  time         price size stop cond ex
----------------------------------------------------
2020.01.02 AAPL 09:30:00.025 83.87 74   0    J    N 
2020.01.02 AAPL 09:30:00.031 83.87 81   0    K    N 
2020.01.02 AAPL 09:30:00.594 83.85 72   0    A    N 
2020.01.02 AAPL 09:30:01.039 83.67 91   0    9    N 
2020.01.02 AAPL 09:30:01.138 83.66 99   0    G    N 
2020.01.02 AAPL 09:30:01.385 83.43 76   0    O    N 
2020.01.02 AAPL 09:30:01.493 83.37 80   0    B    N 
2020.01.02 AAPL 09:30:02.003 83.17 90   0    C    N 
2020.01.02 AAPL 09:30:02.228 83.2  92   0    R    N 
2020.01.02 AAPL 09:30:04.188 83.07 80   0    N    N 
2020.01.02 AAPL 09:30:04.725 82.83 89   0    T    N 
2020.01.02 AAPL 09:30:04.849 82.83 86   0    W    N 
2020.01.02 AAPL 09:30:06.635 82.61 88   0    J    N 
2020.01.02 AAPL 09:30:07.377 82.68 86   0    C    N 
2020.01.02 AAPL 09:30:07.844 82.63 94   0    G    N 
2020.01.02 AAPL 09:30:08.253 82.61 74   0    9    N 
2020.01.02 AAPL 09:30:09.069 82.8  88   0    Z

In [27]:
\t:10 select from trade where sym =`AAPL, size > 70, date = 2020.01.02 // This will take significantly more time
\t:10 select from trade where date = 2020.01.02, sym =`AAPL, size > 70 // This query is more efficient

101
6


Always use `,` instead of `and` in the where clause. 

In [None]:
//performance comparison using and instead of ,
\t:10 select from trade where date = 2020.01.02, sym =`AAPL, size > 70         //the "right" way
\t:10 select from trade where (date = 2020.01.02) and  sym =`AAPL, size > 70   //the "wrong" way 

##### Exercise
Find all trades (using the `trade` table) associated with Dell (<code>\`DELL</code>) where the price is greater than 12.

In [None]:
select from trade where sym=`DELL,price > 12

In [28]:
//your answer here
select from trade where sym = `DELL, price>12 

date       sym  time         price size stop cond ex
----------------------------------------------------
2020.01.02 DELL 09:30:01.250 12.01 53   0    K    N 
2020.01.02 DELL 09:30:01.261 12.01 43   0    8    N 
2020.01.02 DELL 09:30:01.564 12.01 91   0    9    N 
2020.01.02 DELL 09:30:01.636 12.01 80   0    G    N 
2020.01.02 DELL 09:30:01.857 12.01 66   0    N    N 
2020.01.02 DELL 09:30:01.886 12.01 77   0    T    N 
2020.01.02 DELL 09:31:19.113 12.01 74   0         N 
2020.01.02 DELL 09:31:19.280 12.01 83   0    L    N 
2020.01.02 DELL 09:31:19.580 12.01 39   0    W    N 
2020.01.02 DELL 09:31:19.599 12.01 17   0    T    N 
2020.01.02 DELL 09:31:20.906 12.01 58   1    R    N 
2020.01.02 DELL 09:31:21.635 12.01 20   0    C    N 
2020.01.02 DELL 09:31:21.968 12.01 90   0    E    N 
2020.01.02 DELL 09:31:22.214 12.01 85   0    A    N 
2020.01.02 DELL 09:31:22.428 12.02 55   0    J    N 
2020.01.02 DELL 09:31:22.532 12.02 92   0    T    N 
2020.01.02 DELL 09:31:24.300 12.01 70   0     

##### Exercise

Write a select query using our `trade` table to find the volume-weighted average price (vwap) for the Google (<code>\`GOOG</code>) stock

Suggested reading: [wavg](https://code.kx.com/q/ref/avg/#wavg)

In [None]:
select vwap:size wavg price from trade where sym=`GOOG

In [32]:
// Enter your qSQL code here
select vwap:size wavg price from trade where sym=`GOOG

vwap    
--------
69.07248


## Queries with grouping - the `by` clause

The easiest way to obtain data summarized by grouping similar values together is to use the `by` clause.

In [34]:
//select size by sym from daily 
select max size by sym from daily   //performing an aggregation on the list

sym | size  
----| ------
AAPL| 582043
AIG | 584179
AMD | 602276
DELL| 593427
DOW | 600835
GOOG| 587621
HPQ | 591491
IBM | 584613
INTC| 588676
MSFT| 593356
ORCL| 588420
PEP | 586556
PRU | 585720
SBUX| 596792
TXN | 600642


We see that the returned tables are keyed - this is often helpful for quick retrieval.

In [38]:
(select max size by sym from daily)`IBM    //getting the max size for IBM
flip select max size from daily where sym = `IBM 

size| 584613
size| 584613


We can also use our own defined functions on these lists, e.g. to return the last 5 days closing prices: 

In [39]:
last5:{-5 sublist raze x}
select last5DaysClose:last5 close by sym from daily

sym | last5DaysClose               
----| -----------------------------
AAPL| 82.63 84.32 85.67 87.88 90.95
AIG | 31.14 31.87 31.48 31.66 32.76
AMD | 40.21 43.05 43.09 45.68 43.35
DELL| 12.75 12.49 12.63 13.17 13.29
DOW | 20.17 19.69 19.64 20.25 20.36
GOOG| 68.69 68.25 69.36 69.17 71.61
HPQ | 40.72 41.15 41.66 41.59 43.73
IBM | 40.53 40.16 40.42 41.37 42.14
INTC| 53.59 52.76 52.46 54.06 55.77
MSFT| 32.15 33.07 34.15 33.75 34.23
ORCL| 41.26 42.59 42.81 42.58 42.82
PEP | 25.01 25.9  26.01 26.52 26.38
PRU | 57.68 58.5  59.03 60.37 60.31
SBUX| 57.09 57.2  58.59 57.82 56.58
TXN | 18.55 18.5  18.73 18.81 18.88


<img src="../qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:10px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> A neat overload of the <code>by</code> clause is if we don't specify any columns to be returned, we can get the last record in the table, broken down by our grouping!</i></p>

In [40]:
select by sym from daily   //very convenient for quick inspections!

sym | date       open  high  low   close price        size  
----| ------------------------------------------------------
AAPL| 2020.01.31 87.75 93.57 79.88 90.95 4.716256e+07 546683
AIG | 2020.01.31 31.65 33.55 31.11 32.76 1.828947e+07 568297
AMD | 2020.01.31 45.71 47.52 42.03 43.35 2.43365e+07  549224
DELL| 2020.01.31 13.17 13.3  12.26 13.29 6883176      541087
DOW | 2020.01.31 20.26 21.69 19.21 20.36 1.127992e+07 551702
GOOG| 2020.01.31 69.11 75.53 68.34 71.61 3.902459e+07 545816
HPQ | 2020.01.31 41.58 45.76 38.99 43.73 2.28627e+07  545763
IBM | 2020.01.31 41.41 44.84 40.71 42.14 2.325386e+07 548306
INTC| 2020.01.31 54.06 57.48 51.34 55.77 3.014401e+07 547481
MSFT| 2020.01.31 33.72 35.67 32.71 34.23 1.870844e+07 546665
ORCL| 2020.01.31 42.59 46.79 40.99 42.82 2.39701e+07  549283
PEP | 2020.01.31 26.49 27.32 24.62 26.38 1.415232e+07 552175
PRU | 2020.01.31 60.4  63.39 58.99 60.31 3.281376e+07 537421
SBUX| 2020.01.31 57.89 61.24 56.26 56.58 3.219742e+07 546509
TXN | 2020.01.31 18.82 1

##### Exercise 
Write a select statement that returns from our `trade` table the maximum and minimum prices and total number of trades (`numTrades`) broken down by `sym`.

In [46]:
select max price, min price, numTrades:count size  by sym from trade //we can count any column in our table not just i

sym | price price1 numTrades
----| ----------------------
AAPL| 93.94 77.33  217876   
AIG | 35.51 26.36  218281   
AMD | 47.52 29.83  218164   
DELL| 13.74 10.92  218326   
DOW | 22.92 17.17  218064   
GOOG| 79.74 60.46  217240   
HPQ | 45.76 31.88  217743   
IBM | 45.85 34.42  217763   
INTC| 57.48 45.5   217937   
MSFT| 36.57 26.27  217769   
ORCL| 46.79 32.6   217624   
PEP | 28.57 20.4   217831   
PRU | 66.5  51.01  217595   
SBUX| 68.16 51.26  217551   
TXN | 21.6  16.84  218381   


In [45]:
//your answer here 
//trade
select mini:min price, maxi:max price, numTrades:count sym by sym from trade

sym | mini  maxi  numTrades
----| ---------------------
AAPL| 77.33 93.94 217876   
AIG | 26.36 35.51 218281   
AMD | 29.83 47.52 218164   
DELL| 10.92 13.74 218326   
DOW | 17.17 22.92 218064   
GOOG| 60.46 79.74 217240   
HPQ | 31.88 45.76 217743   
IBM | 34.42 45.85 217763   
INTC| 45.5  57.48 217937   
MSFT| 26.27 36.57 217769   
ORCL| 32.6  46.79 217624   
PEP | 20.4  28.57 217831   
PRU | 51.01 66.5  217595   
SBUX| 51.26 68.16 217551   
TXN | 16.84 21.6  218381   


##### Exercise 
Write a select statement to recreate our `daily` table from our `trade` table. 

This has the open, high, low, close prices, a price column calculated as size x price, and size as the total traded volume for each sym on every date. Assign this value to `daily2` and verify it matches the `daily` table. 

*(Just this once we'll allow not using a where clause on a partitioned table!)*

In [None]:
//lets look first at what we're trying to reproduce
meta daily
daily

In [None]:
//so we need to recreate this - it's broken down by sym and date so they'll be our by clause 
daily2:select open:first price, high: max price, low: min price, close: last price,  //OHLC prices
            price:sum price*size, size:sum size      //total price as a cost (price*size) and total traded volume
//next our grouping clause - break down by date, then sym
        by date,sym                                  
        from trade 
//does this look the same? 
daily2

In [None]:
daily2: 0!daily2            //removing our key since daily isn't keyed
daily2~daily

In [None]:
//your answer here


### Temporal arithmetic

One of the most common uses of the `by` clause within qSQL is to return aggregations over a specified period of time.


In [48]:
trade
select trds:count i, vwap:size wavg price by sym, 15 xbar time.minute from trade where date = last date 

date       sym  time         price size stop cond ex
----------------------------------------------------
2020.01.02 AAPL 09:30:00.021 83.88 17   0    G    N 
2020.01.02 AAPL 09:30:00.025 83.87 74   0    J    N 
2020.01.02 AAPL 09:30:00.028 83.84 57   0    N    N 
2020.01.02 AAPL 09:30:00.031 83.87 81   0    K    N 
2020.01.02 AAPL 09:30:00.041 83.87 52   0    G    N 
2020.01.02 AAPL 09:30:00.147 83.83 20   0    Z    N 
2020.01.02 AAPL 09:30:00.216 83.98 67   0    8    N 
2020.01.02 AAPL 09:30:00.413 83.97 47   0    P    N 
2020.01.02 AAPL 09:30:00.439 83.95 70   0    8    N 
2020.01.02 AAPL 09:30:00.441 83.9  62   0    A    N 
2020.01.02 AAPL 09:30:00.536 83.94 18   0    G    N 
2020.01.02 AAPL 09:30:00.575 83.89 32   0    G    N 
2020.01.02 AAPL 09:30:00.594 83.85 72   0    A    N 
2020.01.02 AAPL 09:30:00.646 83.76 10   0    O    N 
2020.01.02 AAPL 09:30:00.796 83.77 25   0    P    N 
2020.01.02 AAPL 09:30:00.873 83.69 62   0    9    N 
2020.01.02 AAPL 09:30:01.039 83.67 91   0    9

##### Exercise
* Show the total volume every 1.5 minutes from our trade table on the 2nd of Jan 2020
* Further break this down by sym

(Hint: the [`xbar`](https://code.kx.com/q/ref/xbar/) documentation has a domain and range mapping table at the end to help understand which types work together)

In [None]:
select sum size by `time$0D00:01:30.000 xbar `timespan$time from trade where date = 2020.01.02

In [None]:
select sum size by `time$0D00:01:30.000 xbar `timespan$time, sym  from trade where date = 2020.01.02

In [58]:
//your answer here 
select sum size by `time$0D00:01:30.000 xbar `timespan$time, sym  from trade where date = 2020.01.02

time         sym | size 
-----------------| -----
09:30:00.000 AAPL| 9648 
09:30:00.000 AIG | 11395
09:30:00.000 AMD | 12951
09:30:00.000 DELL| 12186
09:30:00.000 DOW | 10963
09:30:00.000 GOOG| 11056
09:30:00.000 HPQ | 11651
09:30:00.000 IBM | 10748
09:30:00.000 INTC| 13173
09:30:00.000 MSFT| 12721
09:30:00.000 ORCL| 13629
09:30:00.000 PEP | 12740
09:30:00.000 PRU | 12695
09:30:00.000 SBUX| 12192
09:30:00.000 TXN | 11060
09:31:30.000 AAPL| 6110 
09:31:30.000 AIG | 4378 
09:31:30.000 AMD | 5613 
09:31:30.000 DELL| 6991 
09:31:30.000 DOW | 5719 
..


##### Exercise
Use `xbar` to generate a count of the number of trades (`trade where date = last date`) in intervals of trade size (interval size 10). 

(*This is commonly used to generate a histogram of trade size distribution*) 

In [None]:
select count i by 10 xbar size from trade where date = last date 

In [59]:
//your answer here
select count i by 10 xbar size from trade where date = last date 

size| x    
----| -----
10  | 17068
20  | 16688
30  | 16595
40  | 16834
50  | 16717
60  | 16592
70  | 16973
80  | 16567
90  | 16926


# Extracting data from tables - `exec`

The qSQL `exec` can also be used to query tables. All `exec` statements are written with the same `by`, `from`, and `where` clauses as select statements. However instead of returning only tables, `exec` statements can return a list, a dictionary, or indeed tables depending on the specific query. They are used primarily to extract data from the table format - or to restructure our data (see Practical Guidance for pivoting using `exec`)

If we only specify one column to be returned from our `exec` statement this is returned as a list: 

In [60]:
exec size from daily 

536408 532160 530579 531534 539534 535376 524307 531573 538767 545950 533825 ..


Suppose we want to return more than one list, if we specify many then we return a dictionary:  

In [61]:
exec size, price from daily    //this is nice because the dictionary values are lists 

size | 536408       532160       530579       531534  539534       535376    ..
price| 4.452568e+07 1.515896e+07 1.744796e+07 6657170 1.101615e+07 3.800257e+..


If we add a grouping clause we get our values broken down by that grouping:

In [62]:
// returns a dictionary with the syms and prices of each trade
exec price by sym from daily

AAPL| 4.452568e+07 4.93008e+07  4.673882e+07 4.353936e+07 4.499806e+07 4.4916..
AIG | 1.515896e+07 1.664099e+07 1.640576e+07 1.573368e+07 1.638048e+07 1.5814..
AMD | 1.744796e+07 2.020541e+07 1.776361e+07 1.720057e+07 1.773455e+07 1.6864..
DELL| 6657170      6673491      6426205      6113305      6454175      600460..
DOW | 1.101615e+07 1.146551e+07 1.138643e+07 1.060542e+07 1.111021e+07 1.0625..
GOOG| 3.800257e+07 3.944743e+07 3.765593e+07 3.654637e+07 3.796523e+07 3.7075..
HPQ | 1.804498e+07 1.957508e+07 1.839489e+07 1.945706e+07 1.938343e+07 1.9473..
IBM | 2.267041e+07 2.358848e+07 2.339335e+07 2.115495e+07 2.164019e+07 2.2247..
INTC| 2.638187e+07 2.758548e+07 2.748836e+07 2.516563e+07 2.584221e+07 2.4953..
MSFT| 1.650542e+07 1.519559e+07 1.538936e+07 1.537383e+07 1.602633e+07 1.5736..
ORCL| 1.823452e+07 1.909067e+07 2.007516e+07 1.863508e+07 1.950575e+07 1.9636..
PEP | 1.176689e+07 1.236042e+07 1.181257e+07 1.15121e+07  1.233191e+07 1.2415..
PRU | 3.170256e+07 3.393572e+07 2.984919

If we add more columns to be returned at this stage, we actually end up returning a dictionary where the keys are the broken down groupings and the value is a table with each column we selected as a column: 

In [63]:
exec 3 sublist price, 3 sublist size by sym from daily //sublisting for visibility

    | price                                  size                
----| -----------------------------------------------------------
AAPL| 4.452568e+07 4.93008e+07  4.673882e+07 536408 554761 535121
AIG | 1.515896e+07 1.664099e+07 1.640576e+07 532160 559732 544834
AMD | 1.744796e+07 2.020541e+07 1.776361e+07 530579 552365 542053
DELL| 6657170      6673491      6426205      531534 543730 546929
DOW | 1.101615e+07 1.146551e+07 1.138643e+07 539534 552440 537432
GOOG| 3.800257e+07 3.944743e+07 3.765593e+07 535376 551302 526374
HPQ | 1.804498e+07 1.957508e+07 1.839489e+07 524307 563705 528128
IBM | 2.267041e+07 2.358848e+07 2.339335e+07 531573 550613 528404
INTC| 2.638187e+07 2.758548e+07 2.748836e+07 538767 553318 542887
MSFT| 1.650542e+07 1.519559e+07 1.538936e+07 545950 544680 525033
ORCL| 1.823452e+07 1.909067e+07 2.007516e+07 533825 542955 536128
PEP | 1.176689e+07 1.236042e+07 1.181257e+07 525361 555637 536426
PRU | 3.170256e+07 3.393572e+07 2.984919e+07 543107 561256 527113
SBUX| 3.40

This is because what we are returning is a series of dictionaries for each of our groupings! 



In [64]:
exec sym from select sym from trade  //pulling the selection into memory, and then using exec 
//exec sym from trade                  //can't do this on disk - there is really a sym list for each date

`sym$`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`AAPL`A..


##### Exercise

Using the `daily` table, return the first `open` and last `close` prices for all symbols ending with "L".

Output the result as a dictionary, and also specifically as a keyed table. 

In [69]:
exec first open, last close by sym from daily where sym like "*L"  //not a keyed table, a dictionary
type exec first open, last close by sym from daily where sym like "*L"  //not a keyed table, a dictionary
//type 0! exec first open, last close by sym from daily where sym like "*L"  //can't unkey this 


    | open  close
----| -----------
AAPL| 83.88 90.95
DELL| 12    13.29
ORCL| 35    42.82
99h


In [None]:
exec first open, last close by sym:sym from daily where sym like "*L" //other column names fine too 
type exec first open, last close by sym:sym from daily where sym like "*L" //keyed table - also a dictionary 
type 0! exec first open, last close by sym:sym from daily where sym like "*L" //can unkey this because it's a table

In [70]:
//your answer here 
5#daily

exec first open, last close by sym from daily where sym like "*L"

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07 532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07 530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170      531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07 539534
    | open  close
----| -----------
AAPL| 83.88 90.95
DELL| 12    13.29
ORCL| 35    42.82


# Updating/modifying table data - `update`

The qSQL `update` statement can be used to modify existing rows or add new columns to a table. All `update` statements are written with the same `by`, `from`, and `where` clauses as `select` and `exec` statements.

Suppose we wanted to change our price to be negative for all `AAPL` stocks - we can do that using update. 

In [71]:
5 sublist daily                                          //table before modification (sublisting for visibility)
5 sublist update neg[price] from daily where sym =`AAPL  //table after we make the price negative for AAPL

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07 532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07 530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170      531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07 539534
date       sym  open  high  low   close price         size  
------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07  532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07  530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170       531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07  539534


If we wanted to persist this change, we can pass the table by reference: 

In [72]:
update neg[price] from `daily where sym =`AAPL //we are returned the table reference as output when persisting
5 sublist daily                                //confirming our change is present

daily
date       sym  open  high  low   close price         size  
------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07  532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07  530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170       531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07  539534


We can also use `update` to create new columns and to do so on a grouped basis - like if we wanted to add a new column to our trade table to show the max trade size for each symbol: 

In [73]:
show daily3:update maxTradeSize: max size by sym from daily
5 sublist select from daily3 where sym = `AAPL   //updated for all syms with their specific size max
5 sublist select from daily3 where sym = `DELL   //updated for all syms with their specific size max 

date       sym  open  high  low   close price         size   maxTradeSize
-------------------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408 582043      
2020.01.03 AAPL 86.14 93.72 83.81 87.95 -4.93008e+07  554761 582043      
2020.01.06 AAPL 88.01 90.8  83.48 85.78 -4.673882e+07 535121 582043      
2020.01.07 AAPL 85.83 87.36 81.05 85.12 -4.353936e+07 518017 582043      
2020.01.08 AAPL 85.07 88.32 79.7  84.95 -4.499806e+07 534561 582043      
date       sym  open  high  low   close price   size   maxTradeSize
-------------------------------------------------------------------
2020.01.02 DELL 12    13.56 11.52 12.07 6657170 531534 593427      
2020.01.03 DELL 12.07 12.89 11.59 12.01 6673491 543730 593427      
2020.01.06 DELL 12.01 12.39 11.07 11.66 6426205 546929 593427      
2020.01.07 DELL 11.67 12.42 11.41 11.89 6113305 512893 593427      
2020.01.08 DELL 11.92 13.06 11.54 11.58 6454175 521809 593427      
date  

##### Exercise

Update the `daily` table to have a new column `mid` which is the midpoint of the high and low prices. Do this without modifying our original table.

In [None]:
update mid:0.5*high+low from daily

In [74]:
//your answer here 
update mid:0.5*high+low from daily

date       sym  open  high  low   close price         size   mid   
-------------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408 83.07 
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07  532160 28.105
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07  530579 33.11 
2020.01.02 DELL 12    13.56 11.52 12.07 6657170       531534 12.54 
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07  539534 20.33 
2020.01.02 GOOG 71.97 73.97 67.89 71.01 3.800257e+07  535376 70.93 
2020.01.02 HPQ  35.98 36.91 31.88 34.6  1.804498e+07  524307 34.395
2020.01.02 IBM  42    44.65 40.89 41.44 2.267041e+07  531573 42.77 
2020.01.02 INTC 50.98 51.32 45.97 49.46 2.638187e+07  538767 48.645
2020.01.02 MSFT 29    32.27 28.32 28.74 1.650542e+07  545950 30.295
2020.01.02 ORCL 35    35.94 32.6  34.83 1.823452e+07  533825 34.27 
2020.01.02 PEP  22    23.05 21.73 22.47 1.176689e+07  525361 22.39 
2020.01.02 PRU  59.04 60.44 56.33 59.61 3.170256

##### Exercise
Persist a change to our daily table so  all `DOW` values are now half the `price`

In [None]:
update price*0.5 from `daily where sym =`DOW 
select from daily where sym =`DOW

In [80]:
//your answer here 
5#daily
update price:price%2 from `daily where sym = `DOW

date       sym  open  high  low   close price         size  
------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07  532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07  530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170       531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 1.101615e+07  539534
daily


# Remove data from table - `delete`
The qSQL `delete` can be used to remove whole rows or whole columns from a table. All `delete` statements specify either column names (to delete columns), or use a `where` statement (to delete rows) - they cannot have both as partial column or row deletions are not supported.

In [81]:
5 sublist delete from daily where date=2020.01.02  //Table is passed by value, we are deleting rows
5 sublist daily                                    //change not persisted

date       sym  open  high  low   close price        size  
-----------------------------------------------------------
2020.01.03 AAPL 86.14 93.72 83.81 87.95 -4.93008e+07 554761
2020.01.03 AIG  29    31.61 27.77 30.14 1.664099e+07 559732
2020.01.03 AMD  33.9  39.34 33.13 34.37 2.020541e+07 552365
2020.01.03 DELL 12.07 12.89 11.59 12.01 6673491      543730
2020.01.03 DOW  20.45 21.87 19.96 20.47 5732756      552440
date       sym  open  high  low   close price         size  
------------------------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 -4.452568e+07 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 1.515896e+07  532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 1.744796e+07  530579
2020.01.02 DELL 12    13.56 11.52 12.07 6657170       531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 5508073       539534


In [82]:
delete price from `daily  //we are deleting the whole price column from our daily table, and persisting
5 sublist daily

daily
date       sym  open  high  low   close size  
----------------------------------------------
2020.01.02 AAPL 83.88 87.45 78.69 86.22 536408
2020.01.02 AIG  26.97 29.85 26.36 29.01 532160
2020.01.02 AMD  33.01 34.92 31.3  33.94 530579
2020.01.02 DELL 12    13.56 11.52 12.07 531534
2020.01.02 DOW  19.99 21.17 19.49 20.45 539534


If we try to combine the two and delete *part* of a row or column we will get an error: 

In [84]:
//delete sym from daily where date = 2020.01.02

##### Exercise

Delete all occurrences of `AAPL` from our `daily` table by passing the table in as reference.

In [None]:
delete from `daily where sym =`AAPL
daily

In [86]:
//your answer here 
delete from `daily where sym = `AAPL

daily


# Using `fby` to avoid nested queries


The [`fby`](https://code.kx.com/q/ref/fby/) keyword, sometimes referred to as "filter by" allows us to avoid multiple aggregation and joining steps that would usually be required in another language. 

The form of fby is `(aggregation;data) fby group` where: 
* aggregation refers to a function which takes a list and returns a singular atom 
* data refers to the column to which you want to apply this function 
* group refers to a column by which you want to group, or a table of multiple columns on which you want to group 

Returning to our example about finding all trades where the size is less than the average trade size on the exchange they traded, we can express this as follows: 

In [87]:
select from trade where date = last date, size < (avg;size) fby ex

date       sym  time         price size stop cond ex
----------------------------------------------------
2020.01.31 AAPL 09:30:00.068 87.75 42   0    P    N 
2020.01.31 AAPL 09:30:00.084 87.73 32   0    K    N 
2020.01.31 AAPL 09:30:00.091 87.77 11   0    9    N 
2020.01.31 AAPL 09:30:00.149 87.81 27   0    W    N 
2020.01.31 AAPL 09:30:00.167 87.8  47   0    E    N 
2020.01.31 AAPL 09:30:00.273 87.94 45   0    B    N 
2020.01.31 AAPL 09:30:00.438 87.96 10   0    E    N 
2020.01.31 AAPL 09:30:00.603 87.85 23   0    B    N 
2020.01.31 AAPL 09:30:01.139 87.7  17   0    A    N 
2020.01.31 AAPL 09:30:01.400 87.68 13   0    J    N 
2020.01.31 AAPL 09:30:02.158 87.38 16   0    N    N 
2020.01.31 AAPL 09:30:03.024 87.46 46   0    W    N 
2020.01.31 AAPL 09:30:03.076 87.39 51   0    J    N 
2020.01.31 AAPL 09:30:04.268 87.19 12   0    C    N 
2020.01.31 AAPL 09:30:04.341 87.22 40   0    R    N 
2020.01.31 AAPL 09:30:04.769 87.43 26   0    J    N 
2020.01.31 AAPL 09:30:04.889 87.38 39   0    J

Compare the above statement to the how it would be similarly done via normal qSql commands, we would first get the average size for each exchange, then join this data to our original table and perform a new selection: 

In [88]:
//first, get the average by exchange
show resby:select exAvg:avg size by ex from trade where date = last date

ex| exAvg   
--| --------
N | 54.44547
O | 54.47614


In [89]:
//next, combine that average value with your original table using lj
show interim:(select from trade where date = last date) lj resby

date       sym  time         price size stop cond ex exAvg   
-------------------------------------------------------------
2020.01.31 AAPL 09:30:00.067 87.75 65   0    T    N  54.44547
2020.01.31 AAPL 09:30:00.068 87.75 42   0    P    N  54.44547
2020.01.31 AAPL 09:30:00.084 87.73 32   0    K    N  54.44547
2020.01.31 AAPL 09:30:00.091 87.77 11   0    9    N  54.44547
2020.01.31 AAPL 09:30:00.146 87.78 77   0    A    N  54.44547
2020.01.31 AAPL 09:30:00.149 87.81 27   0    W    N  54.44547
2020.01.31 AAPL 09:30:00.167 87.8  47   0    E    N  54.44547
2020.01.31 AAPL 09:30:00.273 87.94 45   0    B    N  54.44547
2020.01.31 AAPL 09:30:00.438 87.96 10   0    E    N  54.44547
2020.01.31 AAPL 09:30:00.603 87.85 23   0    B    N  54.44547
2020.01.31 AAPL 09:30:00.625 87.78 80   0    G    N  54.44547
2020.01.31 AAPL 09:30:00.634 87.82 80   0         N  54.44547
2020.01.31 AAPL 09:30:00.886 87.69 65   0    E    N  54.44547
2020.01.31 AAPL 09:30:00.920 87.72 99   0    C    N  54.44547
2020.01.

In [90]:
//finally, return the results from our original table that are less than the exchange average
select from interim where size < exAvg

date       sym  time         price size stop cond ex exAvg   
-------------------------------------------------------------
2020.01.31 AAPL 09:30:00.068 87.75 42   0    P    N  54.44547
2020.01.31 AAPL 09:30:00.084 87.73 32   0    K    N  54.44547
2020.01.31 AAPL 09:30:00.091 87.77 11   0    9    N  54.44547
2020.01.31 AAPL 09:30:00.149 87.81 27   0    W    N  54.44547
2020.01.31 AAPL 09:30:00.167 87.8  47   0    E    N  54.44547
2020.01.31 AAPL 09:30:00.273 87.94 45   0    B    N  54.44547
2020.01.31 AAPL 09:30:00.438 87.96 10   0    E    N  54.44547
2020.01.31 AAPL 09:30:00.603 87.85 23   0    B    N  54.44547
2020.01.31 AAPL 09:30:01.139 87.7  17   0    A    N  54.44547
2020.01.31 AAPL 09:30:01.400 87.68 13   0    J    N  54.44547
2020.01.31 AAPL 09:30:02.158 87.38 16   0    N    N  54.44547
2020.01.31 AAPL 09:30:03.024 87.46 46   0    W    N  54.44547
2020.01.31 AAPL 09:30:03.076 87.39 51   0    J    N  54.44547
2020.01.31 AAPL 09:30:04.268 87.19 12   0    C    N  54.44547
2020.01.

Hopefully this illustrates how much more simple using `fby` is compared to the above statements. 

The `fby` doesn't have to be used only in the `where` clause, we can use this in any part of our statement: 

In [91]:
select sym, size, ex, lessThanEx: size < (avg;size) fby ex from trade where date = last date

sym  size ex lessThanEx
-----------------------
AAPL 65   N  0         
AAPL 42   N  1         
AAPL 32   N  1         
AAPL 11   N  1         
AAPL 77   N  0         
AAPL 27   N  1         
AAPL 47   N  1         
AAPL 45   N  1         
AAPL 10   N  1         
AAPL 23   N  1         
AAPL 80   N  0         
AAPL 80   N  0         
AAPL 65   N  0         
AAPL 99   N  0         
AAPL 17   N  1         
AAPL 76   N  0         
AAPL 57   N  0         
AAPL 65   N  0         
AAPL 99   N  0         
AAPL 13   N  1         
..


In [92]:
update  filterSize:(avg;size) fby ex,
        lessThanEx: size < (avg;size) fby ex from  //as an update to the table instead
        (select from trade where date = last date) //partitioned, so we first select, then update

date       sym  time         price size stop cond ex filterSize lessThanEx
--------------------------------------------------------------------------
2020.01.31 AAPL 09:30:00.067 87.75 65   0    T    N  54.44547   0         
2020.01.31 AAPL 09:30:00.068 87.75 42   0    P    N  54.44547   1         
2020.01.31 AAPL 09:30:00.084 87.73 32   0    K    N  54.44547   1         
2020.01.31 AAPL 09:30:00.091 87.77 11   0    9    N  54.44547   1         
2020.01.31 AAPL 09:30:00.146 87.78 77   0    A    N  54.44547   0         
2020.01.31 AAPL 09:30:00.149 87.81 27   0    W    N  54.44547   1         
2020.01.31 AAPL 09:30:00.167 87.8  47   0    E    N  54.44547   1         
2020.01.31 AAPL 09:30:00.273 87.94 45   0    B    N  54.44547   1         
2020.01.31 AAPL 09:30:00.438 87.96 10   0    E    N  54.44547   1         
2020.01.31 AAPL 09:30:00.603 87.85 23   0    B    N  54.44547   1         
2020.01.31 AAPL 09:30:00.625 87.78 80   0    G    N  54.44547   0         
2020.01.31 AAPL 09:30:00.

##### Exercise
Write a statement using `fby` to find the largest volume in our `trade` table (`where date = last date`) for which the price is greater than the average price for that symbol.

In [98]:
select max size from trade where date = last date, price > (avg;price) fby sym

size
----
99  


In [100]:
// Enter your qSQL code here
select max size from trade where date = last date, price>(avg;price) fby sym

size
----
99  
