In [ ]:
#;.pykx.disableJupyter()

In [ ]:
# https://code.kx.com/pykx/3.0/examples/jupyter-integration.html#q-first-mode
import pykx as kx
kx.util.jupyter_qfirst_enable()

##### Initialization

In [None]:
example:([]a:1 2 3;b: `mini`example`table);
keyedExample:([b:`mini`keyed`example]a: 1 2 3);
complexTab:([]c:(1 2;3 4;5 6);d: `more`complex`table)

# Tables


**Learning Outcomes**

To understand: 
* How to investigate and identify a table
* The two types of table
* How to create a table
* Accessing tables
* How to insert data into a table

# Introduction

Unsurprisingly, [tables](https://code.kx.com/q4m3/8_Tables/) form the core of Kx technology and much of what we've learned up to now has been building up to tables. Tables are the **most important** data structures in kdb+ and is where the vast majority of all data in kdb+/q is stored.


## Table discovery 
Firstly let's look at what tables we have in our workspace:

In [None]:
tables[]               //typical usage - default to passing `.
tables[`.]~tables[]

We can see how many records our tables have using count: 

In [None]:
count example 
count complexTab

In [None]:
\a

To find out the name of the columns of a table, we can apply the keyword [cols](https://code.kx.com/q/ref/cols):

In [None]:
cols example           //returns a list of symbols

If we want to return the entire table contents we can do so by calling the table by name.

In [None]:
example 

In [None]:
complexTab     //This table is more complex, the first column c is actually a list of lists!

## Table schema
Perhaps the most important information associated with any table is the table metadata - obtained by using the keyword [`meta`](https://code.kx.com/v2/ref/meta).

The `meta` command takes a table name as input, and will return a table of available columns and their types and other information. The following columns are produced:

- `c`: column name
- `t`: column [type](https://code.kx.com/v2/ref/#datatypes)
- `f`: [foreign keys](https://code.kx.com/q4m3/8_Tables/#85-foreign-keys-and-virtual-columns)
- `a`: [attributes](https://code.kx.com/v2/basics/syntax/#attributes) - any column modifiers applied for performance characteristics 


In [None]:
meta example 

The first column `t` includes the type information specified by the char value of the type. If this data is itself comprised of lists of a particular type, this char value will be capitalized - we can see this by looking at our complex table: 

In [None]:
meta complexTab

##### Exercise 

Inspect the table definition and types for the `keyedExample` table 

In [None]:
meta keyedExample

In [None]:
//your answer here

# Types of Tables

There are two types of tables in kdb+/q: 
* Unkeyed tables - simple tables which are just lists of records
* Keyed tables - these are really dictionaries (which present as tables) where we associate a table of keys with a table of records.

## How to identify whether a table is keyed or unkeyed
There are two differences that we can use to distinguish - the output in the console and the type. 

In [None]:
example 
keyedExample //can see vertical line

In [None]:
type example 
type keyedExample //this is the type of a dictionary

## Unkeyed tables


An unkeyed table is:

* A list of dictionaries - each dictionary is a record in our table
* A flipped (transposed) column dictionary 
* A collection of lists of equal length (called columns) - the values in our column dictionary

###  A table is a list of dictionaries

An unkeyed table can be directly queried/modified as if it were a list of dictionaries. 

In [None]:
example             //our table as defined
count example       //the number of records in our table

In [None]:
first example           //the first record in our table - it's a dictionary! 

##### Exercise

Return the final row from our `example` table and verify it is in fact a dictionary.

In [None]:
last example
type last example   //it's a dictionary! 

//for unkeyed tables we can use indexing too!
example 2

In [None]:
//your answer here

### A table is a flipped column dictionary

Since kdb+/q works in vector based fashion, each column in the table is treated as a large vector which is associated with that column name. 

In [None]:
show d:flip example     //d is a column dictionary i.e. values are lists of equal length
                           //and the dictionary keys are symbols  

This new variable `d` is just a standard dictionary;

In [None]:
d`b               //We can index keys like any dictionary
d[`b;0 1]         //Index at depth

In [None]:
d[`a;0 1]*:2;      //Amend items
d                  //Notice the change

In [None]:
flip d             //We can use flip to turn this back to a table

### A table is a collection of lists of equal length (called columns)

We have mentioned that kdb+/q treats each column within a table as a vector and to form a table we require each of the columns to be an equal length.  

In [None]:
example `a          //We can extract a column list with lookup notation similar to dictionaries - example[`a]
example`a`b         //We can extract multiple columns

##### Exercise

Create a new column `c:3.14 2.72 299792458` in our `example` table, using dictionary assignment syntax.

In [None]:
example[`c]: 3.14 2.72 299792458  //creating a new column and assigning 
example

In [None]:
//your answer here

## Keyed tables

A keyed table is a special form of a dictionary where the key and value are both tables. We can extract the keys and values using the inbuilt operators, [key](https://code.kx.com/q/ref/key/) and [value](https://code.kx.com/q/ref/value/), just like we did with dictionaries. 

In [None]:
keyedExample 

In [None]:
key keyedExample 

In [None]:
value keyedExample

We can actually make a keyed table (table to table dictionary) explicitly using the `!` keyword with two unkeyed tables of equal length:

In [None]:
example!complexTab //result is a keyed table 

### Differences from unkeyed tables

With keyed tables, we will see that the operations we tried for unkeyed tables will return different values: 

In [None]:
keyedExample
first keyedExample //returns dictionary association of our keyed tables "value" table
first value keyedExample
first[keyedExample]~first value keyedExample

We can also see that attempting to retrieve the columns by name as we did with unkeyed tables does not work either: 

In [None]:
keyedExample[`b]

When we attempt a dictionary lookup in our table, it is performed with reference to the dictionary key and like standard dictionaries, when the value is not found a null value is returned. 

In [None]:
key keyedExample      //what are the keys of our table
keyedExample[`mini]   //using a key value to return the associated "value table" values 
keyedExample[`large]  //using a key that does not exist

When we pass an associated key value our data is returned - further detail in later sections.

## Keying and unkeying a table 


It is possible to convert dynamically between an unkeyed table having a column of potential key values and the corresponding keyed table using the keyword [xkey](https://code.kx.com/q/ref/keys/).

In [None]:
`a xkey example    

We can use this with our already keyed table also:

In [None]:
keyedExample
newKeyed:`a xkey keyedExample   //changing our key to be the column `a instead 
newKeyed

In [None]:
newKeyed[3]    //Now the key has changed, we use our a values for retrieval

##### Exercise 
The below will throw an error - do you know why? 

In [None]:
newKeyed[`mini]            

The reason for the error is that our key is no longer the value for the `b` column as we used before. Since our new key is `a` and this column has a type of long, passing a symbol value for retrieval will throw a `'type` error. If in doubt, checking the `meta` for types is very helpful.

To convert a keyed table back to a regular unkeyed table, we can also use `xkey` with an empty general list as the left operand.

In [None]:
() xkey keyedExample

 ##### The Bang (`!`) operator 
There is another operator we can use to change the keys on a table - there is an overload of the `!` operator referred to as [unkey](https://code.kx.com/q/ref/enkey/#unkey) which will remove or add keys to a table.

In [None]:
0!keyedExample        //Here the 0 means use 0 of the columns to key the table

##### Exercise 

Create a table called `bigExample` which combines the `example` and `complexTab` table to have the column order <code>`c`d`a`b c</code>. This table should be unkeyed. 

Note: there are many different ways of doing this

In [None]:
show bigExample: 0!complexTab!example

In [None]:
//your answer here

# Creating tables
The syntax to create a table is as follows:

` ([kcolname1:klist1;...;kcolnameM:klistM] colname1:list1;...;colnameN:listN) `

where:
+ `kcolname1` ... `kcolnameM` are the names of the key columns
+ `klist1` ... `klistM` are the lists of key values associated with the respective key column
+ `colname1` ... `colnameN` are the column names 
+ `list1` ... `listN` are the lists of values associated with the respective column and all lists are of equal length.

## Unkeyed Tables
For now, let's say that an unkeyed table is a table that doesn't have a primary key associated with it.


### Creating a table with existing data  

In [None]:
show trade:([]sym:`JPM`IBM`BP;size:100 25 54;price:3.45 5.21 6.33)

If there is a mix of atomic and list values kdb+/q will handle this in the same way it handles operations like addition and will fill the column accordingly with the atomic value:  

In [None]:
show trade:([]sym:`JPM`IBM`BP;size:100 25 54;price:3.45 5.21 6.33;ex:`NYSE)

However, we can only do this when not all the values are atomic - if they all are it will throw an error as the table definition syntax requires lists in at least some columns

In [None]:
([]sym:`JPM;size:100;price:3.45) //error

In [None]:
([]sym:enlist[`JPM];size:enlist[100];price:enlist[3.45])  //if we make our atoms lists we can define a single row

### Creating an empty table
Below is a simple definition of an empty table with column names and empty lists. 

In [None]:
show emptyTrade:([]sym:();size:();price:())
// horizontal lines are the signature in the display of a table

In [None]:
meta emptyTrade    //we have no type information as we only used general lists

We can also specify the type that we want each list to be associated with by typing the lists when defining:

In [None]:
show emptyTradeTyped:([]sym:`$();size:`long$();price:`float$())
meta emptyTradeTyped

## Keyed Tables

When defining a table, placing column(s) in the square brackets indicates a key:

In [None]:
show trade:([]sym:`JPM`IBM`BP;size:100 25 54;price:3.45 5.21 6.33)   // unkeyed table

In [None]:
show tradeKeyed:([sym:`JPM`IBM`BP]size:100 25 54;price:3.45 5.21 6.33)   // table keyed on sym

##### Exercise 
Create a table `quote` that looks like this:

|Time|sym|bid|ask|
|---|---|---|---|
|2020.04.16D09:30:00.0000|GE|100|25|
|2020.04.16D09:31:00.0000|GE|100|25|

Create another table `quoteKey` that is the same as the above except it is keyed on `sym`. 

Do so by: 
*  Explicit table definition 
*  Modification of the `quote` table



In [None]:
show quote:([]time:(2020.04.16D09:30:00.0000;2020.04.16D09:31:00.0000);sym:`GE`GE;bid:(100;100);ask:(25;25))

In [None]:
show quoteKey:([sym:`GE`GE];time:(2020.04.16D09:30:00.0000;2020.04.16D09:31:00.0000);bid:(100;100);ask:(25;25))

In [None]:
//modification of the quote table 
`sym xkey quote

In [None]:
//your answer here

# Accessing tables
We have outlined the differences between the two types of tables, due to their different structures accessing them will also be different. 

Let's break it down between accessing specific columns and rows.

## Accessing columns


### Unkeyed tables
There are two notations that allows us to retrieve a column from an unkeyed table: 
* Back-tick notation (like dictionary retrieval)
* Dot notation


In [None]:
trade[`sym] //retrieving the sym column using backtick notation 
trade.sym

We can access columns using the dot notation however it's important to know that this applies to the global table. The dot notation should be avoided particuarly within functions, unless you are absolutely clear on what you are doing. 

In [None]:
sum trade.size           //works fine

In [None]:
getSumSizes:{sum x.size}  //write a function that does the same
getSumSizes trade         //q does not like this - it looks for a global x!

However, if we use the back-tick notation we don't encounter the same issue: 

In [None]:
getSumSizes:{sum x`size};   //changed to use ` notation
getSumSizes trade           //this works

##### Exercise 

Extract the columns bid and ask from our `quote` table: 

In [None]:
quote
quote[`bid`ask]             //we can specify both together

In [None]:
//your answer here

### Keyed tables
As keyed tables are like dictionaries, we need to access them using the same methods as we used on dictionaries. 

In [None]:
tradeKey:([sym:`JPM`IBM`KX]size:10 20 30;price:1 2 3)
tradeKey

In [None]:
key tradeKey  //will return the sym column as a table
(key tradeKey)[`sym] //indexing into the table to return the values of the column sym

We can do the same operation on the value columns that are contained in the `tradeKey` table

In [None]:
(value tradeKey)[`size]
(value tradeKey)[`price]

<img src="../qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:12px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Keyed tables are accessed via their key, so  can't use the dot or back-tick notation like we did before with unkeyed tables. We will see the <code>exec</code> command in the Queries section will also allow us to access columns in a table.</i></p>



Another option would be to remove the key on the table and then access the column from the unkeyed table. Let's check out which method is more performant:

In [None]:
//unkey then access 
\ts:100000  (0! tradeKey)[`size]
//access via value table - the most space performant as we're subsetting the full table 
  //to just the value table in our keyed table
\ts:100000  (value tradeKey)[`size]
//access via exec - this is the most time performant 
\ts:100000 exec size from tradeKey    

## Accessing rows

### Unkeyed tables

As unkeyed tables are lists of dictionaries, we can use list indexing to retrieve a particular row. 

In [None]:
trade
trade 1    //retrieving the second 

Similar to lists again, we can also use [#](https://code.kx.com/q/ref/take/) to get a subset of the rows:

In [None]:
3#1 2 3 4 5 //selecting the first 3 entries of the list
2#trade    //selecting the first 2 entries of the trade table
-2#trade    //selecting the last 2 entries of the trade table

### Keyed table
Like all dictionaries, a keyed table is accessed by key:

In [None]:
tradeKey 
tradeKey`JPM       //retrieve the row at that key - here the key is like the index

If we wanted to retrieve more than one value we can't operate in the same way as a standard dictionary and just provide a list of keys: 

In [None]:
tradeKey[`JPM`BP]  //try retrieve value for multiple keys

If we want to return many values we need to supply our inputs like a table:

In [None]:
tradeKey[([]sym:`BP`IBM)] //BP doesn't exist in table therefore returning nulls for that sym

##### Exercise 

Update our `tradeKey` table to change the values for IBM to have a size of 40 and a price of 10.

In [None]:
tradeKey[`IBM]: 40 10 //indexing into the row correponding to the IBM key 
                        //reassigning the values to 40 and 10
tradeKey

In [None]:
//your answer here

## Accessing a particular cell 

Putting this all together, we can continue to elide index into our tables to access particular cells. 

In [None]:
//unkeyed table 
trade
//keyed table 
tradeKey

Let's suppose in both instances we want to return the size values for JPM and IBM: 

In [None]:
//unkeyed table - first index by row, then by column
trade[0 1;`size]
//OR - first index by column, then by row
trade[`size][0 1]  

//Keyed table - first index by key, then by column 
tradeKey[([]sym: `JPM`IBM);`size]

If we want to modify the size values for JPM and IBM to be 50 and 70 respectively we can do so via reassignment: 

In [None]:
//unkeyed 
trade[0 1;`size]: 50 70 
trade 

In [None]:
//keyed 
tradeKey[([]sym: `JPM`IBM);`size]: 50 70 
tradeKey

##### Exercise

Amend the second row of our `example` table to change our column value for `b` to `exercise` and for `a` to 15.

In [None]:
example 

//we can do this individually: 
example[1;`a]: 15
example[1;`b]: `exercise

//or we can do them both together 
example[1;`a`b]:(15;`exercise)   

//either method is fine!
example

In [None]:
//your answer here

# Inserting Data 
There are many ways to insert data into a table. The two simplest methods involve using the built-in functions
* [insert](https://code.kx.com/q/ref/insert/) 
* [upsert](https://code.kx.com/q/ref/upsert/)

## Insert
The insert function is used to append data to a table. The syntax is:

    `tableName insert data

<img src="../qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:2px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i>Insert <b>requires</b> persistence of the inserted data, otherwise it will error - hence the use of the <code>`</code> to amend the table globally by reference. </i></p>

### Unkeyed tables
Let's take an empty table for this example and try to append some data to it using `insert`:

In [None]:
show trade:([]sym:`$();size:`long$();price:`float$()) // Creating an empty table
meta trade

In [None]:
`trade insert(`BP;100;12.44) 
//insert returns the index/indices that the data has been inserted into

In [None]:
trade insert (`BP;100;12.44) 
//we get a type error if we try to apply the insert without the global reference

##### Exercise

Insert a row into your `quote` table. 
* time:the timestamp right now
* sym:`JPM
* bid:120
* ask:60

In [None]:
quote 
`quote insert (.z.P;`JPM;120;60)
quote

In [None]:
//your answer here

We can also use dictionaries with insert:

In [None]:
`trade insert `price`sym`size!(12.45;`MSFT;400) //inserting one row using a dictionary
trade

### Keyed tables

We can use insert in the same way as we have done above with unkeyed tables.

In [None]:
show tradeKey:([sym:`$()]size:`long$();price:`float$()) //defining table
`tradeKey insert(`BP;100;4.33)                          //works fine
tradeKey                                                //our row has been added

In [None]:
`tradeKey insert `sym`price`size!(`KX;4.55;120)         //using a dictionary instead
tradeKey 

However, this will throw an error if a record of that key already exists in the table:

In [None]:
`tradeKey insert(`BP;200;5.33)                           //insert error

### Bulk inserts

In the example above, we have inserted one entry into the trade table. The beauty of `insert` is that we can also perform bulk inserts:

In [None]:
// inserting two rows using a list of lists
`trade insert(`IBM`AAPL;200 300;15.53 14.39)
trade 

A drawback when using dictionaries is we can't bulk insert in the same way as we saw with the lists: 

In [None]:
//inserting two rows using a dictionary
`trade insert `sym`size`price!((`MSFT;`JPM);(400;200);(12.45;11.5)) 

We can get around that with relative ease by valuing our dictionary: 

In [None]:
//keys in the dict need to match order of columns in the table
`trade insert value `sym`size`price!((`MSFT;`JPM);(400;200);(12.45;11.5)) 

Finally, we can also use `insert` with another table with a consistent schema to append the table data to our target table: 

In [None]:
//flipping our dictionary to a table
show tab:flip `sym`size`price!((`KX;`KX);(400;200);(12.45;11.5))
`trade insert tab 

In [None]:
trade

In [None]:
`size`price`sym#tab                //reordering our tab table 
`trade insert `size`price`sym#tab  //we can insert this too 
trade 

By passing tables we can get the benefit of a bulk `insert`, while also getting the benefit of not having to ensure correct ordering on our data columns. In the event of missing columns, null data is used:

In [None]:
`sym`size#tab 
`trade insert `sym`size#tab
trade

This also applies to keyed tables, so long as we don't use any keys that are already present in the table: 

In [None]:
`tradeKey insert flip `sym`size`price!(`FD`CITI;102 10;23.9 211.1)
tradeKey

##### Exercise

Given a `trade` table, use `insert` to add two new rows with the following values: 

```sym:`GE`MSFT,size:40 60,price:8.12 10.36```

In [None]:
`trade insert (`GE`MSFT;40 60;8.12 10.36) //using a list of lists
//or using a dictionary 
// `trade insert value `sym`size`price!((`GE;`MSFT);(40;60);(8.12;10.36))

In [None]:
//your answer here

## Upsert 

The upsert template is like `insert`, only `upsert` will work on a table passed either by value or by reference and so is more flexible than `insert`. The syntax is the same as insert:

<code>\`tableName upsert data</code> 

### Unkeyed tables
All the examples for insert all work the same for `upsert`.

In [None]:
show trade:([]sym:`$();size:`long$();price:`float$())
`trade upsert (`GOOG;230;15.42)            //upserting one row
`trade upsert `sym`size`price!(`JPM;300;20.31) //upserting a dictionary 
trade 

When using `upsert` we can both pass-by-name and pass-by-value:

In [None]:
trade upsert (`GS;340;30.4)  //we are returned the table with the new data 
trade                        //we can see the change isn't persisted however 

### Keyed tables
With a keyed table, `upsert` works in the same way, however, the `upsert` function will **overwrite the key** if it's 
already present rather than throwing an error. 


In [None]:
tradeKey       //our table from earlier

In [None]:
//upserting by reference 
`tradeKey upsert(`BP;200;5.33)    //upsert returns the table name, not the indices
tradeKey                          //row at that key has been updated

A benefit of the pass-by-value effect of `upsert` is we can add new data to our table and assign to a new value while leaving our original table untouched:

In [None]:
show tradeKey2: tradeKey upsert(`FD;400;10.13)    //can add new key
tradeKey

##### Exercise

Using the `quoteKey` table that we created in previous exercises, upsert a row with the current time, sym <code>\`AAPL</code>, bid of 60 and an ask of 200 to the table `quoteKey` and check to see if has been updated. 

Try upserting it again to see what happens.

In [None]:
quoteKey

In [None]:
`quoteKey upsert (`AAPL;.z.p;60;200)
quoteKey

In [None]:
`quoteKey upsert (`AAPL;.z.p;60;200) //you will see that the time has changed
quoteKey

In [None]:
//your answer here

### Bulk upserts 

Bulk additions to tables using `upsert` require data to be arranged row-by-row. 

In [None]:
`trade upsert ((`KX;240;16.71);(`FD;250;17.34)) //upserting bulk
//`trade insert (`KX`FD;240 250;16.71 17.34)     reminder - this was the insert syntax 

In [None]:
trade //showing the new records inserted

As with `insert` we can also use tables as bulk values to be added to our table, along with lists and dictionaries: 

In [None]:
trade upsert trade    //doubling up our trade table, but not presisting the change

On the plus side, this means that if we want to use a dictionary structure as a means of upserting data, we would need to flip the values to arrange them by row, or alternatively, flip the dictionary into a table: 

In [None]:
trade upsert flip value `sym`size`price!(`JPM`MS;200 399; 10.2 39.5)  //passing a dictionary
trade upsert flip `sym`size`price!(`JPM`MS;200 399; 10.2 39.5)        //passing a table

Again, missing columns will be handled with nulls: 

In [None]:
//passing a subsetted table
trade upsert flip `sym`size#`sym`size`price!(`JPM`MS;200 399; 10.2 39.5)  

##### Exercise 
Given a table defined as follows: 

``tradeEx:([]sym:`JPM`IBM`BP;size:100 25 54;price:3.45 5.21 6.33)``

Use `upsert` to add two new rows with the following values: 

```sym:`JPM`FD,size:50 50,price:4.76 3.21```

In [None]:
show tradeEx:([]sym:`JPM`IBM`BP;size:100 25 54;price:3.45 5.21 6.33)  //defining our table

In [None]:
tradeEx upsert ([] sym:`JPM`FD;size:50 50;price:4.76 3.21)  //bulk table upsert 
tradeEx upsert ((`JPM;50;4.76);(`FD;10;3.21))               //list of rows for upsert 

In [None]:
//your answer here