In [ ]:
#;.pykx.disableJupyter()

In [ ]:
# https://code.kx.com/pykx/3.0/examples/jupyter-integration.html#q-first-mode
import pykx as kx
kx.util.jupyter_qfirst_enable()

 <img src="../qbies.png" style="width: 100px;padding-right:5px;padding-top:1px;padding-left:5px;" align="left"/>
 <p style="padding-left:125px;padding-top: 30px";><font size="+2"><b> Practical Guidance - Joins</b></font><p>

## Keyed joins
As mentioned in the notebook, there are five different keyed joins in kdb+. The three joins that weren't discussed in the notebook are discussed below:

+ [Plus Join](https://code.kx.com/v2/ref/pj/) - `pj`
+ [Union join](https://code.kx.com/q/ref/uj/) - `uj`  
+ [Equi join](https://code.kx.com/q/ref/ej/) - `ej`

### Plus Join `pj`

The [plus join](https://code.kx.com/q/ref/pj/) is used to sum matching columns of two tables. The left table can be either keyed or unkeyed and the right table must be keyed. The resulting table is all of the rows of the left table summed with the corresponding rows of the right table.

The Syntax: 

    pj[unkeyed or keyed;keyed]

Similar to `lj` the match is determined based on the key column in the secondary, or RHS table. 

In [1]:
show stocks:([]sym:`IBM`AAPL`GOOG;amount:400 700 1200)
show newpurchases:([sym:`IBM`GOOG]amount:60 30)
pj[stocks;newpurchases]

sym  amount
-----------
IBM  400   
AAPL 700   
GOOG 1200  
sym | amount
----| ------
IBM | 60    
GOOG| 30    


sym  amount
-----------
IBM  460   
AAPL 700   
GOOG 1230  


### Union Join `uj`
[Union join](https://code.kx.com/q/ref/uj/) appends data as rows in contrast to `ij` and `lj` - columns will be added where they are not already present to create a combined schema between the two tables. 

The syntax: 

     LHSTable uj RHSTable

This is the most flexible join and when both tables are unkeyed looks to create a combined schema between the two tables that will allow all records to be combined. This will not throw an error in the event of type conflicts between common columns.

In [2]:
//creating tables
show trade:([]time:09:00+10*til 5;sym:`JPM`GE`JPM`IBM`GE;price:30+5?3.;size:5?20) 
//created a key table keyed on sym
show reference:([sym:`JPM`IBM]companyName:`$("JP Morgan";"International Business Machines");sector:`Banking`IT) 

time  sym price    size
-----------------------
09:00 JPM 32.3551  12  
09:10 GE  31.60413 8   
09:20 JPM 32.13351 10  
09:30 IBM 31.23479 1   
09:40 GE  31.47955 9   
sym| companyName                     sector 
---| ---------------------------------------
JPM| JP Morgan                       Banking
IBM| International Business Machines IT     


In [3]:
trade uj 0! reference                   //common schema used and data for each present as separate rows
trade uj ([]size: 2 3f),'0! reference   //adding a size column to reference with type float

time  sym price    size companyName                     sector 
---------------------------------------------------------------
09:00 JPM 32.3551  12                                          
09:10 GE  31.60413 8                                           
09:20 JPM 32.13351 10                                          
09:30 IBM 31.23479 1                                           
09:40 GE  31.47955 9                                           
      JPM               JP Morgan                       Banking
      IBM               International Business Machines IT     


time  sym price    size companyName                     sector 
---------------------------------------------------------------
09:00 JPM 32.3551  12                                          
09:10 GE  31.60413 8                                           
09:20 JPM 32.13351 10                                          
09:30 IBM 31.23479 1                                           
09:40 GE  31.47955 9                                           
      JPM          2f   JP Morgan                       Banking
      IBM          3f   International Business Machines IT     


In the event of just one table being keyed, an error is thrown. If both tables are keyed on the same column we get the corresponding records in each table for the set of keys :

In [4]:
(`sym xkey trade) uj reference 
reference uj (`sym xkey trade)

sym| time  price    size companyName                     sector 
---| -----------------------------------------------------------
JPM| 09:00 32.3551  12   JP Morgan                       Banking
GE | 09:10 31.60413 8                                           
IBM| 09:30 31.23479 1    International Business Machines IT     


sym| companyName                     sector  time  price    size
---| -----------------------------------------------------------
JPM| JP Morgan                       Banking 09:00 32.3551  12  
IBM| International Business Machines IT      09:30 31.23479 1   
GE |                                         09:10 31.60413 8   


 ###### Example
One use of `uj` could be to time order data from two different tables in order to ascertain the sequence of updates across multiple tables. For example, trade and order tables could be joined as below:

In [5]:
trade // Using the trade table defined above
/creating an order table
show order:([]time:asc `minute$08:00+5?0D02:00:00;
             sym:5?`JPM`GE`IBM;
             orderID:(5 cut 25?.Q.n),'string 5?`5;
             price:30+5?20.)

time  sym orderID      price   
-------------------------------
08:46 IBM "92701ifnam" 41.57041
08:49 JPM "92188fohfn" 31.67777
09:21 IBM "17245kdjbl" 33.91981
09:27 GE  "42785eegnd" 37.51276
09:58 JPM "64133ncejf" 42.2749 


time  sym price    size
-----------------------
09:00 JPM 32.3551  12  
09:10 GE  31.60413 8   
09:20 JPM 32.13351 10  
09:30 IBM 31.23479 1   
09:40 GE  31.47955 9   


It is now easy to see the sequence in which trades and orders happened.

In [6]:
`time xasc uj[trade;order]

time  sym price    size orderID     
------------------------------------
08:46 IBM 41.57041      "92701ifnam"
08:49 JPM 31.67777      "92188fohfn"
09:00 JPM 32.3551  12   ""          
09:10 GE  31.60413 8    ""          
09:20 JPM 32.13351 10   ""          
09:21 IBM 33.91981      "17245kdjbl"
09:27 GE  37.51276      "42785eegnd"
09:30 IBM 31.23479 1    ""          
09:40 GE  31.47955 9    ""          
09:58 JPM 42.2749       "64133ncejf"


### Equi Join `ej`
[Equi join](https://code.kx.com/q/ref/ej/) joins two tables on specified column(s). The result is a table with all of the rows in the first table that match the second table on the specified columns.

The Syntax: 
    
    ej[matchingColumns;table1;table2]

In [7]:
show trade:([]sym:`XOM`GM`MET`GOOG`GM`XOM;price:200 150 60 151 152 199)
show sector:([]sym:`XOM`GM`IBM;sector:`energy`auto`tech)
ej[`sym;trade;sector]

sym  price
----------
XOM  200  
GM   150  
MET  60   
GOOG 151  
GM   152  
XOM  199  
sym sector
----------
XOM energy
GM  auto  
IBM tech  


sym price sector
----------------
XOM 200   energy
GM  150   auto  
GM  152   auto  
XOM 199   energy


## Bitemporal joins

### Window joins

Window join is a generalization of as-of joins, and rather than retrieving just the last value are designed to aggregate values within certain intervals.  

The Syntax: 
    
    wj[windowS;columns;source table;(reference table;(function1;col1);(function2;col2)...)]
    
Where
 + windowS is a two item list of (`startTimes`;`endTimes`), each of the same length as the source tables 
 + columns are the matching columns that are in both tables
 + source table
 + reference table is the table that you can aggregate on 
 + `colN` - the columns that the aggregation is going to be applied to 
 + `functionN` -  are the aggregation functions

In [8]:
show t:([]sym:3#`JPM;time:09:30:01 09:30:04 09:30:08;price:120 123 121)
show q:([]sym:10#`JPM;time:asc 09:30:00+10?8;ask:10?90+til 20;bid:10?90+til 20)

sym time     price
------------------
JPM 09:30:01 120  
JPM 09:30:04 123  
JPM 09:30:08 121  
sym time     ask bid
--------------------
JPM 09:30:01 104 98 
JPM 09:30:04 104 108
JPM 09:30:04 102 105
JPM 09:30:05 104 98 
JPM 09:30:05 99  103
JPM 09:30:06 92  96 
JPM 09:30:06 109 99 
JPM 09:30:06 102 106
JPM 09:30:07 99  90 
JPM 09:30:07 100 90 


 Let's construct a 3 second set of windows (2 seconds before and 1 second after each trade time) for each of our trades:

In [9]:
t[`time]                         //our trade times
windows: -2 1+\: t[`time]       //using iterators to add -2 and +1 to each time
windows

09:30:01 09:30:04 09:30:08


09:29:59 09:30:02 09:30:06
09:30:02 09:30:05 09:30:09


 Using these windows, we can join t and q calculating the highest ask and lowest bid within the window.

In [10]:
wj[windows;`sym`time;t;(q;(max;`ask);(min;`bid))]

sym time     price ask bid
--------------------------
JPM 09:30:01 120   104 98 
JPM 09:30:04 123   104 98 
JPM 09:30:08 121   102 90 


<img src="../qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:2px;padding-left:5px;" align="left"/><p style='color:#273a6e'><i>This concept is discussed more details in the Kx Advanced course.</i></p>