In [1]:
#;.pykx.disableJupyter()

In [2]:
# https://code.kx.com/pykx/3.0/examples/jupyter-integration.html#q-first-mode
import pykx as kx
kx.util.jupyter_qfirst_enable()

PyKX now running in 'jupyter_qfirst' mode. All cells by default will be run as q code. 
Include '%%py' at the beginning of each cell to run as python code. 


In [3]:
system"cd ",.trn.nbdir:$["/"=first v;"";getenv[`HOME],"/"],v:first system "dirname '",getenv[`JPY_SESSION_NAME],"'"
\l scripts/loaddata.q

"Initializing variables"
"Loaded Weather CSV"
"Loaded Taxi Trips partitioned DB"
"Defining exercise results"
"Ready"


**Learning objectives**

To understand:
* How to call functions
* How to define user defined functions
* Creating and applying projections
* Iterations 

## Functions

So far we have used built-in functions. Now we introduce user-defined functions.

### Calling functions

Functions are called with the arguments in square brackets `[]`. For example we can call the builtin `max` function on a list like so:

In [4]:
max[10 11 12]

12


With a unary (single argument) function, we can omit the square brackets. These two lines are equivalent:

In [5]:
max[10 11 12]  // functional notation
max 10 11 12   // infix notation

12
12


### Defining functions

We can define our own [functions](https://code.kx.com/q/basics/function-notation/). 

Here is a binary (two-argument) function that calculates the speed in km per hour from distance traveled (miles) and duration (hours).

In [6]:
speed:{[miles;hours]
 mph:miles%hours;
 kph:1.609*mph;
 // return the speed in kph
 :kph;
 }

Here we refer to `miles` and `hours` as arguments of the `speed` function. `kph` is a local variable we define inside the function. We then [explicitly return](https://code.kx.com/q/basics/function-notation/#explicit-return) it as the function’s result using `:`.

 <img src="images/qbies.png" width="50px" align="left"/><p style='color:#273a6e'><i> A function definition is a list of expressions, separated by semicolons and embraced by curly brackets. Functions can be defined over multiple lines: each line, except for the first one, must start with at least one whitespace character (we recommend two). This includes the line with the closing curly bracket. The arguments listed in the [signature](https://code.kx.com/q/basics/function-notation/#signature) are embraced in square brackets and separated by semicolons. </i></p>

An example call to the `speed` function looks like this:

In [7]:
speed[15;0.5]

48.27


If there is no [explicit return](https://code.kx.com/q/basics/function-notation/#explicit-return) from a function its result is the result of evaluating the last expression in it. So the code above can be rewritten as 

In [8]:
speed:{[miles;hours]
 1.609*miles%hours  // NOTE NO SEMICOLON
 }
speed[15;.5]        // result is unchanges

48.27


We can also call this function with a list of distances and a corresponding list of durations.

In [9]:
speed[15 30 20;.5 .9 1.0]

48.27 53.63333 32.18


###### Exercise 19
Create the following function: <br>
_func_ which is equivalent to the mathematical function: <br>\begin{equation}
res= - \frac{y(x+1)^2}{(2(x+1))-1}
\end{equation}
(What is the minimum number of brackets you have to use?) <br> 

In [10]:
 //Use [`xexp`](https://code.kx.com/q/ref/exp/) for power and `%` for division;
{neg (y*(1+x) xexp 2) %-1+2*1+x}

{neg (y*(1+x) xexp 2) %-1+2*1+x}


In [11]:
res:{[x;y]
    up:y * (x + 1) xexp 2;
    down:(2*(x+1)) - 1;
    :neg (up % down);
    }

In [12]:
res:{[x;y]
    neg (y * (x + 1) xexp 2) % (2*(x+1)) - 1
    }

In [13]:
res[10;5]

-28.80952


In [14]:
ex19[10;5] //check correct output

-28.80952


### Explicit and Implicit parameters

In the `speed` function above, we have named its arguments. Call these explicit arguments. 

In [15]:
speed:{[miles;hours] //explicit parameters
 1.609*miles%hours 
 }

When functions have no more than three arguments, their names can be omitted, and `x`,`y`, and `z` used as implicit arguments. So our `speed` function can be written as:

In [16]:
speed:{1.609*x%y}

Here we calculate the speed for two distances for a single duration.

In [17]:
speed[15 30;.5] 

48.27 96.54


###### Exercise 20

Create a function that will find the area of a rectangle with length 7.93 and width 1.87 using implicit parameters.

In [18]:
rectangleArea2:{x*y}  //using implicit parameters
rectangleArea2[7.93;1.87] 

14.8291


In [19]:
area:{
    x * y
    }
area[7.93;1.87]

14.8291


In [20]:
ex20[] //check correct output

14.8291


### Call functions from qSQL

So far our arguments have been lists and atoms. We can also call functions in qSQL queries.


In [21]:
jan09:select from trips where date within 2009.01.01 2009.01.07
select spd:speed[distance;duration % 0D01:00],distance,duration,res:duration % 0D01:00 from jan09 where vendor = `VTS

spd      distance duration              res       
--------------------------------------------------
48.5918  1.51     0D00:03:00.000000000  0.05      
-17.3772 1.26     -0D00:07:00.000000000 -0.1166667
14.28792 0.74     0D00:05:00.000000000  0.08333333
16.8945  0.7      0D00:04:00.000000000  0.06666667
25.82445 1.07     0D00:04:00.000000000  0.06666667
24.4568  5.32     0D00:21:00.000000000  0.35      
28.4793  0.59     0D00:02:00.000000000  0.03333333
28.42567 2.65     0D00:09:00.000000000  0.15      
31.41263 4.23     0D00:13:00.000000000  0.2166667 
34.17516 3.54     0D00:10:00.000000000  0.1666667 
51.86343 9.67     0D00:18:00.000000000  0.3       
26.75537 1.94     0D00:07:00.000000000  0.1166667 
24.96875 5.69     0D00:22:00.000000000  0.3666667 
55.13507 10.28    0D00:18:00.000000000  0.3       
37.1679  3.08     0D00:08:00.000000000  0.1333333 
28.09314 2.91     0D00:10:00.000000000  0.1666667 
22.6869  1.41     0D00:06:00.000000000  0.1       
22.3651  1.39     0D00:06:00.00

This gives us the speed in km/h for each trip in our `jan09` table. We use `duration % 0D01:00` to give us a number of hours as a floating point number from the nanosecond precision duration we have stored in the trips table.

In [22]:
0D00:03:00.000000000 % 0D01:00

0.05


We can combine our functions, native kdb+/q functions, and qSQL with grouping and aggregation to get an average speed for each vendor in our `jan09` table.

In [23]:
select avgspeed:speed[sum distance;sum[duration]%0D01:00] by vendor from jan09

vendor| avgspeed
------| --------
CMT   | 24.43787
DDS   | 22.21592
VTS   | 22.33302


###### Exercise 21

- Create a function called `tipOverDistance` that divides explicit argument `x` by argument `y`.

In [24]:
tipOverDistance:{[x;y] x%y}  

In [25]:
tipOverDistance:{x % y}
tipOverDistance[3;2]

1.5


In [26]:
ex21_a[3;2] //check correct output

1.5


- Write a function `createTable` that selects from `jan09` the columns `vendor`, `distance`, and `tip`; and adds a new column from the result of `tipOverDistance` applied to columns `tip` and `distance`.  

In [27]:
createTable:{
  select tipPerDist:tipOverDistance[tip;distance], distance, tip, vendor from jan09 where distance > 0
 }

In [28]:
createTable:{select tipPerDist:tipOverDistance[tip;distance], distance, tip, vendor from jan09}
createTable[]

tipPerDist distance tip vendor
------------------------------
0          1.3      0   CMT   
0          0.9      0   CMT   
0          1        0   CMT   
0          0.8      0   CMT   
0          5.5      0   CMT   
0          0.9      0   CMT   
0          1        0   CMT   
0          2.1      0   CMT   
0          0.2      0   DDS   
0          3.7      0   CMT   
0          7.2      0   CMT   
0          9.1      0   CMT   
0          3.8      0   CMT   
0          7.1      0   CMT   
0          12.3     0   CMT   
0          1.6      0   CMT   
0          1.2      0   CMT   
0          2.1      0   CMT   
0          4.4      0   CMT   
0          5.4      0   DDS   
..


In [29]:
ex21_b[] //check correct output

tipPerDist distance tip vendor
------------------------------
0          1.3      0   CMT   
0          0.9      0   CMT   
0          1        0   CMT   
0          0.8      0   CMT   
0          5.5      0   CMT   
0          0.9      0   CMT   
0          1        0   CMT   
0          2.1      0   CMT   
0          0.2      0   DDS   
0          3.7      0   CMT   
0          7.2      0   CMT   
0          9.1      0   CMT   
0          3.8      0   CMT   
0          7.1      0   CMT   
0          12.3     0   CMT   
0          1.6      0   CMT   
0          1.2      0   CMT   
0          2.1      0   CMT   
0          4.4      0   CMT   
0          5.4      0   DDS   
..


- Find the average tip per mile per vendor from the result of `createTable`.

In [30]:
select avg tipPerDist, avg distance, avg tip by vendor from createTable[]

vendor| tipPerDist distance tip      
------| -----------------------------
CMT   | 0w         2.668375 0.3695627
DDS   | 0w         2.95473  0.3592817
VTS   | 0w         2.774267 0.4431133


In [31]:
select avg tipPerDist, avg distance, avg tip by vendor from createTable[]

vendor| tipPerDist distance tip      
------| -----------------------------
CMT   | 0w         2.668375 0.3695627
DDS   | 0w         2.95473  0.3592817
VTS   | 0w         2.774267 0.4431133


In [32]:
ex21_c[] //check correct output

vendor| tipPerDist distance tip      
------| -----------------------------
CMT   | 0w         2.668375 0.3695627
DDS   | 0w         2.95473  0.3592817
VTS   | 0w         2.774267 0.4431133


## Iterators

Most iteration is handled implicitly by q operators and keywords. Beyond that, we have iterators. An iterator is an operator that modifies how a function is applied.

Say we want to add `1 2` to `3 4 5`

In [33]:
1 2 3+3 4 5

4 6 8


This signals a length error. The Add operator iterates implicitly, but expects its arguments [to be atoms or have matching lengths](https://code.kx.com/q/basics/conformable/). 

We clearly have something else in mind. Using iterators, we can modify the application of Add to add both vectors together. 

### Mapping iterators

In this instance, we can use:

+ [Each-right and Each-left](https://code.kx.com/q/ref/maps/#each-left-and-each-right)  

<img src="images/eachRighteachLeft.png" width="400" height="200">

In [34]:
1 2+\: 3 4 5 //each left
1 2+/: 3 4 5 //each right

4 5 6
5 6 7
4 5
5 6
6 7


Each Right and Each Left are both examples of **map iterators**, the simplest kdb+/q iterators. Other map iterators are:
* Each
* Each Prior

Below we have a list of lists, we can use the keyword `count` to see how many items are in the list:

In [35]:
L:("the";"quick";"brown";"fox")
count L

4


The `each` keyword modifies only unary (single argument) functions, as above. To modify a multivalent function, we use the Each operator. Let’s look at the Take operator `#` which gets a subset of the data:  

In [36]:
count each L
type each L //checking the type all element

3 5 5 3
10 10 10 10h


`each` can only modifies monadic functions (1 parameter) as seen above. To modify a multivalent function, we can use `each-both`. Let's look at the function `#` which gets a subset of the data:  

In [37]:
3#L                               // returned the first three items of the list
3#'L                              // returned the first three items of each item

"the"
"quick"
"brown"
"the"
"qui"
"bro"
"fox"


Can you predict what `3#''L` returns? Try it.

In [58]:
3#' 'L

("ttt";"hhh";"eee")
("qqq";"uuu";"iii";"ccc";"kkk")
("bbb";"rrr";"ooo";"www";"nnn")
("fff";"ooo";"xxx")


###### Exercise 23

Create two lists `x: 10 30 20 40, y: 13 34 25 46` and join them item by item, returning a pair of lists (type `0h`)

In [39]:
x: 10 30 20 40
y: 13 34 25 46
x,'y
type x,'y

10 13
30 34
20 25
40 46
0h


In [63]:
show x: 10 30 20 40
show y: 13 34 25 46
x,'y

10 13
30 34
20 25
40 46
10 30 20 40
13 34 25 46


In [64]:
ex23[] //check correct output

{x: 10 30 20 40;y: 13 34 25 46;x,'y}[::]


### Accumulating iterators

Where map iterators apply a function *across argument items*, the accumulators apply it repeatedly to the results of successive evaluations. The function is first applied to the entire (first) argument; then to the result of that; then to the result of that; and so on. 

There are two accumulators in q. They both apply a function the same way; but one returns the result of each iteration; the other only the result of the last iteration. The iterators are: 
+ Scan (\\) 
+ Over (/)

<img src="images/scanIteration.png" width="500" height="200">

In [42]:
N:1 4 7 10                         / numeric list
+/[N]                             / sum      (Over)   
+\[N]                             / sum      (Scan)   
*/[N]                             / product  (Over)
*\[N]                             / products (Scan)

22
1 5 12 22
280
1 4 28 280


What is the result of `-/[N]`? Try it.

In [65]:
-/[N]

-20


Below are more examples with a different syntax format:

``function/[data]``

```(function/) data```

Both forms are valid; use which you prefer. 

In [44]:
(+/)1 2 3 4 / Add these numbers, fold '+' over the vector; fold is sometimes called reduce or inject
(*/)1 2 3 4 / Extends to all functions in the expected way
sum 1 2 3 4 / Another way to sum the values, using a built-in function
(+\)1 2 3 4 / Cumulative sums, using scan
sums 1 2 3 4 / Same, using the built-in function

10
24
10
1 3 6 10
1 3 6 10


###### Exercise 24

a. Create a new function, add:{x+x}, and iterate this list of integers across it, 3 6 8

In [45]:
//create a new function, add:{x+x}, and iterate this list of integers across it, 3 6 8
add:{x+x}
add each 3 6 8

6 12 16


In [70]:
add:{x+x}
add each 3 6 8

6 12 16


In [47]:
ex24_a[] //check correct output

6 12 16


b. Create a new function, add2:{x+y}, and iterate this list of integers across it, (3 6 8;4 7 9) so that the first value of each list is added together

In [48]:
//create a new function, add2:{x+y}, and iterate this list of intergers across 
//it, (3 6 8;4 7 9) so that the first value of each list is added together
add2:{x+y}
'[add2][3 6 8;4 7 9]
add2'[3 6 8;4 7 9]

7 13 17
7 13 17


In [77]:
add2:{x+y}
(add2/)(3 6 8;4 7 9)

7 13 17


In [50]:
ex24_b[] //check correct output

7 13 17


c. Multiply each value in this list, 3 5 4 2, against the value 11. Use both scan and over.

In [51]:
 //multiply recursively by value in this list, 3 5 4 2, starting with a seed value of 11
//using scan 
{x*y}\[11;3 5 4 2]
//using over 
{x*y}/[11;3 5 4 2]

33 165 660 1320
1320


In [87]:
{x * y}\[11; 3 5 4 2]
{x * y}/[11; 3 5 4 2]

33 165 660 1320
1320


In [53]:
ex24_c[] //check correct output

scan| 33 165 660 1320
over| 1320


### Bonus Exercise -  Fibonacci sequence

Let's work towards generating the first 10 numbers of the Fibonacci sequence

>The Fibonacci sequence is defined such that each number is the sum of the two preceding ones, starting from 0 and 1. e.g. the first four numbers in the Fibonacci sequence is: 0 1 1 2

Tips:
1. Use sum and / (over) and # (take) to generate the Fibonacci numbers
2. you will also need to define and use a function
3. Start with 0 1
4. Use the [Do form of Over](https://code.kx.com/q/ref/accumulators/#do)

In [54]:
exerFib:{{x, sum -2#x}/[x;0 1]}


x:0 1
sum -2#x
x,sum -2#x       / Beginning of Fibonacci sequence
{x,sum -2#x} 1 1 / Same, but as an (unnamed) function
fib:{x,sum -2#x} / Name it
fib/[10;1 1]     / Similar to +/ example earlier, apply the function repeatedly 10 times
10 fib/ 1 1      / Another way to invoke the function


1
0 1 1
1 1 2
1 1 2 3 5 8 13 21 34 55 89 144
1 1 2 3 5 8 13 21 34 55 89 144


In [55]:
// Enter your code here 

In [56]:
exerFib[20] //check correct output

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181 6765 10946
