# Atoms and Primitives

**Learning Objectives**

To understand: 
* Atoms in kdb+/q
* How to identify an atom
* Datatypes in kdb+/q
* Primitives 
* Operator Precedence
* How to create a Variable 
* Using Variables in Arithmetic

## Atoms 

An *atom* in kdb+/q is any single item of a particular [datatype](https://code.kx.com/q/basics/datatypes/). Atoms form the building blocks for more advanced data structures in kdb+/q.

In [1]:
2   //default type is long when not a floating numeric value
2.3 //default type is float when floating 

2


2.3


There are a number of different datatypes in kdb+, and a large number of these are time-specific datatypes. 

In [2]:
09:30:00    //this is a time type
2020.01.01  //this is a date type 

09:30:00


2020.01.01


Within the kdb+/q language there are shortcuts for retrieving some of the [system information](https://code.kx.com/q/ref/dotz/) as atomic values e.g. the present time or date. 

We can run the below code to retieve them - lets do that now! 

In [3]:
.z.t  //Current time 

09:47:01.611


In [4]:
.z.d  //Current date 

2020.11.16


Each of the above items are singular atomic values. 

##### Exercise
Using the information [here](https://code.kx.com/q/ref/dotz/), find and return below the current UTC system timestamp value.

In [None]:
//built-in operator that returns the UTC timestamp
.z.p 

In [6]:
//Enter your code below
.z.p

2020.11.16D09:49:25.746652000


If we want to create an atom of a datatype that isn't the default, we can specify [the type](https://code.kx.com/q/basics/datatypes/) with a trailing type indicator matching the character code for that type as follows: 

In [7]:
30e //creating a real - the "e" is the type extension 

30e


In [8]:
3h  //creating a short - the "h" is the type extension

3h


##### Exercise
Using the information [here](https://code.kx.com/q/basics/datatypes/), create an integer atom with a value of 8.

In [None]:
//specifying an integer type using an "i"
8i

In [9]:
//Enter your code below

8i

8i


## Vectors/Lists

An atom in kdb+/q refers to a singular value - a vector is a list of atoms. 

In [10]:
1 2 3   //a list of longs 

1 2 3


In [11]:
"a list of characters"  //a list of characters in kdb+/q is a string

"a list of characters"


<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:2px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Text data in kdb+/q is commonly stored as a List (see corresponding section) of characters, typically referred to as a "string".</i></p>

##### Exercise

Output the iconic "hello world!" 

In [None]:
//we can return this as a string
"hello world!"

In [12]:
//Enter your code below
"hello world!"

"hello world!"


Lists/Vectors are actually so important within kdb+/q they have their own separate module which we will come to next!

# Datatypes 

The kdb+/q language has a built in command called [type](https://code.kx.com/q/ref/type/) that will let us know the datatype of anything passed to it. 

In [13]:
type 60  //the default datatype is long 

-7h


This is the first inbuilt function (known as keywords, or primitive operations) in kdb+/q  we have used! Congratulations! 

<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:3px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Looking at the output of <code>-7h</code> from our <code>type</code> command, what can we say about this return value? </i></p>

We can see that the returned output from the `type` command is a short because of the trailing `h`! The return value being a short is not related to the data we passed to the `type` command, rather this is just the format in which the helpful information (`-7`) returned by the command is output. Going forward we can safely ignore the trailing `h` as not having any information from the passed data.

## How to use the `type` command

The type command tells us two pieces of information: 
* Whether the input is atomic or a vector
* The datatype of the input

Let's compare the following outputs and fill out the requested information to see if we can deduce how this command works.

In [14]:
type 60 //the input is an atomic long with a value of 60 

-7h


##### Exercise 

With reference to the [datatype](https://code.kx.com/q/basics/datatypes/) guide and using the `type` command, find the following: 
* What is the character reference value for a long? (2nd column, heading c) 
* What is the numeric reference value associated with a long? (1st column, heading n) 

Is the output positive or negative? 

Character reference value: j

Numeric reference value: 7

Output: Negative

_Fill out the below!_ 

Character reference value: j

Numeric reference value: 7

Output: negative

##### Exercise 

Repeat the exercise with an input of `1 2 3i`

In [None]:
type 1 2 3i  //vector integer input

Character reference value: i

Numeric reference value: 6

Output: Positive

In [15]:
type 1 2 3i //your code here 

6h


_Fill out the below!_ 

Character reference value: i

Numeric reference value: 6

Output: positive

##### Exercise 

Finally lets repeat the exercise with an input of `2.0 3.4`

In [None]:
type 2.0 3.4  //vector float input

Character reference value: f

Numeric reference value: 9

Output: Positive

In [16]:
type 2.0 3.4 //your code here 

9h


_Fill out the below!_ 

Long character reference: f

Numeric reference value: 9

Output: positive

<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:3px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Hopefully this has helped to clarify the way in which the <code>type</code> works, but formally these rules are described below. </i></p>

1. If the output from `type` is *positive* the input was a vector
* If output from `type` was *negative*, the input was atomic. 
* The numeric value returns corresponds to the Numeric reference value associated with the type of the input. e.g. a short input will return an numeric value of 5, etc. 

Again, the trailing `h` returned from the `type` command itself is because the return is a short. This is because we only have a finite number of types, so using a bigger datatype to store the values in would be wasteful. 

##### Exercise 

What do you expect the return to be from the following: 

    type 3.14
   

In [None]:
type 3.14 // negative - because atomic, 9  - because default floating type is float

_Fill out the below!_

I expect the return value to be: 

In [None]:
//your code here

## Temporal datatypes

Now we understand the basics of atoms, vectors and datatypes, we can delve deeper into temporal datatypes. 

In [17]:
09:30:00.000 //time type is specified as hh:mm:ss.milliseconds
09:30:00     //second 
09:30        //minute

09:30:00.000


09:30:00


09:30


In [18]:
type 09:30:00.000 
type 09:30:00 
type 09:30

-19h


-18h


-17h


We can add these using the primitive [`+`](https://code.kx.com/q/ref/add/):

In [19]:
09:30:00.000 + 00:12     //kdb+/q understands arithmetic between these types
09:30:00 + 00:12         //highest level of granularity is preserved 

09:42:00.000


09:42:00


Dates in kdb+/q have the format yyyy.mm.dd: 

In [20]:
2020.01.01  //a date
.z.d        //current date

2020.01.01


2020.11.16


In [21]:
.z.d + 6    //adding 6 days to the specified date

2020.11.22


Time intervals can also be explicitly encoded in the timespan type, which has the format xDhh.mm.ss.nanoseconds, where x is a number of days

In [22]:
2020.01.01 + 6D00:00:00.000   //the output is a timestamp! A timestamp is a combination of a time and date

2020.01.07D00:00:00.000000000


##### Exercise 

Using the information [here](https://code.kx.com/q/ref/dotz/), create a timestamp by adding the current date and current time. 
   

In [None]:
.z.d + .z.t       //adding a date and time returns a timestamp as the default temporal datatype
type .z.d + .z.t  //verifying this is in fact a timestamp

In [25]:
type .z.d + .z.t //your code here

-12h


## Textual datatypes

Almost all languages have some way to store non-numeric text/categorical data - in kdb+/q there are two datatypes that can be used for this data. 

Textual data can be stored as either a `symbol` or a `character` string: 

In [26]:
`thisIsAsymbol 

`thisIsAsymbol


In [27]:
"this is a character"

"this is a character"


In [28]:
type `thisIsAsymbol         //symbols are created using a leading backtick
type "this is a character"  //strings are enclosed in quotes

-11h


10h


<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:2px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Looking at the output of <code>type</code> command, what major difference can we see between strings and symbols?</i></p>

We can see the `string` is actually a list of characters! This is important to remember when we get to vector operations, particularly because a list of strings (e.g. in a column of a table) will really be a list of lists.  

Even though the `symbol` has multiple letters, this is actually an atomic value. 

##### Exercise 

Create the text data `ABCD` as a: 
* Symbol 
* String

In [None]:
//symbol - leading backtick 
`ABCD 
//string - enclosed in quotes
"ABCD"

In [29]:
 //your code here 
`ABCD
"ABCD"

`ABCD


"ABCD"


## Nulls and Infinite values 

Each datatype has an associated [null and infinite value](https://code.kx.com/q/basics/datatypes/), with the exception of types for which this doesn't make sense (e.g. an infinite Guid).

In [30]:
type 0N    //null long 
type 0w    //infinite float 
type 0Nf   //also a null float  - in general a null is 0N + the character value of the type 

-7h


-9h


-9h


We can use the [`null`](https://code.kx.com/q/ref/null/) keyword to identify null values: 

In [31]:
null 0N 2 1 0N 3   //returns a boolean array corresponding to the list 

10010b


##### Exercise 

Return the null type for a minute datatype, and the infinite value for a real datatype.  

Verify these are null/not null respectively using the `null` command.

In [None]:
//null minute 
0Nu 
null 0Nu

In [None]:
//infinite real 
0We
null 0We 

In [33]:
//your code here 
0Nu
null 0Nu
0We
null 0We

0Nu


1b


0we


0b


# Primitives

## Basic Primitives

Primitives are inbuilt native functions in kdb+/q. The 4 basic arithmetic operators in q are:

    + (addition)
    - (subtraction)
    * (multiplication)
    % (division)

In general, the terms *primitives*, *operators* and *keywords* refer to the [inbuilt functions in kdb+/q](https://code.kx.com/q/ref/). 

In [40]:
//2 + 4.0  //arithmetic will conform 
6%3      //division with longs will return a float 
6 div 3 //division returns whole number long
7%2
7 div 2

2f


2


3.5


3


##### Exercise 

There are 140 calories in a single serving of mint M&Ms - given the guideline calorific intake per day is approx 2000, lets use division to see how many bags we can have and stay under our allowance.  

Assume we can only eat whole bags.

(The keyword [`floor`](https://code.kx.com/q/ref/floor/) may be helpful here)

In [None]:
// using floor to get whole number of bags that we can eat
floor 2000%140

In [None]:
//the keyword div will perform integer division in this way - rounding down 
2000 % 140 
2000 div 140

In [42]:
floor 2000%140 //your answer here 
2000 div 140

14


14


In [43]:
-7%4
-7 div 4
floor -7%4
ceiling -7%4

-1.75


-2


-2


-1


There are more inbuilt functions like the `type` and `floor` commands we have already used e.g [`neg`](https://code.kx.com/q/ref/neg/) (to get the negative of a number), or set operations like [`in`](https://code.kx.com/q/ref/in/) or [`except`](https://code.kx.com/q/ref/except/) etc. 

In [44]:
neg 5.4  //or explicity -5.4 
-5.4 

-5.4


-5.4


In [45]:
1 in 1 2 3    //is (this item) in (this list)
1 2 in 1 2 3  //are (these items) in (this list) 

1b


11b


##### Exercise 

The modulo function in kdb+/q is [mod](https://code.kx.com/q/ref/mod/). 

1. Use this to get the modulo 7 of the current date.
* What day of the week is 0 when modulo 7 in kdb+/q? 
* Write logic using `mod` and `in` to check if a given date (use the current date) is a weekend day

In [None]:
.z.d mod 7     //the current date

In [None]:
2020.07.11 mod 7 //this was a Saturday! In kdb+/q the week "starts" on a Saturday.

In [None]:
(.z.d mod 7) in 0 1 //weekends are 0 1 

In [49]:
//your answer here 
.z.d mod 7
2020.11.14 mod 7
(.z.d mod 7) in 0 1

2i


0i


0b


## Operator Precedence

<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:1px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> In kdb+/q the order of execution of a code command is right to left, also referred to as 'left of right' </i></p>

There are many reasons why kdb+/q might operate in this fashion, but one such reason is performance based. By performing operations from right to left, we avoid the need to consult precedence tables. 

If it didn't operate in this way the whole statement would need to be read and then reconstructed before passing to the compiler based upon the specified operator hierarchy (think of [BOMDAS](https://en.wiktionary.org/wiki/BOMDAS)!).

The sentence above is crucial when reading existing Q code. This allows for shorter code snippets, but requires the reader to be completely aware of the languages precedence. 

What do you expect the below cell to return?

In [50]:
3*4+2 

18


##### Exercise 
What do you expect the below expressions to return?

*  3+10*4
*  3+(10*4)
*  (3+10)*4

In [None]:
// You might have expected 52,43 and 52
3+10*4
3+(10*4)
(3+10)*4

In [51]:
//your answer here 
43
43
52

43


43


52


52


## Comparison Primitives

Comparison in kdb+/q has intuitive operators largely in line with mathematical notation:

    >   is greater than
    <   is less than
    >=  is greater than or equal to
    <=  is less than or equal to
    <>  not equal to
 
Comparison checks return booleans indicating if the comparison criteria is met (`1b`) or not (`0b`).

Below are some examples:

In [52]:
5>4
5<>4
5>5
5>=5

1b


1b


0b


1b


A special case of comparison is to check if two items are equal. In kdb+/q there are two ways to check equality: 

    =   equal values
    ~   exact match 
    
The `=` operator will check if the values are equal but isn't a *strict* check, in that it doesn't require the datatypes between the two objects to be the same. 


The `~` operator (referred to as [tilda or match](https://code.kx.com/q/ref/match) ) checks to see if both the value and type are the same.

In [53]:
4=4.0 //check if representative value is equal 
4~4.0 //check value and type match

1b


0b


We will talk in later sections about lists, but as a quick intro these operations can also be applied to lists.

In [54]:
1 2 3=1 2.0 3   // when dealing with lists, = compares the lists item-wize
1 2 3~1 2.0 3   // even with lists, ~ always returns a single true or false

111b


0b


##### Exercise 

What do you expect the below expressions to return? 
* 3~6%2  
* 1 2 = 1 4
* 1 2 ~ 1 4 
* 3=6%2

In [None]:
3~6%2       //the output of the division is a float, this is not an exact type match with the long

In [None]:
1 2 = 1 4   //the equality is evaluated pairwise - the first pair  (1 = 1) matches, the second (2 = 4) does not

In [None]:
1 2 ~ 1 4   //each list is take in its entirety and evaluated for an exact match with the other

In [None]:
3=6%2       //the value represented by the two different types match 

In [55]:
//your answer here
3~6%2
1 2 = 1 4
1 2 ~ 1 4
3=6%2

0b


10b


0b


1b


## Function notation

Thus far, we have used `infix` notation with our keywords. Functions/keywords can also be called explicitly using `functional` notation as follows: 

In [56]:
neg 10 2 //infix notation
neg[10 2] //functional notation

-10 -2


-10 -2


With functional notation the function parameters are passed `;` separated into the calling function: 

In [57]:
1 in 2 3 
in[1;2 3]

0b


0b


##### Exercise
Use functional notation and the keyword [mavg](https://code.kx.com/q/ref/avg/#mavg) to calculate the pairwise moving average on the list `3 2 4.1 5 2 1`

In [None]:
mavg[2;3 2 4.1 5 2 1]

In [59]:
//your code here 
mavg[2; 3 2 4.1 5 2 1]

3 2.5 3.05 4.55 3.5 1.5


# Variables & Assignment

## Variable Assignment

Variables are used to store information to be later referenced and manipulated in a computer program - storing intermediate results is useful for avoiding repeated work. 

To assign a variable to a value, we use the `:` operator to pair the name we have chosen for the variable with the value we wish to assign to it.

In [60]:
a:25

<img src="../qbies.png" style="width: 50px;padding-right:5px;padding-top:10px;padding-left:5px;" align="left"/>

<p style='color:#273a6e'><i> Note that in kdb+/q assignment doesn't use = as in many other languages. As we saw previously = is used to check equality. In kdb+/q to assign a variable to a value we use : to assign the right hand side (RHS) to the variable name on the left hand side (LHS).</i></p>

If I wish to examine the value of a variable, simply type it out and hit enter

In [61]:
a

25


We can use our new variable in arithmetic expressions:

In [62]:
b: 10 + a
b

35


We can continue assigning `a` to other values.

Rather than separating the assignment and visualisation of the variable into separate cells, we can use the `show` keyword to immediately see the assigned value.

In [63]:
show a:12    // assign a to another long type
show a:"G"   // Assign a to a completely different type (a char in this case)
show a:4.5   // Assign a to a fractional number (called a float in q)

12
"G"
4.5


In [64]:
a
b      //note changing a hasn't changed b!

4.5


35


##### Exercise 
Lets create:
* A new variable `yesterday` and store in it yesterdays date. 
* A new variable `null_int` and store in it an integer null. 
* A new variable `avg_sleep` and assign it a value of 6.8 (this is the average hours of sleep Americans get each night!).

In [None]:
yesterday:.z.d-1  //in-built function .z.d that returns today's date
null_int:0Ni      //remember nulls in q have an associated type
avg_sleep:6.8     

In [66]:
//your answer here 
show yesterday: .z.d - 1
show null_int:0Ni
show avg_sleep:6.8

2020.11.15
0Ni
6.8


## Using Variables in Arithmetic Operations

A variable can be used in any expression. Some simple examples below:

In [67]:
pi:22%7
radius:5
area:pi*radius*radius //area of circle formula
area

78.57143


Unlike in some other languages, assignment in kdb+/q isn't required to be stated separately on it's own individual line! Since execution occurs from right to left, we can make an assignment and then continue to operate with this data after assignment. 

We could therefore rewrite the above as follows: 

In [68]:
pi:22%7
area:pi*radius*radius:5  //notice assignment working twice here - we have still defined radius to be 5 here too
area

78.57143


##### Exercise 
Using our `avg_sleep` variable from before, let's figure out:
* How many hours of sleep does this corresponds to during a year by multiplying by 365.25 (the number of days in a year)
* How many sleep hours are missed in a year by not getting 8 hours sleep a night?
* How many days does that missed time correspond to? 

In [None]:
//calculating the hours of sleep and assigning it to a variable yearly
show yearly:6.8*365.25

In [None]:
//using neg built in function to convert the yearly value to a "-"
//this is a very "q" way of doing it as we execute from right to left
show missed:neg[yearly]+365.25*8
//for readability 
(365.25*8)-yearly

In [None]:
//using integer division to round down
show days: missed div 24 

In [71]:
//your answer here 
show yearly: 365.25*avg_sleep
show missed:(365.25*8) - yearly
show days: missed div 24
div[365.25*8-avg_sleep;24]

2483.7
438.3
18f


18f


## Assignment and Projecting Functions 

In-built kdb+/q functions which take multiple inputs can be bound to a given input, creating what is called a projection. 

In [72]:
add2:2+      //we don't define the second input 
add2:+[2;]   //functional notation - we leave the second parameter undefined 
add2:+[2]    //all of these are equivalent
add2       

+[2]


In [73]:
add2 4 

6


##### Exercise 

Create a new function `pairMovingSum` using the in-built keyword [`msum`](https://code.kx.com/q/ref/sum/#msum). 

Test this with the following input: `1 2 3 2 3 2 1` 

(Expected output - `1 3 5 5 5 5 3`)

In [None]:
pairMovingSum:msum[2]

In [None]:
pairMovingSum 1 2 3 2 3 2 1

In [76]:
//your answer here
pairMovingSum: msum[2;]
pairMovingSum 1 2 3 2 3 2 1

1 3 5 5 5 5 3
