In [None]:
system"cd ",getenv[`HOME],"/course-introductory-workshop"
.trn.nbdir:system"cd"
\l scripts/loaddata.q

**Learning objectives**

To understand:

* What are lists?
* Casting
* Obtaining random data
* List amendment
* Dictionaries
* Tables

# Lists

[code.kx - q4m - lists](https://code.kx.com/q4m3/3_Lists/)

So far we have seen how a _table_ is a natural fit for storing and analyzing huge amounts of data. Under the covers though, q exploits a more fundamental data structure to build the table: the _list_. A large part of the performance edge of kdb+/q comes from its ability to work directly with lists – every user should be familiar with them!

To get our hands on our first list, consider the `trips` table that we already met. If all you care about is the amount people paid for their journey, we can inspect the `fare` column. You have already seen we can `select` out just the column of interest, with an optional Where phrase.

In [None]:
vtsfares:select fare from trips where date = 2009.01.01, vendor=`VTS
vtsfares

Since tables in kdb+/q are *column-oriented*, columns can be extracted simply by indexing into the table with the column name, returning the column as a contiguous vector, or *list*. 

In [None]:
fares: vtsfares`fare
fares

To check that we've got what we expect, we can use the `type` operator:

In [None]:
type fares

The number is positive and under 20: we have a *simple list*. 
In a simple list, the items all have the same type.

When a list has items of different types, it is referred to as a *general list*. 
For example, a pair representing the taxi company and the fare paid could look like:

In [None]:
general:(`VTS;23.45);
general

While simple lists always have strictly positive values returned by type, general lists always have type zero.

In [None]:
type general

Joining entities of different types with the comma operator will produce a mixed list

In [None]:
general:2018.01.01,102,`hello,enlist "world"
general
type general

A list can be _empty_ - if there had been a typo in the select statement, and we inquired about a non-existent cab company, we would see:

In [None]:
svtfares:select fare from trips where month=2009.01m, vendor=`SVT
svtfares`fare

## Casting

From the result above you can see `$` used to [cast](https://code.kx.com/q/ref/cast/) an empty list. 

When working with data, it is often necessary to cast (change) the data from one type e.g. a time like `09:30:00` to another e.g. a datetime like `2020.05.19T09:30:00`. We can use the `$` to cast a non-textual data type to another data type: 

In [None]:
`float$1 2 //using it's symbol name 
"f"$1 2  //using it's character letter
9h$1 2   //using it's short value

We can create an empty list as a general (i.e.untyped) list, or create a typed empty list:

In [None]:
() //general list 
`long$() //list of type long

We can look at the results just by passing the variable name `fares`:

In [None]:
fares

Yikes! That's a lot of screenspace to waste. 
To save the electrons, we can just look at the first few elements using the `sublist` operator:

In [None]:
10 sublist fares

sublist is clever - to get the _last_ few elements, all you have to do is give a negative number:

In [None]:
-10 sublist fares

##### Exercise 13
- Use sublist to get the second 10 elements in the list

In [None]:
-10 sublist 20 sublist fares
//alternative way - 10 10 sublist fares

In [None]:
// Enter your code here 

In [None]:
ex13[] //check correct output

Sublist has a nice property that the number of elements it returns is _capped_ at the size of the list that it operates on. In comparison the [Take operator `#`](https://code.kx.com/q/ref/take/) returns exactly the number of items you specify:

In [None]:
count 10000000 # fares
count 10000000 sublist fares

Let’s put a bit of structure on our list: make a _sorted_ copy of it that we can play with. 
The [`asc` keyword](https://code.kx.com/q/ref/asc/) does this:

In [None]:
sortedFares:asc fares

`sortedFares` has the same `count` and `type` as fares, but now is _sorted_ in ascending order. If you looked only at the first elements of this list, you might conclude that cab journeys in NYC are great value!

In [None]:
10 sublist sortedFares

##### Exercise 14
- Use sublist to obtain the 10 highest values from the sorted List

In [None]:
-10 sublist sortedFares

In [None]:
// Enter your code here 

In [None]:
ex14[] //check correct output

## Obtaining random data

An easy trap to fall into – we extracted an _unrepresentative sample_. 
To pick ten _random_ records from the list, we can use the [Roll `?` operator](https://code.kx.com/q/ref/deal/)

In [None]:
sampleFares:10?sortedFares;
sampleFares

Lists support _random access_. To pick out the 10th element of a list, we use:

In [None]:
fares[9]

The preceding yields an atom, as can be verified by using `type`. A similar approach works for lists of indexes:

In [None]:
fares[0 1 2 3 4 5 6 7 8 9]

<img src="images/qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:20px;padding-left:5px;" align="left"/><p style='color:#273a6e'><i> Normally, you would use the [`til` keyword](https://code.kx.com/q/ref/til/), to get the list of the first N ints, starting at zero. (As you have seen, q uses zero indexing.) </i></p>

In [None]:
til 10
fares[til 10]

##### Exercise 15
- Extract the 11th to the 20th elements from the fares list using the til keyword

In [None]:
fares[10 + til 10]

In [None]:
// Enter your code here

In [None]:
ex15[] //check correct output

##### Exercise 16
- Use indexing to find the middle value in the `sortedFares` list. 

In [None]:
sortedFares [`long$(count sortedFares)%2]

In [None]:
// Enter your code here

In [None]:
ex16[] //check correct output

In the case of a simple list, if the index used is too high, a _null_ of the list’s type is returned.

In [None]:
sortedFares[count sortedFares]
sortedFares[-1+count sortedFares]  // index from 0 to N-1

The below code block will also obtain the first value from the list: 

In [None]:
1 sublist sortedFares
first sortedFares

<img src="images/qbies.png" width="50px" style="width: 50px;padding-right:5px;padding-top:20px;padding-left:5px;" align="left"/><p style='color:#273a6e'><i> Notice the difference between what is returned by `1 sublist sortedFares` and `first sortedFares`. The former returns a one-item list and the second an atom. You can see below how q displays them on the console </i></p>

[`enlist`](https://code.kx.com/q/ref/enlist/) returns a list containing the argument passed to it

Join `()` to an atom to make a one-item list

In [None]:
enlist 499
(),499

## List Amendment
A simple list can be indexed into using the `@` operator:

In [None]:
2* til 5
@[sampleFares;(2*til 5)]

The `@` operator can be applied with further arguments so that the list can be altered. Below we replace the items at positions `2*til 5` with `99f`.

In [None]:
// index into sampleFares
// using list of indexes (2*til 5)
// assign these values - :
// the value 99f
@[sampleFares;(2*til 5);:;99f]  

Below we use `+` instead of `:` – instead of replacing the items, we add `99f` to them.

In [None]:
@[sampleFares;(2*til 5);+;99f]

The above is not a persistent change - it will make a copy of the `fares` list with a single value changed and display the result at the terminal, but there is no change to the `fares` list. 

In [None]:
sampleFares  // original list not updated

To persist the change, prefix the name of the list with a back-tick; or assign the result to a name:

In [None]:
test:@[fares;(2*til 4);:;0Nf]
test
@[`fares;(2*til 4);:;0Nf]
fares

Extend a list by appending to to it using the [Join operator `,`](https://code.kx.com/q/ref/join/).

In [None]:
fares,:12.34
-10#fares    // inspect the end of the list to see the appended value

Perhaps some data has been lost lost, or otherwise corrupted. kdb+/q handles null values. Is this a problem for us?

In [None]:
any null fares

This is exactly equivalent to using `any[null[fares]]` – but perhaps a little cleaner? Your mileage may vary!

The [`null` keyword](https://code.kx.com/q/ref/null/) flags nulls.

In [None]:
where null fares

##### Exercise 17

Amend the fares list to replace the null values to be equal to the average value.

In [None]:
@[fares;where null fares;:;avg fares]

In [None]:
// Enter your code here

In [None]:
ex17[] //check correct output

# Dictionaries 
[Dictionaries](https://code.kx.com/q/basics/dictsandtables/) are first-class objects in q. (They are known as *hashmaps* in some other languages.) 

Use the [Dict operator `!`](https://code.kx.com/q/ref/dict/) to make a dictionary from a list of keys and a list of values.

In [None]:
d:`a`b!0 1
d

We can access and update existing values being passing in the key to the variable name:

In [None]:
d[`a]
d[`a]:2
d

We can also add keys to the existing dictionary:

In [None]:
d[`c]:3 // add a new key/value pair to d
d

A dictionary can be joined to another dictionary. Below we have two examples:
1. Add values of two dictionaries
2. Join two dictionaries, prioritising values from the right-hand dictionary

In [None]:
d1:`a`b`c`d!5 6 7 8

In [None]:
d+d1 // add values for common keys
d,d1 // catenation - updates values for common keys, inserts new keys. Typical application is updating a snapshot with deltas.

# Tables 
Tables are first-class objects in q. Any list of 'like dictionaries' (meaning mulitple dictionaries with the same key) is a table. They can also be constructed with table notation or from column dictionaries.

1. Creating a table from a list of like dictionaries

In [None]:
(`a`b!0 1;`a`b!2 3)

2. Creating a table with [table notation](https://code.kx.com/q/kb/faq/#table-notation)

In [None]:
([]a:0 2;b:1 3)

3. Creating a table from a [column dictionary](https://code.kx.com/q/kb/faq/#flip-a-column-dictionary). A table is a transpose (flip) of a conforming dictionary (key of symbols, value of list of equal length lists).

In [None]:
flip `a`b!(0 2;1 3) 

We can also add tables together

In [None]:
([]a:0 2;b:1 3)+([]a:4 5;b:6 7)

Tables can be keyed. Here are two of the ways to create a [keyed table](https://code.kx.com/q/kb/faq/#keyed-tables).

1. Specify key columns with the [`xkey` keyword](https://code.kx.com/q/ref/xkey/)

In [None]:
k:`a xkey ([]a:0 2;b:1 3)
k

2. Specify key columns in the table notation.

In [None]:
([a:0 2]b:1 3)
([a:0 2;b:1 3]c:4 5)

Working with a keyed table is similar to working with a dictionary. We obtain the keys and values with `key` and `value`:

In [None]:
key k
value k

A keyed table is a dictionary where both key and values are tables:

In [None]:
key[k]!value k

And as such, we can perform lookups on the keys to obtain values based on the keys:

In [None]:
k([]a:0 1 2)
([]a:0 1 2)#k

##### Exercise 18

a.  Create a dictionary with keys, `a`, `b`, and `c`, and assign to each key a list of three random ints.

In [None]:
dict:`a`b`c!(3?10i;3?10i;3?10i)
dict

In [None]:
// Enter your code here 

In [None]:
exer18_a[] //check correct output

Add a new key, `d` with double the values of key `a`.

In [None]:
dict[`d]:2*dict[`a]
dict

In [None]:
// Enter your code here

In [None]:
exer18_b[] //check correct output

c. Make a table from the dictionary

In [None]:
tab:flip dict
tab

In [None]:
// Enter your code here 

In [None]:
exer18_c[] //check correct output

d. Make a new table by joining the table to itself

In [None]:
tab2:tab,tab
tab2

In [None]:
// Enter your code here

In [None]:
exer18_d[] //check correct output

e. Make column `b` the key of this new table

In [None]:
tabKeyed:`b xkey tab2
tabKeyed

In [None]:
// Enter your code here 

In [None]:
exer18_e[] //check correct output

f. Compare the types of all the generated tables and dictionaries. What do you notice?

In [None]:
type each (dict;tab;tabKeyed)

In [None]:
// Enter your code here 

In [None]:
// Run this cell to compare results
exer18_f[]