# Lists and dictionaries

In this class we will look at working with multiple data. Using lists and dictionaries you will all be introduced to tools on how to do this. Lambda functions will be overviewed to give you more flexibility in transforming data. We will even introduce generators which are a way to generate data that you can iterate over.

After this class you will know how to:
    * how to work with lists 
    * how to work with dictionaries
    * how to create lambda functions
    * create a generator

## Recap

Last week we look at input and output for our programs, and also importing functionality from files and modules.

### Import

Using the `import` keyword we can import functions, variables and classes from other files or modules. For example there is the `math` module has a definition of the `pi` variable.

In [None]:
import math

print(math.pi)

We also saw how we could only import specific variables and functions

In [None]:
from math import pi
print(pi)

### Input/Output

We saw that the `print` function is used to make sure that we write something to the terminal, not returning a value for a function.

#### Command line arguments

Command line arguments that are passed along to a script are available through the `sys` module that has a `argv` variable, which is a list of `str`. And remember that the first element in this list is the name of the script.

```python
import sys

print(sys.argv[0]) # prints the name of the file
```

#### Reading and writing to files

Using the `open` function we can open a file for reading or writing. This function returns a filehandler that has read functions and write functions. Remember that you need to pass `"w"` as second argument to the `open` function if you are writing to it.

```python
# read all the contents from a file
f = open("data.txt")
data = f.read()
f.close()
```

```python
# write a text to a file
f = open("output.txt", "w")
output = """Here is some
text that we want to 
write to a file
"""
f.write(output)
f.close()
```

**ALWAYS** remember to `close` the file when you are done.



## 1. Lists

Lists are the basic data structure for storing multiple values using one variable.

In [None]:
li = [1,2,3,4,5,6] # create a new list

Lists are however not restricted to a certain type in Python, you can freely mix the types for each element:

In [None]:
[[1,2,3], "Hello", 5.6, (True, False)]

### 1.1 Indexing

List indexing in Python is done using brackets `[i]`, where `0` is the first element.

In [None]:
li = [1,2,3,4,5] # create a new list

In [None]:
print(li[0])

You can also use negative indexing in Python to get the `i`th value starting from the end of the list. So at index `-1` you have the last element, `-2` the second from last, and so on...

In [None]:
print(li[-2])

It is also possible to get a subset of a list, called a `slice`, using a range of indices. This is done using the syntax *first*`:`*last*. This takes all the elements starting and including *first* until and **excluding** last.

In [None]:
print(li[1:3]) # get elements li[1], li[2]

You can also omit one of the values. If you omit *first* the range will start from `0`. If last is omitted the range ends at the last element of the list (including it).

In [None]:
print(li[2:]) # get all the elements after the two first ones

In [None]:
print(li[:3]) # get the three first elements

Lastly you can also specify a step within your range using the *first*`:`*last*`:`*step* syntax

In [None]:
print(li[0:4:2]) # get every second element starting a the first and stopping at the fourth

In [None]:
print(li[::2]) # get every second element in the list

### 1.2 Re-assigning values

The indexing syntax can be to re-assign elements within a list

In [None]:
li[2] = 8 # re-assign the third element to "8"
print(li)

In [None]:
li[:3] = [10, 11, 12] # re-assign the first three value to [10, 11, 12]
print(li)

### 1.3 Iterating over a list

We saw before how we can do this using a `for`-loop.



In [None]:
for e in li:
    print(e)

If you need to keep track of the index in each iteration of the loop, Python's `enumerate` function comes in to hand:

In [None]:
for i, e in enumerate(li):
    print("li[", i, "] = ", e)

The builtin `len` function will give you the length of a list. This can be used in case you want to use the `range` function instead: 

In [None]:
for i in range(len(li)):
    print("li[", i, "] = ", li[i])

With the range function you can use a negative step to start at the end and iterate in reverse order instead.

In [None]:
for i in range(len(li)-1, -1, -1):
     print("li[", i, "] = ", li[i])

But an easier way would be to use Python's builtin `reversed` function.

In [None]:
for e in reversed(li):
    print(e)

### 1.4 Adding and removing elements

#### Inserting

Lists can be concatenated with each other. In the case we want to add more elements to a list we can just use the `+` operator which creates a new list that is the merge of the two.

In [None]:
print(li + [7, 8, 9])

NOTE that we can only use this functionality between lists. If you just want to add one element you would need to create a new list with the element

In [None]:
print(li + [7])

The `+` operator takes two lists and returns a new one, without modifying the original ones. If you want your original list to have the updated version, you need to re-assign it by using `+=` instead.

List objects have some functions to modify the original list directly also. For adding just one element lists have the `append` function that can be used instead

In [None]:
li.append(7)
print(li)

You can also `insert` an element at a specific index, pushing the elements after the given index one step back. `insert` takes as first parameter the index where you want to insert the new value, which comes in the second parameter.

In [None]:
li.insert(4, 8)
print(li)

#### Removing

Removing elements can be done by extracting a slice of the list as we saw earlier

In [None]:
print("remove two first elements,", li[2:])
print("remove the last three elements", li[:-3])
print("note that li is unmodified:", li)

Just as when using the `+` operator we need to remember to re-assign the variable if we want to modify the actual list.

You can use the `del` statement if you want to this without assigning:

In [None]:
del li[4] # remove the fourth element (the 8 we inserted earlier)
print(li)

And the `del` statement can also be used with a range:

In [None]:
del li[-2:] # remove the last two elements
print(li)

You can achieve the same behavior using list's `pop` function, which takes one optional parameter. When called without parameters it defaults to removing the last element, optionally you can specify at which index you want to remove the element.

In [None]:
li.pop() # remove last element
li.pop(0) # remove first element
print(li)

To remove specific values from a list you can use the list's `remove` function which will delete the *first* occurrence of a value within a list

In [None]:
li = [1,2,3,4,5]
li.remove(3) # remove the value "3" from the list
print(li)

In [None]:
# create a new list of ones and test to remove the value "1"
li = [1,1,1,1] 
li.remove(1)
print(li)

### 1.6 Initilization

Initializing lists with it's initial values can be cubersome to write, as well not as not even feasible in some cases. 

Say that you want to create a list that has the values 0 to 999. One way of achieving this would be to create an empty list and use a `for`-loop and `range` where we append in each iteration:

```python
li = []
for i in range(1000):
    li.append(i)
```

Python has a nice way of doing this with one-line though using a syntax like following: 

`[` *expression* `for` *element* `in` *sequence* `]`

This is called **list comprehension**.

In [None]:
li = [i for i in range(1000)] # create a list of values 0-999
print(li[:5], "...", li[-5:])

### Exercises

#### Even and odd list

In this exercise I would like you create a function that takes one input parameter and returns two lists. The input parameter is an upper limit, and the function should return all even numbers from 0 to n in one list and all odd numbers in the other list

In [2]:
# your code here
def even_odd(n):
    elems = [i for i in range(n)]
    return elems[::2], elems[1::2]

In [4]:
evens, odds = even_odd(10) # even: [0,2,4,6,8], odds: [1,3,5,7,9]
print(evens)
print(odds)

[0, 2, 4, 6, 8]
[1, 3, 5, 7, 9]


#### Split string

`str` objects are basically lists and can use the same indexing as with lists

In [8]:
"here is a string"[0]

'h'

Create a function that splits a string in every other letter, returning two lists where one has all the letters on even indices, and one with all the letters on odd indices

In [11]:
# your code here
def split(s):
    return s[::2], s[1::2]

In [12]:
split("Hello, World") # should give [Hlo ol], [el,Wrd]

('Hlo ol', 'el,Wrd')

#### Common elements

In Python you can check if an element exists in a list using the `in` keyword: *element* `in` *list*. This statement returns a boolean value of True or False, wether or not there was the value.

In [13]:
print(5 in [1,2,3,4,5])
print(10 in [1,2,3,4,5])

True
False


Now we want to write a function that takes two lists as parameters and a new list with all the common elements.

In [14]:
# your code here
def common(l1, l2):
    c = []
    for e in l1:
        if e in l2:
            c.append(e)
    return c

In [15]:
co = common([1,2,3,4,5], [4,5,6,7,8,9,1]) # should return [4,5,1]
print(co)

[1, 4, 5]


#### Sort a list [optional]

In this exercise we will try to sort a list. You can choose any sorting algorithm you like but [*Insertion sort*](https://en.wikipedia.org/wiki/Insertion_sort) is a quite simple one to get started with if you have no preference. Here is the pseudocode from wikipedia:
```
i ← 1
while i < length(A)
    j ← i
    while j > 0 and A[j-1] > A[j]
        swap A[j] and A[j-1]
        j ← j - 1
    end while
    i ← i + 1
end while
```

In [16]:
# your code here
def sort(li):
    for i in range(1, len(li)):
        j = i
        while j > 0 and li[j-1] > li[j]:
            li[j], li[j-1] = li[j-1], li[j]
            j -= 1

In [17]:
li = [90,23,1,5,3,9,43,21,87]
sort(li)
print(li)

[1, 3, 5, 9, 21, 23, 43, 87, 90]


## 2. Tuples

Tuples are very similar to lists, but with some key differences.

Syntax is very similar:

In [None]:
t = (1987, 10, 24) # note paranthesis instead of square brackets

And you can index them similarly to lists

In [None]:
print(t[1])

One large difference is that tuples are not mutable, meaning you can not re-assign the values within.

In [None]:
t[1] = 11 # this is going to give an error!

The main idea when using tuples compared to lists are that:
 * list is a sequence of similar type of data, where each element is meant to be processed separatly
 * tuples are a logical sequence of data that can be different types that is meant to be processed together.

## 3. Dictionaries

Dictionaries are mappings from hashable values to arbitrary objects. They define a list of keys that each point to an object. Below is an example of a mapping (not Python) that has the keys 'a', 'b' and 'c', that each point respectively to the values 1, 2 and 3.

```
{
    'a' => 1
    'b' => 2
    'c' => 3
}
```

### 3.1 Initialization 

In Python dictionaries kan be either initialized using the `dict` constructor and keyword input parameters as start values:

In [19]:
d = dict(a=1, b=2, c=3)
print(d)

{'a': 1, 'b': 2, 'c': 3}


We can also do this using a statement within `{}`, where we specify the keys and their values 
```python
{
    key1: value1,
    key2: value2,
}
```

In [20]:
d = {
    'a': 1,
    'b': 2,
    'c': 3,
}
print(d)

{'a': 1, 'b': 2, 'c': 3}


Notice how in the first example keys are defaulted to the `str` type (which is also the most common scenario). But using the second version we specified this manually. Using the second version we can create dictionaries with other key types than strings, and you don't need to use the same:

In [21]:
# here is a quite wild dictionary that contains a little of everything
d = {
    'a': 1,
    39: [[1,2,3],[4,5,6],[7,8,9]],
    "Hello,": "World",
    False: 0,
}
print(d)

{'a': 1, 39: [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'Hello,': 'World', False: 0}


Dictionaries can also be created in the same list comprehension syntax:

In [22]:
keys = ["a", "b", "c", "d"]
D = {key: i+1 for i, key in enumerate(keys)}
print(D)

{'a': 1, 'b': 2, 'c': 3, 'd': 4}


### 3.2 Indexing and muting objects

The syntax is the same as with lists except that instead of strictly using integers for indexing you can use any type that you used for creating keys.

In [23]:
print(d['a'])
print(d[False])

1
0


To re-assign a value you just need to use the different assignment operators `=` operator.

In [24]:
d["Hello,"] = "class"
print(d)

{'a': 1, 39: [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'Hello,': 'class', False: 0}


In [25]:
d[False] += 10
print(d)

{'a': 1, 39: [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'Hello,': 'class', False: 10}


If you want to add a new key and value pair it is the exact same syntax.

In [26]:
d['b'] = 2
print(d)

{'a': 1, 39: [[1, 2, 3], [4, 5, 6], [7, 8, 9]], 'Hello,': 'class', False: 10, 'b': 2}


### 3.3. Iterating

There are many ways you can iterate over dictionaries. You can use a `for`-loop directly, and in this we will be iterating over the keys, that we can use to access the elements.

In [27]:
for key in d:
    print(key)

a
39
Hello,
False
b


If you would like for the key and the value directly in each step of the loop, you can use the `items` function of the dictionary:

In [28]:
for key, value in d.items():
    print(key, "=>", value)

a => 1
39 => [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Hello, => class
False => 10
b => 2


Dictionaries have a `values` function that can be used to iterate on only the values within dictionary:

In [29]:
for value in d.values():
    print(value)

1
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
class
10
2


### Exercises

#### Word counter

Just like with lists you use the `in` keyword on dictionaries, but to check if a certain key is available in the dictionary

In [30]:
'a' in d

True

In this exercise we want to create a word counter. You will write a function that takes a string as an input parameter and returns a dictionary that maps the words in the text to their number of occurrences.

**TIP**

`str` objects have a `split` function that you can use to create a list of substrings, where the original value has been split at each occurrence of the delimiter

In [31]:
"split,these,words".split(",") # split on "," characters 

['split', 'these', 'words']

In [34]:
# your code here
def occurrences(txt):
    occ = {}
    for word in txt.split(" "):
        if word in occ:
            occ[word] += 1
        else:
            occ[word] = 1
    return occ

In [35]:
text = """insertion sort iterates consuming one input element each repetition and growing a sorted output list at each iteration insertion sort removes one element from the input data finds the location it belongs within the sorted list and inserts it there It repeats until no input elements remain"""
counts = occurrences(text)
print(counts)

{'insertion': 2, 'sort': 2, 'iterates': 1, 'consuming': 1, 'one': 2, 'input': 3, 'element': 2, 'each': 2, 'repetition': 1, 'and': 2, 'growing': 1, 'a': 1, 'sorted': 2, 'output': 1, 'list': 2, 'at': 1, 'iteration': 1, 'removes': 1, 'from': 1, 'the': 3, 'data': 1, 'finds': 1, 'location': 1, 'it': 2, 'belongs': 1, 'within': 1, 'inserts': 1, 'there': 1, 'It': 1, 'repeats': 1, 'until': 1, 'no': 1, 'elements': 1, 'remain': 1}


#### Generate a dictionary

Create a function that takes a parameter `n` and returns a dictionary that has as keys all values from 1 to `n` pointing to the key value at power to 3.

```
{
    1: 1^3,
    2: 2^3,
    ...
    n: n^3,
```

**Reminder** the `**` operator gives the exponential.

In [38]:
# your code here
def exp3(n):
    return {v: v**3 for v in range(1, n+1)}

In [39]:
di = exp3(20)
print(di[2]) # should print 8 (= 2^3)
print(di[4]) # should print 64

8
64


## 4. Generators [advanced]

We have looked at ways on how we can store and work with larger amounts of static data. This section will introduce you to some ideas on how we can work and iterate with more dynamic data.

There might be cases that you have an enormous amount of data that you need to process, that you just can not load at one time into memory. Instead you want to load chunks that you work with one at a time. One way of doing this is using generators.

Generators are basically functions that use a `yield` statement to return a new chunk. The function halts after the `yield` and will continue when next chunk is requested. A basic example looks like this.

In [None]:
# create a generator that works like range but yields one value at a time
def gen_range(n):
    i = 0
    while i < n:
        yield i
        i += 1

In [None]:
g = gen_range(10)
print(type(g))

In [None]:
for v in g:
    print(v)

Here is a little more advanced example where group in chunks of 10 before yielding

In [None]:
# create a generator that returns a list of values between 0:99 in chunks of 10.
def gen():
    chunk = []
    for i in range(100):
        chunk.append(i)
        if len(chunk) == 10: # when chunk has a length of 10 we yield it
            yield chunk
            chunk = [] # empty it for next time

In [None]:
for chunk in gen():
    print(chunk)

### Exercise [optional]

#### Generator for lines in file

Try writing a generator that reads the lines of a file and returns them in groups of 5.

##  5. Lambda functions

Lambda functions can be seen as one-line functions that take an input and return a new value. They are created using the `lambda` keyword: 

`lambda` *input*`:`*output*

In [None]:
la = lambda x: x + 1

In [None]:
print(type(la))

In [None]:
la(10)

Lambda functions can also take several inputs:

In [None]:
la = lambda x, y: x + y # create an add function as lambda 

In [None]:
la(10, 10)

### Exercise

In our first class we saw inner functions and how they are nice way to create functions that are not be shared to an outter scope. Following we had an example of how we used inner functions:

In [40]:
def foo(x):
    def foo_(x):
        x += 5
        x **= 2
        x /= 10
        return x
    for i in range(4):
        x = foo_(x)
    return x

In [41]:
foo(5)

650.0390625

Please re-write this function using `lambda` functions

In [42]:
# your code here
def foo(x):
    foo_ = lambda x: ((x + 5) ** 2) / 10
    for i in range(4):
        x = foo_(x)
    return x

In [43]:
foo(5)

650.0390625

Note how we do not need a return statement in `lamdba` functions. They are only one expression, and it is to define what is returned in the end. They are very useful when you are calling a function that takes another function as input parameter.

#### map

One such function where defining `lambda` function comes in handy is the `map` function. `map` takes two parameters:
 * a function
 * an iterable (e.g. list)

`map` iterates over the list and returns a new list as a `map` type where each value is the output of calling it's first argument (the function) with the current element. Here is an example to illustrate a little more.

In [None]:
li = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] # create a list 0 - 9

In [None]:
new_list = list(map(lambda x: x**2, li)) # note how we convert to a list
print(new_list)

#### filter

`filter` is another useful function that returns a new list as a `filter` object with all elements from the original list that give `True` for the input function.

In [None]:
new_list = list(filter(lambda x: x > 4, li))
print(new_list)