# Unit 2.2: Data Structures I

This notebook is based on Anna-Lena Lamprecht's CoTaPP repository (https://github.com/annalenalamprecht/CoTaPP). Some modifications were made.

Last unit we discussed functions, modules and packages, had a quick look at the Python’s standard library and the Python Package Index.

In this unit we will cover two very important data structures in Python (the list and the tuple) that will allow us to work with more powerful data items than just the individual numbers, strings and Booleans that we have used so far. 

Next unit we will cover more data structures (dictionaries and sets).

## Data Structures in Python 
We have already seen the four basic data types that Python provides: string, integer, float and boolean. Data structures are more than data types, they are used to represent more complex types of data. There are four built-in data structures in Python: lists, tuples, dictionaries and sets. They are all special kinds of variables that can store more than just one value. In this unit we will study lists and tuples.

In lists, elements have a defined order. Lists are *mutable* data structures, meaning that it is possible to add, edit or delete elements. Lists allow for duplicates. Finally, lists are iterable, that is, a for-loop can automatically go through all their elements.

Tuple are almost exactly like lists, except that they are immutable, i.e., you can't change them after creation.

| Data Structure | Ordered | Mutable | Unique | Iterable |
|----------------|:-------:|:-------:|:------:|:--------:|
| List           |   Yes   |   Yes   |   No   |    Yes   |
| Tuple          |   Yes   |    No   |   No   |    Yes   |
| Dictionary     |    ?    |   ?     |   ?    |    ?     |
| Set            |    ?    |   ?     |   ?    |    ?     |

We will complete this table next lecture.

### Lists

Here is an example of a list in Python, containing the first 12 [Fibonacci numbers](https://en.wikipedia.org/wiki/Fibonacci_number):

In [None]:
fibonacci_numbers = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144] 

That is, lists can simply be defined by comma-separated lists of values in square brackets:

```
<list_name> = [<value1>, <value2>, …, <valueN>]
```

Lists can also contain values of different types, for example:

In [None]:
person_details = ["Bob", "Smith", "22.05.1987", 1.6, 4094379] 

Individual elements of lists can be accessed by giving their position (index) in the list in square brackets directly behind the name of the variable:

```
<list_name>[<index>]
```

For historical technical reasons the first index of a list is not 1, but 0, and accordingly the last index is its length – 1. For example, we can print the 7th Fibonacci number (13) by:

In [None]:
print(fibonacci_numbers[6]) 

Or Bob's last name by:

In [None]:
print(person_details[1]) 

Conversely, a new value can be assigned to a list element, for example a new last name for Bob:

In [None]:
person_details[1] = "Tailor" 

Resulting in a changed list:

In [None]:
print(person_details) 

We can delete an element from the list, for example:

In [None]:
del person_details[4] 

Resulting in:

In [None]:
print(person_details) 

The operators ```in``` and ```not in``` can be used to check if a particular values is contained in a list (or not):

In [None]:
if "Bob" in person_details:
    print("Bob is there.")
if "John" in person_details:
    print("John is there.")
if "Bob" not in person_details:
    print("Bob is not there.")
if "John" not in person_details:
    print("John is not there.")

Using the ```len``` function we can get the length of a list:

In [None]:
print(len(person_details))

Lists can also be used as iterable objects for for-loops, which are in fact a convenient way for going through the elements of a list. Here is an example:

In [None]:
numbers = [2, 4, 6, 8, 10]
for n in numbers:
    print(n)

Nesting for-loops into each other is also easy with lists. If we want, for example, to determine all matches to be played in a "Province League" between the provinces in Ireland (Gaelic football or Rugby or …) the following piece of code is sufficient:

In [None]:
teams = ["Connacht", "Ulster", "Munster", "Leinster"]
for home in teams:
    for guest in teams:
        if home != guest:
            print(f"{home} : {guest}") 

Lists in Python are *ordered*, meaning that its elements are stored and retrieved in a specific order. You might want to *sort* a list, i.e., make the ordering of the list match the natural ordering given by the nature of its elements. For example, you might want to sort a list of strings in alphabetical order. The simplest way to do this in Python is with the ```sort()``` function for lists:

In [None]:
names = ["Hieke", "Alexander", "Sergey", "Anna-Lena", "Amir"]
print(names)
names.sort()
print(names)

Note here that the list is sorted by calling ```names.sort()```, and not ```sort(names)``` as you might have expected. The reason is that the method ```sort()``` is provided by the list class, hence it can be called on all instances of list, like ```names``` in our example. In contrast, a method like ```print()``` is not connected to a particular object, and is just called by itself.

There are a few other useful object methods available for lists:


  ```<list>.index(<value>)``` returns the index of the first occurrence of the value in the list
 
  ```<list>.append(<value>)``` appends the value to the list as new element
 
  ```<list>.remove(<value>)``` removes the (first) element with the value from the list
 

Here is a simple random example:

In [None]:
numbers = [4, 31, 34, 2, 3, 13, 53, 54, 2]
print(numbers)
print(f"Index(2): {numbers.index(2)}")
numbers.append(99)
print(numbers)
numbers.remove(2)
print(numbers)
print(f"Index(2): {numbers.index(2)}")

Maybe you have wondered if lists can also contain others lists. Yes, there are also lists of lists. Here is an example, doing the opposite of sorting, namely creating a random running order of presentations:

In [None]:
import random

presentations = [["Bob", "World Heritage Sites in Montenegro"], \
                 ["Elise", "The discography of Lecrae"], \
                 ["Evelyne", "Amphibian species in the American state of Texas"], \
                 ["Harry", "Notable individuals who have been affiliated with Pomona College"], \
                 ["Jack", "27 local nature reserves in Cambridgeshire"], \
                 ["Linda", "The 2018 Atlantic hurricane season"], \
                 ["Michael", "The chief minister of Jharkhand"], \
                 ["Paul", "The cartography of Jerusalem"]]

random.shuffle(presentations)

i = 0
print("Presentations on Tuesday, April 3:")
while i < len(presentations)/2:
    print(f"\t {presentations[i][0]}: {presentations[i][1]}")
    i += 1
print("Presentations on Thursday, April 4:")
while i < len(presentations):
    print(f"\t {presentations[i][0]}: {presentations[i][1]}")
    i += 1

We can also use slicing to split the above list in two. Slicing allows to refer to a whole range of indexes instead to just a single one. A slicing expression has the following basic form, referring to all elements from the first index to the one before the last index:

```
<list>[<first_index>:<last_index>]
```
    
That can for instance be used to replace the while loops in the example above by for-loops:

In [None]:
print("Presentations on Tuesday, April 3:")
for presentation in presentations[0:len(presentations)//2]:
    print(f"\t {presentation[0]}: {presentation[1]}")   
    
print("Presentations on Thursday, April 4:")
for presentation in presentations[len(presentations)//2:8]:
    print(f"\t {presentation[0]}: {presentation[1]}") 

When copying lists, whether completely or partially with the slicing operators, it needs to be taken into account that copying of complex objects like lists behaves a bit differently than the copying of simple variable values. If the usual assignment operator (=) is used, only the reference to the list is copied, meaning that all changes to the original list are also visible in the copied list, because they refer to the same object. When using the slicing operator or a dedicated `copy()` function, the (respective) elements of the list are copied into a new object that is independent from the original. This is sometimes also called a *shallow copy*. If the list contains a reference to another list or complex object, however, only the reference will be copied. Therefore, in this case a *deep copy* needs to be made in order to copy the whole list completely. The following code illustrates the difference:

In [None]:
import copy

short_list = ["1", "2", "3"]
long_list = ["a", "b", "c", "d", "e", "f", "g", short_list]

# assignment
assigned_list = long_list
print(assigned_list)
del short_list[0]
del long_list[3]
print(assigned_list)

# (shallow) copy
copied_list = copy.copy(long_list)
print(copied_list)
del short_list[0]
del long_list[3]
print(copied_list)

# deep copy
deep_copied_list = copy.deepcopy(long_list)
print(deep_copied_list)
del short_list[0]
del long_list[3]
print(deep_copied_list)


The main reason for not doing deep copies of lists by default is that they are typically slower and in many cases not necessary.

Finally two more notes on indexing in Python: So far we have seen forward indexing, from 0 to len(list)-1, which is also common in a lot of other programming languages. Python allows additionally also for backward indexing, where the last element in the list is indexed with –1, and the first with –len(list). For example, consider the list of Fibonacci numbers from above again:

In [None]:
fibonacci_numbers = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144]

The first 1 has index 0, or alternatively index –12. The 144 has index 11, or alternatively –1. The 5 can be indexed by 4 or –8:

In [None]:
print(f"{fibonacci_numbers[0]} == {fibonacci_numbers[-12]}")
print(f"{fibonacci_numbers[11]} == {fibonacci_numbers[-1]}")
print(f"{fibonacci_numbers[4]} == {fibonacci_numbers[-8]}")

And then there is the so-called unspecified index that can be used with slicing. If the first index of the slice is left unspecified, it refers to all elements in the list from the beginning to the second index - 1. If the second index is left unspecified, it refers to the first index and all remaining elements after it:

In [None]:
print(fibonacci_numbers[:6])
print(fibonacci_numbers[6:])

### Tuples
Tuples are actually very similar to lists, just that they are immutable and cannot be changed. Thus, they only support operations that read from them, but no operations that would change or delete the data structure.
In practice tuples are frequently obtained as a result from library functions, but they can also be created directly, by using round brackets:

```
<tuple_name> = (<value1>, <value2>, …, <valueN>)
```

For example:


In [None]:
sample = ("Thursday", "lunch", "pasta", 3.95)
print(sample)

All reading operations (such as indexing, slicing, iteration…) work in the same way as on lists, for example:

In [None]:
print(sample[0])
print(sample[len(sample)//2:])

for s in sample:
    print(s)

However, writing operations are not possible on tuples, that is, no changing of elements, no deletions, no appending, no sorting, etc.

In [None]:
s = (1, 2, 3)
s[0] = 10

In practice, tuples are frequently used for example by web services or other APIs to return results. You cannot manipulate these directly, but of course access them and copy the contained values to other data structures. Being read-only data structures also makes operations on tuples faster than on lists, so when working with large collections of data that does not change, the use of tuples might be preferred over lists. Furthermore, tuples are also used to make functions return more than one value. For example:

In [None]:
def integer_division(a,b):
    quotient = a//b
    remainder = a%b
    return quotient, remainder

print(integer_division(20,6))

Note that the return statement does not explicitly define the pair of numbers as a tuple, but any comma-separated list of return values as shown here will automatically be turned into a tuple.

## List comprehension

Many Python developers employ a technique called [*list comprehension*](https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions). With list comprehension, a Celsius-to-Fahrenheit conversion of a list of temperatures could be implemented in a single line, as follows:

In [None]:
temps = [34.5, 23.6, 78.7, 34.3, 99.9, 23.7, 42.6]
temps_f = [ (9/5) * x + 32 for x in temps ]
print(temps_f)

List comprehension can also be used to filter values. For example, to select only temperatures above 50 degrees:

In [None]:
temps_50plus = [ x for x in temps if x >= 50 ]
print(temps_50plus)

And you can do both operations at the same time. For example, convert temperatures to Fahrenheit only if they are above 50 degrees:

In [None]:
[(9/5) * x + 32 for x in temps if x >= 50]

List comprehensions have the following basic form:

```[``` *expression* ```for``` *var* ```in``` *iterable* ```if``` *condition* ```]```
    
Additionally, ```if``` statements can be included to express further constraints. The result is a list with the results from evaluating the expression in the context of the ```for``` and ```if``` clauses in the comprehension statement.

## Exercises

### 1. Ackermann Function (★★★★☆)
The Ackermann function (named after the German mathematician Wilhelm Friedrich Ackermann) grows rapidly already for small inputs. It exists in different variants, one of the common definitions is the following (for two nonnegative integers m and n):

![](img/ackermann.png)

Define and implement a (recursive) function ackermann(m,n) that computes the Ackermann function value for two nonnegative integers m and n. Then, write a test program that computes the results for calling the function with growing m and n of the same value, starting with m = n = 0 and incrementing by 1 in each iteration. The output should be something like: 
```
ackermann(0,0) = 1
ackermann(1,1) = 3
ackermann(2,2) = 7
[...]
```
What is the last value that your program computes before you get a `RecursionError`? (Hint: It might be that the outputs in the IPython console in Spyder are too verbose to see anything. You can alternatively run your program from the command line to see more.) What does this error mean?

### 2. String Reverse (★★★★☆)
Strings can be indexed like lists, that is, an expression like <string>[<index>] returns the character at the corresponding position in the string. The first character of the string has index 0, and the last is at position len(<string>)-1. A sub-sequence of a string can be obtained by specifying a range of indexes. For example, <string>[1:len(<string>)] will return a string containing all characters but the first of the original string.
    
Implement three different variants of a function for reversing a string:
1. reverse_recursive(string), solving the problem recursively
2. reverse_while(string), solving the problem using a while-loop
3. reverse_for(string), solving the problem using a for-loop
    
You can use the following code to test your functions. The last output should be "True".
```
# test program
string_to_reverse = "This is just a test."
print(reverse_recursive(string_to_reverse))
print(reverse_while(string_to_reverse))
print(reverse_for(string_to_reverse))
print(reverse_recursive(string_to_reverse) == reverse_while(string_to_reverse) == reverse_for(string_to_reverse))
```

### 3. Irish League (★★★☆☆)

Consider again the "Irish League" example from the lecture:
```
teams = ["Connacht", "Ulster", "Munster", "Leinster"]
for home in teams:
    for guest in teams:
        if home != guest:
            print(home, ":", guest)
```

Add another list at the beginning:
```
dates = ["June 1", "June 2", "June 3", "June 4", "June 5", "June 6", \
         "June 7", "June 8", "June 9", "June 10", "June 11", "June 12"]
```
Then adapt the code so that it does not only print the pairings, but also the date on which the match shall take place (using the dates in the list in the order they appear there). The output should then be:
```
Connacht : Ulster (June 1)
Connacht : Munster (June 2)
Connacht : Leinster (June 3)
Ulster : Connacht (June 4)
Ulster : Munster (June 5)
Ulster : Leinster (June 6)
Munster : Connacht (June 7)
Munster : Ulster (June 8)
Munster : Leinster (June 9)
Leinster : Connacht (June 10)
Leinster : Ulster (June 11)
Leinster : Munster (June 12)
```

### 4. List of Fibonacci Numbers (★★★★☆)
Implement a `function fib(n)` that returns a list with the first n Fibonacci numbers. If `n==0`, it should directly return the list `[1]`, if `n==1`, it should return `[1,1]`, and if `n>1` it should use `[1,1]` as a start and compute Fibonacci numbers 2 to n by always adding the two predecessors in the list. 
You can use the following code to test your function:
```
print(fib(0))
print(fib(1))
print(fib(2))
print(fib(12))
```
The output should be:
```
[1]
[1, 1]
[1, 1, 2]
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233]
```

### 5. List versus tuple (★★☆☆☆)
Write a Python function called ```modify_elements``` that takes in either a ```list``` or a ```tuple``` as input. The function should modify the elements of the input sequence by doubling each element in-place.

Here's an example of the expected behavior:

    numbers_list = [1, 2, 3, 4, 5]
    modify_elements(numbers_list)
    print(numbers_list)  # Output: [2, 4, 6, 8, 10]

    numbers_tuple = (1, 2, 3, 4, 5)
    modify_elements(numbers_tuple)
    print(numbers_tuple)  # Output: (1, 2, 3, 4, 5)

In this example, the ```modify_elements``` function is called twice: once with a list and once with a tuple. For the list, the elements are modified in-place, and the resulting list is [2, 4, 6, 8, 10]. However, when the function is called with a tuple, it should not modify the elements because tuples are immutable. Therefore, the tuple remains unchanged as (1, 2, 3, 4, 5).

### 6. Extract Even Numbers (★☆☆☆☆)

Write a Python function called ```extract_even``` that takes in a list of numbers as input. The function should use list comprehension to create a new list containing only the even numbers from the input list.

Here's an example of the expected behavior:

    numbers = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    result = extract_even(numbers)
    print(result)  # Output: [2, 4, 6, 8, 10]

In this example, the input list is ```[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]```. The ```extract_even``` function uses list comprehension to create a new list that contains only the even numbers from the input list. Therefore, the function should return ```[2, 4, 6, 8, 10]```.