# Python Containers

Python uses four basic container types: 
* [`tuple`](https://docs.python.org/3/library/stdtypes.html#tuple) (a fixed list), 
* [`list`](https://docs.python.org/3/library/stdtypes.html#list) (a mutable list), 
* [`dict`](https://docs.python.org/3/library/stdtypes.html#dict) (an associative array or mapping) 
* [`set`](https://docs.python.org/3/library/stdtypes.html#set) (an unordered collection with unique elements).

The essential common feature is that items stored in a container need not to be of the same type. 

Note 1: *These containers are essential in Python and can be found almost everywhere.* 

Note 2: *Soon, you cannot imagine how anyone can be programming without `list`, `tuple` or `dict`.* 

Note 3: Although not a basic container, in engineering and scientific computing, the `numpy` array is also essential.
Wait for the afternoon/tomorrow to learn about it.

Getting and assigning items employ square brackets `[]`. `tuple` and `list` are integer indexed,
`dict` is a key-value mapping (a hash map). Similarly to the C language, numeric **indexing starts with 0**.


## Note to the beginning: mutable versus immutable 

Data types can be *mutable* (changeable) or *immutable* (unchangeable). 
Immutable objects (in Python, basically anything is an object - we will see this many times again)
can not change their values without losing their identities. 
In contrast, mutable objects can change their value 
(without losing their identities). Immutable types are numbers 
(`int`, `float`, `complex` etc.), strings (`str`) and `tuple`. 
Mutable types are e.g. `list`, `dict` or `set`.


## Tuple

Tuples are **immutable ordered sequences** that can store collections of heterogeneous data. 
Tuples are heavily used in core features of Python, such as iteration or multiple return values
(together with unpacking).

There are multiple ways to construct a tuple:

In [None]:
tuple1 = (1, 'a', 5)           # The basic syntax for creating a tuple (parentheses) 
tuple2 = 1, 'a'                # Parentheses are not required but beware! 
tuple3 = tuple(["a", "b"])     # Advanced: Create a tuple from another container 
tuple4 = tuple(range(0, 10))   # Advanced: Create a tuple from an iterator / generator 
tuple5 = ()                    # an empty tuple 
tuple6 = ("single", )          # A tuple with one element - mind the comma!
tuple7 = "single",             # One element again - no parentheses but the comma!
tuple8 = 0, "1", (0, 1, 2)     # Tuple can of course contain another tuple

# And what we got out?
print(f"tuple1 = {tuple1}")
print(f"tuple2 = {tuple2}")
print(f"tuple3 = {tuple3}")
print(f"tuple4 = {tuple4}")
print(f"tuple5 = {tuple5}")
print(f"tuple6 = {tuple6}")
print(f"tuple7 = {tuple7}")
print(f"tuple8 = {tuple8}")

Use square brackets to obtain tuple elements:

In [None]:
print(tuple4[0])         # The first one
print(tuple4[-1])        # The last one
print(tuple4[-2])        # The one before the last one

A tuple cannot be changed - it is *immutable*:

In [None]:
tuple1[0] = "b"          # Throws an exception

However, you can create a new tuple from existing ones:

In [None]:
print(tuple1 + tuple2)
print(2 * tuple6)

The methods of tuple are

In [None]:
# you will understand this breakneck syntax soon ;)
", ".join(item for item in dir(tuple) if not item.startswith("_"))

### Unpacking

Tuple can be used to assign values to multiple variables simultaneously. For example,

In [None]:
(x, y, z) = (1, 2, 3)
print(y)

In this case, the paretheses are often omitted, so we can write

In [None]:
x, y, z = 1, 2, 3
print(y)

This is especially useful for functions that return multiple values:

In [None]:
def neighbors(x):
    """Returns the integers a and b less and greater than x, i.e. a < x < b"""
    from math import ceil, floor
    a = int(floor(x))
    b = int(ceil(x))
    # 1 must be added / subtracted if x is an integer
    if a == x:
        a -= 1
        b += 1
    return a, b

x = 3
# we see that the function returns a tuple
print(neighbors(x))

# now assign the result to two variables
a, b = neighbors(x)
print(f"{a} < {x} < {b}")

Any iterable object can appear on the right hand side (more on iterators later), for example, `list` (see below) or string.

In [None]:
# elements from a list
a, b, c = [1, 2, 3]
print(a, b, c)

# or a string
a, b, c = "123"
print(a, b, c)

A very useful functionality is *extended unpacking*.

In [None]:
# c is assigned the remaining elements in the form of a list
a, b, *c = (1, 2, 3, 4, 5, 6)
print(a)
print(b)
print(c)

Important is of course the *asterisk*, which may be also in the middle.

In [None]:
a, *b, c = (1, 2, 3, 4, 5, 6)
print(a)
print(b)
print(c)

Python throws an exception if the number of elements is incorrect.

In [None]:
a, b, c = (1, 2, 3, 4, 5, 6)

## List

List is similar to tuple in that it is an ordered sequence; 
However, lists are mutable, i.e., one can change their contents. 

The fundamental syntax to create a list is via brackets `[ ]` or the `list` function.

*Note: List is similar to `std::vector` in C++ (though not type-specific).*

In [None]:
list0 = []                 # en empty list
list1 = list()             # an empty list
list2 = ["a", "b", "c"]    # list is cretaed using [...]
list3 = [0, 0.0, "0.0"]    # list can contain arbitrary types
list4 = list(("from", "a", "tuple"))       # list can be created from a tuple

print(list0)
print(list1)
print(list2)
print(list3)
print(list4)

List contains more methods than tuple, which follows from the fact that it is mutable.

In [None]:
", ".join(item for item in dir(list) if not item.startswith("_"))

It is natural to add elements to the end of the list with `append` and to remove from the end with `pop`:

In [None]:
letters = ["a", "b", "c"]

letters.append("d")         # adding an element
print(letters)              # list1 has changed!

letters.sort(reverse=True)  # sort the list - the list will be changed (mutated)
print(letters)

print(letters.pop())        # pops (removes and returns) the last element
print(letters)

Elements can also be removed using `remove` but this method must search for the element first.

In [None]:
letters.remove("d")         # removing element(s)
print(letters)

Nested lists can create "multidimensional" lists, though there is no automatic mechanism to keep the dimensions consistent
(*We will see the right way how to do it later*).

In [None]:
list_2d = [[11, 12], [21, 22]]   # "multidimensional" list
print(list_2d[0][0])             # element [0,0]

All variables in Python are references (similar to pointers in languages like C). 
If the referenced object is *mutable*, e.g. a `list`, modifying its state
is thus reflected in any other references to the same object (e.g. other variables).
Bear this in mind in order not to *accidentally overwrite the contents of another variable!*

In [None]:
a = [1, 2, 3]
b = a            # b is identical to a (not a copy)
b.insert(0, 0)   # as list is mutable, a is modified as well
print(a)

Lists can be easily *copied* if we need to manipulate a list (e.g. in a function)
without modifying the original list.

In [None]:
import copy  # the copy module can be used to create copies

b = copy.copy(a)
print(f"a = {a}")

# after modifying b, a is not chenged
b.pop()
print(f"b = {b}")
print(f"a = {a}")

## Equality

Two operators can be used to check the equality of lists and other containers or objects in general: 
`is` and `==`. While `==` compares *by content* (or *value*), `is` compares the *identity* (as in having *same address in memory*).
Two objects can have equal contents but not the same identity.

We can demonstrate this on list copies:

In [None]:
b = a                  # b is a refernce to a - they have the same identity
print("a is b:", a is b)          # the is operator tests the identity (of objects)
print("a == b:", a == b)          # the contents must be the same of course

a_copy = copy.copy(a)            
print("a_copy is b:", a_copy is b)
print("a_copy == b:", a_copy == b)
# after modifying b, the contents will not be the samr
a_copy.append(5)
print("a_copy == b:", a_copy == b)


## Indexing aka *slicing*

Slicing is a very important concept. For variables of type `list` and `tuple`, slices can be used to select elements in a sophisticated manner; they can be used to change the list as well. 
`list` and `tuple` allow so called simple slice, for details see the [documentation](http://docs.python.org/3/reference/expressions.html#slicings). 
Extended slicing will be used later for Numpy arrays. The syntax of a simple slice is:

    [lower_bound] ":" [upper_bound] [":" [stride]]
   
The default value for the upper and lower bounds is `None`, the default stride value is 1. 
The result is a new object of the same type, consisting of elements with indices
from the `lower_bound` (inclusive) to the `upper_bound` (exclusive) possibly with a given step (stride). 

Let's see how it works in examples.

In [None]:
# create a simple list
numbers = [0, 1, 2, 3, 4, 5]

# all elements
print("numbers[:] =", numbers[:])

# the first two elements
print("numbers[:2] =", numbers[:2])

# the second up to the last but one
print("numbers[1:-1] =", numbers[1:-1])

# at maximum a million elements
print("numbers[:1000000] = ", numbers[:1000000])

# "invalid" ranges have zero elements
print("numbers[4:2] = ", numbers[4:2])

# "invalid" ranges have zero elements
print("numbers[4:2:-1] = ", numbers[4:2])

# the last three elements
print("numbers[-3:] =", numbers[-3:])

# even elements
print("numbers[::2] = ", numbers[::2])

# elements in reversed order
print("numbers[4:2:-1] = ", numbers[4:2:-1])
print("numbers[::-1] = ", numbers[::-1])

With slices we can change the list (but not tuple): add, remove or modify elements:

In [None]:
numbers = [0, 1, 2, 3, 4, 5]

# modify a single element
numbers[0] = "zero"
print(numbers)

# changing more elements
numbers[1:4] = ["one", "two", "three"]
print(numbers)

# add elements to an arbitrary position
numbers[2:3] = [1.25, 1.5, 1.75, 2] 
print(numbers)

# remove elements
numbers[-2:] = []
print(numbers)
# del can remove elements as well
del numbers[:1]
print(numbers)

We already know that `list` is *mutable* and that a copy can be created using the `copy` module. 
We can equally well use the `[:]` slice to create a copy:

In [None]:
a = ["a"]
b = a[:]
# are a and b identical?
print(a is b)

### Exercise

1. Create a list of at least three programming languages. 
2. Add these two languages to the list: `["Rust", "Perl"]`. 
3. Sort alphabetically and print the result.

## Searching in containers

Operators `in` and `not in` serve for checking whether a value is (is not) in a given list or tuple.
The related `index` method returns the index of the first occurence of a value.

In [None]:
values = ["a", "A", "b", "ABC", 1, "2"]

# use in to test whether "b" and "B" are in our values
print('"b" in values =', "b" in values)
print('"B" in values =', "B" in values)
print('"B" not in values =', "B" not in values)

# let's demonstrate the index method
print('"b" is at index', values.index("b"))

In [None]:
# index raises an exeption if the value is not in the list
values.index("B")

### Exercise

1. Write a function that adds an element into a list if it's not already in the list.

## Dictionary

The `dict` type in Python is an associative array (mappping) where *keys* can be any
*hashable* objects. Typical keys are numbers, strings, tuples or instances of user defined classes.
Keys cannot be for example lists, sets or dicts as those are not hashable.
Values can be of any type, similarly to other containers.

A dict can be constructed using curly braces `{ }` with `key: value` pairs
or via the built-in `dict` function:

In [None]:
empty_dict = {}                 # an empty dict
empty_dict = dict()             # equivalent

sample_dict = {7: "seven", "numbers" : [1, 2, 3]}  # various key and value types
print(sample_dict)

The value for a specific key is retrived using `[ ]` (similar to list indexing):

In [None]:
print('sample_dict[7] =', sample_dict[7])
print('sample_dict["numbers"] =', sample_dict["numbers"])

In [None]:
# non-existing key raises an exception
sample_dict[0] 

The `get` method is handy when we need a default value for non-existing keys.

In [None]:
print(sample_dict.get(0))               # returns None for a non-existing key
print(sample_dict.get(0, "default"))    # returns a specific value for a non-existing key
print(sample_dict.get(7, "default"))    # return the actual value for existing key

`dict` methods:

In [None]:
", ".join(item for item in dir(dict) if not item.startswith("_"))

The `in` and `not in` operators check for *keys*, not for values.

In [None]:
sample_dict = {7: "seven", "numbers" : [1, 2, 3]}

print("7 in sample_dict:", 7 in sample_dict)
print("'seven' in sample_dict:", 'seven' in sample_dict)

## Set

`set` is an unordered (i.e. non-indexable) collection of hashable obejcts, 
in which each object can exist only once.

Sets are created using curly braces `{ }` or the built-in `set` function.

In [None]:
print({"spades", "hearts", "clubs", "diamonds"})
print(set(("a", "a", "a", "b", "b")))              # duplicit elements are removed

These operators (or methods) are relevant for sets:

* `|` (`union`)
* `&` (`intersection`)
* `-` (`difference`)
* `^` (`symmetric_difference`)
* `<`, `<=` (`issubset`)
* `>`, `>=` (`issuperset`)

## Exercise

1. Use an apt container to associate 1 - 10 ratings with the programming languages you have listed. 
2. Print the languages that are a) common with b) distinct from these ones: 
Python, C, C++, Julia, Lisp.
3. Given you have a dict variable `language_ratings` of language: rating pairs, use the [`sorted`](https://docs.python.org/3/library/functions.html?highlight=sorted#sorted)
function to get the languages sorted by rating. Hint: use `language_ratings.get` as the sort key.

## Built-in functions for containers

Handy built-in functions exist in Python for working with containers.

`len` returns the number of elements:

In [None]:
o = [1, 1, 2, 2]
print(f"len({o}) =", len(o))
o = 1, 1, 2, 2
print(f"len({o}) =", len(o))
o = {1, 1, 2, 2}
print(f"len({o}) =", len(o))

`sum` return the sum of all elements:

In [None]:
o = [1, 1, 2, 2]
print(f"sum({o}) =", sum(o))
o = 1, 1, 2, 2
print(f"sum({o}) =", sum(o))
o = {1, 1, 2, 2}
print(f"sum({o}) =", sum(o))

Python is a strongly typed language so summing incompatible types yields an error:

In [None]:
sum([1, 1, 2, 2, "3"])

`min` and `max` return the smallest or the largest element, respectively.

In [None]:
o = [1, 2, -1, -10, 0]
print(f"min({o}) =", min(o))
print(f"max({o}) =", max(o))

`sorted` returns elements sorted (in a newly created list), `reversed` returns elements in reversed order.

In [None]:
o = [1, 2, -1, -10, 0]
print(f"sorted({o}) =", sorted(o))
# reversed return a genenator so we have to use list(reversed(...))
# more on generators later
print(f"list(reversed({o})) =", list(reversed(o)))

### `all`, `any` and an excursion to `bool`

`all` and `any` return logical `and`, respectively `or`, applied between all elements, checking that all, resp. at one or more elements are "true-ish".

In [None]:
o = [True, True, True]
print(f"all({o}) =", all(o))
print(f"any({o}) =", any(o))

o = [True, False, True]
print(f"all({o}) =", all(o))
print(f"any({o}) =", any(o))

This is a good opportunity to explain how Python converts different types, including containers, to boolean values.
This happens either explicitely via the `bool` function or, more often, implicitely in `if` and similar statements.

All numbers except for 0 are True:

In [None]:
print(bool(0))
print(bool(0.0))
print(bool(0.0 + 0j))
print(bool(-1))

Strings are True when not empty (having at least one character):

In [None]:
print(bool(""))
print(bool("text"))
# even "0" and "False" are True
print(bool("0"))
print(bool("False"))

Containers like `tuple`, `list`, `dict` or `set` are True only if not empty (having at least one item):

In [None]:
print(bool([]))
print(bool(()))
# this is a non-empty set, even though the element is False
print(bool({False}))

In case we need to check whether all / any elements are True or False, we have to use `all` or `any`:

In [None]:
print(all({0, 1}))
print(any({0, 1}))

Maybe not that intuitive are `all` and `any` of an empty set or list - but consistent with the wording "there is no false-ish element" and "there is at least one true-ish element".

In [None]:
o = []
print(f"all({o}) =", all(o))
print(f"any({o}) =", any(o))

# Additional materials 

If you are interested in more details about containers, you can check the [collections](https://docs.python.org/3/library/collections.html) module where you can find more advanced containers like `namedtuple`, `deque`, `Counter` or `OrderedDict`.