# Advanced structures

While a lot can be achieved using lists, sometimes they can be a bit cumbersome. In this lecture, we will discuss a few advanced data structures such as sets, tuples, and dictionaries. Although their functionality can be reproduced using lists, working with these structures can make your code much more readable and avoid errors due to working with data structures that are not suited for the application. 

Furthermore, we will consider advanced functionality of functions. In particular, it may often be useful to provide default values for some arguments of a function. This can help keep you code simple and allow for users to only supply the arguments they need for a given application. Additionally, we will discuss a slightly different version of a function, called a generator, which makes the function act like an iterator, similar to the `range()` function. This allows for function outputs to be used in for-loops, for example, or function values can be computed one at a time when they are needed; thus, giving great flexibility in your programming. 

### What you will learn

In this notebook we will cover the following topics:
* sets
* tuples
* dictionaries
* generators
* default values for functions

*&#169; Tobias Hartung, University of Bath 2021-2025. This problem sheet is copyright of Tobias Hartung, University of Bath. It is provided exclusively for educational purposes at the University and is to be downloaded or copied for your private study only. Further distribution, e.g. by upload to external repositories, is prohibited.*


## Sets

For example, a set can be represented as a list by listing all elements. However, if we do this, then we need to ensure that no duplicates are in the list and the ordered nature of a list may not be what we want either. Thus, using a set in Python can simplify your code and make it a lot more efficient.


In [None]:
items = {(1,2):4, (4,2):2}

In [None]:
fruits = ['apple', 'banana', 'cherry','guava']
disliked_fruits = ['apple', 'guava']
[f for f in fruits if f not in disliked_fruits]

In [None]:
{'dog','cat','rat','dog'}

In [None]:
{(4,2),(2,1)}

In [None]:
L = [1,2,3,4,4] # recall how to define a list
print('type of L: ',type(L))

s1 = set([1,2,3,4,4]) # defining a set from a list
s2 = {1,3,3}          # defining a set directly
print('type of s1: ',type(s1))
print('type of s2: ',type(s2))

print('list L: ',L)
print('set s1: ',s1)
print('set s2: ',s2)

Note that sets are iterable but not ordered, that is you can write a for loop `for x in s1:` and it will go through each element `x` of `s1` but you cannot ask for the "first element" of the set because it is "unordered". As the following example shows, the order in which the set gets iterated over is not the same as the list used to define the set.

In [None]:
s = set([1,3,4,2,6]) # define set from list of elements
for x in s:
    print(x) # show all elements of the set

Be aware, the elements of a Python set must be "immutable", i.e., they cannot be changed after they are created in memory. Numbers and strings can be in sets, but lists cannot be elements of sets. If you want an immutable list-like object as an element of a set, you will need to use tuples (see below). 

Sets also have the `union`, `intersection`, `difference`, and `symmetric difference` methods.

In [None]:
s1 = set([1,2,3,4])
s2 = set([3,4,5,6])

print("union of s1 and s2")
print(s1.union(s2))
print(s1|s2) # also union
print()

print("intersection of s1 and s2")
print(s1.intersection(s2))
print(s1&s2) # also intersection
print()

print("difference of s1 and s2")
print(s1.difference(s2))
print(s1-s2) # also difference
print()

print("symmetric difference of s1 and s2")
print(s1.symmetric_difference(s2))
print(s1^s2) # also symmetric difference

You can check whether two sets are disjoint with `isdisjoint()` or subsets of another with `issubset()` or `<=`. `<` determines whether we have a proper subset.

In [None]:
print(s1.isdisjoint(s2))
print(s1.issubset(s1))
print(s1<s1)

Similarly, we have `issuperset()`, `>=` and `>`. 

Finally, we can `add()` and `remove()` elements from a set, and check wheter an element is in a set with `in`.

In [None]:
print(s1)
s1.add(9)
s1.remove(1)
print(s1)
print(2 in s1)

For example, sets are useful if you want to reduce lists to their unique values, e.g., ranges of functions. Then we can ask whether they are disjoint, the same, or similar questions.

In [None]:
import numpy as np

def f1(n):                       # define first function as round( sin(n pi / 2) )
    return round(np.sin(n * np.pi/2))

def f2(n):                       # define second function as modulo 3 
    return n%3

def f3(n):                       # define third function as round( sin(n pi / 2) )**2
    return round(np.sin(n * np.pi/2))**2

domain = range(100)

# compute ranges as lists
range_list_f1 = []
range_list_f2 = []
range_list_f3 = []
for n in domain:
    range_list_f1.append(f1(n))
    range_list_f2.append(f2(n))
    range_list_f3.append(f3(n))

# show ranges: imagine we chose domain = range(10000) or even larger
print('f1 range as list: ',range_list_f1)
print('f2 range as list: ',range_list_f2)
print('f3 range as list: ',range_list_f3)
print()

# map ranges to sets and print
range_f1 = set(range_list_f1)
range_f2 = set(range_list_f2)
range_f3 = set(range_list_f3)
print('f1 range as set: ',range_f1)
print('f2 range as set: ',range_f2)
print('f3 range as set: ',range_f3)
print()

# ask questions about ranges as sets
print('Are they the same?', range_f1 == range_f2)
print('Is range(f1) a subset of range(f2)?', range_f1 <= range_f2)
print('Is range(f3) a subset of range(f2)?', range_f3 <= range_f2)
print('Are they disjoint?', range_f1.isdisjoint(range_f2))
print('What are the common elements?', range_f1&range_f2)

## Tuples

Tuples are ordered lists and immutable. As such, they can be used to represent constant vectors and appear in sets. They are defined using parentheses (). They are indexed just like lists, the only difference is that you cannot assign a value to a tuple element.

In [None]:
t = (1,2,3) # define a tuple
print(t)    # show the tuple
print(t[0]) # show a value of the tuple
t[1] = 5    # this causes an error, cannot change values in a tuple

The code above would work perfectly fine if `t` had been a list.

In [None]:
t = [1,2,3] # define a list
print(t)    # show the list
print(t[0]) # show a value of the list
t[1] = 5    # change value of list
print(t)    # show the list

If you have a function that has multiple return arguments, i.e., a function that ends in `return a,b,c` then the returned object is the tuple `(a,b,c)`. To get access to the values in a tuple, we can use tuple unpacking. 

In [None]:
a,b,c = t # unpacking the tuple t into variables a, b, and c
print(a)
print(b)
print(c)

Other than this immutability, tuples act just like lists and can be concatenated with `+` and sliced with `:`. 

## Dictionaries

Dictionaries are similar to lists, but instead of having a numerical index, they are indexed by a key. This can make your code much more readable. Consider the following example of a participant in a study.

In [None]:
participant = ['Smith', 'John', 32, 105] # data stored as list
participant = {                          # data stored as dictionary
    'last name': 'Smith', 
    'first name': 'John',
    'age': 32,
    'IQ': 105
}

print(participant['age'])

The call `participant['age']` is much more readable than the list equivalent `participant[2]`. Furthermore, if the data changes to include more or less information, the code `participant['age']` still works unchanged whereas the index in the list may be shifted and then `participant[2]` refers to the wrong data. 

When working with dictionaries, the important functions are `keys()` (returns list of keys), `values()` (returns list of values, and `pop()` (removes a key-value-pair). We can also add an entry to the dictionary by setting it with `dictionary[key] = value`.

In [None]:
print(participant.keys()) # show current keys
print(participant.values()) # show current values 
participant.pop('IQ') # remove IQ from dictionary
participant['score'] = 10 # add a 'score' entry to the dictionary
print(participant) # show changed dictionary

It is important to note that dictionaries are iterable and `for x in dict:` loops over key values. 

In [None]:
for x in participant:
    print(x)

If you wish to access both keys and values, you should use `.items()`.

In [None]:
for k,v in participant.items():
    print(k,v)

## Generators

Sometimes, list can get very large but you don't need to have all elements at all times. For example, if you write `for n in range(10**100):` it is not necessary (nor possible) to store 10^100 many integers in memory. Instead, only the current number `n` is stored and incremented at each loop iteration. Generators do this for you in Python. 

Generators are defined like functions but use `yield` instead of `return`.

In [None]:
{2,3}

In [None]:
def fibonacci():
    """
    Generator object to generate the Fibonacci numbers starting with 1, 1, ...
    
    Input: None
    
    Output: integer representing the sequence of Fibonacci numbers
    """
    a = 1
    b = 1
    while True:
        yield a
        a,b = b,a + b

F = fibonacci()
print(F)
        

This defines `F` to be a generator of the Fibonacci sequence. It can generate arbitrarily many elements in the sequence but at any point in time only two numbers are stored (`a` and `b`). We can access the numbers by calling `next`.

In [None]:
print(next(F))
print(next(F))
print(next(F))
print(next(F))
print(next(F))

As such, the main difference between a generator and a function is the fact that a function is closed and removed from memory once it has returned a value. For example, if we were to write a function of the form

then the function will never execute the `do something else` part because it stops everything it does as soon as a `return` statement is executed. 

A generator on the other hand would look like

Once it reaches the `yield` statement, it will also return a value but the generator will stay in the state it is in. Once `next()` is called on the generator object, the `do something else` code will be executed. 

Hence, a generator object will only cease to exist like a function does, if the generator reaches the end of its code, i.e., the end of the `do something else` part in the abstract example above. If we consider our Fibonacci generator however, this is not possible because it trapped in an infinite loop `while True`. If you wanted to stop a generator object, like the Fibonacci generator, from existing (because you no longer need it), you need to close it manually. This can be achieved by calling the `.close()` method. 

In [None]:
F.close()    # closing the Fibonnaci generator

## Default values

Finally, it can often be usefult to define functions with default values for some variables. This can allow you to define more versatile functions while, at the same time, keeping the the code readible when the additional functionality is not needed. 

We can define default values by assigning a value to an argument in the `def function():` line.

For example, we could consider the Fibinacci generator above. But rather than having the first two elements given by 1 and 1, we could allow for the first two values to be supplied by the user. However, if they are not given, then we could default back to 1 and 1. 

In [None]:
def versatile_fibonacci(init=(1,1)): 
    """
    Generator object to generate the Fibonacci numbers starting with the number specified in the tuple init.
    If init is not given, then the sequence starts with 1,1,...
    
    Input:
    init : tuple of size 2 : optional values for first two Fibonacci numbers
    
    Output:
    integer representing the sequence of Fibonacci numbers
    """
    
    # note that a is the 0th number and b the 1st, while init contains F_1 and F_2
    # so we need to set b = F_1 and a = F_2-F_1
    a = init[0]
    b = init[1]
    
    # now we repeat the same code
    while True:
        yield a
        a,b = b,a + b

# define the "normal" Fibonacci sequence by not passing initial values
fibo = versatile_fibonacci()
for i in range(5):
    print(next(fibo))

print() # make an empty line in the output

# but we can also define different starting values, like F_1 = 3 and F_2 = 3, 
# by passing the tuple (3,3) to the init argument
fibo_3_3 = versatile_fibonacci(init=(3,3))
for i in range(5):
    print(next(fibo_3_3))

Of course, default values and arguments without default values can be mixed. If you do this, then all arguments without default values need to be listed first. For example, we could consider a function that computes the `N`th Fibonacci number but allowing for the initial values to be set differently with 1 and 1 as defaults.

In [None]:
def nth_fibonacci(N,f1=1,f2=1):
    """
    This function computes the Nth Fibonacci number. 
    
    Input:
    N : integer : the N for the Nth Fibonacci number to be computed
    f1 : integer : optional argument for the first Fibonacci number to be specified; if unspecified f1 = 1
    f2 : integer : optional argument for the second Fibonacci number to be specified; if unspecified f2 = 1
    
    Output:
    integer f_n representing the Nth Fibonacci number.
    """
    
    # first check whether N is 1 or 2, i.e., one of the supplied values
    if N == 1: 
        return f1
    
    if N == 2:
        return f2
    
    # else we need to add the last two values N-2 times
    f_n_minus_1 = f2
    f_n_minus_2 = f1
    for i in range(N-2):
        f_n = f_n_minus_1 + f_n_minus_2 # add the last two numbers
        
        # shift the index, i.e., set new last and second to last numbers
        f_n_minus_2 = f_n_minus_1  
        f_n_minus_1 = f_n
        
    return f_n

print(nth_fibonacci(5))
print(nth_fibonacci(5,f1=3,f2=3))

# Check your understanding

##### Question 1
Which of the following define the set {'a', 'b', 'c'}:
```
A S = set('abc')
B S = set('a','b','c')
C S = set(['a','b','c'])
D S = {'a','b','c'}
E S = {('a','b','c')}
```

##### Question 2
What is the result of `{1,2,3} & set('123')`?
```
A {1,2,3}
B {'1','2','3'}
C {1,2,3,'1','2','3'}
D set()
E {}
```

##### Question 3
Which of the following are true of Python dictionaries?
```
A Dictionaries are accessed by key.
B Dictionaries can be nested to any depth.
C A dictionary can contain any object type except another dictionary.
D Items are accessed by their position in a dictionary.
E All the keys in a dictionary must be of the same type.
F Dictionaries are mutable.
```

##### Question 4
What is the type of `d = {('foo', 100),('bar', 200),('baz', 300)}`?

##### Question 5
What is the type of `d = {}`?

##### Question 6
Consider the dictionary `d = {'foo': 100, 'bar': 200, 'baz': 300}`. What happens if you write `d['bar':'baz']`?

##### Question 7
Consider `t=(1,2,3)`. How would you replace the `2` with a `4`?

##### Question 8
What structure would you use to compute one prime at a time starting with 2 or another chosen prime, that is, you may want to obtain 2, 3, 5, 7, ... one at a time and you may wish to choose the initial prime in the list?

```























```

# Answers
Q1: A, C, D
Q2: D
Q3: A, B, F
Q4: set
Q5: dictionary
Q6: It raises an exception. 
Q7: This is a trick question. Tuples are immutable.
Q8: A generator with default argument, e.g., `def primes(first_prime = 2):`
