# Lesson 7: Advanced stuff, decorators, \*args, \**kwargs, list comprehensions, generators, generator expressions and the itertools module.

# Chapters:
Chapter 10: Decorators (and args/kwargs) <br>
Chapter 11: Advanced Iterations (generators, comprehensions) <br>
Itertools module <br>
Author: Jurre Hageman <br>

## Decorators

Decorators are advanced topis but many modules and frameworks (like Flask) make use of them so a basic understanding of decorators is important. <br>
Decorators are functions that take another function and extend their behaviour without changing the code of the other function. Python supports the use of decorators with special syntactic sugar that simplifies their use. Let's start with some basic understanding of decorators:

Functions are objects in Python:

In [1]:
def my_function():
    print("OK")

print(type(my_function))

<class 'function'>


And it is possible to pass a function as an argument in anouther function and invoke them within another function:

In [2]:
def func1():
    print('two')


def func2(f):
    print('one')
    f()

    
func2(func1)

one
two


We can also define nested functions:

In [3]:
def func1():
    print('one')
    def func2():
        print('two')
    func2()

func1()

one
two


However, due to the scoping rules we can not invoke func2 from the outer scope:

In [4]:
def func1():
    print('one')
    def func2():
        print('two')


#func2() will produce a TypeError

But we can also return a function from another function without invoking the second function:

In [5]:
def func1():
    print('one')
    def func2():
        print('two')
    return func2


x = func1()

one


The variable x now contains func2. We can invoke func2 as follows:

In [6]:
x()

two


If you understand the above concepts we can now continue to decorators. We first write a nested functions. The inner function is a wrapper function:

In [7]:
def decorate_function(function):
   def function_wrapper(name):
       return "DNA is composed of {}".format(function(name))
   return function_wrapper


def get_message(seq):
   return "the nucleotides {}".format(seq)


get_message = decorate_function(get_message)

print(get_message("ATCG"))

DNA is composed of the nucleotides ATCG


A lot is happening here. The 'decorate_function' contains an inner 'function_wrapper'. The 'decorate_function' also takes another function (get_message) as argument (this becomes a parameter in the function header). The inner function 'function_wrapper' invokes the function that 'decorate_function' received as argument and the 'wrapper_function' augments it's behaviour (it adds text to the string). Note that the 'function_wrapper' itself is not invoked. It is just returned by 'decorate_function'.

Let's now descibe the order of events that happens when the code runs:
Fist the two functions are declared:

In [8]:
def decorate_function(function):
   def function_wrapper(name):
       return "DNA is composed of {}".format(function(name))
   return function_wrapper


def get_message(seq):
   return "the nucleotides {}".format(seq)

Next, the following code is executed:

In [9]:
get_message = decorate_function(get_message)
print(get_message)

<function decorate_function.<locals>.function_wrapper at 0x10896ac80>


get_message is a variable that catches a function object... 'decorate_function' is invoked and the get_message function is used as an argument. Note that the get_message function is not invoked yet. The 'get_message' function is a parameter in 'decorate_function'. In 'decorate_function' is a nested function 'function_wrapper'. This function takes some text as argument and invokes 'get_message' when 'function_wrapper' get's invoked. But 'function_wrapper' is not invioked yet. It is returned as a function object. The 'get_message' function definition get's overwritten by the 'get_message' variable. So we can now invoke the 'function wrapper' with the original 'get_message' function by:

In [10]:
print(get_message("ATCG"))

DNA is composed of the nucleotides ATCG


Now the get_message function is decorated by the 'decorate_function'. Note that the behaviour of 'get_message' is changed but not it's code!

The above pattern is such an important pattern in Python that Python adds some syntactic sugar for it. We can rewrite the above as:

In [11]:
def decorate_function(function):
   def function_wrapper(name):
       return "DNA is composed of {}".format(function(name))
   return function_wrapper


@decorate_function
def get_message(seq):
   return "the nucleotides {}".format(seq)


print(get_message("ATCG"))

DNA is composed of the nucleotides ATCG


This reads as: 'add the functionality of the 'decorate_function' to get_message. And that is exactly what happened...

## \*args and \**kwargs

\*args and \**kwargs are also called magic variables. They can be very handy so it is important to understand them. Remember from the lessen about functions that a function call can contain arguments and the function header contains parameters: 

In [12]:
def my_func(param):
    print(param)
    
arg = "hello"
my_func(arg)

hello


However, if the number of arguments does not match the number of parameters an error occurs:

In [13]:
def my_func(param):
    print(param)
    
arg1 = "hello"
arg2 = "world"
my_func(arg1, arg2)

TypeError: my_func() takes 1 positional argument but 2 were given

However, sometimes you might not know the number of arguments to expect. The *arg notation in the function header accepts any number of arguments. Only the * notation is important so you can also write *blablabla but do not do that as *args is used by convention:

In [None]:
def my_func(*args):
    print(args)
    
arg1 = "hello"
arg2 = "world"

my_func(arg1) #1 argument
my_func(arg1, arg2) #2 arguments
my_func() #0 arguments

So now we do not have an error anymore. \*args in the function header accepts any number of positional arguments and is available in the function as a tuple. Calling arg within the function unpacks the tuple. <br>
Invoking my_func without arguments results in an empty tuple. <br>
However, we can also use \*args as argument in the function call instead as parameter in the function header. This will pack the arguments in a tuple:

In [None]:
def my_func(param1, param2):
    print(param1)
    print(param2)
    
arg1 = "hello"
arg2 = "world"
args = (arg1, arg2)
my_func(*args)


Thus \*args in the function call UNPACKS argument lists while \*args in the function header PACKS an argument list!

Remember from a previous lesson that it was possible to use positional arguments and keyword arguments:

In [None]:
def my_func(val1, val2):
    print(val1)
    print(val2)
    
my_func(val2=3, val1=4)

However, like the positional arguments, this function MUST accept 2 arguments. We can use \**kwargs to accept any number (including 0) keyword arguments:

In [None]:
def my_func(**kwargs): #Note that **kwargs is a parameter in a function header
    print(kwargs)
    
my_func(val1=1, val2=2, val3=3, val4=4) #4 keyword arguments
my_func() #0 keyword arguments

The keyword arguments will be __packed__ in a dictionary. Likewise, we can also use \**kwargs as argument in a function call to __unpack__ a dictionary:

In [None]:
def my_func(val1, val2, val3, val4):
    print(val1)
    print(val2)
    print(val3)
    print(val4)

kwargs = {'val1' : 1, 'val2' : 2, 'val3' : 3, 'val4' : 4}
my_func(**kwargs) #Note that **kwargs is know an argument in a function call


Using this information will can now write a generic function that accepts any number and type of arguments:

In [None]:
def accept_all(*args, **kwargs):
    if args:
        for arg in args:
            print(arg)
    if kwargs:
        for kwarg in kwargs:
            print(kwarg)

accept_all("bla", 10, 15, ['small', 'middle', 'big'], {"one": 1, 'two': 2}, naam="Jurre", age="40")

However there are some rules: 
In the function header: first \*args and then \**kwargs.
In the function call: positional arguments must come first and than keyword arguments.
These are not all the rules. For a thorough overview: 

Quoted from Mark Lutz Learning Python: <br>
- In a function call, all nonkeyword arguments (name) must appear first, followed
by all keyword arguments (name=value), followed by the \*name form, and, finally,
the \**name form, if used.
- In a function header, arguments must appear in the same order: normal arguments
(name), followed by any default arguments (name=value), followed by the
\*name form if present, followed by \**name, if used.

## Generators

Another more advanced concept is the generator function. Generators are used to save memory space. 
A generator function looks like a normal function, except that instead of returning value, a generator yields as many values as it needs to. Python will call the generator function each time it needs a value, then saves the state of the generator when the generator yields a value so that it can be resumed when the next value is required. This saves a lot of memory. Let's start with a simple generator function:

In [14]:
def my_generator():
    yield 1
    yield 2
    yield 3

print(my_generator())

<generator object my_generator at 0x1089831a8>


Invoking the function will only print object information about the function.
To do something we need to iterate over the function:

In [15]:
for i in my_generator():
    print(i)

1
2
3


So how did this work? To understand this you need to understand a bit about the iteration protocol. Suppose we have a list with 3 elements. We can easily iterate over the elements using a for loop but we can do the same with next if we make an iterator from the list:

In [16]:
my_list = ['a', 'b', 'c']
for i in my_list:
    print(i)

my_list2 = ['d', 'e', 'f']
iter_object = iter(my_list2)
print(next(iter_object))
print(next(iter_object))
print(next(iter_object))
print(next(iter_object))

a
b
c
d
e
f


StopIteration: 

All went fine until the end was reached. The last print statement raised the StopIteration protocol. Now back to the generator:

In [17]:
def my_generator():
    yield 1
    yield 2
    yield 3

generator_object = my_generator()
print(next(generator_object))
print(next(generator_object))
print(next(generator_object))
print(next(generator_object))

1
2
3


StopIteration: 

So what happened here:
- Each time next() is called on the generator iterator (either with next or in a for loop), the generator resumes execution from where it called yield, not from the beginning of the function.
- If a generator function calls return or reaches the end of its definition, a StopIteration exception is raised.


So why would this be usefull? It is all about memory usage! Let's take a real example. Let's write a simple function to check if a number is a prime number:

In [20]:
import math

def is_prime(number):
    ''' check if this is a prime number'''
    if number > 1:
        if number == 2: #2 is the only even prime number
            return True
        if number % 2 == 0:
            return False
        for current in range(3, int(math.sqrt(number) + 1), 2): 
            if number % current == 0: 
                return False
        return True
    return False #0 and negative numbers are not prime

print(is_prime(3))
print(is_prime(4))
print(is_prime(-17))

True
False
False


Using this function we can generate a list of all prime numbers below the integer 100:

In [21]:
def get_primes(n):
    primes = []
    num=0
    while num < n:
        if is_prime(num):
            primes.append(num)
        num += 1
    return primes

print(get_primes(100))

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


This works as expected but we used a considerable amount of memory if large numbers are involved:

In [22]:
import sys
print(sys.getsizeof(get_primes(100000)), "bytes")

77848 bytes


This is were the generator function shines:

In [23]:
def get_primes_from_generator_function(n):
    num=0
    while num < n:
        if is_prime(num):
            yield num
        num += 1


for i in get_primes_from_generator_function(100):
    print(i, end= " ")

2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 

So what is the deal here? We get the same prime numbers. But if you look close we do not have received a list. It looks as if these for loops are connected! The function 'remembers' the state of the previous call! And what about memory usage if large numbers are involved?

In [24]:
print(sys.getsizeof(get_primes_from_generator_function(100000)), "bytes")

88 bytes


The resulting generator object has a much smaller memory size compared to the list object! Using a generator function we can even safely write a while True loop to retreive prime numbers.
What will be the first prime number above the integer 1000?

In [26]:
def get_next_prime_from_generator_function(num):
    '''prime number generator from a number till infinity...'''
    while True:
        if is_prime(num):
            yield num
        num += 1

num = 1000
my_generator_object = get_next_prime_from_generator_function(num)
print(next(my_generator_object))

1009


Get the subsequent 10 prime numbers:

In [27]:
for i in range(10):
    print(next(my_generator_object))

1013
1019
1021
1031
1033
1039
1049
1051
1061
1063


Remember that we can safely write the while True loop because we are using a generator function that will 'remember' the previous return value. This will save a lot of memory!

## Comprehensions

## Generator expressions

## The Itertools module