<div style="background-color:lightgrey;
            padding:10px;
            color:black;
            border:black dashed 2px; 
            border-radius:5px;
            margin: 20px 0;">
            
            
# Functions



**Staff:** Walter Daelemans <br/>
**Support Material:** [Exercises](https://github.com/dtaantwerp/dtaantwerp.github.io/blob/DTA_Bootcamp_2021_students/exercises/07_functions.ipynb) <br/>
**Support Sessions:**  Thursday, October 6, 10:30AM

</div>

## Summary: what all programming languages have in common:

- Datatypes and procedures operating on them
    - `int`, `float`, `list`, `str`, `tuple`, `bool`, `dict`, ...
- Tests and boolean operators combining them
    - `==`, `in`, `<`, `>`, `and`, `or`, `not`, ...
- Variables and values (cf. assignment)
    - _name_ = _expression_
- Input and Output (often abbreviated: I/O)
    - `input()`, `print()`, `read()`, `write()`, ...
- Control structures for conditions, loops etc.
    - `if`, `elif`, `else`, `for`, `while`, `break`, `continue`, `pass`, ...
- A way of **extending** the language with your own procedures (and datatypes)
    - `def`

Let's revisit a typical programming idiom: a filter consisting of a `for` loop and a `test`.
Write code that takes a list of words as input and filters out a list of all 5-letter words starting with an 'a'.

In [None]:
words = ['abba', 'point', 'madam', 'feline', 'level', 'apple', 'google', 
         'anton', 'microsoft', 'facebook', 'goggle','amazon', 'oracle', 'almos']

result = []
for word in words:
    if len(word) == 5 and word[0] == 'a':
        result.append(word)
        
result

We can define a function using `def` that will encapsulate a piece of code and make it known to Pyhon so that it can be reused. This means that it can be **called** or **invoked** with different arguments.

The syntax is the following:

```python
def name(arguments):
    body
    return 
```

We are free to choose a name for the function (make it informative!) and the names of the **arguments**. There can be no, one, or several arguments. The arguments can be used in the **body** of the function. In the body there will often be one or more `return` expressions that decide the output of a function call. However, a `return` expression is not obligatory, we can define functions that don't return any value, but do something nevertheless (e.g. printing a value).

Let's look at an example function definition, we will make a function of the code for the filter we made.

In [None]:
def filter_words (word_list):
    result = []
    for word in word_list:
        if len(word) == 5 and word[0] == 'a':
            result.append(word)
    return result

In [None]:
filter_words(words)

As you can see the result of calling this new function `filter_words` with as input the variable `words` is exactly the same as the previous code. The value of the input `words` (the list of words), is linked to the variable `word_list`. This is an internal variable that only has it's value while the function is being called, afterwards it will be unbound again. In the end, the `return` expression provides the output of the function call, in this case the filtered list. We can also call the function directly with a list or with another variable bound to a list.

In [None]:
filter_words(['albion', 'always', 'anton', 'anthony'])

In [None]:
words2 = ['a', 'basket', 'of', 'fruit', 'with', 'one', 'apple', 'and', 'one', 'banana']
filter_words(words2)

We can make our filter function more flexible by defining additional arguments for the initial letter and the length on the basis of which we will filter words.

In [None]:
def filter_words (w, first_letter, length):
    result = []
    for word in w:
        if len(word) == length and word[0] == first_letter:
            result.append(word)
    return result

In [None]:
filter_words(words, 'g', 6)

We can also define **default values** for arguments. In case the arguments are not provided, we can then use those values.

In [None]:
def filter_words (w, first_letter='a', length=5):
    result = []
    for word in w:
        if len(word) == length and word[0] == first_letter:
            result.append(word)
    return result

In [None]:
filter_words(words)

In [None]:
filter_words(words, 'l') # 5-letter words is default

In [None]:
filter_words(words, 'g', 6)

The order of the arguments is important. Swapping the order of the initial letter and the length will lead to wrong output. 

In [None]:
filter_words(words, 6, 'g')

In some cases it is clearer to work with explicit keyword arguments (also called `named` arguments), that make the order of the arguments irrelevant. In our latest definition, the arguments `length` and `first_letter` are defined as keywords (to make possible default values), so they can be used as such, `words` is not a keyword though, and should always be the first argument in a call to this function.

In [None]:
filter_words(words, length=6, first_letter='g')

The fact that we didn't get an error message with

```python
filter_words(words, 6, 'g')
```
points to bad (or at least dangerous) code. We could believe here that there just aren't any 6-letter words starting with 'g'. We didn't get an error because the test used in the function definition works both with strings and integers, even if 'semantically' it doesn't make any sense. Python is a very flexible language, but the downside of that is that semantic errors are easy to make. In other languages, type checking is often used to make sure that functions are applied to the right type of arguments. We might consider doing a type check explicitly in our function definition:

In [None]:
def filter_words (w, first_letter='a', length=5):
    if not(isinstance(first_letter, str)) or not(isinstance(length, int)):
        return 'Wrong arguments!'
    result = []
    for word in w:
        if len(word) == length and word[0] == first_letter:
            result.append(word)
    return result

filter_words(words, 6, 'g')

The type-checking condition delivered `True`, so we can avoid that error. This code also illustrates that it is possible to have several `return` expressions in a function definition. Whenever a `return` is executed, the rest of the function body is no longer executed.

##### Advanced argument usage:

As a final variant, it is possible to work with a number of arguments that is previously not defined, indicated by `*args` in the arguments position of the function definition, and/or a number of keyword arguments that is previously undefined, indicated by `**kwargs`. When using both at the same time, the `*args` should precede the `**kwargs`. The `**kwargs` are available as a dictionary.

In [None]:
def function_1 (*args):
    return args

function_1 (4, 6, 'apple')

In [None]:
def function_2 (**kwargs):
    return kwargs

function_2(a=3, b=6, c='apple')

### Local and global variables

It often happens that function definitions use argument names that already exist in the environment. That is not a problem because variables used as argument names only live while the function is being called, after that the previous values are reinstated. 

In [None]:
x = 12

print(x)

def myfunc (x):
    return x

print(myfunc (21))

print(x)

## Decomposition and abstraction

When solving a programming problem, you will have to decompose that problem into subproblems until you reach subproblems that are easy to implement in a single function definition (*decomposition*). And whenever you find subproblems that can be reused in different parts of your code, you can give them a separate function definition (*abstraction*). Both processes lead to a situation where rather than having one big blob of code, you have a set of relatively short functions. So in summary:

- Split up a complex piece of code into simpler pieces of code (decomposition, 'factoring')
- Abstract over a reusable piece of code by giving it a name (e.g. a new function or test)
- Improved clarity, ease of change and debugging, reusability, ...

Let's have a look at a toy example.

In [None]:
# Problem: For each word in a list, check whether it is longer than 5 letters and if so, count the number of vowels

sentence = "I like the words serendipitous and fictitious"
words = sentence.split()
words

In [None]:
# The function should return a list of vowel counts
result_list = []

vowels = "aeiou"

for word in words: # for each word
    if len(word) > 5: # that is longer than 5 letters
        vowelcount = 0 # we have to return a count
        for letter in word: # for each letter in that word
            if letter in vowels: # if a letter is a vowel
                vowelcount += 1 # increment vowelcount
        result_list.append([word, vowelcount]) # add the count to the list 
    
result_list  

Let's make this clearer, more reusable and more elegant through the mechanism of decomposition. 

- We will define a separate function to test whether a letter is a vowel
- We will define a function that uses this test to count the number of vowels in a word
- We will use the abstracted functions in our toplevel function that counts vowels in long words

In [None]:
def is_vowel (letter):
    "Test to check whether a letter (input of type str, output of type bool) is a vowel"
    return letter in "aeiou"

In [None]:
is_vowel("a")

In [None]:
def vowel_count (word):
    "Count the number of vowels in a word (input of type str, output of type int)"
    vowelcount = 0
    for letter in word:
        if is_vowel(letter):
            vowelcount+= 1
    return vowelcount

In [None]:
vowel_count("abracadabra")

In [None]:
# For each word in a list, check whether it is longer than 5 letters and if so, count the number of vowels

def long_words_vowelcount (word_list, minimal_length=5):
    "Input a list of words and return a list of pairs of words longer \
    than 'minimal_length' with their number of vowels"
    result = []
    for word in word_list:
        if len(word) > minimal_length:
            result.append([word, vowel_count(word)])
    return result

In [None]:
long_words_vowelcount(words)

In [None]:
long_words_vowelcount(words, 7)

Finally a word on function stacking. We can apply a function to the output of another function which was applied to the output of ... So we get a structure like this: f1(f2(f3(x))) with an arbitrary number of functions. Because of its compactness and convenience you will see this used a lot in AI and NLP. 

In [None]:
# Take a string as input, convert it to a list of word tokens, find the length of each token and and return 
# the length of the longest word

max(
    map(
        len, 
        ("the mirror and the light".split())))

## Modules

Some Python functions, objects, and methods are available when you start up Python (e.g. `len`), others are available in modules that should be explicitly loaded into the Python system using `import`. The reason for this is efficiency, frequently needed stuff is always available, less frequently needed stuff only if needed by importing it.  A case in point is the `random` module that contains among others the `randint` function which generates random integers. Of course, you should know which modules have which functions and what their names are, but this is something you will gradually learn. Let's try the `random` module.

In [None]:
import random

In [None]:
# help(random) # (lots of info)

In [None]:
help(random.randint)

In [None]:
random.randint(0, 10) # notice we have to specify the name of the module as well as the function name

Another way to import from a module is to specify explicitly which functions you need with `from` ... `import`. In that case you can just use the function name without having to specify the module name,

In [None]:
from random import randint

In [None]:
randint(0,1000)  # no need to specify the module name now

Yet another way to import is to provide an alias: `import` ... `as` ...  In that case the alias can be used instead of the module name.

In [None]:
import random as r # use the alias name to refer to the module

r.randint(0, 10)

You can write your own modules as well, we will return to this later.