# <u><p style="text-align: center;">Pure functions</p></u>

### Contents of this notebook 
* What a pure function is
* Difference between pure functions and non-pure functions
* Advantages of pure functions for big data applications

### Background

This notebook's topic is about **pure functions**. We have already seen that functions are ‘packets’ of code, which receive inputs (through arguments) and produce outputs (through return values). Functions can be used to calculate formulas, but also to write data to files, interact with users, etc. But when is a function a **pure** function? To understand this, we first need to introduce two properties that characterize the operation of functions. These properties are **determinism** and **side-effects**. We will start with a short re-cap of what this means, and then we'll see a couple of examples to help us recognize pure functions in Python.

#### <u>Deterministic functions</u>
When a function produces the <u>same outputs</u> when supplied with the <u>same inputs</u>, this function is called ***deterministic*** (the result can and will be determined from the input only). For example, a mathematical function like *addition*, which adds two values, will always give the same result when the same two values are given as arguments. But a function that makes use of a value that was not given as argument (such as using the current computer clock time in a calculation) may give a different result depending on the situation.

#### <u>Side-effecting functions</u>
Continuing to another property, when a function <u>changes</u> a part of a program <u>outside of the function</u> itself, we say that this function has ***side-effects*** (it is affecting its environment). A side-effect could be for example that the value of a variable outside the scope of a function has changed, or that something has been printed on screen, because of the function.

#### <u>Pure functions</u>
Functions that are deterministic and have no side-effects are called **pure functions**. The use of pure functions is helpful in ensuring the correctness of our programs. If we use only pure functions, then our outcomes will be always predictable and reliable, because they do not depend on anything other than our functions and our input parameters. This is also a very important prerequisite for big data: if we use only pure functions, then it doesn't matter on which computer the function is executed, or by whom, or at what moment in time. The function result is always the same when the same input is used (deterministic), and the function result is independent of the situation (no side-effects). The use of pure functions makes it easy to divide a big data task over multiple computers.

### Code examples

In this section we are going to see examples of functions with the properties described above:   

***Example 1:*** demonstrates the effect of a **non-deterministic** function.  
***Example 2:*** demonstrates the effect of a function **without** and a function **with** side-effects.  
***Example 3:*** is another example of a **side-effecting** function.  
***Example 4:*** contains examples of **pure** functions.  

#### Example 1: a non-deterministic function
Below we have a function named `randint` which returns a **random** integer in the range [1, 10]. This function is imported from Python's standard module **random**, so that it can be called from within our program. Try running the following cells and notice the outputs: 

In [None]:
from random import randint

print(randint(1, 10))

In [None]:
print(randint(1, 10))

In [None]:
print(randint(1, 10))

The output of `randint` changes on each consecutive run even without us changing its inputs. Therefore, the function `randint` is non-deterministic.

#### Example 2: side-effects
Let's first talk about <u>return values</u>. A python function typically returns a single result, which we call its return value. These return values must be assigned to variables, be printed, saved to files, passed on to other functions, or otherwise they will be lost. In the code below, `max` is used to find the maximum of two numbers, which is then assigned to a variable **x**:

In [None]:
x = 5
print('x before assignment:', x)
x = max(0,3)
print('x after assignment:', x)

So is this an example of a side-effect? Remember that functions without side-effects do not change the environment of a program. We see in the example above that the return value of the call to the `max` function is stored in variable **x** in `x = max(0,3)`. But this is not a side effect, because function `max` itself does not change the value of **x**. The function's return value is stored in **x** immediately *after* the call to `max`. That assignment is indeed changing the environment, because the value of **x** changes, but the assignment isn't part of the function `max` itself. So `max` is a function without side-effects.

However, there are functions which manage to change the environment of a program without depending on returning values. For instance, `append` is such a function: it appends an item to a list. In the following example we use `append` to put numbers into a list named `number_list`. [To be precise, `append` is a method (hence the .-notation), which behaves like a function, also see the Introduction to Python notebook.] Notice the output of `append`:

In [None]:
number_list = []

append_return_value = number_list.append(1)
print('"append" return value:', append_return_value) 
print('number_list:', number_list)

We observe that the `append` function has no return value, but after we have called it, our variable `number_list` has been changed! <br>
When we call `append` again (with argument `2` this time), we see that `number_list`'s value has again changed:

In [None]:
append_return_value = number_list.append(2)
print('"append" return value:', append_return_value) 
print('number_list:', number_list)

Notice that `append` always returns *None* and still achieves to alter `number_list` without assignment, already during the execution of the call, not afterwards. This is called a side-effect. It does operations 'behind the scenes' to change the environment of our program (namely our variable `number_list`).

#### Example 3:  another side-function
Another case of a side-effecting function is `sort`, which orders the elements of a list:

In [None]:
number_list = [5,4,1,0,3,2]

print(number_list.sort())
print(number_list)

Similarly to `append`, function `sort` also changes the elements in `number_list` without any explicit assignment, so it is a function with a side-effect.

#### Example 4: more examples of pure functions
Functions like `max`, which we saw in example 2, are pure since their output depends solely on the input, and they do not change their environment during their execution. Other examples of pure functions are:

In [None]:
#'abs', which returns the absolute value of a number
print("abs:", abs(-5))

#'pow', which returns the power of a number
print("pow:", pow(2,3))

#'str', which can return the string representation of a number
print("str:", str(12))

#'round', which rounds a number to its nearest integer
print("round:", round(12.51))

<span style="display:none" id="question1">W3sicXVlc3Rpb24iOiAiV2hpY2ggb2YgdGhlIGZvbGxvd2luZyBmdW5jdGlvbnMgYXJlIHB1cmU/IiwgInR5cGUiOiAibWFueV9jaG9pY2UiLCAiYW5zd2VycyI6IFt7ImNvZGUiOiAiYXBwZW5kIiwgImNvcnJlY3QiOiBmYWxzZSwgImZlZWRiYWNrIjogIidhcHBlbmQnIGlzIHNpZGUtZWZmZWN0aW5nIn0sIHsiY29kZSI6ICJtYXgiLCAiY29ycmVjdCI6IHRydWV9LCB7ImNvZGUiOiAibWluIiwgImNvcnJlY3QiOiB0cnVlfSwgeyJjb2RlIjogInJhbmRpbnQiLCAiY29ycmVjdCI6IGZhbHNlLCAiZmVlZGJhY2siOiAiJ3JhbmRpbnQnIGlzIG5vdCBkZXRlcm1pbmlzdGljIn1dfV0=</span>

<span style="display:none" id="question2">W3sicXVlc3Rpb24iOiAiV2hpY2ggb2YgdGhlIGZvbGxvd2luZyBzdGF0ZW1lbnRzIGFyZSB0cnVlPyIsICJ0eXBlIjogIm1hbnlfY2hvaWNlIiwgImFuc3dlcnMiOiBbeyJjb2RlIjogIkRldGVybWluaXN0aWMgZnVuY3Rpb25zIGFyZSBwdXJlIiwgImNvcnJlY3QiOiBmYWxzZSwgImZlZWRiYWNrIjogIlRoZXkgbXVzdCBhbHNvIG5vdCBoYXZlIHNpZGUtZWZmZWN0cyJ9LCB7ImNvZGUiOiAiU2lkZS1lZmZlY3RpbmcgZnVuY3Rpb25zIGNoYW5nZSBlbnRpdGllcyBvdXRzaWRlIG9mIHRoZWlyIHNjb3BlIiwgImNvcnJlY3QiOiB0cnVlfSwgeyJjb2RlIjogIlB1cmUgZnVuY3Rpb25zIG9ubHkgYWN0IGluIGVudGl0aWVzIGluc2lkZSB0aGVpciBib2R5IiwgImNvcnJlY3QiOiB0cnVlfSwgeyJjb2RlIjogIkZ1bmN0aW9ucyB0aGF0IHJldHVybiAgIGRpZmZlcmVudCB2YWx1ZXMgZm9yIHRoZSBzYW1lIGlucHV0IGFyZSBjYWxsZWQgICBkZXRlcm1pbmlzdGljIiwgImNvcnJlY3QiOiBmYWxzZSwgImZlZWRiYWNrIjogIlRoZXkgYXJlIG5vbi1kZXRlcm1pbmlzdGljIn1dfV0=</span>


# Quiz

In this section you can check your understanding of the contents of this notebook.

#### Q1:

In [None]:
from jupyterquiz import display_quiz

display_quiz("#question1")

#### Q2:

In [None]:
display_quiz("#question2")

### Further reading

* [Pure functions](https://en.wikipedia.org/wiki/Pure_function)
* [Side effects](https://en.wikipedia.org/wiki/Side_effect_(computer_science))