# Programming For Chemists: Control Flow Statements

**Importance for scientists:**
* Control flow statements allow for programs to make rudimentary decisions based on input or output.
* They allow for autonomous looping over large sets of data without manually having to manipulate them; the essence of computation.
* They can save a scientist valuable time by automation of many tasks.
* **You make the computer work for you**.

Control flow is the order in which a program’s code executes. The control flow of a Python program is regulated by conditional statements, loops, and function calls which will be covered in this session. First we will discuss how you can program a **decision** into your program using `if` statements.

## The if Statement

`if` statements are control flow statements used to run particular code only when a certain condition is satisfied; e.g. you want to end a program when a particular value is calculated. 

* The `if...elif...else` statements are used in Python for decision making and are one of the most common control flow statements you will use. Practically it means:

    `if condition 1 is met then run code A. else if condition 2 is met run code B. if neither of these conditions are satisfied, run code C`.

* Let's now turn this pseudo-code into Python code in conjunction with the conditionals we learned about in session 2:

<center><img src="https://raw.githubusercontent.com/adambaskerville/ProgrammingForChemists/master/images/if_statement_syntax.png" width="900" height="900" /></center>

*  <font color='red'>Complete the following implementation of the picture. Change the values of `x` and `y` to test the various outputs:</font>

In [None]:
x = 2

if x > y: # Condition 1: If x is greater than y
    print("x is greater than y!") # Code A
elif :
    

* There can be zero or more `elif` parts, and the `else` part is optional. 
* The keyword `elif` is short for `else if`, and is useful to avoid excessive indentation. 
* The result of `if..elif...else` statements evaluate to **Booleans**: `True` or `False` depending on the condition so are valid across Python data types. 

**Two very important syntax rules:**

1. Always include colons at the end of the `if`, `elif` and `else` lines.
2. Never miss the indentation (spaces or tabs) after the conditional statements. Most programming languages like C, C++, and Java use braces `{ }` to define where a block of code starts and ends whereas Python uses **indentation**. A code block (body of a function, `if` statement, etc...) starts with indentation and ends with the *first unindented line*. The amount of indentation is up to you, but it must be consistent throughout that block as this is how Python determines when a code block has ended. 

* Tabs vs. spaces is one of the longest running debates in programming and the [original Python style guide](https://www.python.org/dev/peps/pep-0008/#tabs-or-spaces) suggests spaces as the preferred indentation method. I personally use tabs for indentation as they are faster and more convenient, but the choice is up to you! 
* In the previous session we implemented multiple conditional statements at the same time which also applies for `if` statements. Consider checking if an integer is divisible by 14 **and** is less than 3500: 

In [None]:
x = 1582

if (x % 14 == 0) and (x < 3500): # If a number is exactly divisible by another then the remainder (modulus: %) will be 0
    print("The number is divisible by 14 and is less than 3500")

* Sometimes we might want to make another decision based on the initial decision which can be done by 'nesting' `if` statements. Consider the following example:

In [None]:
username = 'Adam1'
password = 'sherlock123'
    
if username == 'Adam1':
    if len(password) < 6:
        print("Your password is too short")
    else:
        print("Your password is a good length")
else:
    print("Username not recognised.")

* Python sets a maximum number of nested static code blocks to **20** which is to keep memory usage at a sane level when executing nested blocks. If you feel you need more than 20 nested `if` statements then there is certainly a better way to implement your code.

## The for Loop

`for` loops are synonymous with automation, highlighting a computers real talent for running the same command repeatedly:

* `for` loops iterate over a collection of items, such as a list or dictionary, and run a block of code with each element from the collection. 
* Consider dividing each item in a list by 3 and printing the remainder of the division:

In [None]:
for i in [0, 1, 2, 3, 4, 5, 6]:
    print(i % 3)

* **Note again the importance of indentation**. We can rewrite the previous example using the `range` function which returns a series of numbers under an iterable form and is commonplace in `for` loops:

In [None]:
for i in range(7):
    print(i % 3)

* This gives the exact same result as the first `for` loop. 
* Note that the integer in `range` is set to 7 not 6 as `range` uses the first 6 numbers counting from 0, i.e. 0, 1, 2, 3, 4, 5 so we add one to get 0, 1, 2, 3, 4, 5, 6. 
* `range` also accepts a second integer argument meaning ranges between numbers can be specified:

In [None]:
# Loop over all numbers from 0 -> 9 (10 numbers)
for i in range(10):
    print(i)

print("\n") # This inserts a newline character to separate the print outs

# Loop over all numbers from 5 -> 10 (5 numbers)
for i in range(5,10):
    print(i)

* In session 2 we discussed the possible pitfalls of floating point rounding errors which are now further highlighted using `for` loops. Consider adding 0.1 to 0.2 a million times using a **cumulative sum**:

In [None]:
a = 0.2 # We tell Python the initial value of a
for i in range(1000000):
    a += 0.1  #This is the same as writing: a = a + 0.1
    
print(a)

* Note our use of the syntax `+=` which is equivalent to a = a + 1 (new a = old a + 1). 
* The exact answer is `a = 100000.2`, and we can see that the error is now present at the 6th decimal place which is much more significant than we saw before. 
* When using loops in your program, especially nested `for` loops, take note of the possible sources of floating point rounding error and check that errors do not accumulate as your program runs. 
* A better option is to remove unnecessary loops and leverage mathematics where possible. In the previous example there is no need to use a `for` loop as it can simply be calculated as:

In [None]:
a = 0.2 + (1000000 * 0.1)

print(a) # Notice there is no rounding error here even though we use 0.1 which is non-representable exactly in binary?
         # In Python 3 results printed to the screen are a string representation of the object. They pass through the repr() function in Python which chooses how best to represent the object.

* If you want to loop through both the elements of a list **and** have an index for the elements as well, you can use Python's `enumerate` function:

In [None]:
for index, item in enumerate(['volume', 'pressure', 'temperature', 'velocity']):
    print(index, '::', item)

* `enumerate` will generate tuples, which are unpacked into index (an integer) and item (the actual value from the list)

### Exercise

<font color='red'>Why won't this program run?:</font>

In [None]:
for i in range(10)
    print i * 10)

## The while Loop

`while` loops allow us to execute a set of statements as long as a condition is `True`. Consider printing numbers less than 10, `while x is less than 10, print that number. If x >= 10 stop`:

In [None]:
x = 1

while x < 10:
    print(x)

    x += 1 # increment the value of x by 1

* We need to increment `x` by 1 at the end of each loop cycle otherwise the `while` loop will continue forever as `x` will always be equal to 1.  

### The break Statement

With the `break` statement we can stop a loop even if the `while` condition is `True`:

In [None]:
x = 1

while x < 10:
    print(x)
    if x == 6:
        break
    x += 1 

* As soon as the condition `x == 6` is satisfied the loop gets the command to `break` and will exit the entire loop.

### The continue Statement

With the `continue` statement we can stop the current iteration, and continue with the next. In this example we print out numbers less than 6 but skip the number 3:

In [None]:
x = 0

while x < 6:
    x += 1
    if x == 3:
        continue
    
    print(x)

* There is a similar statement to `continue` called `pass`. `pass` just means 'no operation', it does not do anything, whereas in comparison `continue` breaks a loop and jumps to the next loop iteration:

In [None]:
x = True

while x:
    x = False
    continue
    print("This will not print") # continue breaks the loop and starts the next loop iteration without encountering this line

y = True
while y:
    y = False
    pass
    print("This will print") # pass doesn't do anything so lines after calling it are executed

### The else Statement

`else` statements can be used in conjunction with `while` loops just as we did with `if` statements.

In [None]:
x = 1
while x < 6:
    print(x)
    x += 1
else:
    print("x is no longer less than 6")

### List Comprehension

List comprehension is an elegant way to define and create lists based on existing lists. 

* Common applications are to make new lists where each element is the result of some operation applied to each member of another sequence or iterable. 
* List comprehension combines lists, `for` loops and optionally `if` statements. The syntax is as follows:

    `[expression for element in iterable]`

* With the optional `if` condition:

    `[expression for element in iterable if condition]`

* Let's consider an example of using a list comprehension and how you could achieve the same result using regular `for` loop syntax. Consider the chemical formula for ethanol input as a string, and we want to extract each character into a new list to be used somewhere else in the program:


In [None]:
chemical_formula = 'C2H5OH'

# Using for loops seperately
chem_characters = []
for i in chemical_formula:
    chem_characters.append(i) 

print(chem_characters)

# Using list comprehension
chem_characters = [i for i in chemical_formula]

print(chem_characters)

* List comprehensions are slightly faster than the precisely equivalent `for` loop syntax but the intended advantage of list comprehension is its elegant syntax, resulting in clear and easily understandable code.

# Functions

Functions are blocks of code which only run when called, behaving similarly to mathematical functions

$$ 
f(x) = x + 6.
$$

* This function accepts any value of $x$ and will produce the corresponding result. 
* Functions bundle a set of instructions that are run repeatedly, or due to the complexity of the code, are better self-contained in a sub-program and called when needed. 
* Functions can take none, single or multiple inputs and output none, single or multiple outputs, and are a **key building block used in scientific computing**.

**Defining a function:**

* The keyword `def` introduces a function definition and must be followed by the function name and the parenthesized list of input parameters. 

In [None]:
def example_func1(x):
    return x + 6

def example_func2(x, y, z):
    return x + 6, y + 6, z + 6
    
x = example_func1(4)
print(x)

x, y, z = example_func2(4, 5, 9)
print(x, y, z)

* The statements that form the body of the function start at the next line, and must be indented as discussed previously. 
* The first statement of the function body can optionally be a string; this string is the function’s **documentation string**, or **docstring**. 
    * Docstrings are used by developers to automatically produce online or printed documentation; but more importantly to explain to readers and users of your code what the purpose of the function is. 
    * For very simple functions this is not necessary, but it is good practice to include docstrings in code that you write, so make a habit of it. 
    * In these tutorials we will be using the [numpy docstring convention, PEP257 superset](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) but there are multiple schemes. 
    
* Consider the following function that is an equivalent implementation of the chemical formula code above:

In [None]:
def chem_formula_to_character(chemical_formula):
    '''
    This is the function's docstring!
    
    This function takes a chemical formula input as a string, separates the characters and appends them to a list.
    
    Parameters
    ----------
    chemical_formula : string
                       This is a chemical formula in string format, e.g. "H2O".
    
    Returns
    -------
    chem_characters : list
                      This list contains the separated characters from the chemical_formula string.
    '''
    # Using list comprehension
    chem_characters = [i for i in chemical_formula]

    # Using for loops seperately
    chem_characters = []
    for i in chemical_formula:
        chem_characters.append(i)
        
    return chem_characters

print(chem_formula_to_character('C2H5OH'))
print(chem_formula_to_character('H2O'))
print(chem_formula_to_character('C9H8O4'))

# Worked Example: Sum of Multiples

Now we know the basics of `if` statements, `for` and `while` loops we will put our knowledge to the test and answer the [first question](https://projecteuler.net/problem=1) from Project Euler, similar to the **fizz buzz** interview question for programmers:

<p style="text-align: center;">
"If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23. Find the sum of all the multiples of 3 or 5 below 1000."
</p>

There are usually multiple ways to solve a problem; and there are some very succinct ways of solving this problem using more advanced Python, but it is perfectly solvable using just `if` and `for` statements in conjunction with what we learned about lists in session 2. **Firstly, let's replicate the given information about multiples of 3 and 5 below 10 by:**

1. Storing the natural numbers below 10 in a list.
2. Creating two empty lists, one to store multiples of 3 and one to store multiples of 5.
3. We will then loop over the natural number list using a `for` loop and test each number to see if it is divisible by 3 or 5 using the modulus operator, `%`. If the number is divisible by 3 or 5, it is appended to its relevant list which we then sum together after the `for` loop using the inbuilt `sum` function in Python which adds the items of an iterable object:

In [None]:
# Create list of natural numbers below 10
natural_num_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

# Create two empty lists, one for multiples of 3 and one for multiples of 5
mul_3 = []
mul_5 = []

# We now want to loop over the list of numbers
for i in natural_num_list:
    if (i % 3 == 0): # If number is exactly divisible by 3
        mul_3.append(i) # If True then append the number to the multiples of 3 list
    elif (i % 5 == 0): # If number is exactly divisible by 5
        mul_5.append(i) # If True then append the number to the multiples of 5 list
    else:
        pass # If neither condition is True then carry on

# Add the multiples together
sum_multiples = sum(mul_3) + sum(mul_5)

print(sum_multiples)

* Excellent! we have replicated the given information which is a worthwhile habit to learn as your programming progresses. 
* Testing your program on model systems, known input and output or literature data is one of the primary ways to check your program is functioning as expected; 
    * **remember a program will produce numbers but it does not know if those numbers are correct; that is the job of the human programming it.** 

* We can now modify our program to find the multiples of 3 and 5 below 1000; which we will do by using `for` loops and the equivalent list comprehension to build a variable size list of the natural numbers, rather than having to manually type them all: 

In [None]:
# for loop notation
natural_num_list = []
for x in range(10):
    natural_num_list.append(x)

print(natural_num_list)

# Equivalent, more compact list comprehension notation
natural_num_list = [x for x in range(10)]

print(natural_num_list)

In [None]:
# There are more comments here than should usually be written. This is just because we are learning!

# Create list of natural numbers below 1000
natural_num_list = [x for x in range(1000)]

# Create two empty lists, one for multiples of 3 and one for multiples of 5
mul_3 = []
mul_5 = []

# We now want to loop over the list of numbers
for i in natural_num_list:
    if (i % 3 == 0) :   # If number is exactly divisible by 3
        mul_3.append(i) # If True then append the number to the multiples of 3 list
    elif (i % 5 == 0):  # If number is exactly divisible by 5
        mul_5.append(i) # If True then append the number to the multiples of 5 list 
    else:
        pass # If neither condition is True then pass

# Add the multiples together as required
sum_multiples = sum(mul_3) + sum(mul_5)

print(sum_multiples)

* **Which is the correct answer!**
* We can go one step further and use list comprehension to compress the entire program into just a single line:

In [None]:
print(sum([i for i in range(1000) if i % 3 == 0 or i % 5 == 0]))

* We have produced several equivalent programs which correctly calculate the answer, but do we even need to use `for` loops, `if` statements or lists? 
    * There are times where sitting down with a problem and analysing it carefully can reveal alternative solutions using more elegant mathematics than 'brute forcing' it with loops. 
    * We could instead solve this problem using the [**inclusion-exclusion principle**](https://mathworld.wolfram.com/Inclusion-ExclusionPrinciple.html) from a branch of mathematics called *combinatorics* which is described at the **end of this workbook** as an alternative solution to this probelm without requiring any `for` loops.  

## Review

In this session we covered:

* `if`, `elif` and `else` statements which allow us to include decisions within our program.
* `for` loops which allows us to iterate over a collection of items.
* `while` loops which allow us to execute a set of statements as long as a condition is `True`.
* List comprehension using a combination of lists, `if` statements and `for` loops.
* `Functions` which only run when called, behaving similarly to mathematical functions.

## Exercise

Each new term in the Fibonacci sequence is generated by adding the previous two terms, e.g. 1 + 2 = 3 -> 2 + 3 = 5 -> 3 + 5 = 8 etc... By starting with 1 and 2, the first 10 terms will be:

\\[1, 2, 3, 5, 8, 13, 21, 34, 55, 89, \ldots\\]

By considering the terms in the Fibonacci sequence whose values do **not exceed four million**, find the sum of the **even-valued** terms. Below is a code which generates the Fibonacci sequence and your task is to modify it in order to find the sum of these **even-valued** terms.

In [None]:
x = 0
y = 1

no_fib_iters = 10 # number of steps to take in the Fibonacci sequence

fib = [] # Create empty list to store the Fibonacci numbers
for i in range(no_fib_iters):
    fib_current = x + y # current Fibonnaci value

    x = y # assign the second value in the sum to the first value for the next sum
    y = fib_current 

    fib.append(fib_current)

print(fib)

**Hints**: 

* Start by generating the given Fibonnaci sequence by running the code to understand how it works.
* Copy, paste and edit the code to handle the summation part of the problem, a `while` loop may be useful.
* Extract out the even numbers from this sequence and add together, making sure no terms in the Fibonacci sequence exceed 4 million.
* The answer is **4613732**.
* **If you are stuck:** An example code answer is at the bottom of this notebook.

## Alternative solution to sums of multiples

If the items in sets can be in any of several sets (multiples of 3 and multiples of 5 being the sets), add the totals for each set, subtract the number of things that are in exactly two of the sets (in this case multiples of 15 which are double counted):

$$
\underbrace{\sum\limits_{k_1 = 1}^{333}3k_1\phantom{15}}_{\text{Sum 1}} + \underbrace{\sum\limits_{k_2 = 1}^{199}5k_2 \phantom{15}}_{\text{Sum 2}} - \underbrace{\sum\limits_{k_3 = 1}^{66}15k_3}_{\text{Sum 3}} = 166833 + 99500 − 33165 = 233168
$$

* **Sum 1:** Sum up all numbers, which are multiples of 3: 3, 6, 9, 12, 15, 18, 21,..., 990, 993, 996, 999. We factor out 3 which is why the upper bound is 333, (333 x 3 = 999).

* **Sum 2:** Sum up all numbers, which are multiples of 5: 5, 10, 15, 20, 25, 30, 35,..., 990, 995. We factor out 5 which is why the upper bound is 199, (199 x 5 = 995).
* **Sum 3:** We have counted multiple numbers twice, e.g. 990 and we need to account for this double counting by subtracting them off the total. We know how many multiples of 15 there will be by dividing the highest multiple of 15 in the range by 15, i.e. 990/15 = 66, hence the third sum ranges from 1 to 66. 

This solution is more technical compared to the Python codes we just developed but when you stumble across elegant mathematical solutions to a problem that do not require writing code they are nearly always **the preferred solution.**

## Fibonacci summation code

In [None]:
x = 0
y = 1

num_limit = 4E6 # set the limit of the Fobonacci numbers to be 4000000

fib_even = 0 # Start the cumulative sum of even fibonnaci numbers
while x <= num_limit:
    fib_current = x + y

    x = y
    y = fib_current

    # Test if the number is even
    if fib_current % 2 == 0.0:
        fib_even += fib_current

# run the final sum of even Fibonacci numbers
print(fib_even)

In [None]:
# Below is a more efficient solution

def fibonacci_sum_even(limit):
    '''
    Fill in the docstring for practice!
    '''
    x, y = 0, 1
    while x < limit:
        if not x % 2:         
            yield x # look here for what the yield function does: https://www.geeksforgeeks.org/use-yield-keyword-instead-return-keyword-python/
        x, y = y, x + y

print(sum(fibonacci_sum_even(4E6)))