# Control Flow and Data Structures

## Lesson Goal

 - Compose simple programs to perform the algebraic equations we have studied so far on:
  - single value variables.
  - data structures (holding mutiple variables)
  - data imported from external sources (user input and external files)

## Objectives

 - Use control __statements__, __loops__ and  to determine the flow of a program.  
 - Express collections of mulitple variables as `list`, `tuple` and dictionary (`dict`) data structures.
 
- Use iteratation to visit entries in a data structure 

- Learn to select the right data structure for an application

 - Input data to your programs from different sources (user input and external files).

## 1.Control Statements
In the last seminar we looked at a simple computer program that returned Boolean (True or False) variables. 

Based on the current time of day, the program answers two questions:


>__Is it lunchtime?__

>`True`


if it is lunch time.

>__Is it time for work?__

>`True`

if it is within working hours.

In [1]:
time = 13.05          # current time

work_starts = 8.00    # time work starts 
work_ends =  17.00    # time work ends

lunch_starts = 13.00  # time lunch starts
lunch_ends =   14.00  # time lunch ends

# variable lunchtime is True or False
lunchtime = time >= lunch_starts and time < lunch_ends

# variable work_time is True or False
work_time = time < work_starts or time >= work_ends


print("Is it lunchtime?")
print(lunchtime)
print("Is it time for work?")
print(work_time)

Is it lunchtime?
True
Is it time for work?
False


What if we now want our computer program to do something based on these answers?

To do this, we need to use *control statements*.

Control statements are a fundamental part of programming.

Here is a control statement in pseudo code:

    if A is true    
        Perform task X
        
For example 

    if lunchtime is true    
        Eat lunch


This is an `if` statement.

Control flow is used to make decisions in a program.

For example we can use an `if` statement with an `else if`to check wether multiple conditions are true or false. 

    if A is true
        Perform task X (only)
        
    else if B is true
        Perform task Y (only)
        


    if lunchtime is true
        Eat lunch
        
    else if work_time is true
        Do work

Often it is useful to include an `else` statement.

If none of the `if` and `else if` statements are satisfied, the code following the `else` statement will be executed.

    if A is true
        Perform task X (only)
        
    else if B is true
        Perform task Y (only)
        
    else   
        Perform task Z (only)
        



    if lunchtime is true
        Eat lunch
        
    else if work_time is true
        Do work
        
    else   
        Go home

Let's get a better understanding of control flow statements by completing some examples. 

## 1.1 `if` and `else` statements

Below is a simple example that demonstrates a Python  if-else control statement. 

The input to the program is variable `x`.

The program prints a message and modifies `x`.

The message and the modification of `x` depend on the initial value of `x`.

__Note:__ 
 
 - In Python, "else if" is written: `elif`
 - The program  uses the short-cut algebraic operators that you learnt to use in the last seminar. 

In [3]:
x = -10.0  # Initial x value

if x > 0.0:  
    print('Initial x is greater than zero')
    x -= 20.0
    
elif x < 0.0:  
    print('Initial x is less than zero')
    x += 21.0
    
else: 
    print('Initial x is not less than zero and not greater than zero, therefore it must be zero')
    x *= 2.5

# Print new x value

Initial x is less than zero


__Try it yourself__

Try changing the value of `x` a few times.
Re-run the cell to see the different paths the program can follow.

Let's look more closely at the control statement example. 

The control statement begins with an `if`.<br>
This is followed by `:` <br>.
This is followed by the expression to check. <br>

```python
if x > 0.0:
```

Below that is a block of code, indented by four spaces, that is executed if the check (`x > 0.0`) is true:

````python
    print('Initial x is greater than zero')
    x -= 20.0
````
if the check (`x > 0.0`) is true:
 - The code above is executed.
 - The control block is exited.
 
If the check is false, then the `elif` (else if) check is performed:

```python
elif x < 0.0:
    print('Initial x is less than zero')
    x += 21.0
```      
If (`x < 0.0`) is true:
 - `print('x is less than zero')`' is executed.
 - The control block is exited. 
 
 
If none of the preceding stements are true [(`x > 0.0`) is false and (`x < 0.0`) is false], the code following the `else` statement is executed.
 
 
```python
else:
    print('Initial x is not less than zero and not greater than zero, therefore it must be zero')
```


### Real-World Example: currency trading

__Read the problem below carefully.__

To make a comission (profit), a currency trader sells US dollars to travellers at a rate below the market rate. 

The multiplier the apply to calculate the reduction is shown in the table.  

|Amount (GBP)                                |reduction on market rate |
|--------------------------------------------|-------------------------|
| Less than $100$                            | 0.9                     |   
| From $100$ and less than $1,000$           | 0.925                   |   
| From $1,000$ and less than $10,000$        | 0.95                    |   
| From $10,000$ and less than $100,000$      | 0.97                    |   
| Over $100,000$                             | 0.98                    |   

The currency trader incurs extra costs for handling cash. 

Therefore, if the transaction is made in cash they retain an extra 10% after conversion. 
(If the trasnaction is made electronically, they do not).  

At the current market rate 1 JPY is 0.0091 USD.

The *effective rate* is the rate that the customer is getting based on the amount in JPY to be changed.

In [5]:
JPY  = 200500  # The amount in JPY to be changed into USD
cash = True  # True if selling cash, otherwise False

market_rate = 0.0091  # 1 JPY is worth this many dollars at the market rate

# Apply the appropriate reduction depending on the amount being sold
if JPY < 100:
    USD = 0.9 * market_rate * JPY
    
elif JPY < 1000:  
    USD = 0.925 * market_rate * JPY
    
elif JPY < 10000:
    USD = 0.95 * market_rate * JPY
    
elif JPY < 100000:
    USD = 0.97 * market_rate * JPY
    
else:
    USD = 0.98 * market_rate * JPY

if cash:
    USD *= 0.9  # recall that this is shorthand for USD = 0.9*USD 
    
print("Amount in JPY sold:", JPY)
print("Amount in USD purchased:", USD)
print("Effective rate:", USD/JPY)

Amount in JPY sold: 200500
Amount in USD purchased: 1609.2531000000001
Effective rate: 0.0080262


__Try it yourself__

Try changing the values of `JPY` and `cash` a few times.
Re-run the cell to see the different paths the program can follow.

## 1.2 `for` loops

A `for` loop is a block that repeats an operation a specified number of times (loops). 

We will learn its simplest and most common usage.

### 1.2.1 To do this we are first going to need to use the function `range()`.

Here we are using the function `range`.

`range(3, 6)` returns *integer* values starting from 3 and ending at 6.

i.e.

> 3,4,5

Note this does not include 6.

We can change the starting value.
 
For example for integer values starting at 0 and ending at 4:
 
`range(0,4)`

returns:

> 0, 1, 2, 3

`range(4)` is a __shortcut__ for range(0, 4) 

### 1.2.2 Simple `for` loops

In [1]:
for i in range(-2, 3):
    print(i)

-2
-1
0
1
2


The above executes 5 loops.

The statement 
```python
for i in range(-2, 3):
```
says that we want to loop over five integers, starting from -2. 

Each loop the value `i` increases by 1 (-2, -1, 0, 1, 2).
 
The code we want to execute inside the loop is indented four spaces: 
```python
    print(i)
```
The loop starts from -2 and executes this code for each value of i (-2, -1, 0, 1, 2).

In [12]:
for n in range(4):
    
    print("----")
    
    print(n, n**2)

----
0 0
----
1 1
----
2 4
----
3 9


The above executes 4 loops.

The statement 
```python
for n in range(4):
```
says that we want to loop over four integers, starting from 0. 

Each loop the value `n` increases by 1 (0, 1, 2 3).
 
The code we want to execute inside the loop is indented four spaces: 
```python
    print("----")
    print(n, n**2)
```
The loop starts from zero and executes this code for each value of n (0, 1, 2, 3). 



__Try it yourself__
Go back and change the __range__ of input values and observe the chnage in output. 


If we want to step by three rather than one:

In [14]:
for n in range(0, 10, 3):
    print(n)

0
3
6
9


If we want to step backwards rather than forwards we must include the step size:

In [22]:
for n in range(10, 0, -1):
    print(n)

10
9
8
7
6
5
4
3
2
1


For example:

In [23]:
for n in range(10, 0):
    print(n)

Does not return any values because there are no values that lie between 10 and 0 when counting in the positive direction from 10. 

__Try it yourself.__

In the cell below write a `for` loop that:
 - loops __backwards__ through a range starting at `n = 10` and ending at `n = 1`.
 - prints `n`$^2$ at each loop.


In [20]:
# For loop

### Real-world Example: conversion table from degrees Fahrenheit to degrees Celsius

We can use a `for` loop to create a conversion table from degrees Fahrenheit ($T_F$) to degrees Celsius ($T_c$).

Conversion formula:

$$
T_c = 5(T_f - 32)/9
$$

Computing the conversion from -100 F to 200 F in steps of 20 F (not including 200 F):

In [24]:
print("T_f,    T_c")

for Tf in range(-100, 200, 20):
    print(Tf, (Tf - 32) * 5 / 9)

T_f,    T_c
-100 -73.33333333333333
-80 -62.22222222222222
-60 -51.111111111111114
-40 -40.0
-20 -28.88888888888889
0 -17.77777777777778
20 -6.666666666666667
40 4.444444444444445
60 15.555555555555555
80 26.666666666666668
100 37.77777777777778
120 48.888888888888886
140 60.0
160 71.11111111111111
180 82.22222222222223


## 1.3 `while` loops

`for` loops perform an operation a specified number of times. 

A `while` loop performs a task while a specified statement is true. 

For example:

In [29]:
print("Start of while statement")
x = -2
while x < 5:
    print(x)
    x += 1  # Increment x
print("End of while statement")

Start of while statement
-2
-1
0
1
2
3
4
End of while statement


The code that follows the `while` statement is indented four spaces.

It is executed and repeated until `x < 5` is `False`.

It can be quite easy to crash your computer using a `while` loop. E.g.,
```python
x = -2
while x < 5:
    print(x)
```
will continue indefinitely since `x < 5 == False`  will never be satisfied. 

Therefore, we should consider which type of loop is appropraite for our program e.g. it would be better to use a `for` loop for the example above. 

In [30]:
print("Start of for statement")
x = -2

for y in range(x,5):
    print(y)

    
print("End of for statement")

Start of for statement
-2
-1
0
1
2
3
4
End of for statement


The following is an example of where a `while` is appropriate.

In [27]:
x = 0.9

while x > 0.001:
    # Square x (shortcut x *= x)
    x = x * x
    print(x)

0.81
0.6561000000000001
0.43046721000000016
0.18530201888518424
0.03433683820292518
0.001179018457773862
1.390084523771456e-06


since we might not know beforehand how many steps are required before `x > 0.001` becomes false. 

If $x \ge 1$, the above would lead to an infinite loop as `x` would increase with every loop. 

e.g. 
```python
x = 2

while x > 0.001:
    x = x * x
    print(x)
```

To make a code robust, it would be good practice to check that $x < 1$ before entering the `while` loop.

__Try it for yourself:__

In the cell below:
 - Create a variable,`x`, with the initial value 50
 - Each loop, reduce the value of x by half
 - Exit the loop when `x` < 3

In [1]:
# While loop

## 1.4 `break`, `continue` and `pass`

### 1.4.1 `break`

Sometimes we want to break out of a `for` or `while` loop. 

For example in a `for` loop we can check if something is true, and then exit the loop prematurely, e.g

In [None]:
for x in range(10):
    print(x)
    
    if x == 5:
        print("Time to break out")
        break

Let's look at how we can use this in a program.

The program below checks (integer) numbers up to 50 __finds prime numbers__ and prints the prime numbers. 

__Prime number:__ A natural number, greater than 1, that has no positive divisors other than 1 and itself (2, 3, 5, 11, 13, 17....)

In [8]:
N = 50  # Check numbers up 50 for primes (excludes 50)

# Loop over all numbers from 2 to 50 (excluding 50)
for n in range(2, N):

    # Assume that n is prime
    n_is_prime = True

    # Check if n can be divided by m
    # m ranges from 2 to n (excluding n)
    for m in range(2, n):
        
        # Check if the remainder when n/m is equal to zero 
        # If the remainder is zero it is not a prime number
        if n % m == 0:   
            n_is_prime = False

    #  If n is prime, print to screen        
    if n_is_prime:
        print(n)

2
3
5
7
11
13
17
19
23
29
31
37
41
43
47


Notice that our program contains a second `for` loop. 

For each value of n, it loops through incrementing values of m in the range (2 to n):

```python
# Check if n can be divided by m
    # m ranges from 2 to n (excluding n)
    for m in range(2, n):
```
before incrementing to the next value of n.

We call this a *nested* loop.

The indents in the code show where loops are nested.

<br>
<br>

Notice that one of the prime numbers is 17.

In the program below, a break statment is added. 

In [9]:
N = 50  # Check numbers up 50 for primes (excludes 50)

# Loop over all numbers from 2 to 50 (excluding 50)
for n in range(2, N):

    # Assume that n is prime
    n_is_prime = True

    # Check if n can be divided by m
    # m ranges from 2 to n (excluding n)
    for m in range(2, n):
        
        # Check if the remainder when n/m is equal to zero 
        # If the remainder is zero it is not a prime number
        if n % m == 0:   
            n_is_prime = False

    #  If n is prime, print to screen        
    if n_is_prime:
        print(n)
        
    if n == 17:   
        break

2
3
5
7
11
13
17


If if `n`  is equal to 17, the program stops running the `for` loop:

```python
for n in range(2, N):
```

Only values up to 17 are printed. 

__Try it yourself.__

Re-write the break statement in the cell above.  

Try stopping the foor loop at the first prime number greater than 20.

__Note:__ You do not need to delete the previous `break` statement.

You can make it a comment by adding a `#` at the start of each line:

```python
#    if n == 17:   
#        break
```
This allows you to refer to see the code, but stops the program from running it. 

The program exits the loop at the `break` statement.
This means that any code within the loop, after the break statement is skipped. 

__In the cell below copy and paste your code from the cell above.__

__Try editing your code to printing all of the prime numbers *under* 20.__

In [None]:
# Copy and paste your code here.

A simple way to do this is to place the break statement before we print the value of `n`.

    #  If n is prime, print to screen        
    if n_is_prime:
        print(n)
        
If `n`$>20$ the program breaks out of the loop before printing the number.

### 1.4.2 `continue`

Sometimes, instead of stopping the loop we want to go to the next iteration in a loop, skipping the remaining code.

For this we use `continue`. 

The example below loops over 20 numbers (0 to 19) and checks if the number is divisible by 4. 

If the number is not divisible by 4:

- it prints a message 
- it moves to the next value. 

If the number is divisible by 4 it *continues* to the next value in the loop, without printing.

In [11]:
for j in range(20):
    
    if j % 4 == 0:  # Check remainer of j/4
        continue    # continue to next value of j
        
    print("Number is not divisible by 4:", j)

Number is not divisible by 4: 1
Number is not divisible by 4: 2
Number is not divisible by 4: 3
Number is not divisible by 4: 5
Number is not divisible by 4: 6
Number is not divisible by 4: 7
Number is not divisible by 4: 9
Number is not divisible by 4: 10
Number is not divisible by 4: 11
Number is not divisible by 4: 13
Number is not divisible by 4: 14
Number is not divisible by 4: 15
Number is not divisible by 4: 17
Number is not divisible by 4: 18
Number is not divisible by 4: 19


## 2. Data Structures

Often we want to manipulate data that is more meaningful than ranges of numbers.

These collections of variables might include:
 - the results of an experiment
 - a list of names
 - the components of a vector
 - a telephone directory with names and associated numbers.
 
Python has different __data structures__ that can be used to store and manipulate these values.

Like variable types (`string`, `int`,`float`...) different data structures behave in different ways.

Today we will learn to use `list`, `tuple` and dictionary (`dict`) data structures.

We will study the differences in how they behave so that you can learn to select the most suitable data structure for an application. 
 
 

Programs use data structure to collect data into useful packages. 

$$
r = [u, v, w]
$$

For example, rather than representing a vector `r` of length 3 using three seperate floats `ru`, `rv` and `rw`, we could represent 
it as a __list__ of floats:

`r = [u, v, w]`. 

We will learn what a __list__ is in a moment.

If we want to store the names of students in a laboratory group, rather than representing each students using an individual string variable, we could use a list of names, e.g.:



In [12]:
lab_group0 = ["Sarah", "John", "Joe", "Emily"]
lab_group1 = ["Roger", "Rachel", "Amer", "Caroline", "Colin"]

This is useful because we can perform operations on lists such as:
 - checking its length (number of students in a lab group)
 - sorting the names in the list into alphabetical order
 - making a list of lists (we call this a *nested list*):


In [13]:
lab_groups = [lab_group0, lab_group1]

## 2.1 Lists

A list is a sequence of data. 

We call each item in the sequence an *element*. 

A list is constructed using square brackets.

A list can hold a mixture of types (`int`, `string`....).

In the example below, `lab_group0` is a list.


In the example below, `lab_group0` is a list.



We can find the length (number of items) of a list using the function `len()`, by including the name of the list in the brackets. 

In [37]:
lab_group0 = ["Sara", "Mari", "Quang"]

size = len(lab_group0)

print("Lab group members:", lab_group0)

print("Size of lab group:", size)

print("Check the Python object type:", type(lab_group0))


Lab group members: ['Sara', 'Mari', 'Quang']
Size of lab group: 3
Check the Python object type: <class 'list'>


An empty list is created by

In [29]:
my_list = []

A list of length 5 with repeated values can be created by

In [30]:
my_list = ["Hello"]*5
print(my_list)

['Hello', 'Hello', 'Hello', 'Hello', 'Hello']


### 2.1.1 Iterating over lists

Looping over each item in a list is called *iterating*. 

To iterate over the members of the lab group we use a `for` loop: 

In [31]:
for member in lab_group0:
    print(member)

Sara
Mari
Quang


### 2.1.2 Manipulating lists 

There are many functions for manipulating lists. 

To sort the list by alphabetical order we use `sorted()`:

In [38]:
lab_group0 = ["Sara", "Mari", "Quang"]

print(lab_group0)

lab_group0 = sorted(lab_group0)

print(lab_group0)

['Sara', 'Mari', 'Quang']
['Mari', 'Quang', 'Sara']


As with `len()` we include the name of the list we want to sort in the brackets. 

There is a shortcut for sorting a list

`sort` is known as a 'method' of a `list`. 

If we suffix a list with `.sort()`, it performs an *in-place* sort.

In [42]:
lab_group0 = ["Sara", "Mari", "Quang"]

print(lab_group0)

#lab_group0 = sorted(lab_group0)
lab_group0.sort()

print(lab_group0)

['Sara', 'Mari', 'Quang']
['Mari', 'Quang', 'Sara']


We can remove items from a list using the method `pop`.

We place the index of the element we wich to remove in brackets. 

In [44]:
print(lab_group0)

# Remove the second student 
# remember indexing starts from 0
# 1 is the second element

lab_group0.pop(1)
print(lab_group0)

['Mari', 'Sara']
['Mari']


We can add items at the end of a list using the method `append`.

We place the element we want to add to the end of the list in brackets. 

In [46]:
# Add new student "Lia" at the end of the list
lab_group0.append("Lia")
print(lab_group0)

['Mari', 'Lia', 'Lia']


## 2.1.3 Indexing

Lists store data in order.

We can select a single element of a list using its index.

You are familiar with this process; it is the same as selecting individual characters of a `string`:

In [49]:
a = "string"
b = a[1]
print(b)

't'

In [50]:
first_member = lab_group0[0]
third_member = lab_group0[2]
print(first_member, third_member)

Mari Lia


Indices can be useful when looping through the items in a list.`

In [53]:
# We can express the following for loop:
for i in lab_group0:
    print(i)
    
# as:
for i in range(len(lab_group0)):
    print(lab_group0[i])

Mari
Lia
Lia
Mari
Lia
Lia


In this case:
 - the first value in the range is 0.
 - the last value in the range is (list length - 1). 

Indexing can be useful for numerical computations. 

__Example: The dot product of two vectors:__

__Vector:__ A quantity with magnitude and direction.

Position vectors (or displacement vectors) in 3D space can always be expressed in terms of x,y, and z-directions.  

<img src="../../../ILAS_seminars/intro to python/img/3d_position_vector.png" alt="Drawing" style="width: 250px;"/>

The position vector 𝒓 indicates the position of a point in 3D space.

𝒊 is the displacement one unit in the x-direction
𝒋 is the displacement one unit in the y-direction
𝒌 is the displacement one unit in the z-direction

The scaler coefficients 𝑥, 𝑦, 𝑧 give us the position vector 𝒓 = 𝑥𝒊 + 𝑦𝒋 + 𝑧𝒌

The ordered basis of a vector, 𝒓,  in 3D space is [𝒊,𝒋,𝒌]

The brackets [] indicate the order.

So we often express position vector r as $ r = [x, y, z]$

...which looks a lot like a Python list!

3D vectors are used to describe many physical quantities, such as force.

The __dot product__ is a really useful algebraic operation that takes two equal-length sequences of numbers (usually coordinate vectors) and returns a single number.

It can be expressed mathematically as:

__GEOMETRIC REPRESENTATION__

\begin{align}
\mathbf{A} \cdot \mathbf{B} = |\mathbf{A}| |\mathbf{B}| cos(\theta)
\end{align}

<img src="../../../ILAS_seminars/intro to python/img/dot_product.gif" alt="Drawing" style="width: 250px;"/>

$\mathbf{B} cos(\theta)$ is the component of $B$ acting in the direction of $A$.

__ALGEBRAIC REPRESENTATION__


>The dot product of two $n$-length-vectors is: 


\begin{align}
\mathbf{A} \cdot \mathbf{B} = \sum_{i=1}^n A_i B_i.
\end{align}

>For a three-vector consisting of $x$, $y$, and $z$ components,
>
\begin{align}
\mathbf{A} \cdot \mathbf{A} &= \sum_{i=1}^n A_i B_i \\
&= A_x B_x + A_y B_y + A_z B_z.
\end{align}

__Example:__ Dot product of vectors $\mathbf{A}$ = [1, 3, −5] and $\mathbf{B}$ = [4, −2, −1]:


\begin{align}
      {\displaystyle {\begin{aligned}\ [1,3,-5]\cdot [4,-2,-1]&=(1)(4)+(3)(-2)+(-5)(-1)\\&=4-6+5\\&=3\end{aligned}}} 
\end{align}

We can solve this very easily using Python.



In [4]:
A = [1.0, 3.0, -5.0]
B = [4.0, -2.0, -1.0]

# Compute dot-product
dot_product = 0.0

for i in range(len(A)):
    dot_product += A[i]*B[i]

print(dot_product)

3.0


By considering the __GEOMETRIC__ representation of the dot product we can see that this simple piece of code can allow us to quickly solve many engineering-related problems:
 - Test if two vectors are perpendicular (if perpendicular the dot product is equal to 0).
 - Find the angle between two vectors (from its cosine).
 - Gives the magnitude of one vector in the direction of another.
 (e.g. resolving forces into their component directions).   


__Example:__

i) How much work is done by pushing the object for 10m

ii) How much force 

could tell me how much force is actually helping the object move, when pushing at an angle."

<img src="../../../ILAS_seminars/intro to python/img/dot_product.gif" alt="Drawing" style="width: 250px;"/>