# 7PAVITPR: Introduction to Statistical Programming
# Python practical 6

_Angus Roberts<br/>
Department of Biostatistics and Health Informatics<br/>
Institute of Psychiatry, Psychology and Neuroscience<br/>
King's College London<br/>_


# Flow of control: if, while, for, and range

Python's main control statements are:
- `if`
- `while`
- `for`

`if` and `while` are very similar to their counterparts in other languages. `for` behaves a little differently, as we will explain later.

We will also cover two built-in functions that can be used to extend the use of `for`:
- `range()`
- `enumerate()`

Python also has statements to break out of and force continuation of `while` and `for` control loops. We will not cover these - see the [Python tutorial](https://docs.python.org/3/tutorial/controlflow.html#break-and-continue-statements-and-else-clauses-on-loops) for details.



### Indentation

When defining control of flow, we need to know which statements a loop (`for` or `while`) or a condition (`if`) applies to. R (and some other languages) delineate these blocks of code with pairs of curly braces `{}`:

`while (test_1) {
  statement_a
  if (test_2) {
    statement_b
    statement_c
  }
}`

In these languages:

- The control statement (in this case `while` and `if`) applies to everything in their following braces.
- Any whitespace or indentation before each line is not neccesary, and has no meaning to the program interpreter. It is a convention, and there only to aid readability

In Python, whitespace matters - it is meaningful.

- No braces are used
- Instead the amount of indentation determines the block of code
- All lines with the same amount of indentation are in the same block
- A colon is used at the end of a test, to signal that an associated block starts on the next line

The above R code in Python is:

`while test_1:
  statement_a
  if test_2:
    statement_b
    statement_c`
    

## <font color=green>💬 Discussion point</font>
Can you think of any advantages or disadvantages with Python's meaningful whitespace approach?

### If statements

The simplest form of the `if` in Python is:

`if test:
  statement
else:
  statement`
  
## <font color=green>❓ Question</font>

Write an if statement to take an integer input from a user, and print out whether the integer is even or odd. You can make use of the following:

- the `input()` function, which takes input from the user (STDIN)
- the `int()` function, which casts a value to an integer
- the `%` operator, which finds the remainder
- the fact that zero has a truth value of false


## <font color=green>⌨️ Your answer</font>



In [None]:
# Uncomment and complete the lines below to give your answer

value = input("integer please")
value = int(value)
if value % 2 :
    print('odd')
else:
    print('even')



__Multiple conditions in an if statement__

What if we want to carry out multiple tests? We could nest our if statements, like this:



In [None]:
value = input('Enter a character: ')
if value == 'c':
    print('cytosine')
else:
    if value == 'g':
        print('guanine')
    else:
        if value == 't':
            print('thymine')
        else:
            print('something else')

But this will get tedious and hard to read very quickly. Instead, we can chain lots of tests together with `elif`, which is similar to `else if` in R and java. Run the example below. Note how the final `else` catches any value not covered by the previous tests:

In [None]:
value = input('Enter a character: ')
if value == 'c':
    print('cytosine')
elif value == 'g':
    print('guanine')
elif value == 't':
    print('thymine')
else:
    print('something else')

## <font color=green>❓ Question</font>

Write an if / elif / else statement to take an integer input from a user, and print out whether the integer is zero, negative, positive even or positive odd.

## <font color=green>⌨️ Your answer</font>

In [None]:
# Uncomment and complete the lines below to give your answer

# value = 
# if test:
#    print('zero')
# elif test:
#    print('negative')
# etc etc
# else:
# etc etc

value = input("integer please: ")
value = int(value)
if not value :
    print('zero')
elif value < 0:
    print('negative')
elif value % 2:
    print('positive odd')
else:
    print('positive even')



### While loops

`while` loops allow you to carry out some block of statements until a condition is no longer met. This is called the _termination condition_. It is most  useful when you don't know how many times you will need to repeat some statements, for example, when handling data being streamed from some device, or when taking user input. It is less useful when you will know how many data items you have before looping (for this, see the `for` loop below). Subsequently, you may not use them often.

`while` loops make use of the same syntax as `if` statements:
- the keyword `while` followed by a condition
- a colon `:` after the condition
- indentation of the body of the loop


Here's an example:



In [None]:
values = list()
print('Enter values.')
print('Enter an empty line to quit.')

value = input('Next value: ')    # initalize
while value:                     # termination condition
  values.append(value)
  value = input('Next value: ')  # reset


print('\nThe values were:')
print(values)

## <font color=green>💬 Discussion point</font>
What roles is the variable `value` playing in the above code?

## <font color=green>❓ Question</font>
- An app sends a stream of characters to some cloud based program as follows:
  - characters '1', '2', '3', '4', or '5' indicating user mood scores collected over the last day
  - the character 'z' to indicate all scores have been sent, and that the program should compute the mean
- Using keyboard input to simulate the stream of characters from the app, write the program that computes the mean


## <font color=green>⌨️ Your answer</font>


In [None]:
### Complete the code below, replacing the four REPLACE words with your own code


prompt = 'Next mood score (z to finish): '  # we will use this prompt multiple times

score = input(prompt)                       # get the first score
sum = 0                                     # initialise the sum
count = 0                                   # initialise the count
while score != 'z':                         # test for terminating condition
    sum = sum + int(score)                  # calculate the sum
    count = count + 1                       # increment the count
    score = input(prompt)                   # get the next score
    
print('Mean score: ', sum/count)            # print out the answer


## <font color=green>💬 Discussion point</font>
In your code above,
- why has the prompt string been placed in a variable, instead of using it directly in the `input()` statements?
- why does the `sum` variable have to be initialised?
- why are there two `input()` statements - couldn't we use one?

### For loops

`for` loops in Python have very similar functionality to those in R, in that they iterate over the items in any sequence. As you might expect, the syntax follows the same pattern as for `while` loops:

`for item in sequence:
   statement
   statement
   ...`
   
The `in` keyword indicates an assignment. Each item from the sequence is assigned to a variable, in turn. Here's an example for you to try:
   
   

In [None]:
readings = ['+++++','++','++', '+++']
for item in readings:
  print(len(item))

## <font color=green>❓ Question</font>

You have a list of integer results from some instrument run. On every instrument run, the first and last four results in this list are artefacts and discarded. Write a piece of code to count the number of results over 50, and the number 50 or less. (___Hint:___ use list slices)

## <font color=green>⌨️ Your answer</font>



In [None]:
# Complete the code below to give your answer

# Here are the results
results = [0, 0, 1, 1, 55, 67, 42, 78, 34, 56, 2, 2, 0, 0]

big_count = 0        # Initialise the count of big results (over 50) 
small_count = 0      # Initialise the count of small results (50 or less)

# Iterate over all of the list except first and last four values
for r in results[4:-4]:
    # increment the count variables according to the value of r
    if r > 50:
        big_count = big_count + 1
    else:
        small_count = small_count + 1

# Print out the results
print('Big:', big_count)
print('Small:', small_count)
    

## <font color=green>💬 Discussion point</font>
Take a look at the code below, and run it.

- What type of assignment is happening here?
- What do you think the built in `zip` function doing?


In [None]:
list_a = [12, 50, 45]
list_b = [3, 10, 5]

for a, b in zip(list_a, list_b):
    print(a/b)

## <font color=green>💬 Discussion point</font>

Take a look at the code below, and run it.

- What is the expression `items[:]` doing?
- Why do you think this neccesary?
- Try this thought experiment, walking through the code in your head (i.e. don't try running this for real!): what would  happen if you used this as you for statement in the same code?
  - `for item in items:`

In [None]:
items = ['+++++','++','++', '+++']
for item in items[:]:
    if len(item) > 3:
        items.append('☺️' + item)
print(items)

#### Using range() with for loops

All of the examples of `for` that we have looked at above assume that we already have some sequence to iterate over. What if we have no sequence, but we know that we need to execute some statements a fixed number of times?

- We can use `range()` for this.
- `range(n)` creates the sequence [0, 1, 2, 3, ..., n-1] 

For example,  try running the below code:

In [None]:
for i in range(5):
    print(i*'+')

## <font color=green>❓ Question</font>

`range()` is described in the Python documentation [here](https://docs.python.org/3/library/stdtypes.html#range). Take a look. Now write a piece of code, using `for` and `range()`, that prints out all of the cubes (powers of three) of the even numbers from 4 to 16. Make sure it prints the last one, 16 cubed!

## <font color=green>⌨️ Your answer</font>


In [None]:
# Type your answer here, replacing the REPLACE words:
for r in range(4,17,2):
    print(r**3)

#### Using enumeration with for loops

Often when looping over sequences, we want to make use of not only the current item, but also its numeric position. We can do this with the `enumerate()` function.

Try the following code:

In [None]:
bases = ['C', 'G', 'G', 'C', 'C', 'T', 'A', 'T', 'A']
for value in enumerate(bases):
    print(value)

This is what is happening:

- The `enumerate()` iterates over the list.
- It creates a _tuple_ for each item.
- The _tuple_ consists of an int for the index of the item, and the item

(a tuple is a Python immutable sequence)

## <font color=green>❓ Question</font>

Write a new version of the above code, using double assignment to get the index and the item in to separate variables, and print them both

## <font color=green>⌨️ Your answer</font>

In [None]:
# Edit the below code, changing the REPLACE words to give your answer

bases = ['C', 'G', 'G', 'C', 'C', 'T', 'A', 'T', 'A']
for i,v in enumerate(bases):
    print('Index:', i, 'Base:', v)