# Part 5.Making Choices
Adapted from [Programming with Python](http://swcarpentry.github.io/python-novice-inflammation/), [copyright © Software Carpentry](http://swcarpentry.github.io/python-novice-inflammation/license/), under the Creative Commons license [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/).

### Questions
* How can my programs do different things based on data values?  

### Objectives
* Write conditional statements including `if`, `elif`, and `else` branches.
* Correctly evaluate expressions containing `and` and `or`.

In our last lesson, we discovered something suspicious was going on in our inflammation data by drawing some plots. How can we use Python to automatically recognize the different features we saw, and take a different action for each? In this lesson, we’ll learn how to write code that runs only when certain conditions are true.


## Conditionals

We can ask Python to take different actions, depending on a condition, with an if statement

In [None]:
num = 37
if num > 100:
    print('greater')
else:
    print('not greater')
print('done')

The second line of this code uses the keyword `if` to tell Python that we want to make a choice.
If the test that follows the `if` statement is true,
the body of the `if`
(i.e., the lines indented underneath it) are executed.
If the test is false,
the body of the `else` is executed instead.
Only one or the other is ever executed:

![Executing a Conditional](images/python-flowchart-conditional.svg)

Conditional statements don't have to include an `else`.
If there isn't one,
Python simply does nothing if the test is false:

In [None]:
num = 53
print('before conditional...')
if num > 100:
    print('53 is greater than 100')
print('...after conditional')

We can also chain several tests together using `elif`,
which is short for "else if".
The following Python code uses `elif` to print the sign of a number.

In [None]:
num = -3

if num > 0:
    print(num, "is positive")
elif num == 0:
    print(num, "is zero")
else:
    print(num, "is negative")

One important thing to notice in the code above is that we use a double equals sign `==` to test for equality
rather than a single equals sign
because the latter is used to mean **assignment**.

We can also combine tests using `and` and `or`.
`and` is only true if both parts are true:

In [None]:
if (1 > 0) and (-1 > 0):
    print('both parts are true')
else:
    print('at least one part is not true')

while `or` is true if at least one part is true:

In [None]:
if (1 < 0) or (-1 < 0):
    print('at least one test is true')

## Checking our Data

Now that we've seen how conditionals work,
we can use them to check for the suspicious features we saw in our inflammation data.
In the first couple of plots, the maximum inflammation per day
seemed to rise like a straight line, one unit per day.
We can check for this inside the `for` loop we wrote with the following conditional:

```python
if data.max(axis=0)[0] == 0 and data.max(axis=0)[20] == 20:
    print('Suspicious looking maxima!')
```

We also saw a different problem in the third dataset;
the minima per day were all zero (looks like a healthy person snuck into our study).
We can also check for this with an `elif` condition:

```python
elif data.min(axis=0).sum() == 0:
    print('Minima add up to zero!')
```

And if neither of these conditions are true, we can use `else` to give the all-clear:

```python
else:
    print('Seems OK!')
```

Let's test that out:

In [None]:
import numpy as np

In [None]:
data = np.loadtxt(fname='./data/inflammation-01.csv', delimiter=',')

if data.max(axis=0)[0] == 0 and data.max(axis=0)[20] == 20:
    print('Suspicious looking maxima!')
elif data.min(axis=0).sum() == 0:
    print('Minima add up to zero!')
else:
    print('Seems OK!')

In [None]:
data = np.loadtxt(fname='./data/inflammation-03.csv', delimiter=',')

if data.max(axis=0)[0] == 0 and data.max(axis=0)[20] == 20:
    print('Suspicious looking maxima!')
elif data.min(axis=0).sum() == 0:
    print('Minima add up to zero!')
else:
    print('Seems OK!')

In this way,
we have asked Python to do something different depending on the condition of our data.
Here we printed messages in all cases,
but we could also imagine not using the `else` catch-all
so that messages are only printed when something is wrong,
freeing us from having to manually examine every plot for features we've seen before.

## Exercises

Consider this code:

```Python
if 4 > 5:
    print('A')
elif 4 == 5:
    print('B')
elif 4 < 5:
    print('C')
```

Which of the following would be printed if you were to run this code? Why did you pick this answer?

1. A
2. B
3. C
4. B and C

### What is truth?

`True` and `False` are special words in Python called `booleans` which represent true
and false statements. However, they aren't the only values in Python that are true and false.
In fact, *any* value can be used in an `if` or `elif`.
After reading and running the code below,
explain what the rule is for which values are considered true and which are considered false.

In [None]:
if '':
    print('empty string is true')

In [None]:
if 'word':
    print('word is true')

In [None]:
if []:
    print('empty list is true')

In [None]:
if [1, 2, 3]:
    print('non-empty list is true')

In [None]:
if 0:
    print('zero is true')

In [None]:
if 1:
    print('one is true')

### That’s Not Not What I Meant

Sometimes it is useful to check whether some condition is not true. The Boolean operator not can do this explicitly. After reading and running the code below, write some if statements that use not to test the rule that you formulated in the previous challenge.

In [None]:
if not '':
    print('empty string is not true')

In [None]:
if not 'word':
    print('word is not true')

In [None]:
if not not True:
    print('not not True is true')

### Close Enough
Write some conditions that print True if the variable `a` is within 10% of the variable `b` and False otherwise. I.e., if you subtract them, is the difference less than 10%?

You can use the built-in function `abs` to get the absolute value of the difference of two numbers.

Compare your implementation with a partner: do you get the same answer for all possible pairs of numbers?

## In-place operators

Python (and most other languages in the C family) provides **in-place operators**
that work like this:

In [None]:
x = 1  # original value
x += 1 # add one to x, assigning result back to x
x *= 3 # multiply x by 3

print(x)

Write some code that sums the positive and negative numbers in a list separately, using in-place operators. Do you think the result is more or less readable than writing the same without in-place operators?

### Sorting a List Into Buckets

The folder containing our data files has large data sets whose names start with “inflammation-“, small ones whose names with “small-“, and possibly other files whose sizes we don’t know. Our goal is to sort those files into three lists called large_files, small_files, and other_files respectively. Add code to the template below to do this. Note that the string method startswith returns True if and only if the string it is called on starts with the string passed as an argument.

Your solution should:

* loop over the names of the files
* figure out which group each filename belongs
* append the filename to that list

In [None]:
from glob import glob
filenames = glob('./data/*')
large_files = []
small_files = []
other_files = []

# pseudocode:
# for loop to loop through 'filenames'
for f in filenames:
# check if 'inflammation' is in current filename
    if 'inflammation' in f:
        large_files.append(f)
    elif 'small' in f:
        small_files.append(f)
    else:
        other_files.append(f)
# if it is, append current filename to large_files
# else if 'small' is in current filename
# append to small_files
# else append to other_files

### Counting Bases
Write a loop that counts the number of bases in a DNA sequence.

In [None]:
dna_sequence = 'CGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGACGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCGCGTACCGTCGACGATGCTACGATCGTCGATCGTAGTCGATCATCGATCGATCGCGATCGATCGATATCGATCGATATCATCGATGCATCGATCATCGATCGATCGATCGATCGACGATCGATCGATCGTAGCTAGCTAGCTAGATCGATCATCATCGTAGCTAGCTCGACTAGCTACGTACGATCGATGCATCGATCGTACGATCGATCGATCGATCGATCGATCGATCGATCGATCGATCGTAGCTAGCTACGATCG'

In [None]:
# initialize counters, e.g. G = 0
G = 0 
A = 0
T = 0
C = 0
err = 0
# loop through bases in dna sequence
for base in dna_sequence:
    if base == 'G':
        G += 1
    elif base == 'A':
        A += 1
    elif base == 'T':
        T += 1
    elif base == 'C':
        C += 1
    else:
        err += 1
# if base is G, then increment the value of the G counter by 1
# do this for all the bases each time through the loop
# also add an else statement that prints "unrecognized base"  
# and adds a value to an "error" counter

## Key Points
* Use `if condition` to start a conditional statement, `elif condition` to provide additional tests, and `else` to provide a default.
* The bodies of the branches of conditional statements must be indented.
* Use `==` to test for equality.
* `X and Y` is only true if both `X` and `Y` are true.
* `X or Y` is true if either `X` or `Y`, or both, are true.
* Zero, the empty string, and the empty list are considered false; all other numbers, strings, and lists are considered true.
* Nest loops to operate on multi-dimensional data.
* Put code whose parameters change frequently in a function, then call it with different parameter values to customize its behavior.