<a href="https://colab.research.google.com/github/lucywowen/csci191_ProgSci/blob/main/workshops/conditionals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Making choices

In our last workshop, we discovered something suspicious was going on in our inflammation data by drawing some plots. How can we use Python to automatically recognize the different features we saw, and take a different action for each? In this lesson, we’ll learn how to write code that runs only when certain conditions are true.

## Conditionals

We can ask Python to take different actions, depending on a condition, with an `if` statement:

In [1]:
num = 37 ## Try changing this value and see what happens!
if num > 100:
    print('greater')
else:
    print('not greater')
print('done')

not greater
done


The second line of this code uses the keyword if to tell Python that we want to make a choice. If the test that follows the `if` statement is true, the body of the `if` (i.e., the set of lines indented underneath it) is executed, and “greater” is printed. If the test is false, the body of the else is executed instead, and “not greater” is printed. Only one or the other is ever executed before continuing on with program execution to print “done”:

<p align="center">
  <img src="
https://swcarpentry.github.io/python-novice-inflammation/fig/python-flowchart-conditional.png" width="450" />
</p>

Conditional statements don’t have to include an `else`. If there isn’t one, Python simply does nothing if the test is false:

In [4]:
num = 53
print('before conditional...')
if num > 100:
    print(num, 'is greater than 100')
print('...after conditional')

before conditional...
...after conditional


We can also chain several tests together using `elif`, which is short for “else if”. The following Python code uses `elif` to print the sign of a number.

In [5]:
num = -3

if num > 0:
    print(num, 'is positive')
elif num == 0:
    print(num, 'is zero')
else:
    print(num, 'is negative')

-3 is negative


Note that to test for equality we use a double equals sign `==` rather than a single equals sign `=` which is used to assign values.

Along with the `>` and `==` operators we have already used for comparing values in our conditionals, there are a few more options to know about:

- `>`: greater than
- `<`: less than
- `==`: equal to
- `!=`: does not equal
- `>=`: greater than or equal to
- `<=`: less than or equal to

We can also combine tests using `and` and `or`.  `and` is only true if both parts are true:

In [6]:
if (1 > 0) and (-1 >= 0):
    print('both parts are true')
else:
    print('at least one part is false')

at least one part is false


while `or` is true if at least one part is true:

In [7]:
if (1 < 0) or (1 >= 0):
    print('at least one test is true')

at least one test is true


`True` and `False` are special words in Python called booleans, which represent truth values. A statement such as 1 < 0 returns the value `False`, while -1 < 0 returns the value `True`.

For this workshop, we'll be using all the files in the `data` directory, so go ahead and load all of them in.

In [21]:
from google.colab import files
files.upload()

Saving inflammation-07.csv to inflammation-07 (1).csv
Saving inflammation-01.csv to inflammation-01 (2).csv
Saving inflammation-02.csv to inflammation-02 (1).csv
Saving inflammation-03.csv to inflammation-03 (1).csv
Saving inflammation-04.csv to inflammation-04 (1).csv
Saving inflammation-05.csv to inflammation-05 (1).csv
Saving inflammation-06.csv to inflammation-06 (1).csv
Saving inflammation-08.csv to inflammation-08 (1).csv
Saving inflammation-09.csv to inflammation-09 (1).csv
Saving inflammation-10.csv to inflammation-10 (1).csv
Saving inflammation-11.csv to inflammation-11 (1).csv
Saving inflammation-12.csv to inflammation-12 (1).csv
Saving small-01.csv to small-01.csv
Saving small-02.csv to small-02.csv
Saving small-03.csv to small-03.csv


{'inflammation-07 (1).csv': b'0,1,0,2,2,5,6,2,4,7,2,2,11,5,6,4,4,7,18,17,9,5,7,15,10,4,10,3,3,2,3,4,3,7,3,3,4,1,1,1\n0,1,0,2,3,4,1,5,3,9,2,5,8,10,10,14,15,16,7,9,10,14,6,9,4,6,6,12,7,3,9,5,6,7,3,2,1,0,0,1\n0,0,1,2,3,4,6,7,6,4,5,9,6,13,5,12,8,10,7,6,7,12,8,13,6,9,14,6,12,2,9,9,3,3,2,2,1,1,1,0\n0,1,2,2,1,1,3,4,7,4,2,7,12,6,9,10,12,8,11,15,5,16,18,10,16,8,7,8,5,4,6,8,4,4,5,2,1,2,2,1\n0,0,2,1,2,5,3,5,6,4,4,2,9,3,10,15,5,17,16,6,6,16,7,6,13,8,4,5,3,10,2,2,8,5,3,3,2,1,0,0\n0,0,1,0,2,5,1,1,7,5,3,10,8,10,7,6,10,11,8,17,8,17,7,7,7,14,8,9,4,5,8,3,7,3,3,5,4,2,2,0\n0,1,0,3,1,1,1,1,6,5,7,3,4,4,9,10,12,8,5,19,14,15,11,5,4,13,7,10,3,5,5,5,8,5,1,3,4,1,0,0\n0,0,1,0,1,2,1,1,6,7,10,10,6,13,11,6,6,11,5,5,14,18,14,14,5,3,12,5,7,8,4,5,7,1,3,4,4,2,2,0\n0,0,2,1,1,4,6,5,5,6,2,2,6,4,10,6,5,15,12,5,12,14,9,16,8,10,9,7,4,10,5,5,7,3,1,3,2,2,1,0\n0,0,2,2,1,1,6,4,6,3,10,6,12,5,5,10,8,6,10,14,15,17,17,4,15,12,7,3,11,6,8,4,4,1,5,4,1,3,1,1\n0,1,2,0,2,2,4,7,4,4,4,3,6,3,9,8,13,12,8,5,6,12,14,5,10,6,7,10,11,7,6,4,8,3,4,5,

## Checking out data

Now that we’ve seen how conditionals work, we can use them to check for the suspicious features we saw in our inflammation data. We are about to use functions provided by the `numpy` module again. Therefore, if you’re working in a new Python session, make sure to load the module and data with:


In [9]:
import numpy
data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',')

From the first couple of plots, we saw that maximum daily inflammation exhibits a strange behavior and raises one unit a day. Wouldn’t it be a good idea to detect such behavior and report it as suspicious? Let’s do that! However, instead of checking every single day of the study, let’s merely check if maximum inflammation in the beginning (day 0) and in the middle (day 20) of the study are equal to the corresponding day numbers.

In [10]:
max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
    print('Suspicious looking maxima!')

Suspicious looking maxima!


We also saw a different problem in the third dataset; the minima per day were all zero (looks like a healthy person snuck into our study). We can also check for this with an `elif` condition:

In [None]:
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
    print('Minima add up to zero!')

And if neither of these conditions are true, we can use `else` to give the all-clear:

In [None]:
else:
    print('Seems OK!')

Let’s test that out:

In [11]:
data = numpy.loadtxt(fname='inflammation-01.csv', delimiter=',')

max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
    print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
    print('Minima add up to zero!')
else:
    print('Seems OK!')

Suspicious looking maxima!


In [13]:
data = numpy.loadtxt(fname='inflammation-03.csv', delimiter=',')

max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
    print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
    print('Minima add up to zero!')
else:
    print('Seems OK!')

Minima add up to zero!


In this way, we have asked Python to do something different depending on the condition of our data. Here we printed messages in all cases, but we could also imagine not using the `else` catch-all so that messages are only printed when something is wrong, freeing us from having to manually examine every plot for features we’ve seen before.

### Understand it

Consider this code:

In [None]:
if 4 > 5:
    print('A')
elif 4 == 5:
    print('B')
elif 4 < 5:
    print('C')

Which of the following would be printed if you were to run this code? Why did you pick this answer?

- A
- B
- C
- B and C

`True` and `False` booleans are not the only values in Python that are true and false. In fact, any value can be used in an `if` or `elif`. After reading and running the code below, explain what the rule is for which values are considered true and which are considered false.

In [None]:
if '':
    print('empty string is true')
if 'word':
    print('word is true')
if []:
    print('empty list is true')
if [1, 2, 3]:
    print('non-empty list is true')
if 0:
    print('zero is true')
if 1:
    print('one is true')

Sometimes it is useful to check whether some condition is `not` true. The Boolean operator `not` can do this explicitly. After reading and running the code below, write some `if` statements that use `not` to test the rule that you formulated in the previous challenge.

In [None]:
if not '':
    print('empty string is not true')
if not 'word':
    print('word is not true')
if not not True:
    print('not not True is true')

Write some conditions that print `True` if the variable `a` is within 10% of the variable `b` and `False` otherwise. Compare your implementation with your neighbor's: do you get the same answer for all possible pairs of numbers?

Here's a hint: There is a built-in function `abs` that returns the absolute value of a number:

In [16]:
### Try it and compare with a neighbor

Python (and most other languages in the C family) provides in-place operators that work like this:

In [17]:
x = 1  # original value
x += 1 # add one to x, assigning result back to x
x *= 3 # multiply x by 3
print(x)

6


Write some code that sums the positive and negative numbers in a list separately, using in-place operators. Do you think the result is more or less readable than writing the same without in-place operators?

In [18]:
### Ok keep going!

In our data folder, large data sets are stored in files whose names start with “inflammation-” and small data sets – in files whose names start with “small-”. We also have some other files that we do not care about at this point. We’d like to break all these files into three lists called `large_files`, `small_files`, and `other_files`, respectively.

Add code to the template below to do this. Note that the string method `startswith` returns `True` if and only if the string it is called on starts with the string passed as an argument, that is:

In [19]:
'String'.startswith('Str')

True

And it's case sensitive

In [20]:
'String'.startswith('str')

False

Use the following Python code as your starting point:

In [None]:
filenames = ['inflammation-01.csv',
         'inflammation-02.csv',
         'small-01.csv',
         'small-02.csv']
large_files = []
small_files = []



Your solution should:

- loop over the names of the files
- figure out which group each filename belongs in
- append the filename to that list

In the end the three lists should be:



In [22]:
large_files = ['inflammation-01.csv', 'inflammation-02.csv']
small_files = ['small-01.csv', 'small-02.csv']

In [23]:
### Ok try it!

Here's another one:

- Write a loop that counts the number of vowels in a character string.
- Test it on a few individual words and full sentences.
- Once you are done, compare your solution to your neighbor’s. Did you make the same decisions about how to handle the letter ‘y’ (which some people think is a vowel, and some do not)?


In [24]:
### Try it here

### Key points:

- Use `if` condition to start a conditional statement, `elif` condition to provide additional tests, and `else` to provide a default.
- The bodies of the branches of conditional statements must be indented.
- Use `==` to test for equality.
- `X` and `Y` is only true if both `X` and `Y` are true.
- `X` or `Y` is true if either `X` or `Y`, or both, are true.
- Zero, the empty string, and the empty list are considered false; all other numbers, strings, and lists are considered true.
- `True` and `False` represent truth values.
