# Python part 2


# 5. Making choices

In our last lesson, we discovered something suspicious was going on in our inflammation data by drawing some plots. How can we use Python to automatically recognize the different features we saw, and take a different action for each? In this lesson, we’ll learn how to write code that runs only when certain conditions are true.

In [1]:
# if-statement
num = 37
if num > 100: # notice colon
    print('greater') # notice indentation
else:
    print('not greater')
print('done')

not greater
done


In [2]:
num = 54
print('before conditional')
if num > 100:
    print(num, 'is greater than 100')
print('after conditional')

before conditional
after conditional


Consider conditional statements carefully to check which conditions are considered and you would like to consider

In [3]:
# Chain multiple if-statements
num = -3

if num > 0:
    print(num, 'is positive')
elif num == 0: 
    print(num, 'is zero')
else:
    print(num, 'is negative')


-3 is negative


In [4]:
# Difference between == and =
print(num == 4)
num = 4 
print(num == 4)

False
True


In [5]:
# Combine comparisons with 'and'
if (1 > 0) and (-1 >= 0): # best practice to use parentheses
    print('both parts are true')
else:
    print('at least one part is false')

at least one part is false


In [6]:
# Combine with 'or'
if (1 > 0) or (-1 >= 0):
    print('at least one test is true')

at least one test is true


## Checking our Data

1. Let's rerun the `inflammation_analysis.ipynb` 
1. Discuss data 

Let's catch the suspicious data

In [7]:
import numpy

In [8]:
# Let's inspect max values of the first dataset
data = numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')

max_inflammation = numpy.max(data, axis=0)
print(max_inflammation)

[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10. 11. 12. 13. 14. 15. 16. 17.
 18. 19. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10.  9.  8.  7.  6.  5.
  4.  3.  2.  1.]


In [9]:
# Let use day_0 == 0 and day_20 == 20
if (max_inflammation[0] == 0) and (max_inflammation[20] == 20):
    print('Suspicious looking maxima!')

Suspicious looking maxima!


In [10]:
data = numpy.loadtxt(fname='data/inflammation-03.csv', delimiter=',')

min_inflammation = numpy.min(data, axis=0)
print(min_inflammation)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]


In [11]:
# Let's use sum == 0
if numpy.sum(min_inflammation) == 0:
    print('Minima add up to zero')

Minima add up to zero


### Optional: Let's test all our datasets

1. Find files
1. Create for-loop over filenames
1. Load data
1. Test data

In [12]:
import glob
import numpy

filenames = glob.glob('data/inflammation*.csv')
for filename in filenames:
    data = numpy.loadtxt(fname=filename, delimiter=',')
    
    # Data to test
    max_inflammation = numpy.max(data, axis=0)
    min_inflammation = numpy.min(data, axis=0)
    
    if (max_inflammation[0] == 0) and (max_inflammation[20] == 20):
        print('Suspicious looking maxima in:', filename) 
    elif numpy.sum(min_inflammation) == 0:
        print('Minima add up to zero in:', filename) 
    else:
        print(filename, ' looks OK') 

Suspicious looking maxima in: data\inflammation-01.csv
Suspicious looking maxima in: data\inflammation-02.csv
Minima add up to zero in: data\inflammation-03.csv
Suspicious looking maxima in: data\inflammation-04.csv
Suspicious looking maxima in: data\inflammation-05.csv
Suspicious looking maxima in: data\inflammation-06.csv
Suspicious looking maxima in: data\inflammation-07.csv
Minima add up to zero in: data\inflammation-08.csv
Suspicious looking maxima in: data\inflammation-09.csv
Suspicious looking maxima in: data\inflammation-10.csv
Minima add up to zero in: data\inflammation-11.csv
Suspicious looking maxima in: data\inflammation-12.csv


# 6. Creating functions


We’d like a way to package our code so that it is easier to reuse, and Python provides for this by letting us define things called ‘functions’.

In [13]:
# Simple example function to convert Fahrenheit to Celsius
def fahr_to_celsius(temp_F):
    temp_C = (temp_F - 32) * (5 / 9)
    return temp_C

In [14]:
temp_in_celsius = fahr_to_celsius(32)
print(temp_in_celsius)

0.0


In [15]:
# Second example to convert Celsius to Kelvin
def celsius_to_kelvin(temp_C):
    temp_K = temp_C + 273.15
    return temp_K

In [16]:
print('freezing point of water in Kelvin:', celsius_to_kelvin(0))

freezing point of water in Kelvin: 273.15


In [17]:
# Combine function to convert Fahrenheit to Kelvin
def fahr_to_kelvin(temp_F):
    temp_C = fahr_to_celsius(temp_F)
    temp_K = celsius_to_kelvin(temp_C)
    return temp_K

In [18]:
print('boiling point of water in Kelvin:', fahr_to_kelvin(212))

boiling point of water in Kelvin: 373.15


This is our first taste of how larger programs are built: we define basic operations, then combine them in ever-larger chunks to get the effect we want. Real-life functions will usually be larger than the ones shown here — typically half a dozen to a few dozen lines — but they shouldn’t ever be much longer than that, or the next person who reads it won’t be able to understand what’s going on. General guideline: a function should perform one task.

### Defining default

In [19]:
# Example function default arguments
def display_number(a=1, b=2, c=3):
    print('a:', a, 'b', b, 'c', c)

In [20]:
display_number(50)

a: 50 b 2 c 3


In [21]:
display_number(50, 66)

a: 50 b 66 c 3


In [22]:
display_number(c=35)

a: 1 b 2 c 35


In [23]:
display_number(a=5, c=10)

a: 5 b 2 c 10


In [24]:
help(numpy.loadtxt)

Help on function loadtxt in module numpy:

loadtxt(fname, dtype=<class 'float'>, comments='#', delimiter=None, converters=None, skiprows=0, usecols=None, unpack=False, ndmin=0, encoding='bytes', max_rows=None, *, like=None)
    Load data from a text file.
    
    Each row in the text file must have the same number of values.
    
    Parameters
    ----------
    fname : file, str, or pathlib.Path
        File, filename, or generator to read.  If the filename extension is
        ``.gz`` or ``.bz2``, the file is first decompressed. Note that
        generators should return byte strings.
    dtype : data-type, optional
        Data-type of the resulting array; default: float.  If this is a
        structured data-type, the resulting array will be 1-dimensional, and
        each row will be interpreted as an element of the array.  In this
        case, the number of columns used must match the number of fields in
        the data-type.
    comments : str or sequence of str, optional
  

In [25]:
# Adding a docstring
def display_number(a=1, b=2, c=3):
    ''' This is an example docstring
    
    
    '''
    print('a:', a, 'b', b, 'c', c)

In [26]:
help(display_number)

Help on function display_number in module __main__:

display_number(a=1, b=2, c=3)
    This is an example docstring



# Refactoring

1. Update `inflammation_analysis.ipynb` by introducing functions
2. Create a new script called `processing.py` and copy the functions there
3. Create a new file called `inflammation_analysis_refactored.ipynb` and import our functions to run the analsysis

# 7. Errors and Exceptions

Every programmer encounters errors, both those who are just beginning, and those who have been programming for years. Encountering errors and exceptions can be very frustrating at times, and can make coding feel like a hopeless endeavour. However, understanding what the different types of errors are and when you are likely to encounter them can help a lot. Once you know why you get certain types of errors, they become much easier to fix.

In [27]:
def favorite_ice_cream():
    ice_cream = ['chocolate', 'vanilla', 'strawberry']
    print(ice_cream[3])
    
favorite_ice_cream()

IndexError: list index out of range

In [28]:
# SyntaxError
def some_function()
    msg = 'hello, world'
    print(msg)
     return msg

SyntaxError: invalid syntax (Temp/ipykernel_22352/1007663813.py, line 2)

In [29]:
# NameError
print(aa)

NameError: name 'aa' is not defined

In [30]:
# TypeError
number = 'one'
sum_values = number + 1

TypeError: can only concatenate str (not "int") to str

In [31]:
# File errors
file = open('inflammation-01.csv')

FileNotFoundError: [Errno 2] No such file or directory: 'inflammation-01.csv'

# 8. Defensive programming

The previous lessons have introduced the basic tools of programming: variables and lists, file I/O, loops, conditionals, and functions. What they haven’t done is show us how to tell whether a program is getting the right answer, and how to tell if it’s still getting the right answer as we make changes to it.

With that, I have some bad news for you: You will make mistakes! However, with that fact also comes knowledge. The first step toward getting the right answers from our programs is to assume that mistakes will happen and to guard against them. 

This is called defensive programming, and the most common way to do it is to add **assertions** to our code so that it checks itself as it runs. An assertion is simply a statement that something must be true at a certain point in a program. When Python sees one, it evaluates the assertion’s condition. If it’s true, Python does nothing, but if it’s false, Python halts the program immediately and prints the error message if one is provided. For example, this piece of code halts as soon as the loop encounters a value that isn’t positive:

In [32]:
# Calculate sum of positive numbers
numbers = [1.5, 2.3, 0.7, -0.1, 4.4]
total = 0
for num in numbers:
    assert num > 0, 'Data should only contain positive values'
    total = total + num
print('total is', total)

AssertionError: Data should only contain positive values

### How can we catch empty datasets in our analysis?

In [33]:
import numpy

array = numpy.array([1, 1])
empty_array = numpy.array([])

print(array)
print(empty_array)

print(array.shape[0])
print(empty_array.shape[0])

assert array.shape[0] > 0, 'Expected non-empty array'
assert empty_array.shape[0] > 0, 'Expected non-empty array'

[1 1]
[]
2
0


AssertionError: Expected non-empty array

### Go to inflammation_analysis.ipynb for defensive programming

Write a function `detect_problems_defensive()` that adds pre- and post-conditions