# Learning Python Data Analysis

## Making Choices

Setup: https://swcarpentry.github.io/python-novice-inflammation/instructor/index.html#setup

Instruction: https://swcarpentry.github.io/python-novice-inflammation/instructor/07-cond.html

Objectives:
* Write conditional statements including if, elif, and else branches.
* Correctly evaluate expressions containing 'and' and 'or'.

In [None]:
# python can take different actions depending on a condition using an if statement:

num = 37
if num > 100:
    print('greater')
else:
    print('not greater')
print('done')

In [None]:
# No else needed...

num = 53
print('before conditional...')
if num > 100:
    print(num, 'is greater than 100')
print('...after conditional')

In [None]:
# chain several tests together using 'elif'

num = -3

if num > 0:
    print(num, 'is positive')
elif num == 0: #test for equality we use a double equals sign == rather than a single equals sign =
    print(num, 'is zero')
else:
    print(num, 'is negative')

## Comparing in python
For comparing values in our conditionals we can use the following:

* '>': greater than
* '<': less than
* '==': equal to
* '!=': does not equal
* '>=': greater than or equal to
* '<=': less than or equal to

To combine tests, use 'and' and 'or'. 

In [None]:
# 'and' is only true if both parts are true:

if (1 > 0) and (-1 >= 0):
    print('both parts are true')
else:
    print('at least one part is false')

In [None]:
# 'or' is true if at least one part is true:
if (1 < 0) or (1 >= 0):
    print('at least one test is true')

True and False are special words in Python called booleans, which represent truth values. 
A statement such as 1 < 0 returns the value False, while -1 < 0 returns the value True.

## Checking our Data
Let's use conditions to check for the suspicious features in our inflammation data. We will use functions provided by the numpy module again.

In [None]:
import numpy
data = numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')

# If the maximum daily inflammation raises one unit a day it seems suspicious
# Let's check if maximum inflammation in the beginning (day 0) and in the middle (day 20) are equal to the corresponding day numbers.
max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
    print('Suspicious looking maxima!')

We also saw a different problem in the third dataset; 
the minima per day were all zero (looks like a healthy person snuck into our study). 
We can also check for this with an 'elif' condition:
```
elif numpy.sum(numpy.amin(data, axis=0)) == 0:

    print('Minima add up to zero!')
```

And if neither of these conditions are true, we can use 'else' to give the all-clear:
```
else:
    print('Seems OK!')
```

In [None]:
# Put it all together
data = numpy.loadtxt(fname='data/inflammation-01.csv', delimiter=',')

max_inflammation_0 = numpy.amax(data, axis=0)[0]
max_inflammation_20 = numpy.amax(data, axis=0)[20]

if max_inflammation_0 == 0 and max_inflammation_20 == 20:
    print('Suspicious looking maxima!')
elif numpy.sum(numpy.amin(data, axis=0)) == 0:
    print('Minima add up to zero!')
else:
    print('Seems OK!')
    
# try to also run it on another file 3   

In [None]:
# Exercise - Before executing the code, what is the answer

if 4 > 5:
    print('A')
elif 4 == 5:
    print('B')
elif 4 < 5:
    print('C')

In [None]:
# Exercise - Explain the outcome
if '':
    print('empty string is true')
if 'word':
    print('word is true')
if []:
    print('empty list is true')
if [1, 2, 3]:
    print('non-empty list is true')
if 0:
    print('zero is true')
if 1:
    print('one is true')

In [None]:
# Using the 'not' operator

if not '':
    print('empty string is not true')
if not 'word':
    print('word is not true')
if not not True:
    print('not not True is true')

In [None]:
# check if two values are within 10% of each other
a = 5
b = 5.1

if abs(a - b) <= 0.1 * abs(b):
    print('True')
else:
    print('False')

In [None]:
# In-place operators
x = 1  # original value
x += 1 # add one to x, assigning result back to x
x *= 3 # multiply x by 3
print(x)

In [None]:
# Exercise - Write some code that sums the positive and negative numbers from the list separately
# print the result
test_list = [3, 4, 6, 1, -1, -5, 0, 7, -8]

In [None]:
''' 
Exercise - Sort the following files using the 'String'.startswith('Str') method to populate each of the lists.
Put 'inflammation- into the 'large_files' list, 'small-' into 'small_files' list, 
and put everthing else into 'other_files' list.
'''

#Note: the above method using ''' ''' is a great way to comment multiple lines.

filenames = ['inflammation-01.csv',
         'myscript.py',
         'inflammation-02.csv',
         'small-01.csv',
         'small-02.csv']
large_files = []
small_files = []
other_files = []

# Bonus - use 'glob' to get the file names from the 'data' folder!

In [None]:
# Exercise - Write a loop that counts the number of vowels in a 'sentence'.
vowels = 'aeiouAEIOU'
sentence = 'Mary had a little lamb.'