# Modularity and Testing

# How I learned to write code
* Script
* One command after the other
* ... for thousands of lines
* like a book with no formatting, no paragraphs, no chapters

# Modularity

**breaking code into logically grouped pieces that can be strung together in different ways**

# Tools of Modularity:
* packages (e.g. numpy)
* modules (e.g. numpy.random)
* classes (e.g np.ndarray)
* **functions** (e.g. numpy.random.randomn)




# Why make your code modular?

1. Organization and Readibility
2. Reusability
3. Testability

# 1. Organization and Readability

* Break code into individual thoughts
* Allows reader to get a high level overview and dig deeper if needed

In [1]:
my_list = [1,2, 5, 7, 9, 4, 5, 4]
#find the total 
total=0
for num in my_list:
    total = total + num
#Calc mean
mean = total/len(my_list)

In [2]:
def calc_total(my_list):
    '''
    Calculate the sum of all elements in a list
    Inputs: 
        my_list: list of numbers
    Outputs:
        total: sum of all elements in the input list
    '''
    total=0
    for num in my_list:
        total = total + num
    return total

def calc_mean(my_list):
    '''
    calculate the mean of an input list
    Input:
        my_list: a list of numbers
    Output:
        mean: the mean of the input list
    '''
    total = calc_total(my_list)
    list_len = len(my_list)
    mean = total/list_len
    return mean

In [3]:
my_list = [1,2, 5, 7, 9, 4, 5, 4]
mean = calc_mean(my_list)

# 2. Reusability

* Avoid copying and pasting
    - avoid forgetting to update a variable
    - avoid having to fix bugs in multiple places
* Don't have to reimplement every time
* e.g. once you've written your calc_mean, you can calculate the mean of any list in any piece of code
* Make code more flexible

# 3. Testability

* Much easier to identify unique failure modes for small chunks of code that have a single purpose (e.g. calculate the median)
* --> much easier to test an individual function than a pipeline or a script

# Testing

1. How do you know if your code does what you think it does? You test it
2. How do you make sure you code keeps doing what you think it should do? Write automated tests

Scientists are good at the first one, but not as great with the second one

# Types of Testing
1. Defensive programming: inline checking of code (e.g. asserting inputs have correct form)
2. **Unit Tests**: testing the most basic components of your code (e.g. functions)
3. Integration Tests: testing how those components interact
4. Regression Test: testing if anything has changed since you last trusted your results


In [4]:
# Anatomy of a Unit Test

def test_func():
    expected = get_expected()
    observed = func(*args, **kwargs)
    assert expected == observed


# My testing philosophy for scientists
* whatever testing you do to make sure your code does what you think it should do, put that code in an automated test
* this includes:
    - sanity checks
    - comparison against previous results (write a file with the results you've verified)
    - visual inspection (write it to a file and use in a regression test)

# Learn more:
* Chapter 18 of textbook
* https://realpython.com/python-testing/
* tools: unittest, nose, pytest

# 3. Testability

* Much easier to identify unique failure modes for small chunks of code that have a single purpose (e.g. calculate the median)
* --> much easier to test an individual function than a pipeline or a script

# Suggestions for getting started:
1. Write pseudo-code
    * write a set of directions or commands
    * each command will be a function
2. (Optional) write your tests (aka test driven development; TDD)
    * write tests for each function
    * this seems to be easier in industry than in science
3. Fill in your functions
4. Write tests (if you haven't already done so)