# Testing & principles of design

![exploits](figures/exploits_of_a_mom.png)

Timothy E. Holy

Washington University in St. Louis

# Topics for today

- testing, reliability, and design
- how to write tests
- test coverage

# Reasons to be concerned about code-quality

- science is increasingly reliant on computation

- experimental science has widely agreed-upon standards about positive and negative controls, etc.; there is nothing comparable for computation

- many estimates for error rates are scary: a widely-cited statistic (*Code Complete* by Steve McConnell) gives an industry average of 15-50 bugs per 1000 lines of *delivered* code.

- it's widely agreed that carefully-designed and tested software can beat this by one or more orders of magnitude

# One way to help: testing

- *static testing*: "code that analyzes code" (in Julia: [JET.jl](https://github.com/aviatesk/JET.jl) and [SnoopCompile.jl](https://github.com/timholy/SnoopCompile.jl))
- *dynamic testing*: run the code and check the results

## Testing can do more than just reduce bugs

Testing can help encourage development *processes* that results in better outcomes

# Challenges to writing good code: smart people

*(Acknowedgments to Evan Dorn, https://www.youtube.com/watch?v=HhwElTL-mdI)*

Many people who write code are so smart they can juggle a lot in their heads


"Sit down and start typing" is like an engineer who says "I don't need plans, just give me the bricks and I'll start construction."

May work for small projects, but limits you with complex projects

(story time)

# Advantages of *having* tests

- documents what you expect your code to do

- validates your initial implementation (to you and others)

- *it frees you to improve the code later*, largely without fearing that you'll trigger new bugs

(interactive demo)

# Test-driven development (TDD)

![TDD](figures/TDD.jpg)

# Advantages of *writing* tests

- "exercise" your design before going to the trouble to write it

- testability encourages good design: testable code tends to be "good code"

# Code that might be hard to test

```julia
function calculate(A, B, C, x, y, z; option1=nothing, option2=nothing, option3=nothing, option4=nothing...)
    
    # 100 lines of "preliminaries" that set options, convert arguments into standard formats, etc.
    
    # Now the real computation starts
    
end
```


Matlab is particularly bad about encouraging designs like these (one callable function per file, lack of namespaces)

But Python (& R?) are not free of such problems (one callable *method* per function => many options/function)

# Code that might be easier to test

```julia
# Special cases
calculate(A::AbstractVector, x) = ...   # in one dimension, only `A` and `x` are relevant
calculate(A::AbstractMatrix, B, C, x::Bool, y::AbstractVector, z::Real) = ...   # options unused if `x::Bool`
calculate(A::AbstractMatrix, B, C, x::AbstractVector{Bool}, y::AbstractVector, z::Real; option2=mean(x)/2) = ...

# General case
function calculate(A::AbstractMatrix, B, C, x, y, z; option1=nothing, option2=nothing, option3=nothing, option4=nothing...)
    zz = part_of_calculation(B, C; option2)
    return different_part_of_calculation(A, zz; option1)
end

part_of_calculation(B, C; option2=nothing) = ...

different_part_of_calculation(A, x; option1=size(A, 2)) = ...
```


# Interaction between testing & development: a sketch

Suppose you have a huge directory of data files:

```
simulation_8-17-21_10000points_random_a=5.hdf5
simulation_8-21-21_10000points_random_a=50.hdf5
simulation_8-27-21_10000points_grid_a=5.hdf5
simulation_8-28-21_10000points_grid_a=50.hdf5
simulation_9-13-21_25000points_random_a=7.5_algorithm2.hdf5
simulation_10-4-2021_25000points_random_a=37.0_algorithm2.hdf5
...
```

You want to analyze the overall conclusions that can be drawn from these different executions.

# Pieces of your code
- load the data (from one file or many)
- perform analysis and synthesis
- plot the results

All in one function: testing is hard (you have to inspect the plots manually)

## Modularization

Idea: split the analysis & synthesis into separate pieces

```julia
function analysis1(filename)
    ...
end

function analysis2(filename1, filename2)
    ...
end
```

## Modularization fail

Scary thought: if I keep the same *data*, and just tweak the filename (how it encodes the parameters), will I get the same result?

Example: somewhere, your code assumes `a` is an integer, but then you analyze a file like
```
simulation_9-13-21_25000points_random_a=7.5_algorithm2.hdf5
```
Will your code treat `a` as if it were 7?

Problem with this design: `analysis1` has to be tested with:
- all the different kinds of `data` you need to handle
- AND all the different ways you might encode the parameters as a string in the filename


Will you notice the mistake just from looking at the plots?

## Aphorisms for *better* modularization

Single responsibility principle (SRP): code should do "one thing"

Don't repeat yourself (DRY) (bad design: write parsing code in each of `analysis1`, `analysis2`, etc.)

Other aphorisms that interact with testing:
- KISS (Keep it simple, stupid!)
- YAGNI (You aren’t going to need it) (don't implement it if it's not necessary)
- Hide implementation details

## A better design

```julia
struct Simulation
    date::Date              # from the `Dates` standard library
    npoints::Int
    initialization::Symbol
    a::Float64
    version::Int
end

function Simulation(filename)
    # handle the parsing, give user-friendly errors, ...
end

```

```julia
sim = Simulation("simulation_8-27-21_10000points_grid_a=5.hdf5")
@test month(sim.date) == 8
@test sim.a == 5
sim = Simulation("simulation_8-27-21_10000points_grid_a=7.5.hdf5")
@test sim.a == 7.5
...
```

Get the parsing working to the point of confidence, *make commits*, then move on to the next task.

Strict TDD would say write the tests first, and that would be great, but keeping *testability* in mind when you write the code does almost as much good.

"Test-centered mindset" is more important than any particular workflow.

# Cons and pros of writing tests

- writing tests takes time


- writing bad code and fixing it takes even longer

- sometimes, having a good process for development makes you faster (but it takes practice)

# Limits to testing

- some things are so hard to test, it may not be worth it
- don't test trivial things ("add 1 to x")
- playing around is OK (but write tests for the parts you keep)
- when fixing bugs, sometimes you can't even write a MWE until you verify that you've understood the bug by fixing it

Remember, the goal is good code no matter how you get it.

# Test coverage

You're thinking about contributing to an external package. Are you "coding without a net"?

![without a net](figures/without_a_net.jpg)

# Coverage examples

To avoid embarassing anyone, let me pick examples of my own:
- encouraging case: https://github.com/timholy/FlameGraphs.jl
- discouraging case: https://github.com/timholy/ProfileView.jl


In discouraging cases, an exemplary workflow would be to:
1. submit PR(s) that improve coverage
2. see whether the maintainer merges them
3. once confident you won't break anything, start making the changes you want

# Special topics (1/2): errors, warnings, logs, & broken tests

Homework: read the documentation for Julia's `Test` standard library

- `@test_throws`: check that code produces an error
- `@test_warn` and `@test_logs`: check that code prints notices for the user
- `@test_broken`: your TODO list!

# Special topics (2/2): tests for GUIs

Observables (also called reactive variables, signal/slots)

In [None]:
using Observables
obs = Observable(3)
on(obs) do val
    println("obs has value $val")
end

In [None]:
obs[] = 1

Strategy:
- user interaction does nothing but update the values of observables
- all program code depends only on observables
- test your code by having the tests set new values for the observables

# Summary

- testing is an integral part of modern software development, and essential if you want to produce high-quality code

- leverage testing as a *systematic process* that you use to help improve the quality of your design

- tools for analyzing test coverage can help you identify & catch potential weaknesses before they bite you