# Cookies & Code - Testing: What, Why, and How!

## Learning Intentions

- Explain when and why a piece of code needs tests.
- Identify key features of tests
-  Structure tests using the Arrange-Act-Assert framework.
-  Write and run simple tests using pytest \& ipytest.
-  Outline how writing testable code differs from writing code.
- Be equipped to start writing your own tests **today**.

# What is testing?


## What is pytest?
There are many packages that can be used to help test Python code.
`pytest` is known for its simple language and concise syntax, particularly compared to `unittest` which is part of the Python Standard Library.

You need to import `ipytest` to directly run pytest-style tests in a Jupyter Notebook.

In [1]:
# Import at beginning of notebook
import ipytest
import pytest
ipytest.autoconfig()

You'll also notice commands beginning with `%%ipytest -qq` in cells containing tests.
We'll investigate their role soon!

## What is a test?
A test is a *function* that has an *assert statement*.

`pytest`, or `ipytest` runs the test functions. 

If its assert statements are:
- **true** — the test will **pass**!
- **false** or an **error** occurs — the test will **fail**.

Tests come in a few different flavours - we will deal with **unit tests** today. 

A unit test tests a small, isolated, piece of code (a unit).


## Example Set 1 

Try running (shift + enter) the following cells of examples.

In [2]:
%%ipytest

def test_example():
    assert [1, 2, 3] == [1, 2, 3]

[32m.[0m[32m                                                                                            [100%][0m
[32m[32m[1m1 passed[0m[32m in 0.40s[0m[0m


In [3]:
%%ipytest
# What happens if you run the cell without the line above?

def test_will_fail():
    assert False

[31mF[0m[31m                                                                                            [100%][0m
[31m[1m__________________________________________ test_will_fail __________________________________________[0m

    [0m[94mdef[39;49;00m[90m [39;49;00m[92mtest_will_fail[39;49;00m():[90m[39;49;00m
>       [94massert[39;49;00m [94mFalse[39;49;00m[90m[39;49;00m
[1m[31mE       assert False[0m

[1m[31m/var/folders/vv/d9ncb4ms2x1gl0mmkrvk8m7c0000gp/T/ipykernel_8145/3403248648.py[0m:4: AssertionError
[31mFAILED[0m t_4d9120d0c8834c95b5cc1d988dccc58d.py::[1mtest_will_fail[0m - assert False
[31m[31m[1m1 failed[0m[31m in 0.08s[0m[0m


In [4]:
%%ipytest

def this_test_will_not_run():
    # Pytest tests are required to begin with the test_ prefix.
    assert False


[33m[33mno tests ran[0m[33m in 0.00s[0m[0m


***Optional:** Experiment by replacing `%%ipytest` with:*
- `%%ipytest -vv`
- `%%ipytest -qq`

## Example Set 2 - Structuring Tests
Let's look at a slightly more interesting use case. We have a function "example_func" and we want to write some unit tests for it. 

We will write tests using the Arrange-Act-Assert Framework:

- **Arrange:** Create inputs to the function or class you are testing.
- **Act:** Call the function or class you are testing.
- **Assert:** Assert that you get the output you expected.

Look at how the tests below are structured:

In [5]:
def example_func(x: int, y: int) -> int:
    return x + y

Try running the tests. Some should fail! Can you fix them?

In [6]:
%%ipytest

def test_example_func():
    ## Arrange
    x = 10
    y = 15

    ## Act
    output = example_func(x,  y)

    ## Assert
    assert output == 25

def test_example_func_failing_test():
    ## This test fails! Can you fix it?
    ## NB: Requires changing 1 line.

    ## Arrange
    x = 20
    y = 15

    ## Act
    output = example_func(x, y)

    ## Assert
    assert output == 20

def test_example_func_fails_with_none():
    ## This test fails! What is the problem?
    ## NB: Requires changing 1 line.
    
    ## Arrange
    x = None
    y = 2

    ## Act & Assert
    with pytest.raises(ValueError):
        output = example_func(x, y)


[32m.[0m[31mF[0m[31mF[0m[31m                                                                                          [100%][0m
[31m[1m__________________________________ test_example_func_failing_test __________________________________[0m

    [0m[94mdef[39;49;00m[90m [39;49;00m[92mtest_example_func_failing_test[39;49;00m():[90m[39;49;00m
        [90m## This test fails! Can you fix it?[39;49;00m[90m[39;49;00m
        [90m## NB: Requires changing 1 line.[39;49;00m[90m[39;49;00m
    [90m[39;49;00m
        [90m## Arrange[39;49;00m[90m[39;49;00m
        x = [94m20[39;49;00m[90m[39;49;00m
        y = [94m15[39;49;00m[90m[39;49;00m
    [90m[39;49;00m
        [90m## Act[39;49;00m[90m[39;49;00m
        output = example_func(x, y)[90m[39;49;00m
    [90m[39;49;00m
        [90m## Assert[39;49;00m[90m[39;49;00m
>       [94massert[39;49;00m output == [94m20[39;49;00m[90m[39;49;00m
[1m[31mE       assert 35 == 20[0m

[1m[31m/var/folde

# Why do we test?




- Writing automated unit tests allows us to change code and ensure we do not break existing functionality.
- Unit tests are a contract for the intent of the code.


## Benefits of testing

- Less bugs! A test suite evolves as you fix bugs so they NEVER occur again.
- Documents the code.
- Forces you to write more modular code.

## Does this code need tests?

I have a piece of code that runs on a server every day at 9am.
	
It's run for 15 years and has not been altered since 1992.
	
Should I go and write tests for this function?
	
What about a simple 5 line script that calls an API & sends an email?



## What code should I test?

Generally you should test code that meets one or more of the following conditions:
- Used in multiple places,
- Frequently changing or under active development,
- Important to be correct, or
- Complex, with lots of edge cases.

The tests are coupled to the implementation!

# How to Write Testable Code



Testable code should be:
- Modular,
- Deterministic,
- De-coupled.




## Example 3 - Refactoring

I have some code below that reads in some data from a file, and constructs some Star objects with some characteristics.

In [7]:
import csv
from typing import List

class Star:
    def __init__(self, name: str, distance: float, luminosity: float):
        self.name = name
        self.distance = distance
        self.luminosity = luminosity

def process_astronomy_data(filename: str) -> List[Star]:
    """
    Reads a CSV file of star data and returns a list of Star objects.

    Args:
        filename (str): Path to the CSV file containing star data.
    
    Returns:
        List[Star]: A list of Star objects created from the CSV file.
    """
    stars = []
    with open(filename, newline='') as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            name = row['Name']
            distance = float(row['Distance'])
            luminosity = float(row['Luminosity'])
            star = Star(name, distance, luminosity)
            stars.append(star)
    
    return stars



What makes testing the above function hard? Think about all the operations that are occurring.

In [8]:
### Refactored version

from typing import List, Dict

def read_star_data(filename: str) -> List[Dict[str, str]]:
    """
    Reads a CSV file and returns the data as a list of dictionaries.
    
    Args:
        filename (str): Path to the CSV file containing star data.
    
    Returns:
        List[Dict[str, str]]: The raw data from the CSV as a list of dictionaries.
    """
    with open(filename, newline='') as csvfile:
        reader = csv.DictReader(csvfile)
        return [row for row in reader]

def create_star_objects(data: List[Dict[str, str]]) -> List[Star]:
    """
    Converts raw star data into Star objects.
    
    Args:
        data (List[Dict[str, str]]): Raw data containing star information.
    
    Returns:
        List[Star]: A list of Star objects created from the raw data.
    """
    stars = []
    for row in data:
        name = row['Name']
        distance = float(row['Distance'])
        luminosity = float(row['Luminosity'])
        star = Star(name, distance, luminosity)
        stars.append(star)
    return stars

def process_astronomy_data(filename: str) -> List[Star]:
    """
    Processes the astronomy data by reading and creating Star objects.
    
    Args:
        filename (str): Path to the CSV file containing star data.
    
    Returns:
        List[Star]: A list of Star objects.
    """
    raw_data = read_star_data(filename)
    return create_star_objects(raw_data)

In [9]:
%%ipytest -qq
from unittest.mock import patch, mock_open

sample_data = [
    {"Name": "Star A", "Distance": "10", "Luminosity": "1000"},
    {"Name": "Star B", "Distance": "20", "Luminosity": "2000"}
]

@pytest.mark.parametrize("data,expected_stars", [
    (sample_data, [
        Star("Star A", 10.0, 1000.0),
        Star("Star B", 20.0, 2000.0)
    ])
])
def test_create_star_objects(data, expected_stars):
    stars = create_star_objects(data)
    for star, expected_star in zip(stars, expected_stars):
        assert star.name == expected_star.name
        assert star.distance == expected_star.distance
        assert star.luminosity == expected_star.luminosity

# Using a dummy file to test data ingestion. Look at fake_data.csv.
def test_process_astronomy_data():
    # This test fails. Can you use the debugger to figure out why?
    stars = process_astronomy_data("fake_data.csv")
    assert len(stars) == 2
    assert stars[0].name == "Star A"
    assert stars[1].name == "Star B"


    

[32m.[0m[31mF[0m[31m                                                                                           [100%][0m
[31m[1m___________________________________ test_process_astronomy_data ____________________________________[0m

    [0m[94mdef[39;49;00m[90m [39;49;00m[92mtest_process_astronomy_data[39;49;00m():[90m[39;49;00m
        [90m# This test fails. Can you use the debugger to figure out why?[39;49;00m[90m[39;49;00m
        stars = process_astronomy_data([33m"[39;49;00m[33mfake_data.csv[39;49;00m[33m"[39;49;00m)[90m[39;49;00m
>       [94massert[39;49;00m [96mlen[39;49;00m(stars) == [94m2[39;49;00m[90m[39;49;00m
[1m[31mE       assert 0 == 2[0m
[1m[31mE        +  where 0 = len([])[0m

[1m[31m/var/folders/vv/d9ncb4ms2x1gl0mmkrvk8m7c0000gp/T/ipykernel_8145/184344751.py[0m:25: AssertionError
[31mFAILED[0m t_4d9120d0c8834c95b5cc1d988dccc58d.py::[1mtest_process_astronomy_data[0m - assert 0 == 2


# Fancier things - if you have time

## Run test .py scripts using Jupyter Notebooks

### Run a single file of tests

In [10]:
!pytest test_example_functions.py -q

[32m.[0m[32m                                                                        [100%][0m
[32m[32m[1m1 passed[0m[32m in 0.26s[0m[0m


### Run all tests in directory

In [11]:
!pytest

platform darwin -- Python 3.13.2, pytest-8.3.5, pluggy-1.5.0
rootdir: /Users/jsmallwood/Documents/projects/cookies-testing
plugins: hypothesis-6.130.9, anyio-4.9.0
collected 15 items                                                             [0m

test_example_functions.py [32m.[0m[32m                                              [  6%][0m
test_using_parametrize.py [32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m.[0m[32m                                 [100%][0m



### Run tests with detailed outputs

In [12]:
! python -m pytest -vv

platform darwin -- Python 3.13.2, pytest-8.3.5, pluggy-1.5.0 -- /Users/jsmallwood/Documents/projects/cookies-testing/.venv/bin/python
cachedir: .pytest_cache
hypothesis profile 'default' -> database=DirectoryBasedExampleDatabase(PosixPath('/Users/jsmallwood/Documents/projects/cookies-testing/.hypothesis/examples'))
rootdir: /Users/jsmallwood/Documents/projects/cookies-testing
plugins: hypothesis-6.130.9, anyio-4.9.0
collected 15 items                                                             [0m

test_example_functions.py::test_additive_inverse [32mPASSED[0m[32m                  [  6%][0m
test_using_parametrize.py::test_add_three[-4--1] [32mPASSED[0m[32m                  [ 13%][0m
test_using_parametrize.py::test_add_three[-1-2] [32mPASSED[0m[32m                   [ 20%][0m
test_using_parametrize.py::test_add_three[0-3] [32mPASSED[0m[32m                    [ 26%][0m
test_using_parametrize.py::test_add_three[9-12] [32mPASSED[0m[32m                   [ 33%][0m
test_

## Using pytest.mark.parameterize

Allows the same test function to run for many input / output pairs.

In [13]:
def square_number(x: int) -> int:
    return x ** 2

In [14]:
%%ipytest -qq
import pytest

@pytest.mark.parametrize("input_num,expected_output", [
    (2, 4), (-2, 4), (8, 64)
])
def test_square_number(input_num, expected_output):
    ## Arrange
    ## Nothing to do here

    ## Act
    output = square_number(input_num)

    ## Assert
    assert output == expected_output

[32m.[0m[32m.[0m[32m.[0m[32m                                                                                          [100%][0m


## Fixtures

Allow re-use of setup objects that you use again and again - for instance reading an input file. Here's an example from pycodif:

In [15]:
## Just for an example - this won't run.
## The "example_frame" code is executed before each test
## and the output passed in the "example_frame" argument.
class TestCODIFFrame:

    @pytest.fixture()
    def example_frame(self):
        with open("tests/test_files/test_codif.codif", "rb") as f:
            codif = CODIFFrame(f)
        return codif

    def test_data_parsing(self, example_frame):
        assert hasattr(example_frame, "header")
        assert hasattr(example_frame, "data_array")
        assert hasattr(example_frame, "sample_timestamps")

    def test_data_values(self, example_frame):
        assert isinstance(example_frame.data_array, np.ndarray)
        assert example_frame.data_array.dtype == np.dtype("complex64")
        assert example_frame.data_array[0, 0] == -23 + 45j
        assert example_frame.data_array[0, -1] == 113 - 89j
        assert example_frame.data_array[-1, 0] == 45j
        assert example_frame.data_array[-1, -1] == -43 + 58j

## Hypothesis

This is a package that generates test cases for us based on properties.

In [18]:
%%ipytest

# Taken from hypothesis quickstart guide:
# https://hypothesis.readthedocs.io/en/latest/quickstart.html
from hypothesis import given, strategies as st

@given(st.integers(0, 200))
def test_integers(n):
    assert n < 50

[31mF[0m[31m                                                                                            [100%][0m
[31m[1m__________________________________________ test_integers ___________________________________________[0m

    [0m[37m@given[39;49;00m(st.integers([94m0[39;49;00m, [94m200[39;49;00m))[90m[39;49;00m
>   [94mdef[39;49;00m[90m [39;49;00m[92mtest_integers[39;49;00m(n):[90m[39;49;00m

[1m[31m/var/folders/vv/d9ncb4ms2x1gl0mmkrvk8m7c0000gp/T/ipykernel_8145/2354737801.py[0m:6: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

n = 50

    [0m[37m@given[39;49;00m(st.integers([94m0[39;49;00m, [94m200[39;49;00m))[90m[39;49;00m
    [94mdef[39;49;00m[90m [39;49;00m[92mtest_integers[39;49;00m(n):[90m[39;49;00m
>       [94massert[39;49;00m n < [94m50[39;49;00m[90m[39;49;00m
[1m[31mE       assert 50 < 50[0m
[1m[31mE       Falsifying example: test_integers([0m
[1m[31mE         

In [19]:
%%ipytest
from typing import Union
import numpy as np

## Some of these fail. Is that expected?
## Can you fix them? 
## NB: Look at the section on filtering here: https://hypothesis.readthedocs.io/en/latest/quickstart.html

def square(x: Union[int, float]) -> Union[int, float]:
    return x ** 2

def square_root(x: Union[int, float]) -> Union[int, float]:
    return x ** 0.5

@given(s=st.integers())
def test_inverses_integers(s):
    assert np.isclose(s, square(square_root(s)))
    assert np.isclose(s, square_root(square(s)))

@given(s=st.floats())
def test_inverses_floats(s):
    assert np.isclose(s, square(square_root(s)))
    assert np.isclose(s, square_root(square(s)))

[31mF[0m[31mF[0m[31m                                                                                           [100%][0m
[31m[1m______________________________________ test_inverses_integers ______________________________________[0m

    [0m[37m@given[39;49;00m(s=st.integers())[90m[39;49;00m
>   [94mdef[39;49;00m[90m [39;49;00m[92mtest_inverses_integers[39;49;00m(s):[90m[39;49;00m

[1m[31m/var/folders/vv/d9ncb4ms2x1gl0mmkrvk8m7c0000gp/T/ipykernel_8145/3881189325.py[0m:15: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

s = -1

    [0m[37m@given[39;49;00m(s=st.integers())[90m[39;49;00m
    [94mdef[39;49;00m[90m [39;49;00m[92mtest_inverses_integers[39;49;00m(s):[90m[39;49;00m
        [94massert[39;49;00m np.isclose(s, square(square_root(s)))[90m[39;49;00m
>       [94massert[39;49;00m np.isclose(s, square_root(square(s)))[90m[39;49;00m
[1m[31mE       assert np.False_[0m
[1m[31mE       