In [None]:
# %pip install pytest ipytest

# Pre-Workshop Exercises: Unit Testing with `pytest`


---

## Background




### What is Automated Testing?

Anything that your code can do, you can check.  If you find yourself manually checking for something, putting that check in an automated test helps you continue doing it as you develop your project further, automatically.

[`pytest`](https://docs.pytest.org/en/stable/) contains a test runner that looks for automated tests; one way it finds them is by looking for function names that start with the word `test_`, in file names that start with the word `test_`.  It runs each function it finds, and marks down whether running it:
  - **Passed**: The function ran with no errors
  - **Failed**: The function ran with an `AssertionError`
  - **Errored**: The function ran with any other error type.

While [PyTest] is far and away the most popular testing framework in Python, there are other alternatives, including:

| Library | When to Choose it over Pytest |
| :-- | :-- |
| [`unittest`](https://docs.python.org/3/library/unittest.html) | When you want something built-in to Python. |
| [`doctest`](https://docs.python.org/3/library/doctest.html) | When you want automated test code in your function ducumentation | 
| [`behave`](https://behave.readthedocs.io/en/latest/) | When you want automated test code to be readable and writable by non-coding teammates |
| [A PyTest Plugin](https://docs.pytest.org/en/stable/reference/plugin_list.html) | When you want to add features to your tests, make special test types easier to write, or work with tricky-to-test frameworks. |




### Why Write Automated Tests?


While software developers often discuss automated testing in terms of quality control for gaining others' trust in our code in large projects, in practice, automated testing tools are used for a wide variety of development tasks that help speed up the development process.  Here are just a few examples, to show how pervasive automated testing is:

  1. **Checklist Automation**: Do you find yourself rerunning your code a lot to confirm that it still works?  This can slow down our workflow and hurt our creative flow, if it takes too long.  For each check you need to do, just write an automated test and free up our brain space!

  1. **Colleague Onboarding**: Is everything ready for your colleagues to make contribution to your code?  If they can run some automated tests, then you can be confident that things are installed and ready on their machines.

  1. **Troubleshooting Time Reduction**: Each time you run your code, does it take you a few minutes to work out  where the problem is?  Unit tests turn those minutes into seconds, making it easier to quickly pin down why our code isn't working. 
 
  1. **Design Tools for Tricky Algorithms**: Do you feel like a  particular function feels more like a brain teaser than usual?  Write a few tests as you work on it, to free up some brain space that's focused on checking the code. 

  1. **Code Structuring Guidance**: Wondering if your code is still modular?  Writing tests is a great software architecture check--unit-testable code is modular code!

  1. **UX Guidance**: Wondering if your functions are intuitive to use for others? A great check is to write unit tests, and see if the test code is complicated.  Simple tests mean intuitive interfaces!

  1. **Getting Started Help**: Not even sure how to get started with a project, or what you really should do?  Write a test on a program you would *like* to write!  The test will fail, of course (because the code isn't written yet), but you'll then have a clear idea of what needs done, and in what order.

  1. **Bug Fixing**: A user is reporting a bug in your code?  Write an automated test to recreate the bug, so that the test fails when the bug is present.  Then, fix the code!  

  1. **Code Review Simplification**: Want to speed up code review when accepting contributions?  Require tests on new code! If you're happy with the tests, and they pass, then the code is likely already good to go.

  1. **Mentoring Aid**: Have a junior who wants to contribute, but isn't sure how?  Sit together with them and write some tests that they should get to pass.  That way they have feedback on their progress to their goal, and you get code that works!
  
  

### Unit Tests vs "Other" Tests

We talk a lot about unit tests, because they are easiest to make, and in a big project they make up the vast majority of the automated tests, but there are a lot of different types of tests--which you choose just depends on your goals for that test.  Here are a few other options out there:

| Test Type | When you want to... | Example |  
| :-- | :-- | :-- |
| **Unit Test** | Check that a function or method works. | Check: Calling `predict([1, 2])` returns a transformed array `[3, 4]`. |
| **Property Test** | A type of unit test, it checks that all calls to a function or method result something with a desired property. | Check: Calling `predict()` always returns a 1D array of floats. |
| **Integration Test** | Check that a function, method, or class calls other functions or methods the way you expected. | Check: Calling `predict()` calls the OpenAI API with certain parameters. |
| **System Test** | Check that the whole project works on a high level.  | Check: when I run my pipeline on my data, I get the figures I want. |
| **Smoke Test** | Checks that nothing is crashing. | Check: When I run my script, it doesn't error out. |
| **Behavior Test** | Checks that the program works along the user's expectations | Check: when I press the `fit` button, I see a model fit on-screen. |
| **Snapshot Test** | Checks that the program still does the same thing it did yesterday.  | Check: my pipeline still produces the same figure it did last time I ran the test. |

In this workshop, we'll focus on **Unit Testing** and **Property Testing**.  If there's interest, we can expand to other test types in future workshops.

---



## Workshop Agenda

| Minutes | Activity | Requirements |
| :-- | :-- | :-- |
| 0 - 30 | Review the Pre-Workshop Exercises, Discuss Automated Testing |  *Complete this Notebook before the Course* |
| 30 - 100 | Breakout Rooms: Test-Driven Development Exercise | *Be Familiar with PyTest* |
| 100 - 110 | Break | --- |
| 110 - 190 | Breakout Rooms: Add Tests to Own Projects | *Have Your Project in GitHub, and an Idea of Something to Check* |
| 190 - 210 | Mini-Retrospective | --- |



---

## Exercises

Writing automated tests is work--it doesn't come for free.  Through these exercises, we'll get familiar with the basics of the `pytest` framework and a few useful supplmentary libraries, writing unit tests in a concise manner, so we can spend less time writing boilerplate test code and more time building our projects.

### Setup

To make it easy to write and run automated tests in a notebook, we're using the [`ipytest`](https://pypi.org/project/ipytest/) package.  Run the code below to get it set up.

In [1]:
# Run this to install the packages used in these exercises
%pip install pytest ipytest hypothesis numpy pandas 

Collecting pytest
  Downloading pytest-8.4.2-py3-none-any.whl.metadata (7.7 kB)
Collecting ipytest
  Downloading ipytest-0.14.2-py3-none-any.whl.metadata (17 kB)
Collecting hypothesis
  Downloading hypothesis-6.146.0-py3-none-any.whl.metadata (5.6 kB)
Collecting numpy
  Downloading numpy-2.3.4-cp312-cp312-win_amd64.whl.metadata (60 kB)
Collecting pandas
  Downloading pandas-2.3.3-cp312-cp312-win_amd64.whl.metadata (19 kB)
Collecting iniconfig>=1 (from pytest)
  Downloading iniconfig-2.3.0-py3-none-any.whl.metadata (2.5 kB)
Collecting pluggy<2,>=1.5 (from pytest)
  Using cached pluggy-1.6.0-py3-none-any.whl.metadata (4.8 kB)
Collecting sortedcontainers<3.0.0,>=2.1.0 (from hypothesis)
  Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl.metadata (10 kB)
Collecting pytz>=2020.1 (from pandas)
  Using cached pytz-2025.2-py2.py3-none-any.whl.metadata (22 kB)
Collecting tzdata>=2022.7 (from pandas)
  Using cached tzdata-2025.2-py2.py3-none-any.whl.metadata (1.4 kB)
Downloading pytest-8.4

In [3]:
import pytest
import ipytest

# Makes the "%%ipytest cell magick work, set options"
# or just call `ipytest.autoconfig()` without options to make it all work easily.
ipytest.config(
    magics=True, 
    defopts="auto", 
    addopts=[
        "-q",  # quiet output
        "-W", "ignore:Module already imported so cannot be rewritten:pytest.PytestAssertRewriteWarning",
    ],
    coverage=False
)

{'addopts': ['-q',
  '-W',
 'clean': '[Tt]est*',
 'coverage': False,
 'defopts': 'auto',
 'display_columns': 100,
 'magics': True,
 'raise_on_error': False,
 'rewrite_asserts': False,
 'run_in_thread': False}

### 1. Checking Test Code


Automated Tests three main parts:  a **Test Runner**, **Test Functions** and **Test Assertions**.

| Term | Code | Description |
| :-- | :-- | :-- |
| **Test Runner** | `%%ipytest` | The program that finds, runs, and records tests and their results. |
| **Test Function** | `def test_xxx():` | The function run by the test runner. Must start with `test_`, followed by a description of what is checked.  |  
| **Test Assertion** | `assert x` | The check itself.  If all checks in a test function pass without error, then the test is considered to have passed. |


Let's start with pre-written tests, fixing small parts of the code to see how the test runner gives feedback on what it finds.


**Task**: One of the tests below has an error in it, and so the test is failing. Use the output from pytest to find the failing test and fix it so all tests pass.

In [9]:
%%ipytest

def test_sum_1_2_is_3():
    assert sum([1, 2]) == 3


def test_sum_2_3_is_5():
    assert sum([2, 3]) == 5

[32m.[0m[32m.[0m[32m                                                                                           [100%][0m
[32m[32m[1m2 passed[0m[32m in 0.06s[0m[0m




**Task**: Below are three unit tests that check for three different types of things: a value, a type, and an error.  Edit the code so all tests do their intended checks successfully.

In [21]:
from typing import Union
Union[int, float]  # static typing
hasattr(1, '__add__')  # structural typing

True

In [None]:
%%ipytest

def test_sum_3_4_is_7():
    assert sum([3, 4]) == 7

def test_sum_of_ints_is_an_int():
    assert isinstance(sum([3, 2]), int)

def test_sum_strings_a_b_raises_typeerror():
    with pytest.raises(TypeError):
        sum(['a', 'b'])
    
    

[32m.[0m[32m.[0m[32m.[0m[32m                                                                                          [100%][0m
[32m[32m[1m3 passed[0m[32m in 0.07s[0m[0m


### 2. Checking Equality of Numpy Arrays with `numpy.testing` and Pandas DataFrames with `pandas.testing`

Because we do a lot of data science work, we often check that arrays and dataframes are what we expected.  To make this check easier, many packages include a `testing` subpackage with special `assert_()` functions used to simplify writing tests on  their data structures.  Not only do they make the code easier to write, they also usually give quite descriptive error messages when the tests fail, making troubleshooting simpler.  Let's try it out with `numpy` and `pandas`.

**Task**: Write a unit test to check that the computed numpy array is the expected one.  When the test fails, use the error messages to fix the test so that it passes.

In [22]:
%%ipytest

import numpy as np
import numpy.testing as npt
# npt.assert_array_equal()

a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
expected = np.array([5, 7, 8])
observed = a + b
npt.assert_array_equal(expected, observed)


AssertionError: 
Arrays are not equal

Mismatched elements: 1 / 3 (33.3%)
Max absolute difference among violations: 1
Max relative difference among violations: 0.11111111
 ACTUAL: array([5, 7, 8])
 DESIRED: array([5, 7, 9])


[33m[33mno tests ran[0m[33m in 0.04s[0m[0m


**Task**: Write a unit test to check whether the two methods below produce the same dataframes.  When the test fails, use the error messages to fix the test so that it passes.

In [None]:
# import pandera  # Data validation libraries

In [None]:
%%ipytest

import pandas as pd
import pandas.testing as pdt
# pdt.assert_frame_equal()

# Method one: from dictionary
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [10, 11, 12]})

# Method two: Stepwise DataFrame Mutation
df2 = pd.DataFrame()
df2['b'] = [10, 11, 12]
df2['a'] = [1, 3, 3]

pdt.assert_frame_equal(df1, df2)

### 3. Test Parameterization: Check More Cases with Less Code


It's valuable to check many different sets of inputs and confirm that the outputs are all correct; strange little bugs appear in many functions when certain values weren't what we expected.  But writing a function for every single set of inputs we want to check for is needlessly verbose.  `Parametrizing` tests functions makes that code more condensed, and PyTest provides a decorator for doing this `@pytest.mark.parametrize()`



**Task**: The code below uses PyTest's parametrization feature.  Without writing a new function, use that feature to add two more checks (a.k.a. "test cases") to the tests below, so that a total of 4 tests run:
  - `3 + 7 = 10`
  - `-2 + 3 = 1`

In [24]:
%%ipytest

cases = [
    [[1, 2], 3],
    [[2, 3], 5],
    [[-2, 6], 4],
]
@pytest.mark.parametrize('inputs,output', cases)
def test_sum_of_integers(inputs, output):
    assert sum(inputs) == output


[32m.[0m[32m.[0m[32m.[0m[32m                                                                                          [100%][0m
[32m[32m[1m3 passed[0m[32m in 0.05s[0m[0m


**Task**: Rewrite the three test functions below into a single test function, using `parametrize` to continue checking each case individually. Note that pytest includes an `approx()` function for helping check floats, since there are often little rounding errors with them.

In [None]:
%%ipytest

def test_5p2_minus_2p1_is_3p1():
    assert 5.2 - 2.1 == pytest.approx(3.1)

def test_6p5_minus_1p7_is_4p8():
    assert 6.5 - 1.7 == pytest.approx(4.8)

def test_0p3_minus_0p2_is_0p1():
    assert 0.3 - 0.2 == pytest.approx(0.1)

### 4. Property Testing with `hypothesis`

There are also cases where you want to check a bunch of inputs to make sure that the code works as correctly, but:
  -  you don't know exactly which inputs are the best to check,
  -  and you aren't sure exactly how to calculate the expected result,
  -  but you know what aspect of the result you want to check (i.e. "property" you want the result to have).

This is called "**Property Testing**", and the `hypothesis` library helps with that.  Just describe the inputs that should go in, and write your test, and it will check your code with a wide range of inputs!

**Task**: The test function below isn't checking what it means to be (as described by the function name), and so Hypothesis keeps finding sets of inputs that make the test fail.  Fix the inputs and the test function body, so the test is correct.

In [28]:
%%ipytest 

from hypothesis import given
from hypothesis import strategies as st
# For the curious, a full list of "strategy" functions (how hypothesis generates inputs): 
# https://hypothesis.readthedocs.io/en/latest/reference/strategies.html


@given(
    st.lists(st.integers(min_value=1), min_size=1),
)
def test_sum_of_positive_integers_always_a_positive_integer(inputs):
    assert sum(inputs) > 0


[32m.[0m[32m                                                                                            [100%][0m
[32m[32m[1m1 passed[0m[32m in 0.40s[0m[0m


**Task**: Have hypothesis generate `float` values in order to test the function below.

In [None]:
%%ipytest 

from hypothesis import given
from hypothesis import strategies as st

@given(
    # Put a strategy here for the first float
    # Put a strategy here for the second float
)
def test_sum_of_two_floats_is_always_equivalent_to_using_plus_operator(first, second):
    assert sum([first, second]) == pytest.approx(first + second)
