# Fake It 'Till You Mock It

Welcome to the Mock Tutorial notebook. This will be the resource we use to both learn, and test what we learned. This notebook is designed to be loaded to a Jupyter Lab instance that has `pytest` and `ipytest` installed.

**NOTE** I hope you had a chance to go through the preparation notebook and validate your environment. If you have not done so, and you have any problems, please use the time in the first Hands-On section to run through that notebook. It has validation instructions as well as some useful troubleshooting.

In a new virtual environment, do

```
$ pip install pytest ipytest jupyterlab
```

(If you were in the previous tutorial, we are using a different pytest/Jupyter integration,
after some issues with the previous one.)

When this is done, launch Jupyter

```
$ jupyter lab
```

Click on the upload icon, and upload this notebook.

The next step will be to load the `ipytest` Jupyter extension:

In [None]:
import ipytest
ipytest.autoconfig()

There should not be any output from this step. If an error occured saying "module not found", make sure the virtual environment has `ipytest` installed.

## Recap

This section covers quickly the basics of unit tests, pytests, and mocks.

* Unit tests: functions that exercise production code, usually isolated from environment.
* pytest: a unit test *runner*.
* mocks: objects used instead of "real objects" in production code which interact with environment.

This tutorial uses Jupyter to run tests.
While this is different from how tests run in "real life"
(usually via Continuous Integration, or locally via tools like `tox`),
it is a good way to learn how to write tests.
Pytest is the most popular test runner, so will learn to write unit tests that take advantage of pytest.

Unit tests are especially useful in making sure your code works under *extreme conditions*. Does your certificate rotation code do the right thing when the file system is full? Does your web service work when the connection to the backend disconnects a lot? How does your data analysis code handle corrupted inputs?

Those are things that are sometimes hard to test in real systems, but are *exactly* the things that make the difference between high-quality code versus finding yourself in the incident review meeting.

We will start by running a simple test, that will mostly check that our environment is properly configured:

In [None]:
%%run_pytest[clean]

from unittest import mock
import pytest

@pytest.mark.parametrize('value', [1, 2])
def test_something(value):
    thing = mock.MagicMock()
    thing.return_value = value
    assert thing() == value + 1

This test had a few things that will be in a lot of our examples:

* Creating a mock
* Parametrizing a test
* Failing tests intentionally to see how the assertion errors look like.

Parametrizing tests is useful to run the *same test* with *different parameters*.
This is useful, for example, to check that your code works with both positive and negative numbers.

**HANDS ON** 

This is our first hands-on exercise. The next few will be a little more challenging, but this one is just to make sure we have our environment set up. We will have 10 minutes to finish it. We will get back to the tutorial at 16:50 (US/Pacific).

Try to run it in your environment: go into the notebook, and

* Execute the `import ipytest` cell, by clicking in and pressing Shift-Enter
* Execute the `run_pytest` cell, by clicking in and pressing Shift-Enter.

You should get a failure that looks much like the following one:

```
FF                                                                          [100%]
==================================== FAILURES =====================================
________________________________ test_something[1] ________________________________

value = 1

    @pytest.mark.parametrize('value', [1, 2])
    def test_something(value):
        thing = mock.MagicMock()
        thing.return_value = value
>       assert thing() == value + 1
E       AssertionError: assert 1 == (1 + 1)
E        +  where 1 = <MagicMock id='139863682412256'>()

<ipython-input-24-628b0c93adc5>:8: AssertionError
________________________________ test_something[2] ________________________________

value = 2

    @pytest.mark.parametrize('value', [1, 2])
    def test_something(value):
        thing = mock.MagicMock()
        thing.return_value = value
>       assert thing() == value + 1
E       AssertionError: assert 2 == (2 + 1)
E        +  where 2 = <MagicMock id='139863674357552'>()

<ipython-input-24-628b0c93adc5>:8: AssertionError
============================= short test summary info =============================
FAILED tmp46_xden7.py::test_something[1] - AssertionError: assert 1 == (1 + 1)
FAILED tmp46_xden7.py::test_something[2] - AssertionError: assert 2 == (2 + 1)
2 failed in 0.03s
```

If you did not, then something might not be installed and configured properly. Check again that `pytest` and `ipytest` are properly installed in your virtual environment.


## Mocking Recap (16:50 US/Pacific)

Sometimes, using real objects is hard, ill-advised, or complicated.

For example, a `requests.Session` connects to real websites:
using it in your unittests invites a...lot...of problems.

In [None]:
from unittest import mock

"Mocks" are a unittest concept: they produce objects that are substitutes for the real ones.
There's a whole cottage industry that will explain that "mock", "fake", and "stub" are all subtly different.
We'll use all of those interchangably.

In [None]:
regular = mock.MagicMock()

def do_something(o):
    return o.something(5)

do_something(regular)

Mocks have "all the methods". The methods will usually return another Mock.
This can be changed by assigning to `return_value`.

In [None]:
%%run_pytest[clean]

def do_something(o):
    return o.something() + 1

def test_something():
    obj = mock.MagicMock(name="an object")
    obj.something.return_value = 2
    assert do_something(obj) == 4

It is also possible to override the "magic methods".

In [None]:
%%run_pytest[clean]

from unittest import mock

def test_magic():
    a = mock.MagicMock()
    a.__str__.return_value = "an a"
    assert str(a) == "a"

We can also make sure that a mock does not have "extra" methods or attributes by using a spec.

In [None]:
%%run_pytest[clean]

import httpx

def bad_http_code(client):
    ## TYPO: psot instead of post
    client.psot("https://example.com/upload-data", json=dict(a=1))

from unittest import mock

def test_bad_http_code():
    dummy_client = mock.MagicMock(spec=httpx.Client)
    bad_http_code(dummy_client)

**HANDS ON**: Fix this test! Only change the marked line to get the test to pass. We will have 5 minutes for this exercise. We will reconvene at 17:00 (US/Pacific).

In [None]:
%%run_pytest[clean]

from unittest import mock

def analyze_website(client):
    ret = client.get("http://example.com").text
    return ret[10:15]

def test_analyze():
    client = mock.MagicMock(name="client", spec=httpx.Client)
    pass # fix this line
    assert analyze_website(client) == "hello"

## Mock Side Effect: More Interesting Behavior (17:00 US/Pacific)

Sometimes, having a `MagicMock` that returns the same thing every time doesn't cut it.
For example, we expect `sys.stdin.readline()` to return different values,
not the same value throughout the test.

The property `side_effect` allows controlling what a magic mock returns on a finer-grain level
than using `return_value`. 

### Iterable

One of the things that can be assigned to `side_effect` is
an *iterable*, such as a sequence or a generator.

This is a powerful feature -- it allows controlling each call's return value,
with little code.

In [None]:
%%run_pytest[clean]

from unittest import mock

def test_values():
    different_things = mock.MagicMock()
    different_things.side_effect = [1, 2, 3]
    assert different_things() == 1
    assert different_things() == 2
    assert different_things() == 4


A more realistic example is when simulating file input.
In this case, we want to be able to control what `readline` returns
each time to pretend it is file input.

In [None]:
%%run_pytest[clean]

from unittest import mock

def parse_three_lines(fpin):
    line = fpin.readline()
    name, value = line.split()
    modifier = fpin.readline().strip()
    extra = fpin.readline().strip()
    return {name: f"{value}/{modifier}+{extra}"}

from io import TextIOBase
    
def test_parser():
    filelike = mock.MagicMock(spec=TextIOBase)
    filelike.readline.side_effect = [
        "thing important\n",
        "a-little\n",
        "to-some-people\n"
    ]
    value = parse_three_lines(filelike)
    assert value == dict(thing="important/a-little+to-most-people")


**HANDS ON** Fix the following cell. Only change the line marked. We will have 10 minutes for this exercise, and reconvene at 17:20.

In [None]:
%%run_pytest[clean]

def read_markdown_header(fpin):
    ret_value = {}
    line = fpin.readline()
    if line != "---\n":
        raise ValueError("invalid", line)
    for i in range(100):
        line = fpin.readline()
        if line == "---\n":
            break
        name, value = line.split(": ")
        ret_value[name] = value ## Only change this line
    return ret_value

from io import TextIOBase
from unittest import mock

def test_parser():
    filelike = mock.MagicMock(spec=TextIOBase)
    filelike.readline.side_effect = [
        "---\n",
        "title: Name of Post\n",
        "author: Some One\n",
        "---\n"
    ]
    value = read_markdown_header(filelike)
    assert value == dict(title="Name of Post", author="Some One")

### Exception

Another thing that is possible to do is to assign an *exception* to the `side_effect` attribute.
This will cause the call to raise this exception.
Using this feature allows simulating edge conditions in the environment:
often precisely the ones that

* You care about
* Are hard to simulate realistically

One popular case is network issues: as per Murphy's law, they will always happen at 4am causing a pager to go off,
and never at 10am when you are seating at your desk. The following is based on real code I wrote to test a network service.e 

In this simplified example, the code returns the length of the response line, or a negative number if a timeout has been reached: the number is different based on when in the protocol negotiation this has been reached. This allows the code 
to distinguish "connection timeout" from "response timeout", for example. Testing this code against a real server is hard: servers try hard to avoid outages! You could fork the server C code and add some "chaos" or...just use `side_effect` and mock.

In [None]:
%%run_pytest[clean]
import socket

def careful_reader(sock):
    sock.settimeout(5)
    try:
        sock.connect(("some.host", 8451))
    except socket.timeout:
        return -1
    try:
        sock.sendall(b"DO THING\n")
    except socket.timeout:
        return -2
    fpin = sock.makefile()
    try:
        line = fpin.readline()
    except socket.timeout:
        return -3
    return len(line.strip())

from io import TextIOBase
from unittest import mock

def test_reader():
    sock = mock.MagicMock(spec=socket.socket)
    sock.connect.side_effect = socket.timeout("too long")
    assert careful_reader(sock) == 1

**HANDS ON** Get all tests to work. Only change the lines marked. Because this exercise is a bit more subtle, you get one test
that already works for free. Now you only have to get the others to pass! We will have 10 minutes for this exercise. We will reconvene at 17:35.

In [None]:
%%run_pytest[clean]
import socket

def careful_reader(sock):
    sock.settimeout(5)
    try:
        sock.connect(("some.host", 8451))
    except socket.timeout:
        return -1
    try:
        sock.sendall(b"DO THING\n")
    except socket.timeout:
        return -2
    fpin = sock.makefile()
    print(fpin.readline.side_effect)
    try:
        line = fpin.readline()
    except socket.timeout:
        return -3
    return len(line.strip())

from io import TextIOBase
from unittest import mock

def test_reader_1():
    # This test works
    sock = mock.MagicMock(spec=socket.socket)
    sock.connect.side_effect = socket.timeout("too long")
    assert careful_reader(sock) == -1
    
def test_reader_2():
    sock = mock.MagicMock(spec=socket.socket)
    # Add only one line
    assert careful_reader(sock) == -2
    
def test_reader_3():
    sock = mock.MagicMock(spec=socket.socket)
    # Add only one line
    assert careful_reader(sock) == -3

def test_reader_pos():
    sock = mock.MagicMock(spec=socket.socket)
    # Add only one line
    assert careful_reader(sock) == 10


**BREAK**

We are going to take a break. Get food, stretch your muscles, relex. We will get back in 10 minutes, at 17:45 US/Pacific

### Callable (17:45 US/Pacific)

As mentioned, the above example was simplified: real network service test code should verify that the results it got were correct to validate that the server works correctly. This means doing a synthetic request and looking for a correct result. The mock object has to emulate that. It has to perform some computation on the inputs.

Trying to test such code without performing any computation is difficult. The tests tend to be too *insensitive* or too *flakey*. An insensitive test is one that does not fail in the presence of bugs. A flakey test is one that sometimes fails, even when the code is  correct. Here, our code is incorrect. The insensitive test does not catch it, while the flakey test would fail even if it was fixed!

(This code is loosely based on a memcache checker I built for work.)

In [None]:
%%run_pytest[clean]
import socket
import random

def yolo_reader(sock):
    sock.settimeout(5)
    sock.connect(("some.host", 8451))
    fpin = sock.makefile()
    order = [0, 1]
    random.shuffle(order)
    while order:
        if order.pop() == 0:
            sock.sendall(b"GET KEY\n")
            key = fpin.readline().strip()
        else:
            sock.sendall(b"GET VALUE\n")
            value = fpin.readline().strip()
    return {value: key} ## Woops bug, should be {key: value}
    
from io import TextIOBase
from unittest import mock
import pytest

def test_insensitive_test():
    sock = mock.MagicMock(spec=socket.socket)
    sock.makefile.return_value.readline.return_value = "interesting\n"
    assert yolo_reader(sock) == {"interesting": "interesting"}
    
@pytest.mark.parametrize("does_nothing", [1, 2, 3, 4, 5])
def test_flakey_test(does_nothing):
    sock = mock.MagicMock(spec=socket.socket)
    sock.makefile.return_value.readline.side_effect = ["key\n", "value\n"]
    assert yolo_reader(sock) == {"key": "value"}

The final option of getting results from a mock object is to assign a *callable object* to `side_effect`. This calls `side_effect` to simply call it. Why not just assign a callable object directly to the attribute? Have patience, we'll get to that in the next part!

In this example, our callable object (just a function) will assign a `return_value` to the attribute of another object. This is not that uncommon. We are simulating the environment, and in a real environment, poking one thing often has an effect on other things.

In [None]:
%%run_pytest[clean]
import socket
import random

def yolo_reader(sock):
    sock.settimeout(5)
    sock.connect(("some.host", 8451))
    fpin = sock.makefile()
    order = [0, 1]
    random.shuffle(order)
    while order:
        if order.pop() == 0:
            sock.sendall(b"GET KEY\n")
            key = fpin.readline().strip()
        else:
            sock.sendall(b"GET VALUE\n")
            value = fpin.readline().strip()
    return {key: value} ## Woops bug, should be {key: value}
    
from io import TextIOBase
from unittest import mock

def test_yolo_well():
    sock = mock.MagicMock(spec=socket.socket)
    def sendall(data):
        cmd, name = data.decode("ascii").split()
        if name == "KEY":
            sock.makefile.return_value.readline.return_value = "key\n"
        elif name == "VALUE":
            sock.makefile.return_value.readline.return_value = "value\n"
        else:
            raise ValueError("got bad command", name)
    sock.sendall.side_effect = sendall
    assert yolo_reader(sock) == {"key": "value"}

**HANDS ON** Fix this test! This is not an easy exercise, but it will teach you about how mock objects work. We will have 10 minutes for this exercise. We will reconvene at 18:00

In [None]:
%%run_pytest[clean]
import socket
import random

def echo_computer(sock, num):
    sock.settimeout(5)
    sock.connect(("some.host", 8451))
    fpin = sock.makefile()
    value = random.randint(0, 4)
    sock.sendall(f"ECHO {value}\n".encode("ascii"))
    response = fpin.readline().strip()
    if response != str(value):
        return 0
    return 5 * num
   
from io import TextIOBase
from unittest import mock

def test_echo_computer():
    sock = mock.MagicMock(spec=socket.socket)
    fakefile = sock.makefile.return_value
    def sendall(data):
        pass # Change only this line
    sock.sendall.side_effect = sendall
    assert echo_computer(sock, 2) == 10

## Mock call arguments: X-Ray for Code (18:00 US/Pacific)

When writing a unit test, you are "away" from the code, but trying to peer into its guts to see how it behaves.
The Mock object is your sneaky spy. After it gets into the production code, it records everything faithfully.
This is how you can find what your code does, and whether it is the right thing.

### Call counts

The simplest thing is to just make sure that the code is called the expected number of times.
The `.call_count` attribute is exactly what counts that.

In [None]:
%%run_pytest[clean]

def get_values(names, client):
    ret_value = []
    cache = {}
    for name in names:
        if name not in cache:
            value = client.get(f"https://httpbin.org/anything/grab?name={name}").json()['args']['name']
            cache[name] = value
        ret_value.append(cache[name])
    return ret_value

def test_get_values():
    client = mock.MagicMock()
    client.get.return_value.json.return_value = dict(args=dict(name="something"))
    result = get_values(['one', 'One'], client)
    assert result == ['something', 'something']
    assert client.get.call_count == 1

One benefit of checking `.call_count >= 1` as opposed to checking `.called` is that it is more resistant to silly typos.

In [None]:
%%run_pytest[clean]

def call_function(func):
    print("I'm going to call the function, really!")
    if False:
        func()
    print("I just called the function")

def test_call_function():
    # This test passes!
    func = mock.MagicMock()
    call_function(func)
    assert func.callled # TYPO -- Extra "l"
    
def test_call_function_carefully():
    # This test fails because it has a bug.
    func = mock.MagicMock()
    call_function(func)
    assert func.calll_count >= 1 # TYPO -- Extra "l"

def test_call_function_carefully_and_correctly():
    # This test fails because the function has a bug.
    func = mock.MagicMock()
    call_function(func)
    assert func.call_count >= 1 # We finally managed to not have a typo


Using `spec` diligently can prevent that. However, `spec` is not recursive. Even if the original mock object has a spec, rare is the test that makes sure that every single attribute it has *also* has a spec. However, using `.call_count` instead of `.called` is a simple hack that will completely eliminate the chance to make this error.

### Call arguments

In the following example, we want to make sure the code calls the method with the *correct* arguments.
When automating data center manipulations, it is important to get things right.
As they say, "To err is human, but to destroy an entire data center requires a robot with a bug."

We want to make sure our Paramiko-based automation will correctly get the sizes of files, even when the file names have spaces in them.

In [None]:
%%run_pytest[clean]

def get_remote_file_size(client, fname):
    client.connect('ssh.example.com')
    stdin, stdout, stderr = client.exec_command(f"ls -l {fname}")
    stdin.close()
    results = stdout.read()
    errors = stderr.read()
    stdout.close()
    stderr.close()
    if errors != '':
        raise ValueError("problem with command", errors)
    return int(results.split()[4])

import pytest
from unittest import mock
import shlex

@pytest.mark.parametrize("fname", ["readme.txt", "a file"])
def test_file_size(fname):
    client = mock.MagicMock()
    client.exec_command.return_value = [mock.MagicMock(name=str(i)) for i in range(3)]
    client.exec_command.return_value[1].read.return_value = f"""\
    -rw-rw-r--  1 user user    123 Jul 18 20:25 {fname}
    """
    client.exec_command.return_value[2].read.return_value = ""
    result = get_remote_file_size(client, fname)
    assert result == 123
    [args], kwargs = client.exec_command.call_args
    assert shlex.split(args) == ["ls", "-l", fname]

# List of call args

Sometimes, this is not enough. Some code calls functions repeatedly, and we need to test that *all* calls are correct.
The most sophisticated X-Ray we have is `.call_args_list` which gives the entire history of what happened to the callable.

For this example, we will pretend that the (*real*) remote calculator API only allows multiplying two numbers. In order to *cube* the number, calculate `x**3`, we need two calls to the service. For superstitious reasons, we want to always put the bigger number first: maybe someone told us that it is faster this way.

In [None]:
%%run_pytest[clean]

import httpx
from unittest import mock

def calculate_cube(client, base):
    square = int(client.get(f"https://api.mathjs.org/v4/?expr={base}*{base}").text) # x*x
    return int(client.get(f"https://api.mathjs.org/v4/?expr={base}*{square}").text) # x*x*x

def test_calculate_cube():
    client = mock.MagicMock(spec=httpx.Client)
    client.get.side_effect = [mock.MagicMock(text=str(x)) for x in [25, 125]]
    assert calculate_cube(client, 5) == 125
    assert client.get.call_count == 2
    squaring, cubing = client.get.call_args_list
    args, kwargs = squaring
    assert kwargs == {}
    assert args == tuple(["https://api.mathjs.org/v4/?expr=5*5"])
    args, kwargs = cubing
    assert kwargs == {}
    ## Make sure bigger number comes first!
    assert args == tuple(["https://api.mathjs.org/v4/?expr=25*5"])

**HANDS ON** The code uses random, and currently the test is flakey. Fix it so that the test always passes. It is parametrized to run five times so that there is only 1/32 chance that it will pass if it does not do the right thing. We will have 5 minutes for this exercise, and reconvene at 18:20.

In [None]:
%%run_pytest[clean]

import httpx
from unittest import mock
import pytest
import random

def calculate_fifth_power(client, base):
    square = int(client.get(f"https://api.mathjs.org/v4/?expr={base}*{base}").text)
    hypercube = int(client.get(f"https://api.mathjs.org/v4/?expr={square}*{square}").text)
    # random order
    args = [hypercube, base]
    random.shuffle(args)
    result = int(client.get(f"https://api.mathjs.org/v4/?expr={args[0]}*{args[1]}").text)
    return result

@pytest.mark.parametrize("does_nothing", [1, 2, 3, 4, 5])
def test_calculate_cube(does_nothing):
    client = mock.MagicMock(spec=httpx.Client)
    client.get.side_effect = [mock.MagicMock(text=str(x)) for x in [25, 625, 3125]]
    assert calculate_fifth_power(client, 5) == 3125
    assert client.get.call_count == 3
    squaring, hypercubing, final = client.get.call_args_list
    args, kwargs = squaring
    assert kwargs == {}
    assert args == tuple(["https://api.mathjs.org/v4/?expr=5*5"])
    args, kwargs = hypercubing
    assert kwargs == {}
    args, kwargs = final
    assert kwargs == {}
    [url] = args
    constant, expr = url.split("=", 1)
    assert constant == "https://api.mathjs.org/v4/?expr"
    assert expr == "625*5" # Change only this line

## Recap: Deep Dive into Mocks (18:20 US/Pacific)

Mocks have a *lot* of power. Like any powerful tool, using it properly is a fast way to get into a big mess. But properly using the `.return_value`, `.side_effect`, and the various `.call*` properties, it is possible to write the best sort of unit tests. A good unit test is one that

* Will fail in the presence of incorrect code
* Will pass in the presence of correct code

"Quality" is not binary. It exists on a spectrum. The *badness* of a unit test will be determined by:

* How many errors it lets pass ("missing alarms" AKA "false negatives" or, if you are a statistician, "type 2 errors")
* How many *correct* code changes it fails ("false alarms" AKA "false positives" or, if you a statistician, "type 1 errors")

When using a mocks, take the time and think about both metrics to evaluate whether this mock, and this unittest, will help or hinder you.