# Mock: A tool for unit testing

We all know the joys of unit testing. But how do you isolate the functionality you want to test when your function calls external resources, or when it uses computation-heavy steps such as machine-learning models?

Mocking code paths you don't want to test can be a good way to do this. The standard library's unittest package contains a sub-package called "mock" which supercharges (test-related) mockery. Its central object, the `Mock`, lets you avoid writing your own stubs, and can help you verify what is and isn't being called on the mock.

https://docs.python.org/3/library/unittest.mock.html

ChiPy monthly meeting, 10 March 2016<br>
Stephen Hoover<br>
@StephenActual<br>
https://github.com/stephen-hoover

In [1]:
from unittest import mock  # Standard library

# Use these for examples
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import base

## What is the Mock?
It is whatever you want it to be.

In [2]:
mymock = mock.Mock()

Ask it what attributes it has -- it will happily agree that everything exists, creating new Mock objects to return.

In [3]:
mymock.an_attribute

<Mock name='mock.an_attribute' id='4550270928'>

In [4]:
mymock.a_method()

<Mock name='mock.a_method()' id='4554675312'>

In [5]:
mymock.long.string.of.stuff.leading.to.snakes()

<Mock name='mock.long.string.of.stuff.leading.to.snakes()' id='4554678056'>

If you ask for the same thing twice, you get back the same Mock object.

In [6]:
mymock.an_attribute

<Mock name='mock.an_attribute' id='4550270928'>

You can store values inside the Mock.

In [7]:
mymock.create_something().value = 'spam'
result = mymock.create_something()
print(result.value)

spam


You can tell it what to return when you call things as functions.

In [8]:
mymock.create_estimator.return_value = LogisticRegression()
model = mymock.create_estimator()
print(model)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)


If you want a more opinionated mock, you can make it look like another object.

In [9]:
mock_lr = mock.Mock(spec=LogisticRegression)

In [10]:
isinstance(mock_lr, LogisticRegression)

True

In [11]:
mock_lr.fit

<Mock name='mock.fit' id='4554741520'>

In [12]:
mock_lr.not_a_method

AttributeError: Mock object has no attribute 'not_a_method'

Use the "side_effect" attribute to make the Mock return different values for different calls.

In [13]:
mymock.snake.side_effect = [0, 1, ValueError('No more snakes!')]

In [14]:
mymock.snake()

0

In [15]:
mymock.snake()

1

In [16]:
mymock.snake()

ValueError: No more snakes!

## The Mock in tests

Use the Mock to avoid running unnecessary external code in tests. Do we really need to verify that scikit-learn's logistic regresion works in our unit tests?

In [17]:
def find_most_probable(estimator: base.ClassifierMixin, 
                       X: np.ndarray) -> int:
    yhat_proba = estimator.predict_proba(X)
    i_max = np.argmax(yhat_proba[:,1])
    return i_max

In [18]:
def test_most_probable():
    mock_lr = mock.Mock(spec=LogisticRegression)
    yhat_proba = np.array([[0.1, 0.9], [0.5, 0.5], [0.01, 0.99]])
    mock_lr.predict_proba.return_value = yhat_proba
    test_input = np.arange(3)
    i_max = find_most_probable(mock_lr, X=test_input)
    
    # Did we get the right answer?
    assert i_max==2
    
    # We should be calling `predict_proba` on the input...
    mock_lr.predict_proba.assert_called_once_with(test_input)
    
    # Not `predict`
    mock_lr.predict.assert_not_called
    
    return True
test_most_probable()

True

## Patching
Insert your Mock into other classes or modules!

In [19]:
def train_and_find_most_probable(estimator: base.ClassifierMixin,
                                 X: np.ndarray,
                                 y: np.ndarray) -> int:
    from sklearn import cross_validation
    yhat_proba = cross_validation.cross_val_predict(estimator, X, y)
    i_max = np.argmax(yhat_proba[:,1])
    return i_max

@mock.patch('sklearn.cross_validation')
def test_train_and_find_most_probable(mock_validation):
    yhat_proba = np.array([[0.1, 0.9], [0.5, 0.5], [0.01, 0.99]])
    mock_validation.cross_val_predict.return_value = yhat_proba
    test_X, test_y = np.arange(3), np.zeros(3)
    i_max = train_and_find_most_probable(None, test_X, test_y)
    
    # Did we get the right answer?
    assert i_max==2
    
    assert mock_validation.cross_val_predict.call_count == 1
    
    return True

test_train_and_find_most_probable()

True

## Is there such a thing as too much mockery?

In [20]:
import boto3
def upload_to_s3(fname: str, 
                 bucket: str, 
                 keypath: str,
                 credentials: dict=None):
    s3 = boto3.resource('s3', **(credentials or {}))
    
    if list(s3.Bucket(bucket).objects.filter(Prefix=keypath)):
        raise ValueError('Target location is not empty!')
    s3.meta.client.upload_file(fname, bucket, keypath)
    
@mock.patch('__main__.boto3')
def test_upload(mock_boto):
    upload_to_s3('local', 'mybucket', 'special/key', 
                 {'aws_secret_access_key': 'shhh'})

    assert len(mock_boto.mock_calls) == 6

    mock_boto.resource.assert_called_once_with(
        's3', aws_secret_access_key='shhh')

    s3 = mock_boto.resource()
    s3.Bucket.assert_called_once_with('mybucket')
    s3.Bucket().objects.filter.assert_called_once_with(
        Prefix='special/key')
    s3.meta.client.upload_file.assert_called_once_with(
        'local', 'mybucket', 'special/key')
    
    return True
    
test_upload()

True

What are you really testing at this point?