Inheriting from **collections.abc** for custom container types

*nb see notes on ABCs in concepts_A-L.pynb.* The collections module has some concrete classes that derive from ABCs; these can, of course, be further derived. In addition the collections.abc submodule has some ABCs that can be used to test whether a class or instance provides a particular interface, for example, is it hashable or a mapping.By subclassing built-in types e.g. list you get all the standard functionality and semantics of a list, but you can create custom methods:

In [21]:
class FreqList(list):
    def	__init__(self, items):
        super().__init__(items)
        
    def frequency(self):
        counts = {item: 0 for item in self}
        for item in self:
            counts[item] += 1
        return counts
    
foo = FreqList(['a', 'b', 'a', 'c', 'd', 'c'])
foo.frequency()

{'a': 2, 'b': 1, 'c': 2, 'd': 1}

Suppose you wanted to create an object that provides sequence semantics such as indexing, for which you might provide an implementation of \__getitem\__ . This method is called when indexing a list:

In [22]:
bar = [1, 2, 3]
bar[0]
# is interpreted as:
bar.__getitem__(0)

1

There are a large number of methods required to implement custom container types correctly. You may forget to implement e.g. the count or index methods which would be expected on a sequence like a list or tuple. To avoid this problem, the collections.abc module defines a set of ABCs with all the methods expected for a container type. This ensures your class matches the expected interface and behaviour. For example: 

In [23]:
from collections.abc import Sequence

class BadType(Sequence):
    pass

foo = BadType()

TypeError: Can't instantiate abstract class BadType with abstract methods __getitem__, __len__

In [12]:
from collections import defaultdict, Counter, deque, namedtuple
from pprint import pprint

colors = (('Luke', 'Yellow'), ('Mike', 'Green'),)

# can pass in a sequence of tuples as well as a dict
fav_colors = defaultdict(str, colors)

# no KeyError is raised
fav_colors['John'] = 'Yellow'
pprint(fav_colors)

defaultdict(<class 'str'>,
            {'John': 'Yellow',
             'Luke': 'Yellow',
             'Mike': 'Green'})


In [15]:
tree = lambda: defaultdict(tree)
some_dict = tree()
some_dict['colours']['favourite'] = "yellow"
# json.dumps returns a JSON string representation of the Python object
print(json.dumps(some_dict))
pprint(some_dict)

{"colours": {"favourite": "yellow"}}
defaultdict(<function <lambda> at 0x000001BF6BA4C950>,
            {'colours': defaultdict(<function <lambda> at 0x000001BF6BA4C950>,
                                    {'favourite': 'yellow'})})


In [15]:
Counter('abcbab') # a dict sub-class 

Counter({'a': 2, 'b': 3, 'c': 1})

**deque** provides a queue allowing you to append and delete elements from either side. Values can be popped from either side and the number of items can be limited so that values may be popped rom either side:

In [16]:
d = deque(range(6))
[d.popleft(), d.pop()], d

([0, 5], deque([1, 2, 3, 4]))

In [17]:
d = deque(range(5), maxlen=5)
d.append(5)
d

deque([1, 2, 3, 4, 5])

**namedtuple** a gives a meaningful name to associate with the object in the tuple instead of only an index value.

In [18]:
color = (55, 155, 255)
Color = namedtuple('Color', ['red', 'green', 'blue'])
named_color = Color(*color)
print (color[0], named_color.red)

55 55


**Debugging and Profiling**

**Timeit**
This module provides a simple way to time small bits of Python code. It has both a Command-Line Interface as well as a callable one. For example:

$ python3 -m timeit "'-'.join([str(n) for n in range(100)])"

10000 loops, best of 3: 55.4 usec per loop

In [1]:
def f(x):
    return x**2
def g(x):
    return x**4
def h(x):
    return x**8

import timeit
print(timeit.timeit('[func(42) for func in (f,g,h)]', globals=globals()))

4.296773656118973


**Development Tools**



**unittest** 

The unittest module provides a rich set of tools for constructing and running tests.  It supports test automation, sharing of setup and shutdown code for tests, aggregation of tests into collections, and independence of the tests from the reporting framework. To achieve this, unittest supports some important concepts in an object-oriented way:

Test fixture  represents the preparation needed to perform one or more tests, and any associate cleanup actions. This may involve, for example, creating temporary or proxy databases, directories, or starting a server process.
test case

A test case is the individual unit of testing. It checks for a specific response to a particular set of inputs. Test suite is a collection of test cases, test suites, or both. It is used to aggregate tests that should be executed together. Test runner is a component which orchestrates the execution of tests and provides the outcome to the user.

The crux of each test is a call to assertEqual() to check for an expected result; assertTrue() or assertFalse() to verify a condition; or assertRaises() to verify that a specific exception gets raised. These methods are used instead of the assert statement so the test runner can accumulate all test results and produce a report.

In [20]:
import unittest

class TestStringMethods(unittest.TestCase):

    def test_upper(self):
        self.assertEqual('foo'.upper(), 'FOO')

    def test_isupper(self):
        self.assertTrue('FOO'.isupper())
        self.assertFalse('Foo'.isupper())

    def test_split(self):
        s = 'hello world'
        self.assertEqual(s.split(), ['hello', 'world'])
        # check that s.split fails when the separator is not a string
        with self.assertRaises(TypeError):
            s.split(2)

if __name__ == '__main__':
    unittest.main()

In [None]:
import mymod

class MyTestCase(unittest.TestCase):
    def test1(self):
        self.assertRaises(SomeCoolException, mymod.myfunc)

In [21]:
def broken_function():
    raise Exception('This is broken')

class MyTestCase(unittest.TestCase):
    def test(self):
        with self.assertRaises(Exception) as context:
            broken_function()

        self.assertTrue('This is broken' in context.exception)

Passing the -v option to your test script will instruct unittest.main() to enable a higher level of verbosity. The unittest module can be used from the command line to run tests from modules, classes or even individual test methods.

Tests can be skipped (e.g. expected failures) simply by applying the following decorator to individual tests:

    @unittest.skip("demonstrating skipping")

    @unittest.skipUnless(sys.platform.startswith("win"), "requires Windows")

Nose and py.test are third-party unittest frameworks with a lighter-weight syntax for writing tests. For example:
    
    assert func(10) == 42.

**unittest.mock** 

patch() acts as a function decorator, class decorator or a context manager. Inside the body of the function or with statement, the target is patched with a new object. When the function/with statement exits the patch is undone.
A common use is for mocking return values of a function being called within a function being tested, so that it can be tested in isolation.

from unittest.mock import patch
 

class MockingTestTestCase(unittest.TestCase):


    @patch('my_app.my_file.function_to_mock')

    def test_my_func(self, test_patch):

        test_patch.return_value = 'Mocked this'

        ...
     
    @patch('my_app.my_file.function_to_mock', return_value='Mocked this')

    def test_my_func_again(self, test_patch):

        ...
 
In doing so, the mocked function has been replaced with a Mock object created by applying the decorator. When it is called, a Mock object will return its return_value attribute, which is by default a new Mock object but can easily be assigned with the unit test or by passing it to a Mock class constructor.  

The basic principle is that you patch where an object is looked up, not necessarily the same place as where it is defined. An example is in testing a function which uses requests.get() and checking that an Exception is raised for a certain condition. This shows how the patch decorator allows you to perform side effects, including raising an exception when a mock is called:

    from requests.exeptions import HTTPException
    
    @patch('my_app.my_file.requests.get', side_effect=HTTPException)
    def test_my_func_raises_HTTPException(self, test_patch):
        ...
        with self.assertRaises(RequestException):
            self.my_func(url)

Objects of the Mock class have the attributes 'called' and 'call_count' which give a boolean value of whether or not the mocked object was called and how many times it was called: 

In [6]:
from unittest.mock import Mock

mock = Mock(return_value=None)
a = mock.called
mock()
mock()
b = mock.called
c = mock.call_count
print(a, b, c)

False True 2


**Functools**

**Function caching**

Function caching allows us to cache the return values of a function depending on the arguments. It can save time when an I/O bound function is periodically called with the same arguments.The maxsize argument tells lru_cache about how many recent return values to cache:

In [9]:
from functools import lru_cache

@lru_cache(maxsize=32)
def fib(n):
    if n < 2:
        return n
    return fib(n-1) + fib(n-2)

print([fib(n) for n in range(10)])
# uncache the return values
fib.cache_clear()

[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]


The **importlib** package provides an implementation of the built-in import function and allows e.g. a certain module being imported depending on a condition or use the import function in various other contexts within code.

    from importlib import import_module
    ...
    my_module = 'import_this'
    try: 
        import_module('a.b.{}’.format(my_module))
    except ImportError: 
        ... 

**Itertools**

**itertools.chain**(\*iterables) is used for treating consecutive sequences as a single sequence. Equivalent to:

In [6]:
def chain(*iterables):
    # chain('ABC', 'DEF') --> A B C D E F
    for it in iterables:
        for element in it:
            yield element
            
[i + 1 for i in chain([1,2,3], [6,7,8])]

[2, 3, 4, 7, 8, 9]

**itertools.islice**(iterable, start, stop[, step]) creates an iterator that iterates over an *existing* list, rather than normal slice which holds a copy of the slice in memory. So it avoids havings to use more memory or computation to create the new lists:

In [8]:
import itertools 

data1 = range(10)

# This creates a NEW list
data1[2:5]

# This creates an iterator that iterates over the EXISTING list
itertools.islice(data1, 2, 5)

data2 = [1, 2, 3]
data3 = [4, 5, 6]

# This creates a NEW list
data2 + data3

# This creates an iterator that iterates over the EXISTING lists
itertools.chain(data2, data3)

<itertools.chain at 0x2189ba84048>