# Mutation Analysis

In the [chapter on coverage](Coverage.ipynb), we showed how one identify which parts of the program are executed by a program, and hence get a sense of the effectiveness of a set of test cases in covering the program structure. However, is structural coverage a good measure of effectiveness? One of the problems with structural coverage measures is that it fails to check whether the program executions generated by the test suite were actually correct. That is, an execution that produces a wrong output that is unnoticed by the test suite is counted exactly the same as an execution that produces the right output for coverage. Indeed, if one deletes the assertions in a typical test case, the coverage would not change for the new test suite, but the new test suite is much less useful than the original one.

This is indeed, not an optimal state of affairs. How can we verify that our tests are actually useful? One alternative (hinted in the chapter on coverage) is to inject bugs into the program, and evaluate the effectiveness of test suites in catching these injected bugs. However, that that introduces another problem. How do we produce these bugs in the first place? Any manual effort is likely to be biased by the preconceptions of the developer as to where the bugs are likely to occur, and what effect it would have. Further, writing good bugs is likely to take a significant amount of time, for a very indirect benefit. Hence such a solution is not sufficient.  Mutation Analysis offers an alternative solution. The insight from Mutation Analysis is to consider the probability of insertion of a bug from the perspective of a programmer. If one assumes that the attention received by each program element in the program is sufficiently similar, one can further assume that each token in the program have a similar probability of being incorrectly transcribed. Of course, the programmer will correct any mistakes that gets detected by the compilers (or other static analysis tools). So the set of valid tokens different from the original that make it past the compilation stage is considered to be its possible set of _mutations_ that represent the _probable faults_ in the program. A test suite is then judged by its capability to detect (and hence prevent) such mutations. The proportion of such mutants detected over all _valid_ mutants produced is taken as the mutation score. In this chapter, we see how one can implement Mutation Analysis in Python programs. The mutation score obtained represents the ability of any program analysis tools to prevent faults, and can be used to judge static test suites, test generators such as fuzzers, and also static and symbolic execution frameworks.

**Prerequisites**

* You need some understanding of how a program is executed.
* You should have read [the chapter on coverage](Coverage.ipynb).

In [None]:
def triangle_type(a, b, c):
    if a == b:
        if b == c:
            return 'Equilateral'
        else:
            return 'Isosceles'
    else:
        if b == c:
            return "Isosceles"
        else:
            if a == c:
                return "Isosceles"
            else:
                return "Scalene"

In [None]:
import imp

In [None]:
def import_code(code, name):
    module = imp.new_module(name)
    exec(code, module.__dict__)
    return module

In [None]:
import inspect

In [None]:
triangle = import_code(inspect.getsource(triangle_type), 'triangle_type')

In [None]:
triangle.triangle_type(1,1,1)

In [None]:
import unittest

In [None]:
class TestTriangle(unittest.TestCase):

    def test_equilateral(self):
        assert triangle.triangle_type(1,1,1) == 'Equilateral'

    def test_isosceles(self):
        assert triangle.triangle_type(1,2,1) == 'Isosceles'
        assert triangle.triangle_type(2,2,1) == 'Isosceles'
        assert triangle.triangle_type(1,2,2) == 'Isosceles'

    def test_scalene(self):
        assert triangle.triangle_type(1,2,3) == 'Scalene'

In [None]:
def suite(test_class):
    suite = unittest.TestSuite()
    for f in test_class.__dict__:
        if f.startswith('test_'):
            suite.addTest(test_class(f))
    return suite

In [None]:
runner = unittest.TextTestRunner(verbosity=0, failfast=True)
runner.run(suite(TestTriangle))

In [None]:
import fuzzingbook_utils

In [None]:
from SymbolicFuzzer import ArcCoverage

In [None]:
class ArcCoverage(ArcCoverage):
    def show_coverage(self, fn):
        src = inspect.getsource(fn)
        name = fn.__name__
        covered = set([lineno for method, lineno in self._trace if method == name])
        for i, s in enumerate(src.split('\n')):
            print('%s %2d: %s' % ('#' if i + 1 in covered else ' ', i + 1, s))

In [None]:
with ArcCoverage() as cov:
    print(suite(TestTriangle).run(unittest.TestResult()))

In [None]:
cov.show_coverage(triangle_type)

In [None]:
class WeakTestTriangle(unittest.TestCase):
    def test_equilateral(self):
        assert triangle.triangle_type(1,1,1) == 'Equilateral'

    def test_isosceles(self):
        assert triangle.triangle_type(1,2,1) != 'Equilateral'
        assert triangle.triangle_type(2,2,1) != 'Equilateral'
        assert triangle.triangle_type(1,2,2) != 'Equilateral'

    def test_scalene(self):
        assert triangle.triangle_type(1,2,3) != 'Equilateral'

In [None]:
with ArcCoverage() as cov:
    print(suite(WeakTestTriangle).run(unittest.TestResult()))

In [None]:
cov.show_coverage(triangle_type)

In [None]:
import ast

In [None]:
class StmtDeletionMutator(ast.NodeTransformer):
    def __init__(self, mutate_lst=None):
        self.count = 0
        self.mutate_lst = [] if mutate_lst is None else mutate_lst

    def specific_visitor(self, node):
        self.count += 1 # statements start at line no 1
        if self.count in self.mutate_lst:
            return ast.Pass()
        return self.generic_visit(node)

    def visit_Return(self, node): return self.specific_visitor(node)
    def visit_Delete(self, node): return self.specific_visitor(node)

    def visit_Assign(self, node): return self.specific_visitor(node)
    def visit_AnnAssign(self, node): return self.specific_visitor(node)
    def visit_AugAssign(self, node): return self.specific_visitor(node)

    def visit_Raise(self, node): return self.specific_visitor(node)
    def visit_Assert(self, node): return self.specific_visitor(node)

    def visit_Global(self, node): return self.specific_visitor(node)
    def visit_Nonlocal(self, node): return self.specific_visitor(node)

    def visit_Expr(self, node): return self.specific_visitor(node)

    def visit_Pass(self, node): return self.specific_visitor(node)
    def visit_Break(self, node): return self.specific_visitor(node)
    def visit_Continue(self, node): return self.specific_visitor(node)

In [None]:
def get_mutation_count(mysrc):
    sdc = StmtDeletionMutator()
    sdc.visit(ast.parse(mysrc))
    return sdc.count

In [None]:
get_mutation_count(inspect.getsource(triangle_type))

In [None]:
import astunparse

In [None]:
def generate_mutant(mysrc, mutate_lst):
    v = StmtDeletionMutator(mutate_lst).visit(ast.parse(mysrc))
    return astunparse.unparse(v)

In [None]:
import difflib

In [None]:
triangle_src = astunparse.unparse(ast.parse(inspect.getsource(triangle_type)))

In [None]:
mutant_1_4 = generate_mutant(triangle_src, [1, 4])

In [None]:
for i in difflib.unified_diff(triangle_src.split('\n'), mutant_1_4.split('\n'), fromfile='triangle', tofile='mutant_1_4', n=3):
    print(i)

In [None]:
from ExpectError import ExpectTimeout

In [None]:
def evalmutant(mname, mutant_src, test_module):
    test_module.__dict__[mname] = import_code(mutant_src, mname)
    with ExpectTimeout(1):
        return test_module.runTest() # Was test run successful? False -- Mutant Found
    return True

In [None]:
def mytest(lst_locations, mname, msrc, test_module, log=False):
    try:
        mutant_src = generate_mutant(msrc, lst_locations)
        if log:
            for line in difflib.unified_diff(msrc.split('\n'),
                                          mutant_src.split('\n'),
                                          fromfile=mname,
                                          tofile="%s_%s"  % (
                                              mname, '_'.join([str(j) for j in lst_locations])),
                                          n=3):
                print(line)
        return evalmutant(mname, mutant_src, test_module)
    except SyntaxError:
        print('Syntax!', lst_locations)
        return None

In [None]:
def mutate(mname, msrc_, test_module, log=False):
    msrc = astunparse.unparse(ast.parse(msrc_))
    num_mutations = get_mutation_count(msrc)
    res = []
    for m in range(num_mutations):
        d = mytest([m+1], mname, msrc, test_module, log=log)
        if d is not None: # mutant with syntax error is not a valid mutant
            res.append(d)
        if log:
            print(d)
    total = len(res)
    no_detection = len([r for r in res if r.wasSuccessful()])
    score = ((total - no_detection)/total)
    return score

In [None]:
def runTest():
    return suite(TestTriangle).run(unittest.TestResult())

In [None]:
import sys

In [None]:
import astunparse

In [None]:
mutate('triangle', inspect.getsource(triangle_type), sys.modules[__name__], log=False)

In [None]:
def runTest():
    return suite(WeakTestTriangle).run(unittest.TestResult())

In [None]:
mutate('triangle', inspect.getsource(triangle_type), sys.modules[__name__], log=False)

In [None]:
import ControlFlow as cfg

In [None]:
gcd_src = """\
def gcd(a, b):
    if a<b:
        c: int = a
        a: int = b
        b: int = c

    while b != 0 :
        c: int = a
        a: int = b
        b: int = c % b
    return a
"""

In [None]:
class TestGCD(unittest.TestCase):

    def test_simple(self):
        assert cfg.gcd(1,0) == 1
        
    def test_mirror(self):
        assert cfg.gcd(0,1) == 1

In [None]:
def runTest():
    runner = unittest.TextTestRunner(verbosity=0, failfast=True)
    return runner.run(suite(TestGCD))

In [None]:
runTest()

In [None]:
get_mutation_count(gcd_src)

In [None]:
mutate('cfg', gcd_src, sys.modules[__name__], log=True)

## _Section 4_

\todo{Add}

## Lessons Learned

* _Lesson one_
* _Lesson two_
* _Lesson three_

## Next Steps

_Link to subsequent chapters (notebooks) here, as in:_

* [use _mutations_ on existing inputs to get more valid inputs](MutationFuzzer.ipynb)
* [use _grammars_ (i.e., a specification of the input format) to get even more valid inputs](Grammars.ipynb)
* [reduce _failing inputs_ for efficient debugging](Reducer.ipynb)


## Background

_Cite relevant works in the literature and put them into context, as in:_

The idea of ensuring that each expansion in the grammar is used at least once goes back to Burkhardt \cite{Burkhardt1967}, to be later rediscovered by Paul Purdom \cite{Purdom1972}.

## Exercises

_Close the chapter with a few exercises such that people have things to do.  To make the solutions hidden (to be revealed by the user), have them start with_

```markdown
**Solution.**
```

_Your solution can then extend up to the next title (i.e., any markdown cell starting with `#`)._

_Running `make metadata` will automatically add metadata to the cells such that the cells will be hidden by default, and can be uncovered by the user.  The button will be introduced above the solution._

### Exercise 1: _Title_

_Text of the exercise_

In [None]:
# Some code that is part of the exercise
pass

_Some more text for the exercise_

**Solution.** _Some text for the solution_

In [None]:
# Some code for the solution
2 + 2

_Some more text for the solution_

### Exercise 2: _Title_

_Text of the exercise_

**Solution.** _Solution for the exercise_