# Demo for SFLKit

## What is SFLKit

SFLKit is a workbench for statistical fault localization. It comes with the fundamental concepts of statistical debugging and spectrum-based fault localization.

You can use SFLKit out-of-the-box by integrating its command-line interface `sfl.py` or as a library, as we do in this demonstration. We designed SFLKit to be highly configurable and expandable with novel concepts.

To install SFLKit execute

In [1]:
!pip install .

Processing /Users/marius/Desktop/work/projects/sflkit
  Preparing metadata (setup.py) ... [?25ldone
Building wheels for collected packages: sflkit
  Building wheel for sflkit (setup.py) ... [?25ldone
[?25h  Created wheel for sflkit: filename=sflkit-0.0.1-py3-none-any.whl size=36084 sha256=3905ae28e454f777a27417a62c8136a88f497b0b5256d0c9f97c3857558fc6f4
  Stored in directory: /private/var/folders/09/pt1hglws43n7fh5521n6zyyh0000gn/T/pip-ephem-wheel-cache-lhtf00j_/wheels/66/6f/12/7a57ca5f6f40197ae4f96b58c9e8b176ee261d1660520b4fbd
Successfully built sflkit
Installing collected packages: sflkit
  Attempting uninstall: sflkit
    Found existing installation: sflkit 0.0.1
    Uninstalling sflkit-0.0.1:
      Successfully uninstalled sflkit-0.0.1
Successfully installed sflkit-0.0.1


In [2]:
import enum
import importlib
import inspect
import os
import shutil

from IPython.display import HTML

from sflkit.color import ColorCode
from sflkit import instrument_config, analyze_config
from sflkit.config import Config

## A faulty Program

First, we need a faulty program. We chose an implementation of the `middle(x, y, z)` function that returns the *middle* number of its three arguments. For example, `middle(1, 3, 2)` should return 2 because `1 < 2` and `2 < 3`. We introduced a fault in this implementation of `middle` that occurs in line 7 `m = y`. 

In [3]:
def middle(x, y, z):
    m = z
    if y < z:
        if x < y:
            m = y
        elif x < z:
            m = y  # bug
    else:
        if x > y:
            m = y
        elif x > z:
            m = x
    return m

Next, we introduce a class to capture test runs' results efficiently. The `TestResult` is an enum with two possible values, `PASS`and `FAIL`. `PASS` donates a passing test case and `FAIL` a failing one.

In [4]:
class TestResult(enum.Enum):
    
    def __repr__(self):
        return self.value
    
    PASS = 'PASS'
    FAIL = 'FAIL'

Now we implement a test function that takes the three arguments of `middle(x, y, z)` and an expected result. This test function compares the return of `middle(x, y, z)` with the desired value and returns `PASS` if they match and `FAIL` otherwise.

In [5]:
def test_middle(x, y, z, expected):
    try:
        if middle(x, y, z) == expected:
            return TestResult.PASS
        else:
            return TestResult.FAIL
    except BaseException:
        return TestResult.FAIL

Let's check the results for some combinations of the numbers 1, 2, and 3. The expected value is in all cases 2.

In [6]:
test_middle(3, 2, 1, expected=2)

PASS

In [7]:
test_middle(3, 1, 2, expected=2)

PASS

In [8]:
test_middle(2, 1, 3, expected=2)

FAIL

As you can see, the result of `middle(2, 1, 3)` does not match the expected value 2. Hence, we found a failing test case

## Instrument the Program

Subsequently, we want to leverage SFLKit to find the location in the code that is most likely to include the fault.

Let us first get the source of our function and write it to a file so we have something to perform our instrumentation and analysis.

We leverage Python's `inspect` to get the source code.

In [11]:
source = inspect.getsource(middle)
print(source)

def middle(x, y, z):
    m = z
    if y < z:
        if x < y:
            m = y
        elif x < z:
            m = y  # bug
    else:
        if x > y:
            m = y
        elif x > z:
            m = x
    return m



We also define the file we write the source to and the python file we will work on, namely `middle.py` and `tmp.py`, respectively.

In [12]:
middle_py = 'middle.py'
tmp_py = 'tmp.py'

We write the source code to `middle.py`.

In [13]:
with open(middle_py, 'w') as fp:
    fp.write(source)

Let's update our test function to import the correct module and run the `middle(x, y, z)` from this module.

In [14]:
def test_middle_import(x, y, z, expected):
    from middle import middle
    try:
        if middle(x, y, z) == expected:
            return TestResult.PASS
        else:
            return TestResult.FAIL
    except BaseException:
        return TestResult.FAIL

We repeat the tests to check that our setup works with the import.

In [15]:
test_middle_import(3, 2, 1, expected=2), test_middle_import(3, 1, 2, expected=2), test_middle_import(2, 1, 3, expected=2)

(PASS, PASS, FAIL)

We produced the same results for the test cases, so it seems to work.

### Configuring SFLKit

The `Config` class provides comfortable access to `SFLKit` by defining the fundamental concepts we want to investigate.

We give some information for the config that we need to define. First, we need the path to the source we want to investigate, which we already have in `middle_py`. Next, we need an out, `tmp_py`. We also need:

The language of our subject is `'python'`.
Let's start with `'line'` as the predicates we want to investigate.
We define `'tarantula'` as our evaluation metric for the predicates, i.e., the similarity coefficient.
We also need a list of passing and failing tests used during the analysis.

In [16]:
language='python'
predicates='line'
metrics='Tarantula'
passing='event-files/0,event-files/1'
failing='event-files/2'

We define a function that gives as a `Config` object, so we do not need to create it manually every time we change something.

In [17]:
def get_config():
    return Config.config(path=middle_py, working=tmp_py, language=language, predicates=predicates, metrics=metrics, passing=passing, failing=failing)

Now we can define a function that instruments our subject. We leverage `SFLKit`'s `instrument_config()`, which takes a config we create with our defined `get_config()` and instruments the subject. We can also show the content of the instrumented python file with this function.

In [18]:
def instrument(out=True):
    instrument_config(get_config())
    if out:
        with open(tmp_py, 'r') as fp:
            print(fp.read())

Now we instrument our `middle.py` subject and check the results.

In [19]:
instrument()

import sflkit.instrumentation.lib


def middle(x, y, z):
    sflkit.instrumentation.lib.add_line_event('middle.py', 2, 0)
    m = z
    sflkit.instrumentation.lib.add_line_event('middle.py', 3, 1)
    if y < z:
        sflkit.instrumentation.lib.add_line_event('middle.py', 4, 2)
        if x < y:
            sflkit.instrumentation.lib.add_line_event('middle.py', 5, 3)
            m = y
        else:
            sflkit.instrumentation.lib.add_line_event('middle.py', 6, 4)
            if x < z:
                sflkit.instrumentation.lib.add_line_event('middle.py', 7, 5)
                m = y
    else:
        sflkit.instrumentation.lib.add_line_event('middle.py', 9, 6)
        if x > y:
            sflkit.instrumentation.lib.add_line_event('middle.py', 10, 7)
            m = y
        else:
            sflkit.instrumentation.lib.add_line_event('middle.py', 11, 8)
            if x > z:
                sflkit.instrumentation.lib.add_line_event('middle.py', 12, 9)
                m = x
    

As you can see, the instrumentation added an import at the beginning to a lib that comes with `SFLKit`, cluing the execution of files together. Moreover, the instrumentation added a function call function of the lib in front of each executable line that tracks the executed lines.

## Get the Events

In [20]:
def test_tmp(x, y, z, expected): 
    import tmp
    importlib.reload(tmp)
    tmp.sflkit.instrumentation.lib.reset()
    try:
        if tmp.middle(x, y, z) == expected:
            return TestResult.PASS
        else:
            return TestResult.FAIL
    except BaseException:
        return TestResult.FAIL
    finally:
        tmp.sflkit.instrumentation.lib.dump_events()
        del tmp

In [21]:
event_files = 'event-files'

In [22]:
def run_tests():
    if os.path.exists(event_files):
        shutil.rmtree(event_files)
    os.mkdir(event_files)
    os.environ['EVENTS_PATH'] = os.path.join(event_files, '0')
    test_tmp(3, 2, 1, expected=2)
    os.environ['EVENTS_PATH'] = os.path.join(event_files, '1')
    test_tmp(3, 1, 2, expected=2)
    os.environ['EVENTS_PATH'] = os.path.join(event_files, '2')
    test_tmp(2, 1, 3, expected=2)

In [23]:
run_tests()

In [24]:
def analyze():
    run_tests()
    return analyze_config(get_config())

In [25]:
results = analyze()

In [26]:
results

{'LINE': {'Tarantula': [[middle.py:7]:1.0,
   [middle.py:6]:0.6666666666666666,
   [middle.py:4]:0.6666666666666666,
   [middle.py:13]:0.5,
   [middle.py:3]:0.5,
   [middle.py:2]:0.5,
   [middle.py:10]:0.0,
   [middle.py:9]:0.0]}}

In [27]:
code = ColorCode(results['LINE']['Tarantula'])

In [28]:
HTML(code.code(middle_py, source, color=True, suspiciousness=True))

## Change the Analysis Object

In [None]:
predicates='def_use'

In [None]:
instrument(out=False)

In [None]:
results = analyze()

In [None]:
code = ColorCode(results['DEF_USE']['Tarantula'])

In [None]:
HTML(code.code(middle_py, source, color=True, suspiciousness=True))

# Change the Metric

In [None]:
metrics='Ochiai'

In [None]:
instrument(out=False)
results = analyze()

In [None]:
code = ColorCode(results['DEF_USE']['Ochiai'])

In [None]:
HTML(code.code(middle_py, source, color=True, suspiciousness=True))