# Tests4Py Statistical Fault Localization

In this notebook we demonstrate the usage of Tests4Py in the context of statistical fault localization. 
We will use the `middle_2` subject as an example in this tutorial.

## Importing the API

To get access to the API, we need to import the `tests4py` module.

In [1]:
from pathlib import Path

import tests4py.api as t4p
from tests4py import sfl

tests4py :: INFO     :: Loading projects


## Retrieving the Subject

As a next step we can retrieve the subject, such that we have a local copy of it to work with.

In [2]:
report = t4p.checkout(t4p.middle_2)

tests4py :: INFO     :: Copying https://github.com/smythi93/middle from /Users/marius/.t4p/projects/middle into /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/middle_2... 
tests4py :: INFO     :: Resetting git at /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/middle_2 to 029cb8beb7bfc0f2853dfa9504dcdfcc753b051e
tests4py :: INFO     :: Creating tmp location at /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/tmp_middle
tests4py :: INFO     :: Copying required files to /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/tmp_middle
tests4py :: INFO     :: Checkout buggy commit id eed99fa2741bd28744231dfcac0ea34679532bf9
tests4py :: INFO     :: Copying required files from /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/tmp_middle
tests4py :: INFO     :: Create info file
tests4py :: INFO     :: Copying resources for middle_2


We will use the `report` to extract the original source since fault localization may need it later on.

In [3]:
src = report.location

## Instrumenting the Subject

Now we can instrument the subject to retrieve the fault localization information. 
This will also install all dependencies and the subject in a virtual environment. 
It may also install the correct Python version, if it is not already installed (Note that this may take a while).

In [4]:
dst = Path("tmp", "sfl")
sfl.sflkit_instrument(dst, events="line")

tests4py :: INFO     :: Checking whether Tests4Py project
tests4py :: INFO     :: Loading projects
sflkit :: INFO     :: I found 10 events in /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/middle_2/src/middle/__init__.py.
sflkit :: INFO     :: I found 10 events in /Users/marius/Desktop/work/projects/Tests4Py/notebooks/tmp/middle_2.
tests4py :: INFO     :: Checking whether Tests4Py project
tests4py :: INFO     :: Loading projects
tests4py :: INFO     :: Checking for platform darwin
tests4py :: INFO     :: Check for activated python version
tests4py :: INFO     :: Using pyenv python 3.10.9
tests4py :: INFO     :: Activating virtual env
tests4py :: INFO     :: Run setup
[0mtests4py :: INFO     :: Set compiled flag


{
    "command": "sfl",
    "subcommand": "instrument",
    "successful": true,
    "project": "middle_2"
}

## Executing the Relevant Tests

Let's execute the relevant test cases to get the fault localization information.

In [5]:
sfl.sflkit_get_events(work_dir_or_project=dst)

tests4py :: INFO     :: Checking whether Tests4Py project
tests4py :: INFO     :: Loading projects
tests4py :: INFO     :: Checking for platform darwin
tests4py :: INFO     :: Check for activated python version
tests4py :: INFO     :: Using pyenv python 3.10.9
tests4py :: INFO     :: Activating virtual env


{
    "command": "sfl",
    "subcommand": "events",
    "successful": true,
    "project": "middle_2"
}

## Analyzing the Fault Localization Information

Now we can leverage the fault localization information to analyze the subject and find the most likely faulty lines.

In [6]:
report = sfl.sflkit_analyze(work_dir_or_project=dst, 
                            src=src,
                            predicates="line")

tests4py :: INFO     :: Checking whether Tests4Py project
tests4py :: INFO     :: Loading projects


Now we retrieve the suggested lines based on the Ochiai metric.

In [7]:
from sflkit.analysis.analysis_type import AnalysisType
from sflkit.analysis.spectra import Spectrum

suggestions = report.analyzer.get_sorted_suggestions(src,
            type_=AnalysisType.LINE,
            metric=Spectrum.Ochiai,)
suggestions

[[src/middle/__init__.py:6]:0.7071067811865475,
 [src/middle/__init__.py:5]:0.5773502691896258,
 [src/middle/__init__.py:3]:0.5,
 [src/middle/__init__.py:2]:0.4082482904638631,
 [src/middle/__init__.py:12, src/middle/__init__.py:4, src/middle/__init__.py:10, src/middle/__init__.py:8, src/middle/__init__.py:9]:0.0]

## Evaluation

With Tests4Py we can get the faulty lines, i.e. the lines that were changed with the fix.

In [8]:
faulty_lines = t4p.get_faulty_lines(t4p.middle_2)
faulty_lines

[src/middle/__init__.py:6]

As you can see, the top suggested line is indeed the correctly evaluated faulty line.

In [9]:
assert faulty_lines[0] in suggestions[0].lines

## Summary

Tests4Py is with its direct integration of fault localization tools a powerful tool to analyze approaches
in this area.