
# Evalutating different fitness functions for EvoGFuzz

In our project we implement the three given fitness functions with a ```naive```, an ```improved``` and a ```sophisticated``` approach, that were given in the **EvoGFuzz** paper. We then came up with a new approach that uses and aims to improve the ```sophisticated``` approach. We call it the ```ratioed sophisticated``` approach. In this notebook we evaluate each of the approach.


We use the same example as **EvoGFuzz** and therefore need to define our calculator, its oracle and the grammar.

In [1]:
import math

def calculator(inp: str) -> float:
    return eval(
        str(inp), {"sqrt": math.sqrt, "sin": math.sin, "cos": math.cos, "tan": math.tan}
    )

In [2]:
# Make sure you use the OracleResult from the debugging_framework library
from debugging_framework.input.oracle import OracleResult

def oracle(inp: str):
    try:
        calculator(inp)
    except ValueError as e:
        return OracleResult.FAILING
    
    return OracleResult.PASSING

In [3]:
from debugging_framework.types import Grammar
from debugging_framework.fuzzingbook.grammar import is_valid_grammar

CALCGRAMMAR: Grammar = {
    "<start>":
        ["<function>(<term>)"],

    "<function>":
        ["sqrt", "tan", "cos", "sin"],
    
    "<term>": ["-<value>", "<value>"], 
    
    "<value>":
        ["<integer>.<integer>",
         "<integer>"],

    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
}
    
assert is_valid_grammar(CALCGRAMMAR)

For the new fitness functions we also need to define some helper functions, that we use later on. 

We start by defining a function ```count_expansions``` for the ```improved``` approach, that returns the expansions of a tree, when given the first children of the root of the derivation tree.

We then define the ```calculate_sophisticated_score_structure``` function, that calculated the score structure which considers more complex expansions that are more nested. We sum up the degree exponantiated with the height of the individual nodes.

In [4]:
def count_expansions(children):
    if children == []:
        return 0

    counter = 1

    for child in children:
        _, next_children = child
        counter += count_expansions(next_children)    
    
    return counter

In [5]:
def calculate_sophisticated_score_structure(children, height):
    score = 0
    
    for child in children:
        node, next_children = child        
        score += len(next_children)**height
        score += calculate_sophisticated_score_structure(next_children, height+1)

    return score

In [6]:
def calculate_height_and_degreeSum(children, height):
    max_height = height
    score = 0
    
    for child in children:
        node, next_children = child
        score += len(next_children)**height
        next_score, next_height = calculate_height_and_degreeSum(next_children, height+1)
        score += next_score
        if next_height > max_height:
            max_height = next_height
            
    return score, max_height

Fitness functions:
TODO

In [7]:
from evogfuzz.input import Input

In [8]:
def naive_fitness_function(test_input: Input) -> int:
    score_structure = len(str(test_input))
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [9]:
def improved_fitness_function(test_input: Input) -> float:
    _, children = test_input.tree
    number_expansions = count_expansions(children)
    lam = 100
    score_structure = (number_expansions**2)/(lam * len(str(test_input)))
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [10]:
def sophisticated_fitness_function(test_input: Input) -> float:
    score_structure = calculate_sophisticated_score_structure([test_input.tree],0)
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 2**100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [11]:
def ratio_sophisticated_fitness_function(test_input: Input) -> float:
    lam = 2**50
    degreeSums, height = calculate_height_and_degreeSum([test_input.tree],0)
    score_structure = degreeSums/(lam * height)
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

Finally we can define EvoGFuzz instances with all different fitness functions. For comparison, we start with defining an instance with the standard fitness function and thereafter define one for each of ours.


In [12]:
from evogfuzz.evogfuzz_class import EvoGFuzz

def eval_fitness(eval_iterations,initial_inputs,iterations):
    """
    Evaluate the differnet fitness functions.
    :param eval_iterations: The number of iterations we use to calculate the found exception inputs.
    :param initial_inputs: The input from which EvoGFuzz starts to train.
    :param itarations: The number of iterations EvoGFuzz trains.
    :return: The total number of found exception inputs per iteration.
    """
    
    dict_stand = {i: 0 for i in range(iterations)}
    dict_naive = {i: 0 for i in range(iterations)}
    dict_impr = {i:0 for i in range(iterations)}
    dict_soph = {i:0 for i in range(iterations)}
    dict_ratio_soph = {i:0 for i in range(iterations)}
    
    for i in range(eval_iterations):
        epp_stand = EvoGFuzz(
            grammar=CALCGRAMMAR,
            oracle=oracle,
            inputs=initial_inputs,
            iterations=iterations
        )
        
        epp_naive = EvoGFuzz(
            grammar=CALCGRAMMAR,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=naive_fitness_function,
            iterations=iterations
        )
        
        epp_impr = EvoGFuzz(
            grammar=CALCGRAMMAR,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=improved_fitness_function,
            iterations=iterations
        )
        
        epp_soph = EvoGFuzz(
            grammar=CALCGRAMMAR,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=sophisticated_fitness_function,
            iterations=iterations
        )

        epp_ratio_soph = EvoGFuzz(
            grammar=CALCGRAMMAR,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=ratio_sophisticated_fitness_function,
            iterations=iterations
        )
        
        found_exc_inp_stand = epp_stand.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_stand[iteration] += len(found_exc_inp_stand[iteration])
        
        found_exc_inp_naive = epp_naive.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_naive[iteration] += len(found_exc_inp_naive[iteration])
            
        found_exc_inp_impr = epp_impr.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_impr[iteration] += len(found_exc_inp_impr[iteration])
            
        found_exc_inp_soph = epp_soph.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_soph[iteration] += len(found_exc_inp_soph[iteration])
            
        found_exc_inp_ratio_soph = epp_ratio_soph.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_ratio_soph[iteration] += len(found_exc_inp_ratio_soph[iteration])

    return dict_stand, dict_naive, dict_impr, dict_soph, dict_ratio_soph

In [13]:
def print_total_found_exc(dict_found_exc_inp, fitness):
    counter = 0
    for iteration in dict_found_exc_inp.keys():
        counter += dict_found_exc_inp[iteration]

    print(f"EvoGFuzz found {counter} bug-triggering inputs fith the {fitness} fitness function!")

In [14]:
def print_output(dict_found_exc_inp, fitness):
    print(fitness, end="\n")
    for iteration in dict_found_exc_inp.keys():
        print(iteration, ":", dict_found_exc_inp[iteration], end="\n")
    print("\n\n")

In [17]:
initial_inputs = ['sqrt(1)', 'cos(912)', 'tan(4)']

dict_stand, dict_naive, dict_impr, dict_soph, dict_ratio_soph = eval_fitness(eval_iterations=2, initial_inputs=initial_inputs, iterations=20)

print_total_found_exc(dict_stand, "standard")
print_output(dict_stand, "standard")

print_total_found_exc(dict_naive, "naive")
print_output(dict_naive, "naive")

print_total_found_exc(dict_impr, "improved")
print_output(dict_impr, "improved")

print_total_found_exc(dict_soph, "sophisticated")
print_output(dict_soph, "sophisticated")

print_total_found_exc(dict_ratio_soph, "ratio_sophisticated")
print_output(dict_ratio_soph, "ratio_sophisticated")

Exception ignored in: <bound method IPythonKernel._clean_thread_parent_frames of <ipykernel.ipkernel.IPythonKernel object at 0x7f1abe186860>>
Traceback (most recent call last):
  File "/mnt/c/Users/tessa/Documents/InformatikHU/24SS_SE-II/EvoGFuzz-Linux2/venv/lib/python3.10/site-packages/ipykernel/ipkernel.py", line 775, in _clean_thread_parent_frames
    def _clean_thread_parent_frames(
  File "/mnt/c/Users/tessa/Documents/InformatikHU/24SS_SE-II/EvoGFuzz-Linux2/venv/lib/python3.10/site-packages/debugging_framework/execution/timeout_manager.py", line 18, in alarm_handler
    raise TimeoutError("Function call timed out")
TimeoutError: Function call timed out


EvoGFuzz found 3384 bug-triggering inputs fith the standard fitness function!
standard
0 : 0
1 : 0
2 : 29
3 : 43
4 : 58
5 : 94
6 : 138
7 : 228
8 : 228
9 : 228
10 : 228
11 : 228
12 : 229
13 : 229
14 : 230
15 : 230
16 : 230
17 : 230
18 : 251
19 : 253



EvoGFuzz found 5106 bug-triggering inputs fith the naive fitness function!
naive
0 : 0
1 : 0
2 : 55
3 : 153
4 : 209
5 : 282
6 : 282
7 : 282
8 : 282
9 : 282
10 : 289
11 : 289
12 : 289
13 : 291
14 : 295
15 : 298
16 : 316
17 : 373
18 : 403
19 : 436



EvoGFuzz found 495 bug-triggering inputs fith the improved fitness function!
improved
0 : 0
1 : 9
2 : 26
3 : 26
4 : 26
5 : 26
6 : 26
7 : 26
8 : 26
9 : 26
10 : 26
11 : 26
12 : 26
13 : 26
14 : 26
15 : 26
16 : 26
17 : 26
18 : 26
19 : 44



EvoGFuzz found 4434 bug-triggering inputs fith the sophisticated fitness function!
sophisticated
0 : 0
1 : 10
2 : 47
3 : 88
4 : 145
5 : 164
6 : 227
7 : 259
8 : 259
9 : 259
10 : 261
11 : 281
12 : 281
13 : 281
14 : 281
15 : 285
16 : 291
17 : 317
18 : 330
19 : 368
