
# Evalutating different fitness functions for EvoGFuzz

In our project we implement the three given fitness functions with a ```naive```, an ```improved``` and a ```sophisticated``` approach, that were given in the **EvoGFuzz** paper. We then came up with a new approach that uses and aims to improve the ```sophisticated``` approach. We call it the ```ratioed sophisticated``` approach. In this notebook we evaluate each of the approach.


We use the same example as **EvoGFuzz** and therefore need to define our calculator, its oracle and the grammar.

In [58]:
import math

def calculator(inp: str) -> float:
    return eval(
        str(inp), {"sqrt": math.sqrt, "sin": math.sin, "cos": math.cos, "tan": math.tan}
    )

In [59]:
# Make sure you use the OracleResult from the debugging_framework library
from debugging_framework.input.oracle import OracleResult

def oracle(inp: str):
    try:
        calculator(inp)
    except Exception as exc:
        return OracleResult.FAILING
    
    return OracleResult.PASSING

In [60]:
from debugging_framework.types import Grammar
from debugging_framework.fuzzingbook.grammar import is_valid_grammar

CALC_GRAMMAR: Grammar = {
    "<start>":
        ["<function>(<term>)"],

    "<function>":
        ["sqrt", "tan", "cos", "sin"],
    
    "<term>": ["-<value>", "<value>"], 
    
    "<value>":
        ["<integer>.<integer>",
         "<integer>"],

    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
}
    
assert is_valid_grammar(CALC_GRAMMAR)

In [61]:
EXPR_GRAMMAR:  Grammar = {
    "<start>":
        ["<expr>"],

    "<expr>":
        ["<term> + <expr>", "<term> - <expr>", "<term>"],

    "<term>":
        ["<factor> * <term>", "<factor> / <term>", "<factor>"],

    "<factor>":
        ["+<factor>",
         "-<factor>",
         "(<expr>)",
         "<integer>.<integer>",
         "<integer>"],

    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
}
assert is_valid_grammar(EXPR_GRAMMAR)

In [62]:
COMPL_CALC_GRAMMAR: Grammar = {
    "<start>":
        ["<expr>"],

    "<expr>":
        ["<term> + <expr>", "<term> - <expr>", "<term>"],

    "<term>":
        ["<factor> * <term>", "<factor> / <term>", "<factor>", "<function>(<term>)"],

    "<function>":
        ["sqrt", "tan", "cos", "sin"],

    "<factor>":
        ["+<factor>",
         "-<factor>",
         "(<expr>)",
         "<integer>.<integer>",
         "<integer>"],

    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
}
assert is_valid_grammar(COMPL_CALC_GRAMMAR)

For the new fitness functions we also need to define some helper functions, that we use later on. 

We start by defining a function ```count_expansions``` for the ```improved``` approach, that returns the number of expansions of a tree, when given the first children of the root of the derivation tree.

We then define the ```calculate_sophisticated_score_structure``` function, that calculated the score structure which considers more complex expansions that are more nested. We sum up the degree exponantiated with the height of the individual nodes.

As for our ```ratioed sophisticated``` approach we need the maximum height of the tree to devide by it, we also define a function similar to the one before, that also returns the max height of the derivation tree. We named it ```calculate_height_and_degreeSum```.

In [63]:
def count_expansions(children):
    if children == []:
        return 0

    counter = 1

    for child in children:
        _, next_children = child
        counter += count_expansions(next_children)    
    
    return counter

In [64]:
def calculate_sophisticated_score_structure(children, height):
    score = 0
    
    for child in children:
        node, next_children = child        
        score += len(next_children)**height
        score += calculate_sophisticated_score_structure(next_children, height+1)

    return score

In [65]:
def calculate_height_and_degreeSum(children, height):
    max_height = height
    score = 0
    
    for child in children:
        node, next_children = child
        score += len(next_children)**height
        next_score, next_height = calculate_height_and_degreeSum(next_children, height+1)
        score += next_score
        if next_height > max_height:
            max_height = next_height
            
    return score, max_height

In [66]:
def get_diff_expansions(children, exp_set):

    expansion = ""
    for child in children:
        node, _ = child
        expansion += node

    exp_set.add(expansion)

    for child in children:
        _, next_children = child
        exp_set.update(get_diff_expansions(next_children, exp_set))

    return exp_set

AS mentioned before the **EvoGFuzz** paper took a very simple approach for calculating the fitness of each input. The paper itself suggested three different functions that might improve the outcome for future work. A ```naive```, an ```improved``` and a ```sophisticated``` approach. 

The ```naive``` approach simply takes in the length of the input. To reward a failing input, we add 100 to the score. We implement this in the ```naive_fitness_function```.

For the ```improved``` approach we count the expansions of the derivation tree, square it and devide it by the length of the input and multiply the length by a paramater $\lambda$. We reward failing inputs as before. This is implemented in the ```improved_fitness_function```.

The ```sophisticated``` approach rewards more complex expansions. For that we also need the degree of each node, which is then exponantiated by the height of it. As the fitness scores tends to get quite large, we need to reward failing inputs by adding $2^{100}$ to it.

For our approach, we wanted to put the complexity of the expansion into perspective of the height of the derivation tree. We calculate our score as follows:
$$
score_{structure}(x) = \frac{\sum_{\forall v\in V}deg(v)^{h(v)}}{\lambda \cdot h} \text{.}
$$
We reward a failing input the same as in the first 2 approaches. It's implemented in the ```ratio_sophisticated_fitness_function```.

In [67]:
from evogfuzz.input import Input

In [68]:
def naive_fitness_function(test_input: Input) -> int:
    score_structure = len(str(test_input))
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [69]:
def improved_fitness_function(test_input: Input) -> float:
    _, children = test_input.tree
    number_expansions = count_expansions(children)
    lam = 100
    score_structure = (number_expansions**2)/(lam * len(str(test_input)))
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [70]:
def sophisticated_fitness_function(test_input: Input) -> int:
    score_structure = calculate_sophisticated_score_structure([test_input.tree],0)
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 2**100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [71]:
def ratio_sophisticated_fitness_function(test_input: Input) -> float:
    lam = 2**50
    degreeSums, height = calculate_height_and_degreeSum([test_input.tree],0)
    score_structure = degreeSums/(lam * height)
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

In [72]:
def diff_expansions_fitness_function(test_input: Input) -> int:
    score_structure = len(get_diff_expansions([test_input.tree], set()))
    if test_input.oracle == OracleResult.FAILING:
        score_feedback = 100
    else:
        score_feedback = 0
    return score_feedback + score_structure

Finally we can define EvoGFuzz instances with all different fitness functions. For comparison, we start with defining an instance with the standard fitness function and thereafter define one for each of ours.


In [73]:
from evogfuzz.evogfuzz_class import EvoGFuzz

def eval_fitness(eval_iterations,initial_inputs,iterations, grammar):
    """
    Evaluate the differnet fitness functions.
    :param eval_iterations: The number of iterations we use to calculate the found exception inputs.
    :param initial_inputs: The input from which EvoGFuzz starts to train.
    :param itarations: The number of iterations EvoGFuzz trains.
    :return: The total number of found exception inputs per iteration.
    """
    
    dict_stand = {i: 0 for i in range(iterations)}
    dict_naive = {i: 0 for i in range(iterations)}
    dict_impr = {i:0 for i in range(iterations)}
    dict_soph = {i:0 for i in range(iterations)}
    dict_ratio_soph = {i:0 for i in range(iterations)}
    dict_diff_exp = {i:0 for i in range(iterations)}
    
    for i in range(eval_iterations):
        epp_stand = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            iterations=iterations
        )
        
        epp_naive = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=naive_fitness_function,
            iterations=iterations
        )
        
        epp_impr = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=improved_fitness_function,
            iterations=iterations
        )
        
        epp_soph = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=sophisticated_fitness_function,
            iterations=iterations
        )

        epp_ratio_soph = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=ratio_sophisticated_fitness_function,
            iterations=iterations
        )
        
        epp_diff_exp = EvoGFuzz(
            grammar=grammar,
            oracle=oracle,
            inputs=initial_inputs,
            fitness_function=diff_expansions_fitness_function,
            iterations=iterations
        )
        
        found_exc_inp_stand = epp_stand.fuzz()
        for iteration in found_exc_inp_stand.keys():
            dict_stand[iteration] += len(found_exc_inp_stand[iteration])
        
        found_exc_inp_naive = epp_naive.fuzz()
        for iteration in found_exc_inp_naive.keys():
            dict_naive[iteration] += len(found_exc_inp_naive[iteration])
            
        found_exc_inp_impr = epp_impr.fuzz()
        for iteration in found_exc_inp_impr.keys():
            dict_impr[iteration] += len(found_exc_inp_impr[iteration])
            
        found_exc_inp_soph = epp_soph.fuzz()
        for iteration in found_exc_inp_soph.keys():
            dict_soph[iteration] += len(found_exc_inp_soph[iteration])
            
        found_exc_inp_ratio_soph = epp_ratio_soph.fuzz()
        for iteration in found_exc_inp_ratio_soph.keys():
            dict_ratio_soph[iteration] += len(found_exc_inp_ratio_soph[iteration])
            
        found_exc_inp_diff_exp = epp_diff_exp.fuzz()
        for iteration in found_exc_inp_diff_exp.keys():
            dict_diff_exp[iteration] += len(found_exc_inp_diff_exp[iteration])

    return dict_stand, dict_naive, dict_impr, dict_soph, dict_ratio_soph, dict_diff_exp

In [74]:
def print_total_found_exc(dict_found_exc_inp, iterations, fitness):
    total = dict_found_exc_inp[iterations-1]
    print(f"EvoGFuzz found {total} bug-triggering inputs with the {fitness} fitness function!")

In [75]:
def print_output(dict_found_exc_inp, fitness):
    print(fitness, end="\n")
    for iteration in dict_found_exc_inp.keys():
        print(iteration, ":", dict_found_exc_inp[iteration], end="\n")
    print("\n\n")

In [84]:
eval_iterations = 190
initial_inputs = ['sqrt(1)', 'cos(912)', 'tan(4)']#['2 + 2', '-6 / 9', '-(23 * 7)']
iterations = 10
grammar = COMPL_CALC_GRAMMAR

dict_stand, dict_naive, dict_impr, dict_soph, dict_ratio_soph, dict_diff_exp = eval_fitness(eval_iterations, initial_inputs, iterations, grammar)

print_total_found_exc(dict_stand, iterations, "standard")
print_output(dict_stand, "standard")

print_total_found_exc(dict_naive, iterations, "naive")
print_output(dict_naive, "naive")

print_total_found_exc(dict_impr, iterations, "improved")
print_output(dict_impr, "improved")

print_total_found_exc(dict_soph, iterations, "sophisticated")
print_output(dict_soph, "sophisticated")

print_total_found_exc(dict_ratio_soph, iterations, "ratioed sophisticated")
print_output(dict_ratio_soph, "ratioed sophisticated")

print_total_found_exc(dict_diff_exp, iterations, "different expansions")
print_output(dict_diff_exp, "different expansions")

SyntaxError: at 'sqrt(1)' (<string>)