# First Steps towards Grammar-Based Fuzzing
<hr/>

In this notebook, we will learn about fuzzers and how to use them to produce failures.
**Fuzzing** is a powerful testing technique where we feed random inputs to a program to see if it crashes or behaves unexpectedly.

<div class="alert alert-success alertsuccess">
[Tip]: To execute the Python code in the code cell below, click on the cell to select it and press <kbd>Shift</kbd> + <kbd>Enter</kbd>.
</div>

In [None]:
print("Hello, this is a notebook!")

## Part 1: Simple Fuzzing

First, let's start with a simple fuzzing function that generates random strings.

In [None]:
import string
import random

def simple_fuzzer(max_length=50, char_set=string.ascii_letters + string.digits + string.punctuation):
    """A simple fuzzer that creates a string of random characters."""
    length = random.randint(0, max_length)
    return ''.join(random.choice(char_set) for _ in range(length))

Now, we simply need to run our fuzzing function `simple_fuzzer()` multiple times and use the output to test the program or service we're interested in.

In [None]:
for _ in range(10):
    print(simple_fuzzer())

Congratulation! We can now use this input generator to test some programs!

## Part 2: Fuzzing a Program

Our program under investigation is `The Calculator`. This program acts as a typical calculator, capable of evaluating not just arithmetic expressions but also trigonometric functions, such as sine, cosine, and tangent. Furthermore, it also supports the calculation of the square root of a given number.

In [None]:
import math

def calculator(inp: str) -> float:
    """
        A simple calculator function that can evaluate arithmetic expressions 
        and perform basic trigonometric functions and square root calculations.
    """
    return eval(
        str(inp), {"sqrt": math.sqrt, "sin": math.sin, "cos": math.cos, "tan": math.tan}
    )

**Side Note:** In the `calculator`, we use Python's `eval` function, which takes a string and evaluates it as a Python expression. We provide a dictionary as the second argument to eval, mapping names to corresponding mathematical functions. This enables us to use the function names directly within the input string. 

In [None]:
# Evaluating the cosine of 2π
print(calculator('cos(6*3.141)'))

In [None]:
# Calculating the square root of 36
print(calculator('sqrt(6*6)'))

Each of these calls to the calculator will evaluate the provided string as a mathematical expression, and print the result.

Let's introduce our `simple_fuzzer()` function to generate test inputs for the calculator() function:

In [None]:
# Using the simple_fuzzer to generate a random input for the calculator
inp = simple_fuzzer()
try:
    calculator(inp)
except Exception as e:
    print(f"Input '{inp}' triggered an Exception!", e)

From the above experiment, we observe that the majority of the exceptions we encounter stem from the parsing stage of the input. As a consequence, we're unable to probe the internal functionality of our program effectively.

To address this issue, we'll incorporate a basic syntax checker, `CalculatorSyntaxError`, at the onset of the calculator() function. This way, we'll be able to discern syntactically incorrect inputs early on:

In [None]:
import math

class CalculatorSyntaxError(Exception):
    """
    Exception raised for errors in the calculator input syntax.
    """
    pass

def calculator(inp: str) -> float:
    """
    A simple calculator function that evaluates arithmetic and basic trigonometric functions.
    It checks the syntax of the input string before execution.
    """
    
    if not inp.startswith(('sqrt', 'cos', 'sin', 'tan')):
        # Simple syntax check to verify if input string starts with valid calculator functions
        raise CalculatorSyntaxError(f"'{inp}' is not a valid calculator input")
    
    return eval(
        str(inp), {"sqrt": math.sqrt, "sin": math.sin, "cos": math.cos, "tan": math.tan}
    )

In this refined code, the `calculator()` function now checks whether the input string begins with one of the valid function names: 'sqrt', 'cos', 'sin', or 'tan'. If not, it raises a`CalculatorSyntaxError`, signalling that the input is syntactically incorrect.

In [None]:
try:
    calculator(simple_fuzzer())
except CalculatorSyntaxError as e:
    print(e)

Now, to find new defects, we need to introduce an oracle that tells us if the error that is triggered is something we expect or a new/unkonwn defect. The `OracleResult` is an enum with two possible values, `NO_BUG` and `BUG`. `NO_BUG` donates a passing test case and `BUG` a failing one.

In [None]:
from enum import Enum

class OracleResult(Enum):
    BUG = "BUG"
    NO_BUG = "NO_BUG"
    UNDEF = "UNDEF"
    
    def __repr__(self):
        return self.value

    def __str__(self):
        return self.value

This is a function called oracle, which acts as an intermediary to handle and classify exceptions produced by the calculator function when given a certain input.

In [None]:
def oracle(inp: str):
    """
    This function serves as an oracle or intermediary that catches and handles exceptions 
    generated by the 'calculator' function. The oracle function is used in the context of fuzz testing.
    It aims to determine whether an input triggers a bug in the 'calculator' function.

    Args:
        inp (str): The input string to be passed to the 'calculator' function.

    Returns:
        OracleResult: An enumerated type 'OracleResult' indicating the outcome of the function execution.
            - OracleResult.NO_BUG: Returned if the calculator function executes without any exception or only with CalculatorSyntaxError
            - OracleResult.BUG: Returned if the calculator function raises an exception other than CalculatorSyntaxError, indicating a potential bug.
    """
    try:
        calculator(inp)
    except CalculatorSyntaxError as e:
        # print(e)
        return OracleResult.UNDEF
    except Exception as e:
        return OracleResult.BUG
    
    return OracleResult.NO_BUG

This **oracle** function is used in the context of fuzzing to determine the impact of various inputs on the program under test (in our case the _calculator_). When the calculator function behaves as expected (i.e., no exceptions or only CalculatorSyntaxError exceptions occur), the **oracle** function returns `OracleResult.NO_BUG`. However, when the `calculator` function raises an unexpected exception, the **oracle** interprets this as a potential bug in the `calculator` and returns `OracleResult.BUG`.

In [None]:
for _ in range(10):
    inp = simple_fuzzer()
    print(inp.ljust(50), oracle(inp))

However, using our `simple_fuzzer()`, it becomes apparent that we're unable to trigger any exceptions beyond parsing-related ones (`CalculatorSyntaxException`). This is because most of the randomly generated input strings are not valid input and are hence rejected during the parsing stage. This means that our current fuzzing approach isn't effective at testing deeper, more functional aspects of our program. Therefore, we need a more sophisticated strategy to generate test inputs that can pass the parsing stage and potentially expose functional bugs in our program.

## Part 3: Simple Grammar-Based Fuzzing

<div class="alert alert-info">
[Info]: We use the basic functionallity provided by <a href="https://www.fuzzingbook.org">The Fuzzingbook</a>. For a more detailed description of Grammars, have a look at the chapter <a href="https://www.fuzzingbook.org/html/Grammars.html">Fuzzing with Grammars</a>.
</div>

This section focuses on implementing a grammar-based fuzzing approach. This methodology will allow us to create more complex and relevant input strings, which have a higher likelihood of triggering deeper, non-syntactic bugs in the target program.

In [None]:
from typing import Dict, List

# A grammar in the context of our fuzzing approach is a dictionary where:
# - The keys are nonterminal symbols, representing a category of expressions or structures.
# - The values are lists of possible expansions or rules for each nonterminal symbol.
Grammar = Dict[str, List[str]]

In this definition, a `Grammar` is a Python dictionary. Each key-value pair represents a rule in our grammar:
- The key is a str that serves as a nonterminal symbol. Nonterminal symbols are placeholders for patterns or structures that can be expanded or replaced with other symbols (which can be terminal or nonterminal).
- The value is a List[str] that provides the potential expansions for that nonterminal symbol. Each string in this list is a rule that describes one possible form the nonterminal symbol can take. It can consist of a combination of terminal and nonterminal symbols.

Using such a grammar structure helps guide our fuzzer to generate more meaningful and diverse inputs for testing.

The following code represents a simple context-free grammar for our calculator function. This grammar encompasses all the potential valid inputs to the calculator, which include mathematical expressions involving square roots, trigonometric functions, and integer and decimal numbers:

In [None]:
calculator_grammar = {
    "<start>":
        ["<function>(<term>)"],

    "<function>":
        ["sqrt", "tan", "cos", "sin"],
    
    "<term>": ["-<value>", "<value>"], 
    
    "<value>":
        ["<integer>.<integer>",
         "<integer>"],

    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["1", "2", "3", "4", "5", "6", "7", "8", "9"]
}

Each key-value pair in the calculator_grammar dictionary defines a nonterminal symbol (as the key) and its potential expansions (as the values). The fuzzer will use this grammar to generate valid mathematical expressions for testing the calculator.

In [None]:
from fuzzingbook.Grammars import nonterminals
import random

class ExpansionError(Exception):
    """
    Exception raised for errors in the expansion process.
    """
    pass


def simple_grammar_fuzzer(grammar: Grammar, 
                          start_symbol: str = "<start>",
                          max_nonterminals: int = 3,
                          max_expansion_trials: int = 100,
                          log: bool = False) -> str:
    """
    A simple grammar fuzzer that generates strings based on a given grammar.

    Args:
        grammar (Grammar): The grammar based on which the strings are generated.
        start_symbol (str, optional): The symbol in the grammar where the fuzzer begins generation. Defaults to "<start>".
        max_nonterminals (int, optional): The maximum number of nonterminals allowed in an expanded string. 
            Prevents the generation of excessively large strings. Defaults to 10.
        max_expansion_trials (int, optional): The maximum number of attempts to expand a nonterminal. 
            Prevents infinite loops in grammar expansion. Defaults to 100.
        log (bool, optional): If set to True, prints the expansion progress. Defaults to False.

    Returns:
        str: The generated string based on the provided grammar.

    Raises:
        ExpansionError: If the maximum number of expansion trials is reached without a valid expansion.
    """

    term = start_symbol # <start>
    expansion_trials = 0

    while len(nonterminals(term)) > 0:
        # Select a nonterminal symbol from the current term
        symbol_to_expand = random.choice(nonterminals(term)) #1

        # Select a random expansion for this symbol
        expansions = grammar[symbol_to_expand] # [<function>(<term>)]
        expansion = random.choice(expansions) # <function>(<term>)

        # Replace the chosen nonterminal symbol with the new expansion
        new_term = term.replace(symbol_to_expand, expansion, 1)

        # Check if the number of nonterminals in the new term is below the threshold
        if len(nonterminals(new_term)) < max_nonterminals:
            term = new_term
            if log:
                # Log the current replacement and the resulting term
                print(f"{symbol_to_expand} -> {expansion}".ljust(40), term)
            expansion_trials = 0
        else:
            # If we can't find a suitable expansion after max_expansion_trials, raise an error
            expansion_trials += 1
            if expansion_trials >= max_expansion_trials:
                raise ExpansionError(f"Cannot expand {repr(term)}")

    return term

This function generates strings based on a provided grammar. It starts from a starting symbol and randomly expands one of the nonterminals in the current string. It continues this process until there are no nonterminals left or the number of nonterminals exceeds a specified maximum. The function also keeps track of the number of expansion trials to prevent infinite loops in grammar expansion.

In [None]:
simple_grammar_fuzzer(grammar=calculator_grammar, log=True)

Let's put our `simple_grammar_fuzzer()` to the test by applying it to our `calculator`. For each input, we print the input string itself along with the result as determined by our oracle function. This helps us to see the variety of inputs that our fuzzer can generate and the different outcomes they lead to in our calculator function:

In [None]:
# Generating and testing 10 inputs using our simple grammar-based fuzzer
for _ in range(10):
    inp = simple_grammar_fuzzer(grammar=calculator_grammar)
    print(inp.ljust(40), oracle(inp))

With the employment of our simple_grammar_fuzzer, we have successfully revealed some genuine bugs in our calculator. Our fuzzer's ability to generate varied, grammar-conformant inputs has enabled a more comprehensive exploration of the calculator's functionality, resulting in the identification of these issues.

## Part 4: Probabilistic Grammar-Based Fuzzing

<div class="alert alert-info">
[Info]: For this chapter on probabilistic fuzzing, we use the functions provided by <a href="https://www.fuzzingbook.org">The Fuzzingbook</a>. For a more detailed description of the ProbabilisticGrammarFuzzer, have a look at the chapter <a href="https://www.fuzzingbook.org/html/ProbabilisticGrammarFuzzer.html">Probabilistic Grammar Fuzzing</a>.
</div>

In the next section, we delve into "Probabilistic Grammar-Based Fuzzing". An essential aspect of this approach is harnessing the concept of probability distribution. To illustrate this, we consider the Law of Leading Digits, also known as Benford's Law.

## Law of the leading digits

Benford's Law reveals a surprising phenomenon about leading digits in many sets of numerical data: smaller numbers tend to occur as the leading digits more frequently. Specifically, '1' appears as the leading digit about 30% of the time, while '9' appears just under 5% of the time.

We can calculate the probability of each leading digit (1 through 9) occurring using the following formula:

In [None]:
def prob_leading_digit(d: int) -> float:
    """Calculates the probability of a digit to be the leading digit
    according to Benford's Law.
    """
    return math.log10(d + 1) - math.log10(d)

In [None]:
import math
digit_probs = [prob_leading_digit(d) for d in range(1, 10)]
[(d, "%.3f" % digit_probs[d - 1]) for d in range(1, 10)]

We observe that smaller digits indeed have higher probabilities, as per Benford's Law. This understanding will guide us when we apply probabilities in grammar-based fuzzing.

Next, we incorporate the concept of Benford's Law into our Calculator grammar. By doing this, we assign probabilities to the leading digits, thereby transforming our grammar into a probabilistic grammar:

In [None]:
from fuzzingbook.Grammars import is_valid_grammar
from fuzzingbook.Grammars import opts
from fuzzingbook.Grammars import Grammar


probabilistic_calculator_grammar = {
    "<start>":
        ["<function>(<term>)"],

    "<function>":
        ["sqrt", "tan", "cos", "sin"],
    
    "<term>": ["-<value>", "<value>"], 
    
    "<value>":
        ["<leadinteger>.<integer>",
         "<leadinteger>"],

    "<leadinteger>":
        ["<leaddigit><integer>", "<leaddigit>"],
    
    # Benford's law: frequency distribution of leading digits
    "<leaddigit>":
        [("1", opts(prob=0.301)),
         ("2", opts(prob=0.176)),
         ("3", opts(prob=0.125)),
         ("4", opts(prob=0.097)),
         ("5", opts(prob=0.079)),
         ("6", opts(prob=0.067)),
         ("7", opts(prob=0.058)),
         ("8", opts(prob=0.051)),
         ("9", opts(prob=0.046)),
         ],

    # Remaining digits are equally distributed
    "<integer>":
        ["<digit><integer>", "<digit>"],

    "<digit>":
        ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"],
}

assert is_valid_grammar(probabilistic_calculator_grammar)

In this updated grammar, we differentiate between `<leaddigit>` and `<digit>`. For `<leaddigit>`, each option (from "1" to "9") is assigned a probability value based on Benford's Law. This creates a probabilistic bias towards lower digits for the leading digit of a number. The rest of the digits (`<digit>`), however, maintain a uniform distribution.

We are now prepared to utilize our newly formed probabilistic grammar to generate inputs for our calculator. These inputs will inherently adhere to Benford's Law due to the probabilities assigned in our grammar:

In [None]:
from fuzzingbook.ProbabilisticGrammarFuzzer import ProbabilisticGrammarFuzzer

fuzzer = ProbabilisticGrammarFuzzer(probabilistic_calculator_grammar)

for i in range(20):
    print(fuzzer.fuzz())

This script initiates a new instance of the `ProbabilisticGrammarFuzzer`from the
`fuzzingbook` using our `probabilistic_calculator_grammar`. Then, we generate and print 10 fuzzed inputs, demonstrating the integration of Benford's Law within our fuzzing approach.

## Part 4.1 Input Generation Strategies: Learning from Samples

###  Common Input Generation ( Give me `More of the Same`)

By analyzing and learning from a collection of typical examples, we can derive a "typical" probability distribution. This, in turn, enables us to generate more "typical" or "common" inputs. This is particularly beneficial in regression testing, where we want to ensure that the system's existing functionalities continue to work as expected with standard, frequently observed inputs.

In [None]:
# Let's import the Probabilistic Grammar Miner
from fuzzingbook.ProbabilisticGrammarFuzzer import ProbabilisticGrammarMiner
from fuzzingbook.Parser import EarleyParser

# We instantiate a miner with the earley parser
probabilistic_grammar_miner = ProbabilisticGrammarMiner(
    EarleyParser(calculator_grammar))

To extrapolate a probabilistic grammar from a set of sample inputs, we invoke the `mine_probabilistic_grammar()` function with a list of inputs. Here, we provide a couple of sample arithmetic expressions. Then, we derive a new probabilistic grammar, which reflects the characteristics of our provided inputs:

In [None]:
# to learn from a set of samples you can invoke the function mine_probabilistic_grammar(List[str])
inputs = ['sqrt(36)', 'cos(6.282)']

learned_probabilistic_grammar = probabilistic_grammar_miner.mine_probabilistic_grammar(
    inputs)

In [None]:
from pprint import pprint

pprint(learned_probabilistic_grammar['<integer>'])

This way, the learned grammar is reflective of the sample inputs, allowing us to generate similar "typical" or "common" inputs for testing.

Moving forward, let's create a new fuzzer using the probabilistic grammar we just derived:

In [None]:
from fuzzingbook.ProbabilisticGrammarFuzzer import ProbabilisticGrammarFuzzer

# Instantiate a new fuzzer
fuzzer = ProbabilisticGrammarFuzzer(learned_probabilistic_grammar, min_nonterminals=3)

This fuzzer, which has been instantiated with the grammar learned from the provided inputs, will generate fuzzed inputs that exhibit similar characteristics to those initial inputs. Let's demonstrate this by generating and printing 10 fuzzed inputs:

In [None]:
# Generated inputs will be similar to the initial inputs
for i in range(20):
    print(fuzzer.fuzz())

Through this, you can observe how the inputs generated by the fuzzer are similar to the samples we provided, which serves to demonstrate the power of **probabilistic grammar-based fuzzing**.

### Learning from failure inducing inputs

Now, let's leverage the Probabilistic Grammar Miner to ascertain the distribution of inputs that induce failure:

By analyzing and learning from inputs that have previously led to failures, we can generate similar inputs that effectively probe the areas around the original, failure-inducing inputs. This is highly valuable in assessing the thoroughness of applied fixes.

For each of these failure-inducing inputs, we print the input itself and its corresponding oracle result:

In [None]:
# Failure inducing Inputs
failure_inducing_samples = ['sqrt(-24)', 'sqrt(-2)']

for inp in failure_inducing_samples:
    print(inp.ljust(20), oracle(inp))

Next, we initialize a new Grammar Miner with the Earley Parser, employing our original calculator grammar. We then use this miner to learn the distribution of the failure-inducing samples. Subsequently, we generate and print 10 similar samples using a fuzzer instantiated with the learned probabilistic grammar:

In [None]:
# New Prob. Grammar Miner for the CALCULATOR Grammar
probabilistic_grammar_miner = ProbabilisticGrammarMiner(
    EarleyParser(calculator_grammar))

# Lets learn the distribution of the failure inducing samples
learned_probabilistic_grammar = probabilistic_grammar_miner.mine_probabilistic_grammar(
    failure_inducing_samples)

# Generate similar samples
fuzzer = ProbabilisticGrammarFuzzer(learned_probabilistic_grammar, min_nonterminals=3)

for _ in range(10):
    inp = fuzzer.fuzz()
    print(inp.ljust(20), oracle(inp))

These new samples, while not identical to the original failure-inducing inputs, will bear similar traits and are likely to challenge the robustness of any implemented fixes.



## Part 5: Evolutionary Grammar Based Input Generation

In [None]:
# you might need to install the latest version of evogfuzz
# uncomment to install via pip
#!pip install evogfuzz

In [None]:
from evogfuzz.oracle import OracleResult

def oracle(inp: str):
    try:
        calculator(str(inp))
    except CalculatorSyntaxError as e:
        # print(e)
        return OracleResult.NO_BUG
    except Exception as e:
        return OracleResult.BUG
    
    return OracleResult.NO_BUG

In [None]:
from evogfuzz.evogfuzz_class import EvoGGen

evo = EvoGGen(
    grammar=calculator_grammar,
    oracle=oracle,
    inputs=['sqrt(-1)'],
    transform_grammar=True,
    iterations=15
)
generated_grammar, failing_inputs = evo.optimize()

In [None]:
for inp in list(failing_inputs)[:20]:
    print(inp)

In [None]:
from pprint import pprint

pprint(generated_grammar)

In [None]:
from fuzzingbook.ProbabilisticGrammarFuzzer import ProbabilisticGrammarFuzzer

fuzzer = ProbabilisticGrammarFuzzer(generated_grammar)
for _ in range(10):
    inp = fuzzer.fuzz()
    print(inp.ljust(40), oracle(inp))

## Part 5: Explaining Bugs

In [None]:
# you might need to install the latest version of alhazen-py
# uncomment to install via pip
!pip install alhazen-py

In [None]:
from alhazen.oracle import OracleResult

def oracle(inp: str):
    try:
        calculator(str(inp))
    except CalculatorSyntaxError as e:
        # print(e)
        return OracleResult.NO_BUG
    except Exception as e:
        return OracleResult.BUG
    
    return OracleResult.NO_BUG

In [None]:
from alhazen.alhazen import Alhazen
from alhazen.features import NUMERIC_INTERPRETATION_FEATURE, EXISTENCE_FEATURE

alhazen = Alhazen(
    grammar=calculator_grammar,
    initial_inputs=["sqrt(-1)", "cos(9)"],
    evaluation_function=oracle,
    features={NUMERIC_INTERPRETATION_FEATURE, EXISTENCE_FEATURE},
    max_iter=20,
)

alhazen.run()
alhazen.show_model()