# ESBMC-AI AICodeRepair Experiments

Using the knowledge from the [previous single iteration experiments](https://github.com/Yiannis128/aicoderepair_llmtests), implement findings into ESBMC-AI and re-run tests to measure performance.

## Methodology

Use the same samples as the previous experiments to directly compare performance.

For each sample run:

1. Show only the latest code state (last code generation) in the iterative APR loop, and original iteration:
    * Single line
    * Contextual
2. Show all the previous code states:
    * Single line
    * Contextual
3. Data Analysis:
    1. Determine the optimal way to show the history of code
    2. Which number of iterations is the best one?
    3. What is the number of lines of code that change during repair?

Total Experiments:
* We will use the best prompt template + role + ESBMC output type from the previous experiments: 2
* We have 100 experiments.

## Files Needed

This notebook will use:

* `samples`
* `esbmc_output`
* `includes`

All other files are generated as part of the pipeline.

# Running Experiments with GPT 3.5 Turbo

In [1]:
import os
from io import TextIOWrapper
from time import time
from math import floor
from subprocess import run, PIPE, STDOUT

import tiktoken
from time import sleep
from typing import Optional
from dotenv import get_key as load_dotenv, get_key
from openai import OpenAI, Completion

### AI Params + Definitions

In [2]:
MAX_TOKENS: int = 16385
os.environ["ESBMC_AI_CFG_PATH"] = "./config.json"

### Initializing Logger
Log everything using these easy custom print and write functions. Need to beware that opening log.txt may display outdated state until buffer is properly flushed. Editing the log file will result in corruption until `log_file.close()` is called.

In [3]:
def close_logger():
    # Close logger
    try:
        if not log_file.closed:
            log_file.close()
    except NameError:
        pass

In [4]:
close_logger()

In [5]:
# Initialize logger.
def init_logger(file_path: str = "log.txt"):
    log_file: TextIOWrapper
    try:
        log_file = open(file_path, "a")
    except (NameError, ValueError) as e:
        log_file = open("log.txt", "w")
    return log_file

def log_str(text: str = "") -> None:
    assert not log_file.closed, "The log file is closed."
    if len(text) == 0:
        log_file.write("\n")
    else:
        log_file.write(f"Log: {time()}: {text}\n")
    
def print_and_log(text: str = "") -> None:
    assert not log_file.closed, "The log file is closed."
    if len(text) == 0:
        log_file.write("\n")
        print()
    else:
        text = str(time()) + ": " + text
        log_file.write("Log: " + text + "\n")
        print(text)

def print_logging_session_message():
    print_and_log("Notice: Starting new logging session.")
    log_file.flush()

### Load and Parse Data

##### ESBMC Extract Parts of Output

In [6]:
# Load all the sample names to process as part of the experiments. They NEED to be sorted.
sample_names: list[str] = sorted(os.listdir(f"samples"))

### Define Prompts

The following prompts are going to be iterated through. The prompts `simple_prompts_no_esbmc` and `simple_prompts` are the baseline prompts.

In [7]:
# Best prompts from LLM aicoderepair tests

persona_prompt: list[str] = [
    "From now on, act as an {role} that repairs AI C code. You will be shown AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains a stack trace along with what type of error has occurred and its location. Provide the repaired C code as output, as would an {role}. Aside from the corrected source code, do not output any other text. The code is\n\n```c\n{source_code}\n```\n\nThe ESBMC output is\n\n```\n{esbmc_output}\n```\n",
    "From now on, act as an {role} that repairs AI C code. You will be shown a line of AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains what type of error has occurred and its location. Provide the repaired C code as output, as would an {role}. Aside from the corrected line of source code, do not output any other text. The code is\n\n```c\n{source_code}\n```\n\nThe ESBMC output is\n\n```\n{esbmc_output}\n```\nGuideline: Always prefer to repair using a single line of C code, unless neccessary.\nGuideline: Read the error in the ESBMC output and try to repair the fault.",
]

persona_prompt_flipped: list[str] = [
    "From now on, act as an {role} that repairs AI C code. You will be shown AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains a stack trace along with what type of error has occurred and its location. Provide the repaired C code as output, as would an {role}. Aside from the corrected source code, do not output any other text. The ESBMC output is\n\n```\n{esbmc_output}\n```\n\nThe source code is\n\n```c\n{source_code}\n```",
    "From now on, act as an {role} that repairs AI C code. You will be shown a line of AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains what type of error has occurred and its location. Provide the repaired C code as output, as would an {role}. Aside from the corrected line of source code, do not output any other text. The ESBMC output is\n\n```\n{esbmc_output}\n```\n\nThe source code is\n\n```c\n{source_code}\n```\nGuideline: Always prefer to repair using a single line of C code, unless neccessary.\nGuideline: Read the error in the ESBMC output and try to repair the fault.",
]

old_prompt = {
    "system": [
    {
      "role": "System",
      "content": "You are an secure code generator that parses vulnerable source code, and output from a program called ESBMC, which contains vulnerability information about the source code. You should use the output from ESBMC to find the problem, and correct the source code. ESBMC is always correct. You shall add a NULL check for every heap allocation you make. From this point on, you can only reply in source code. You shall only output source code as whole. Reply OK if you understand."
    },
    {
      "role": "AI",
      "content": "OK"
    },
    {
      "role": "Human",
      "content": "The following text is the source code of the program, reply OK if you understand:\n\n```c\n{source_code}\n```"
    },
    {
      "role": "AI",
      "content": "OK"
    },
    {
      "role": "Human",
      "content": "The following text is the output of ESBMC, reply OK if you understand:\n\n```\n{esbmc_output}```"
    },
    {
      "role": "AI",
      "content": "OK"
    }
  ],
  "initial": "Generate a correction for the source code provided. Show the code only. Do not reply with acknowledgements."
}

all_prompts: list = persona_prompt + persona_prompt_flipped
persona_roles: str = "Automated code repair tool"
all_prompts = [prompt.replace("{role}", persona_roles) for prompt in all_prompts]

# No role applied to original prompt
all_prompts.append(old_prompt)
    
for prompt in all_prompts:
    print("--------------------------")
    print(prompt)

--------------------------
From now on, act as an Automated code repair tool that repairs AI C code. You will be shown AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains a stack trace along with what type of error has occurred and its location. Provide the repaired C code as output, as would an Automated code repair tool. Aside from the corrected source code, do not output any other text. The code is

```c
{source_code}
```

The ESBMC output is

```
{esbmc_output}
```

--------------------------
From now on, act as an Automated code repair tool that repairs AI C code. You will be shown a line of AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains what type of error has occurred and its location. Provide the repaired C code as output, as would an Automated code repair tool. Aside from the corrected line of source code, do not output any other text. The code is

```c
{source_code}
```

The ESBMC output is

`

# Experiment Execution

The source code and the ESBMC output is too large for the LLM's context length. 3 strategies are proposed to see if they can alleviate the problem:

1. Constant: Split by line or by character no strucutre (Brutal split)
2. ~~Structural: Split semantically (function by function)~~
3. Contextual: Split from failure and show code before

#### Notation

Let `L={l1, l2, l3, ..., ln}` be the set of all line lengths and where `n` is the number of lines in `C` and where `lx` is the length of the `x`th line in `C`. So `C[l1]` is the length first line and so on... `E` represents the length of the line with the error and `e` is the index of that line in `L`, such that `E=L[e]=le`.

In [8]:
def run_esbmc_ai(file_name: str) -> str:
    cmd = ["pipenv", "run", "esbmc-ai", "-vvvrc", "fix-code", file_name]
    result = run(cmd, stdout=PIPE, stderr=STDOUT)
    exit_code: int = result.returncode
    output: str = result.stdout.decode("utf-8")
    return exit_code, output

## ~~Contextual Strategy~~

Involves getting the line at which the error has occured along with a ratio split of the lines before/after. In this case chose `85%` before and `10%` as we want to give as much information of how the code looked before the error, however, still include some lines after for context.

The following variables are declared:
* `LTOKENS=MAX_TOKENS*0.85` - The window of tokens to keep before the error line.
* `UTOKENS=MAX_TOKENS*0.10` - The window of tokens to keep after the error line.

The lower bound line index is calculated like so:
1. We want the largest `il` such that `S = Σ{il=0}{e}(L[e-il]<=LTOKENS)`.
2. The constraints of `S` are as follows: `0<=il<=e` and `0<=L[e-S:e]<=LTOKENS`.

Similarly, the upper bound line index is calculated like so:
1. We want the largest `iu` such that `S = Σ{iu=0}{e}(L[e+iu]<=UTOKENS)`.
2. The constraints of `S` are as follows: `0<=iu<=n-e` and `0<=L[e:e+Su]<=UTOKENS`.

The combined window will be: `L[e-lu:e+iu]` which will fill `95%` of `MAX_TOKENS`, the other `5%` will be allocated to the ESBMC output's counterexample stack-trace (and/)or violated property.

#### Experimental Loop

In [None]:
def run_contextual_sample(
        prompt_idx: int, 
        role_idx: int,
        esbmc_output_type: str, 
        file_idx: int) -> None:
    prompt: str = all_prompts[prompt_idx]
    file_name_key: str = sorted(data_samples.keys())[file_idx]
    role: str = persona_roles[role_idx]
    
    # Name will be sorted by experimental order. Not filename as common experiments can be
    # found near eachother.
    file_name: str = f"{prompt_idx}.{role_idx}.{esbmc_output_type}.{file_idx}.{os.path.basename(file_name_key)}"
    
    print_and_log()
    print_and_log(f"Notice: Checkpoint {file_name}")
    
    # Get CE or VP output for ESBMC.
    esbmc_output: str
    if esbmc_output_type == "ce":
        esbmc_output = data_esbmc_output[file_name_key]
    else:
        esbmc_output = data_vp_output[file_name_key]

    esbmc_output = get_esbmc_output_sized(esbmc_output)

    source_code: str = data_samples[file_name_key]
    source_code_lines: list[str] = source_code.splitlines(True)
    
    # Get the used ESBMC output length in order to calculate how to add it to the prompt template.
    esbmc_output_token_len: int = num_tokens_from_string(esbmc_output)

    # TODO If we get errors here pass full esbmc output instead of trimmed.
    err_line: int = get_source_code_err_line(esbmc_output)

    # Trim the source code in order to give it to the LLM.
    lower_bound: int = get_lower_bound(source_code_lines, err_line, esbmc_output_token_len * 0.9)
    # The upper bound is inclusive.
    upper_bound: int = get_upper_bound(source_code_lines, err_line, esbmc_output_token_len * 0.1)
    trimmed_sc: str = "".join(source_code_lines[lower_bound:upper_bound+1])
    
    try:
        log_str(f"LLM Raw Input:\n{get_llm_message(prompt, trimmed_sc, esbmc_output, role)}")
        
        delta: float = time()
        # Role will be passed, if the prompt does not contain {role} then it will be not used.
        llm_output_raw = run_sample(prompt, trimmed_sc, esbmc_output, role)
        delta = time() - delta

        log_str(f"Raw Response:\n\n{llm_output_raw}")
        print_and_log(f"Notice: Duration: {delta}")
        
        llm_output = get_code_from_solution(llm_output_raw)
        log_str(f"Extracted Code:\n\n{llm_output}")
        
        # Save patch
        with open(f"results/{file_name}", "w") as file:
            file.write(llm_output)

        # Stitch together patch
        patched_source: str = apply_patch_brutal_replacement(source_code, llm_output, lower_bound, upper_bound)

        # Save patched source
        with open(f"samples-patched/{file_name}", "w") as file:
            file.write(patched_source)
    except Exception as e:
        print_and_log(f"Notice: error: {file_name_key}: {e}")
    
    print_and_log()

    log_file.flush()
    
    # Write progress
    with open("progress.txt", "a") as file:
        # Write the file name and subdirectory in progress in order to know which file is being
        # processed in what subdirectory.
        file.write(f"{file_name}\n")

#### Run Experiments

In [None]:
print_and_log()
print_and_log("Running Contextual Strategy")

# Loop through prompts
for prompt_idx, prompt in enumerate(all_prompts):
    print_and_log()
    print_and_log(f"Notice: Running new cycle with prompt ({prompt_idx})")
    # Try all the roles
    # Check if a {role} tag is in the prompt string and use roles in that case.
    role_count: int
    if "{role}" in prompt:
        print_and_log("Notice: Prompt has roles. Will cycle roles.")
        role_count = len(persona_roles)
    else:
        print_and_log("Notice: Prompt has no roles. Roles will not be cycled or used (0).")
        role_count = 1

    # Loop through the different roles.
    for role_idx in range(role_count):
        # Loop through violated property ESBMC output and counterexample ESBMC output.
        for esbmc_output_type in ["ce", "vp"]:
            # Loop through files
            for file_idx, file_name_key in enumerate(sorted(data_samples.keys())):
                # Check if file has already been processed and skip.
                if os.path.exists("progress.txt"):
                    with open("progress.txt", "r") as file:
                        progress_files: list[str] = file.read().splitlines()
                        file_name: str = f"{prompt_idx}.{role_idx}.{esbmc_output_type}.{file_idx}.{os.path.basename(file_name_key)}"
                        if file_name in progress_files:
                            log_str(f"Skipping already processed file: {file_name}")
                            continue

                run_contextual_sample(
                    prompt_idx=prompt_idx, 
                    role_idx=role_idx, 
                    file_idx=file_idx, 
                    esbmc_output_type=esbmc_output_type,
                )

## Single Line Experiment

Run ESBMC-AI with single line mode. ESBMC output type is vp.

### Config Code

In [9]:
import json

def set_config_prompt(new_prompt: str | dict, temperature: float, save_dir: str = ".") -> None:
    """Creates a new config with the provided prompt as the FCM initial prompt."""
    with open("config_template.json", "r") as read_file:
        file_content = json.load(read_file)
    
    file_content["chat_modes"]["generate_solution"]["temperature"] = temperature
    
    if isinstance(new_prompt, str):
        file_content["chat_modes"]["generate_solution"]["initial"] = new_prompt
    elif isinstance(new_prompt, dict):
        file_content["chat_modes"]["generate_solution"]["system"] = new_prompt["system"]
        file_content["chat_modes"]["generate_solution"]["initial"] = new_prompt["initial"]
    else:
        raise ValueError(f"Invalid type {type(new_prompt)}")
        
    with open(f"{save_dir}/config.json", "w") as write_file:
        json.dump(file_content, write_file)

### Run Experiments

In [10]:
def run_esbmc_ai_single_line(
        prompt_idx: int,
        file_idx: int,
        save_dir: str,
        exp_dir: str) -> None:
    file_name: str = sample_names[file_idx]
    
    print_and_log()
    print_and_log(f"Notice: Checkpoint {file_idx}/{len(sample_names)} samples/{file_name}")
    
    # Call ESBMC-AI
    try:        
        delta: float = time()
        exit_code, esbmc_ai_output = run_esbmc_ai(f"samples/{file_name}")
        delta = time() - delta

        log_str(f"Notice: ESBMC-AI Output:\n{esbmc_ai_output}")
        print_and_log(f"Notice: Duration: {delta}")
        
        # Save output
        with open(f"{save_dir}/{file_name}", "w") as file:
            file.write(esbmc_ai_output)
    except Exception as e:
        print_and_log(f"Notice: error: {file_name}: {e}")
    
    print_and_log()

    log_file.flush()
    
    # Write progress
    with open(f"{exp_dir}/progress-single-{prompt_idx}.txt", "a") as file:
        # Write the file name and subdirectory in progress in order to know which file is being
        # processed in what subdirectory.
        file.write(f"{file_name}\n")

In [None]:
for temperature in ["0.0", "0.4", "0.7", "1.0", "1.3"]:
    temperature_path = f"temperature_{temperature}_results"
    if not os.path.isdir(temperature_path):
        os.mkdir(temperature_path)
        
    log_file = init_logger(file_path=f"{temperature_path}/log.txt")
    print_logging_session_message()
    
    print_and_log()
    print_and_log("Running ESBMC-AI: Single Line")
    
    # Loop through prompts
    for prompt_idx, prompt in enumerate(all_prompts):
        print_and_log()
        print_and_log(f"Notice: Running new cycle with prompt ({prompt_idx}): {prompt}")

        results_dir: str = f"{temperature_path}/results-{prompt_idx}"
        # Create directories if not made already
        if not os.path.exists(results_dir):
            os.mkdir(results_dir)

        print_and_log(f"Creating new config.json from template with prompt: {prompt_idx}")
        set_config_prompt(new_prompt=prompt, temperature=float(temperature))

        # Loop through files
        for file_idx, file_name in enumerate(sample_names):
            # Check if file has already been processed and skip.
            progress_file_path = f"{temperature_path}/progress-single-{prompt_idx}.txt"
            if os.path.exists(progress_file_path):
                with open(progress_file_path, "r") as file:
                    progress_files: list[str] = file.read().splitlines()
                    if file_name in progress_files:
                        log_str(f"Skipping already processed file: {file_name}")
                        continue

            run_esbmc_ai_single_line(
                prompt_idx=prompt_idx,
                file_idx=file_idx, 
                save_dir=results_dir,
                exp_dir=temperature_path,
            )

1713989262.7220867: Notice: Starting new logging session.

1713989262.7324674: Running ESBMC-AI: Single Line

1713989262.7325: Notice: Running new cycle with prompt (0): From now on, act as an Automated code repair tool that repairs AI C code. You will be shown AI C code, along with ESBMC output. Pay close attention to the ESBMC output, which contains a stack trace along with what type of error has occurred and its location. Provide the repaired C code as output, as would an Automated code repair tool. Aside from the corrected source code, do not output any other text. The code is

```c
{source_code}
```

The ESBMC output is

```
{esbmc_output}
```

1713989262.7380824: Creating new config.json from template with prompt: 0

1713989262.7648926: Notice: Checkpoint 0/100 samples/cartpole_0_safe.c-amalgamation-74.c


## ~~3. Clang Feedback To LLM~~

Perform a second iteration fix where the LLM is provided Clang feedback and asked to repair it. The results in `samples-patched` will be used along with `esbmc_output_1` (meaning 1st attempt). The results will be placed in `results-2` and `samples-patched-2`.

### Load ESBMC Output


In [None]:
clang_file_keys: list[str] = sorted(os.listdir(f"esbmc_output_1/"))
data_samples_1: dict[str, str] = {}
data_esbmc_output_1: dict[str, str] = {}

for file_name in clang_file_keys:
    if not file_name.endswith(".c"):
        continue
    
    # Read samples-patched (Generated previously)
    with open(f"samples-patched/{file_name}", "r") as file:
        data_samples_1[file_name] = file.read()

    # Read esbmc_output_1 (Generated by running ./eval-esbmc.sh)
    with open(f"esbmc_output_1/{file_name}", "r") as file:
        data_esbmc_output_1[file_name] = file.read()

### Create Directories

In [None]:
dirs: list[str] = ["samples-patched-2", "results-2"]

for dir in dirs:
    if not os.path.exists(dir):
        os.mkdir(dir)

### Get Error Line

Clang reports errors differently, this method will be used to find the error line: `samples-patched/12.0.ce.36.cartpole_2_safe.c-amalgamation-43.c:177:1: error:`

In [None]:
def get_source_code_err_line_clang(esbmc_output: str, filepath: str) -> int:
    """Gets the error line reported in the clang output."""
    lines: list[str] = esbmc_output.splitlines()
    for ix, line in enumerate(lines):
        # Find the first line containing a filename along with error.
        line_split: list[str] = line.split(":")
        # Check for the filename
        if line_split[0] == filepath and " error" in line_split[3]:
            return int(line_split[1])
            
    raise Exception(f'Could not find error line in {file_name_key}')

### Run Experiments

In [None]:
def run_contextual_sample_clang(
        prompt_idx: int, 
        role_idx: int,
        file_idx: int,
        file_name_key: str) -> None:
    prompt: str = all_prompts[prompt_idx]
    role: str = persona_roles[role_idx]
    
    # Name will be sorted by experimental order. Not filename as common experiments can be
    # found near eachother.
    file_name: str = f"{prompt_idx}.{role_idx}.{file_idx}.{os.path.basename(file_name_key)}"
    
    print_and_log()
    print_and_log(f"Notice: Checkpoint {file_name}")

    esbmc_output: str = get_esbmc_output_sized(data_esbmc_output_1[file_name_key])

    source_code: str = data_samples_1[file_name_key]
    source_code_lines: list[str] = source_code.splitlines(True)
    
    # Get the used ESBMC output length in order to calculate how to add it to the prompt template.
    esbmc_output_token_len: int = num_tokens_from_string(esbmc_output)

    # 0 based, pass full ESBMC output to get error line
    err_line: int = get_source_code_err_line_clang(data_esbmc_output_1[file_name_key], f"samples-patched/{file_name_key}") - 1

    # Trim the source code in order to give it to the LLM.
    lower_bound: int = get_lower_bound(source_code_lines, err_line, esbmc_output_token_len * 0.9)
    # The upper bound is inclusive.
    upper_bound: int = get_upper_bound(source_code_lines, err_line, esbmc_output_token_len * 0.1)
    trimmed_sc: str = "".join(source_code_lines[lower_bound:upper_bound+1])
    
    try:
        log_str(f"LLM Raw Input:\n{get_llm_message(prompt, trimmed_sc, esbmc_output, role)}")
        
        delta: float = time()
        # Role will be passed, if the prompt does not contain {role} then it will be not used.
        llm_output_raw = run_sample(prompt, trimmed_sc, esbmc_output, role)
        delta = time() - delta

        log_str(f"Raw Response:\n\n{llm_output_raw}")
        print_and_log(f"Notice: Duration: {delta}")
        
        llm_output = get_code_from_solution(llm_output_raw)
        log_str(f"Extracted Code:\n\n{llm_output}")
        
        # Save patch
        with open(f"results-2/{file_name}", "w") as file:
            file.write(llm_output)

        # Stitch together patch
        patched_source: str = apply_patch_brutal_replacement(source_code, llm_output, lower_bound, upper_bound)

        # Save patched source
        with open(f"samples-patched-2/{file_name}", "w") as file:
            file.write(patched_source)
    except Exception as e:
        print_and_log(f"Notice: error: {file_name_key}: {e}")
    
    print_and_log()

    log_file.flush()
    
    # Write progress
    with open("progress-2.txt", "a") as file:
        # Write the file name and subdirectory in progress in order to know which file is being
        # processed in what subdirectory.
        file.write(f"{file_name}\n")

In [None]:
print_and_log()
print_and_log("Running Contextual Strategy: 2nd Iteration Clang Assist")

# Loop through clang files
for file_name_key in clang_file_keys:
    file_name_split: list[str] = file_name_key.split(".")
    
    # Get prompt from file
    prompt_idx: int = int(file_name_split[0])
    # Get role from file
    role_idx: int = int(file_name_split[1])
    # Get CE/VP from file
    esbmc_output_type: str = file_name_split[2]
    # Get file_idx
    file_idx: int = int(file_name_split[3])

    # Check if file has already been processed and skip.
    if os.path.exists("progress-2.txt"):
        with open("progress-2.txt", "r") as file:
            progress_files: list[str] = file.read().splitlines()
            file_name: str = f"{prompt_idx}.{role_idx}.{file_idx}.{os.path.basename(file_name_key)}"
            if file_name in progress_files:
                log_str(f"Skipping already processed file: {file_name}")
                continue

    if "VERIFICATION" in data_esbmc_output_1[file_name_key]:
        print_and_log(f"Skipping {file_idx} {file_name_key} as it has VERIFICATION")
        continue

    run_contextual_sample_clang(
        prompt_idx=prompt_idx, 
        role_idx=role_idx, 
        file_idx=file_idx,
        file_name_key=file_name_key,
    )

# Unit Tests

In [None]:
import unittest

class TestNotebook(unittest.TestCase):
    
    def test_get_code_from_solution(self):
        self.assertEqual(get_code_from_solution("This is a code block:\n\n```c\naaa\n```"), "aaa")
        self.assertEqual(get_code_from_solution("This is a code block:\n\n```\nabc\n```"), "abc")
        self.assertEqual(get_code_from_solution("This is a code block:```abc\n```"), "This is a code block:```abc\n```")

    def test_apply_brutal_replacement_strategy(self):
        text = "\n".join(["a", "b", "c", "d", "e", "f", "g"])
        answer = "\n".join(["a", "b", "1", "g"])
        self.assertEqual(apply_patch_brutal_replacement(text, "1", 2, 5), answer)
        text = "\n".join(["a", "b", "c", "d", "e", "f", "g"])
        answer = "\n".join(["a", "b", "c", "1", "e", "f", "g"])
        self.assertEqual(apply_patch_brutal_replacement(text, "1", 3, 3), answer)

    def test_get_source_code_err_line(self):
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reinforcement_learning/cartpole_48_safe.c-amalgamation-6.c"]), 323)
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reinforcement_learning/cartpole_92_safe.c-amalgamation-14.c"]), 221)
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reinforcement_learning/cartpole_95_safe.c-amalgamation-80.c"]), 285)
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reinforcement_learning/cartpole_26_safe.c-amalgamation-74.c"]), 299)
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reach_prob_density/robot_5_safe.c-amalgamation-13.c"]), 350)
        self.assertEqual(get_source_code_err_line(data_esbmc_output["reach_prob_density/vdp_1_safe.c-amalgamation-28.c"]), 247)

    def test_get_lower_bound(self):
        source_code: str = data_samples["reinforcement_learning/cartpole_48_safe.c-amalgamation-6.c"]
        source_code_lines: list[str] = source_code.splitlines(True)

        # Test the additional tokens and if they affect the line.
        self.assertEqual(get_lower_bound(source_code_lines, 323, 0), 176)
        self.assertEqual(get_lower_bound(source_code_lines, 323, 500), 177)
        self.assertEqual(get_lower_bound(source_code_lines, 323, 1000), 178)
        
        # Test if the length is accounted for.
        self.assertLess(num_tokens_from_string(source_code[174:323]), LTOKENS)
        self.assertLess(num_tokens_from_string(source_code[175:323]), LTOKENS - 500)
        self.assertLess(num_tokens_from_string(source_code[176:323]), LTOKENS - 1000)
    
    def test_get_upper_bound(self):
        source_code: str = data_samples["reach_prob_density/robot_5_safe.c-amalgamation-13.c"]
        source_code_lines: list[str] = source_code.splitlines(True)
        
        # Test the additional tokens and if they affect the line.
        self.assertEqual(get_upper_bound(source_code_lines, 323, 0), 444)
        self.assertEqual(get_upper_bound(source_code_lines, 323, 500), 422)
        self.assertEqual(get_upper_bound(source_code_lines, 323, 1000), 379)

        # Test if the length is accounted for.
        self.assertLess(num_tokens_from_string(source_code[323:444]), UTOKENS)
        self.assertLess(num_tokens_from_string(source_code[323:422]), UTOKENS - 500)
        self.assertLess(num_tokens_from_string(source_code[323:379]), UTOKENS - 1000)

    def test_get_esbmc_output_sized(self):
        esbmc_output: str = data_esbmc_output["reach_prob_density/gcas_5_safe.c-amalgamation-149.c"]
        self.assertGreater(num_tokens_from_string(esbmc_output), MAX_ESBMC_OUTPUT)
        self.assertLessEqual(num_tokens_from_string(get_esbmc_output_sized(esbmc_output)), MAX_ESBMC_OUTPUT)

unittest.main(argv=[''], verbosity=2, exit=False)