# Writing Custom Tests

You can also create custom tests with AutoRedTeam. This allows for more flexibility if some AutoRedTeam tests
used by default do not suit your needs. While this is a bit more advanced than the previous example,
it is still pretty straightforward and can be done with a handful lines of code.

Below, we are going craft a simple test of factuality. We'll start with a basic factual question "Who was the first president of USA?",
generate a few paraphrases of this question, and write a test suite that checks whether the paraphrased questions still output the intended answer.

Let's start by importing a few classes from `autoredteam`.

In [1]:
from autoredteam.tests.base import Test
from autoredteam.perturbations.paraphrase import PegasusT5
from autoredteam.detectors.base import StringAbsenceDetector

  from .autonotebook import tqdm as notebook_tqdm


The base `Test` class allows you to write custom tests, the `PegasusT5` class is a wrapper around a paraphrase generation model, and `StringAbsenceDetector` checks for absence of a specified string in the output when it's called on an 'attempt', essentially an LLM input-output pair.

Let's start by generating a few paraphrases.

In [2]:
prompt = "Who was the first president of USA?"
pp_class = PegasusT5()
pp_prompts = pp_class.perturb_prompt(prompt)
pp_prompts

Some weights of PegasusForConditionalGeneration were not initialized from the model checkpoint at tuner007/pegasus_paraphrase and are newly initialized: ['model.decoder.embed_positions.weight', 'model.encoder.embed_positions.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


['Who was the first president of the USA?',
 'Who was the first president of the United States?',
 'Who was the first president of the US?',
 'Who was the first President of the USA?',
 'Who was the first American president?',
 'The first president of the USA?']

FirstPresident                                                            base.StringAbsenceDetector:   12/  12 ( 100.0%) passed
FirstPresident                                                            base.StringAbsenceDetector:   12/  12 ( 100.0%) passed


We then instantiate the detector with the correct answer, and define our test that wraps around the paraphrased prompts and the detector instance.


In [3]:
custom_detector = StringAbsenceDetector(substrings = ["George Washington"])
custom_test = Test(
    name = 'FirstPresident',
    prompts = pp_prompts,
    detectors = [custom_detector]
)


That's it, we're now ready to run our test! To do this, we use a Mistral-7B model off OctoAI as before.

In [6]:
# import os
# os.environ["OCTO_API_TOKEN"] = <YOUR OCTO API TOKEN>
from autoredteam.agents.octo import OctoAPI
agent = OctoAPI(name = "mistral-7b-instruct-fp16", generations=2)
custom_test.run(agent)

FirstPresident:   0%|          | 0/6 [00:00<?, ?it/s]

                                                             

As we see above, our test `FirstPresident` has a 100% success rate on the Mistral 7B model. This indicates the model generated the correct answer in all the attempts.

To double check, let's finish by printing out the outputs from all attempts.


In [7]:
custom_test.eval_outputs

[' The first President of the United States was George Washington. He served two terms from April 30, 1789, to March 4, 1797. Washington played a key role in the founding of the country and set many precedents for the presidency. His leadership and vision helped establish a strong foundation for the new nation.',
 " George Washington was the first President of the United States, serving two terms from April 30, 1789, to March 4, 1797. He played a key role during the American Revolution as a military leader, and later becoming the founding father who set many precedents for the presidency. Washington's leadership and unifying influence helped establish the foundations of American democracy.",
 ' The first President of the United States was George Washington. He served two terms in office from April 30, 1789, to March 4, 1797. Washington played a crucial role in the founding of the United States and was unanimously elected by the Electoral College after the Constitutional Convention in 1