# Solvers
## Introduction

## Installation

In [1]:
%pip install -q openai anthropic ipywidgets colorama
import os
os.environ['XDG_RUNTIME_DIR'] = "/tmp"

from helpers.reporter.pretty import pretty_results


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m25.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


## Your first solver

- Typically you build a test `dataset` with test cases.
- Each entry in a dataset isis called a `Sample`
- A test is called a `Task` : we run `Tasks` on all our `Samples`

- A task consists of taking a Sample and first running it through a `solver`.
- A `solver` takes the input of the `Sample` and runs this.
- The simple solver `generate` send this input to an LLM

In [2]:
from inspect_ai import Task, task, eval
from inspect_ai.dataset import Sample
from inspect_ai.solver import generate

@task
def simple_generate() -> Task:

    dataset=[
        Sample(
            input="Generate a javascript file name hello-world.js that prints out hello"
        )
    ]

    return Task(dataset=dataset,
        solver=[
            generate(model = "openai/gpt-4o-mini",)
        ],
    )

# We set the display to none, as we do our own reporting
results = eval(simple_generate, log_level="info",display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini
input : Generate a javascript file name hello-world.js that prints out hello
target: 
[33m user       [39m> Generate a javascript file name hello-world.js that prints out hello
[33m assistant  [39m> Sure! Here’s a simple JavaScript code that prints "Hello, World!" to the console. You can name the file `hello-world.js`.

```javascript
// hello-world.js
console.log("Hello, World!");
```

To create the file:

1. Open a text editor of your choice.
2. Copy the code above.
3. Save the file with the name `hello-world.js`.

You can run this JavaScript file using Node.js by executing the following command in your terminal:

```bash
node hello-world.js
```

This will output:

```
Hello, World!
```
**** End Sample ******************************************************************


## Set default model
Instead of specifying a model every time, we set the default mode via an environment variable `INSPECT_EVAL_MODEL`

In [3]:
# Set default model to gpt-4o-mini
os.environ['INSPECT_EVAL_MODEL'] = "openai/gpt-4o-mini"

from inspect_ai import Task, task, eval
from inspect_ai.dataset import Sample
from inspect_ai.solver import generate

@task
def simple_generate() -> Task:

    dataset=[
        Sample(
            input="Generate a javascript file name hello-world.js that prints out hello"
        )
    ]

    return Task(dataset=dataset,
        solver=[
            generate()
        ],
    )

results = eval(simple_generate,log_level="info",display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini
input : Generate a javascript file name hello-world.js that prints out hello
target: 
[33m user       [39m> Generate a javascript file name hello-world.js that prints out hello
[33m assistant  [39m> Certainly! Below is a simple JavaScript code snippet that will print "Hello" to the console. You can create a file named `hello-world.js` and include the following code:

```javascript
// hello-world.js
console.log("Hello");
```

To create the file, follow these steps:

1. Open your text editor or Integrated Development Environment (IDE).
2. Create a new file and name it `hello-world.js`.
3. Copy the code above into the file.
4. Save the file.

To run the script, you can use Node.js. Open your terminal or command prompt and navigate to the directory where the `hello-world.js` file is located. Run the following command:

```bash
node hello-world.js
```

This will print "Hello" to the console.
**** End Sample **************************************

## Running multiple models
- We can also set the model we want to use in the evaluation run itself.
- This allows use to run the same Task on different models without changing the task.

In [4]:
# Set default model to gpt-4o-mini
os.environ['INSPECT_EVAL_MODEL'] = "openai/gpt-4o-mini"

from inspect_ai import Task, task, eval
from inspect_ai.dataset import Sample
from inspect_ai.solver import generate

@task
def simple_generate() -> Task:

    dataset=[
        Sample(
            input="Generate a javascript file name hello-world.js that prints out hello"
        )
    ]


    return Task(dataset=dataset,
        solver=[
            generate()
        ],
    )

models = [
    "openai/gpt-4o-mini",
    "openai/gpt-4o",
]
results = eval(simple_generate,log_level="info", model = models, display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini


Status: success Model: openai/gpt-4o
input : Generate a javascript file name hello-world.js that prints out hello
target: 
[33m user       [39m> Generate a javascript file name hello-world.js that prints out hello
[33m assistant  [39m> You can create a simple JavaScript file named `hello-world.js` that prints "hello" to the console. Below is the content you should include in that file:

```javascript
// hello-world.js
console.log("hello");
```

### Steps to Create the File:
1. Open a text editor or an integrated development environment (IDE) of your choice.
2. Copy the code above.
3. Save the file with the name `hello-world.js`.

### Running the JavaScript File:
To run the script, you can use Node.js if you have it installed. Open your terminal or command prompt, navigate to the directory where your `hello-world.js` file is located, and run the following command:

```bash
node hello-world.js
```

This will execute the script, and you should see "hello" printed in the terminal.
****

## Adding a system prompt
- As part of our test, we can also specify a system prompt.
- That gets passed to the generate() before sending it off to the LLM.
- The command `system_prompt` will either take a string, or if it's a filename try to read that file.

In [5]:
from inspect_ai import Task, task, eval
from inspect_ai.dataset import Sample
from inspect_ai.solver import system_message, generate
from textwrap import dedent

@task
def system_message_solver() -> Task:

    dataset=[
        Sample(
            input="Generate a javascript file name hello-world.js that prints out hello"
        )
    ]

    SYSTEM_PROMPT = """You are a code generation assistant. Your task is to generate code based on the input provided.
    You should ensure that the generated code meets the requirements specified in the input.
    You should also ensure that the generated code is syntactically correct and can be executed without errors.
    Make sure to output the code and not any additional text or comments.
    """
    
    return Task(dataset=dataset,
        solver=[
            system_message(dedent(SYSTEM_PROMPT)),
            generate()
        ],
    )

results = eval(system_message_solver,log_level="info",display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini
input : Generate a javascript file name hello-world.js that prints out hello
target: 
[33m system     [39m> You are a code generation assistant. Your task is to generate code based on the input provided.
    You should ensure that the generated code meets the requirements specified in the input.
    You should also ensure that the generated code is syntactically correct and can be executed without errors.
    Make sure to output the code and not any additional text or comments.

[33m user       [39m> Generate a javascript file name hello-world.js that prints out hello
[33m assistant  [39m> ```javascript
// hello-world.js
console.log('hello');
```
**** End Sample ******************************************************************


## Basic picture
- Input in Samples is not limited to just text, we can send images too.

In [6]:
from inspect_ai import Task, task, eval
from inspect_ai.dataset import Sample
from inspect_ai.model import ChatMessageUser, ContentImage, ContentText
from inspect_ai.solver import system_message, generate
from inspect_ai.scorer import includes, model_graded_fact

@task
def picture_basic():
    dataset = [
        Sample(input = [
            ChatMessageUser(content = [
                ContentImage(image="data/screenshot.png"),
                ContentText(text="What is this a picture of?")
            ])
        ])
    ]

    task = Task(
        dataset=dataset,
        solver=generate(),
    )
    return task


results = eval(picture_basic,log_level="info",display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini
input : [ChatMessageUser(id='VRbGB9yMVhXu8xwi3s9xqZ', content=[ContentText(type='text', text='(Image)', refusal=None), ContentText(type='text', text='What is this a picture of?', refusal=None)], source=None, internal=None, role='user', tool_call_id=None)]
target: 

[33m user       [39m+> What is this a picture of?
[33m assistant  [39m> This is a screenshot of the Uber website, showcasing the interface for booking a ride. It includes fields for entering pickup and drop-off locations, options for selecting the date and time, and a map displayed on the right side. There's also a cookie consent notification at the bottom.
**** End Sample ******************************************************************


## Custom solver
- It's easy to build your own solver.
- <https://inspect.aisi.org.uk/solvers.html#custom-solvers>

In [18]:
from inspect_ai.scorer import Target
from inspect_ai.solver import (solver, Solver, Generate)
from inspect_ai.solver._task_state import TaskState
from inspect_ai.model import Model,get_model
from inspect_ai.util import resource
from inspect_ai.model import ChatMessageAssistant


CODE_TEMPLATE = "Generate java code that does the following: {question}. Onky output the code, do not add any comments or explanations."

@solver
def code_generator(
    model: str | Model | None = None,
):

    # resolve model
    model = get_model(model)

    # here is wheree the logic happens
    async def solve(state: TaskState, generate: Generate) -> TaskState:
        # run generation
        response = await model.generate(
            CODE_TEMPLATE.format(
                question=state.input_text
            )
        )

        # add the critique as a user message
        state.messages.append(
            ChatMessageAssistant(
                content=response.completion,
            )
        )

        # regenerate
        return state

    return solve

@task
def code_solver() -> Task:
    dataset = [
        Sample(
            input="a function that prints out hello"
        )
    ]

    return Task(
        dataset=dataset,
        solver=code_generator(model="openai/gpt-4o-mini"),
    )


results = eval(code_solver,log_level="info",display="none")
print(pretty_results(results))

Output()

Status: success Model: openai/gpt-4o-mini
input : a function that prints out hello
target: 
[33m user       [39m> a function that prints out hello
[33m assistant  [39m> ```java
public class HelloWorld {
    public static void main(String[] args) {
        printHello();
    }

    public static void printHello() {
        System.out.println("Hello");
    }
}
```
**** End Sample ******************************************************************
