# Project 3: Large Language Models

## Getting started

### Python setup

**Create a virtual environment**

If you are working on a lab computer, check whether there already is a virtual environment in `/var/vsv306`. In a terminal do:
```
ls /var/csc306
```
If you see a folder named `csc306.venv`, someone has already created a virtual environment on this computer.

If not, run:
```
python3 -m venv /var/csc306/csc306.venv
```

(If you are working on your own computer, you can put the virtual environment wherever you want. Replace `/var/csc306` accordingly.)

**Activate virtual environment**

```
source /var/csc306/csc306.venv/bin/activate
```

**Install libraries**
```
pip install ipykernel openai python-dotenv rich rank_bm25
```

**Select a kernel**

To execute cells in this notebook, you need to connect to a Python kernel. You want to use the virtual environment you created. In VS Code type `Ctrl-shift-p` and then `Python: select interpreter`. That should bring up a drop down menu. Choose "Enter interpreter path" and then "Browse your file system to find a Python interpreter". Navigate to `/var/csc306/csc306.venv/bin/python`. 

Now click "Select Kernel" in the top right of your VS Code editor panel. Then choose "Python environments". Your virtual environment should be one of the options.

Now, you should be able to run the following cell.

In [None]:
print("Hello Python!")

**Autoreload imported Python files when they change**

In [2]:
# Executing this cell will ensure that imported modules (.py files) will automatically
# be reloaded when they change. (However, objects that were defined with the old
# version of the class won't change.)
%load_ext autoreload
%autoreload 2

### Create an OpenAI client

An OpenAI API key will be sent to you.

Make an `.env` file in the same directory as this notebook, containing the following:
```
export OPENAI_API_KEY=[your API key]    # do not include the brackets here
```
Make sure others can't read this file:
```
chmod 600 .env
```

**Be sure to keep the key secret.  It gives access to a billable account.** If OpenAI finds it on the public web, they will invalidate it, and then no one (including you) can use this key to make requests anymore.

Now you can execute the following to get an OpenAI client object.

In [3]:
from tracking import new_default_client, read_usage
client = new_default_client()

That fetches your API key and calls `openai.OpenAI()` to make a new **client** object, whose job is to talk to the OpenAI **server** over HTTP.  (The `OpenAI` constructor has some optional arguments that configure these HTTP messages. However, the defaults should work fine for you.)

That command also saved the new client in `tracking.default_client`, which is the client that the starter code will use by default whenever it needs to talk to the OpenAI server.  Thus, you should **rerun the above cell** to get a new client if you change the `default_model` in `tracking.py`, or if your API key in  `.env` ever changes, or its associated organization ever changes.

### Try the model!

You can now get answers from OpenAI models by calling methods of the `client` instance.

Here is the function from class. Try it to make sure you can access the OpenAI API.

In [None]:
def complete(client, s: str, model="gpt-3.5-turbo-0125", *args, **kwargs):
    response = client.chat.completions.create(messages=[{"role": "user", "content": s}],
                                              model=model,
                                              *args, **kwargs)
    return [choice.message.content for choice in response.choices]

complete(client, "I went to the store and I bought apples, bananas, cherries, donuts, eggs",
         n=10, temperature=0.6, max_tokens=96)

### Compute a function using instructions and few-shot prompting

Let's try prompting the model with a sequence of multiple messages. In this case, we provide some instructions as well as few-shot prompting (actually just one-shot in this case).

Instructions are in the `system` message. The few-shot prompting consists of example inputs (`user` messages) followed by their example outputs (`assistant` messages). Then we give our real input (the final `user` message), and hope that the LLM will continue the pattern by generating an analogous output (a new `assistant` message).

In [None]:
import rich
response = client.chat.completions.create(messages=[{ "role": "system",      # instructions
                                                      "content": "Reverse the order of the words." },
                                                    { "role": "user",        # input
                                                      "content": "Good things come to those who wait." },
                                                    { "role": "assistant",   # output
                                                      "content": "Wait who those to come things good." },
                                                    { "role": "user",        # input
                                                      "content": "Colorless green ideas sleep furiously." }],
                                          model="gpt-4o-mini", temperature=0)
#rich.print(response)
response.choices[0].message.content

By modifying this call, can you get it to produce different versions of the output? Some possible behaviors you could try to arrange:

* specific other way of formatting the output, e.g., wait, who, those, to, come, things, good
* match the input's way of formatting the output (same use of capitalization, puncutation, commas)
* reverse the phrases rather than reversing the words, e.g., To those who wait come good things.

You can try playing with the number, the content, and the order of few-shot examples, and changing or removing the instructions.

What happens if the examples conflict with the instructions?

### Check your usage so far

Please be careful not to write loops that use lots and lots of tokens. That will cost us money, and could hit the per-day usage limit that is shared by the whole class.

Execute the cell below whenever you want to see your cost so far. Or, just open `usage_openai.json` as a tab in your IDE.

In [None]:
read_usage()

## Dialogues and dialogue agents

The goal of this assignment is to create a good "argubot" that will talk to people about controversial topics and broaden their minds.

### A first argubot (Airhead)

You can have a conversation right now with a really bad argubot named Airhead. Try asking it about climate change! When you're done, reply with an empty string.

(The `converse()` method calls Python's `input()` function, which will prompt you for input at the command-line or by popping up a box in your IDE. In VS Code, the input box appears at the top edge of the window.)


In [None]:
import argubots
d = argubots.airhead.converse()

A bot (short for "robot") is a system that acts autonomously. That corresponds to the AI notion of an agent — a system that uses some policy to choose actions to take.

The airhead agent above (defined in `argubots.py`) uses a particularly simple policy.
It is an instance of a simple Agent subclass called `ConstantAgent` (defined in `agents.py`).

The result of talking to airhead is a Dialogue object (defined in dialogue.py). Let's look at it.

In [None]:
rich.print(d)

Each turn of this dialogue is just a tiny dictionary:

In [None]:
d[0]

### An LLM argubot (Alice)

In other CS courses, you may have encountered "conversations" between characters named Alice and Bob.

Let's try talking to the Alice of this homework, who is a much stronger baseline than Airhead. Your job in this assignment is to improve upon Alice. We'll meet Bob later.

In [None]:
# call with argument d if you want to append to the previous conversation
alicechat = argubots.alice.converse()

As you may have guessed, `alice` is powered by a prompted LLM. You can find the specific prompt in `argubots.py`.

So, while `agents.py` provides the core functionality for `Agent` objects, the argubot agents like `alice` -- and the ones that you will write! -- go into `argubots.py` instead. This is just to keep the files small.

### Simulating human characters (Bob & friends)

You'll talk to your own argubots to get a qualitative feeling for their strengths and weaknesses.
But can you really be sure you're making progress? For that, a quantitative measure can be helpful.

Ultimately, you should test an argubot like Alice by having it argue with many real humans — not just you — and using some rubric to score the resulting dialogues. But that would be slow and complicated to arrange.

So, meet Bob! He's just a simulated human. You won't edit him: he is part of the development set. Here is some information about him (from `characters.py`):

In [None]:
import characters
rich.print(characters.bob)

You can't talk directly to `characters.bob` because that's just a data object. However, you can construct a simple agent that uses that data (plus a few more instructions) to prompt an LLM.

(Which LLM does it prompt? The `CharacterAgent` constructor (defined in `agents.py`) defaults to gpt-4o-mini as specified in tracking.py. But you can override that using keyword arguments.)

Try talking to Bob about climate change, too.

In [None]:
from agents import CharacterAgent
# actually, agents.bob is already defined this way
bob = CharacterAgent(characters.bob)
# returns a dialogue, but we've already seen it so we don't want to print it again
bob.converse()
# don't print anything for this notebook cell
None

Of course, a proper user study can't just be conducted with one human user.

So, meet our bevy of beautiful Bobs! (They're not actually all named Bob — we continued on in the alphabet.)

In [None]:
import agents
agents.devset

In [None]:
agents.cara.converse()
None

You can see the underlying character data here in the notebook. Your argubot will have to deal with all of these topics and styles!

In [None]:
rich.print(characters.devset)

### Simulating conversation

We can make Alice and Bob chat.

In [None]:
from dialogue import Dialogue
d = Dialogue()                                              # empty dialogue
d = d.add('Alice', "Do you think it's okay to eat meat?")   # add first turn
print(d)

In [None]:
d = agents.bob.respond(d)
d = argubots.alice.respond(d)
rich.print(d)

In [None]:
d = agents.bob.respond(d)
d = argubots.alice.respond(d)
rich.print(d)

Anyway, let's see what happens when Alice and Bob talk for a while...

In [None]:
from simulate import simulated_dialogue
d = simulated_dialogue(argubots.alice, agents.bob, 8)
rich.print(d)

Sometimes this kind of conversation seems to stall out, with Bob in particular repeating himself a lot. Alice doesn't seem to have a good strategy for getting him to open up. Maybe you can do a better job talking to Bob, and that will give you some ideas about how to improve Alice?

In [None]:
# your name, pulled from an earlier dialogue
myname = alicechat[0]['speaker']
# reuse the same first two turns, then type your own lines!
agents.bob.converse(d[0:2].rename('Alice', myname))
None

You can also try talking to the other characters and having Alice (or Airhead) talk to them.

<div class="alert alert-block alert-warning">
❓❓❓ <b>Task 1<b>
</div>

Define an additional character.

In [51]:
from characters import Character

# See characters.py for how to use the Character class.
# Add the definition of your character here.

**Note:** Please don't change the dev set — the characters we just loaded must stay the same. Your job in this homework is to improve the argubot (or at least try). And that means improving it according to a fixed and stable evaluation measure.