### <span style="color:lightgray">October 2024</span>

# Programming with LLMs
---

### Matt Hall, Equinor &nbsp; `mtha@equinor.com`

<span style="color:lightgray">&copy;2024  Matt Hall, Equinor &nbsp; | &nbsp; licensed CC BY, please share this work</span>

## Set up an environment

You will need:

- `jupyter` (if you want to run this notebook)
- `tiktoken`
- `python-dotenv` (NOT just 'dotenv')
- `openai`

You can also optionally install LangChain (needed for the Agent demo, below):

- `langchain`
- `langchain-community`

## Set up secrets

Make a file called `.env` or `secrets.txt` and give it the following contents (sort of, I will give you the correct key in the class):

```text
AZURE_OPENAI_ENDPOINT=9b0ce5dc32341ab0bf8cd9b5bf18e1f0
AZURE_OPENAI_KEY=https://openai-common.openai.azure.com/
```

We can read environment variables from this file:

In [None]:
# 💥 Either use a file called `.env` to store these, or
# 💥 before proceeding, add secrets.txt to .gitignore

from dotenv import load_dotenv

__ = load_dotenv("secrets.txt") # If key is in a file.

Now you can read the constants from the environment:

In [None]:
import os

os.getenv("AZURE_OPENAI_ENDPOINT")

## Define the client and make a request

In [None]:
from openai import AzureOpenAI


MODEL = "gpt-35-turbo" # "gpt-4o" is multimodal but more expensive.

CLIENT = AzureOpenAI(
    azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"),
    api_key=os.getenv("AZURE_OPENAI_KEY"),  
    api_version="2024-02-01",
)

prompt = "Define AI in one sentence."

<div style="border: 2px solid green; border-radius: 10px; padding: 8px; background: #DDFFDD">
<h3>EXERCISE</h3>

[Check out the docs](https://platform.openai.com/docs/overview) to figure out what a `message` looks like and define it below.

<a title="Look at the 'Developer quick start' > Python"><strong>Hover for a hint</strong></title>
</div>

In [None]:
message =  # Put your message object here.
 
CLIENT.chat.completions.create(
    model=MODEL,
    messages=[message],
)

<div style="border: 2px solid green; border-radius: 10px; padding: 8px; background: #DDFFDD">
<h3>EXERCISE</h3>

Extract the plain text answer to your question.

<a title="You are looking for the attribute called choices > message > content"><strong>Hover for a hint</strong></title>
</div>

<div style="border: 2px solid green; border-radius: 10px; padding: 8px; background: #DDFFDD">
<h3>EXERCISE</h3>

Write a function called `ask()` to contain this code.
</div>

In [None]:
def ask() -> str:
    """Your code here."""
    return ...

## A tokenizer

We can use `tiktoken` for tokenization.

In [None]:
import tiktoken

def tokenize(prompt):
    encoding = tiktoken.encoding_for_model(MODEL)
    tokens = encoding.encode(prompt)
    decode = lambda token: encoding.decode_single_token_bytes(token).decode()
    return [decode(token) for token in tokens]

tokenize("Stratigraphically.")

## Embeddings

Embedding models are learned during training of the LLM. 

In [None]:
def get_embedding(text, model="text-embedding-3-large"):
    text = text.replace("\n", " ")
    response = CLIENT.embeddings.create(input=[text], model=model)
    return response.data[0].embedding

get_embedding("Equinor is an energy company.")

## Conversations

We can fake a conversation by storing the chat 'steps' and passing them back to the model on each new request.

In [None]:
class Convo:
    def __init__(self, temperature=0, model='gpt-35-turbo'):
        self.temperature = temperature
        self.model = model
        self.messages = []

    def ask(self, prompt):
        self.messages.append({"role": "user", "content": prompt})
        response = CLIENT.chat.completions.create(
            model=self.model,
            temperature=self.temperature,
            max_tokens=1024,
            messages=self.messages
        )
        content = response.choices[0].message.content
        self.messages.append({'role': 'assistant',  'content': content})
        return content

    def history(self):
        return self.messages

In [None]:
convo = Convo()
convo.ask("I'm Matt, who are you?")

In [None]:
convo.ask("What's my name?")

In [None]:
convo.history()

## Including images in the context

Can can encode an image and send it with the prompt. Be careful about image size!

In [None]:
import httpx
import base64

def ask_about_image(prompt, image_url=None, model='gpt-4o'):
    """Ask ChatGPT about an (optional) image."""
    content = []
    image_format = image_url.split('.')[-1]

    if image_url is not None:
        image_media_type = f"image/{image_format}"
        image = base64.b64encode(httpx.get(image_url).content).decode("utf-8")
        image_content = {
              "type": "image_url",
              "image_url": {"url": f"data:image/jpeg;base64,{image}"}
            }
        content.append(image_content)

    content.append({"type": "text", "text": prompt})
    
    messages = [{"role": "user", "content": content},]
    response = CLIENT.chat.completions.create(
        model=model,  # Deployment name.
        temperature=0.5,
        max_tokens=1024,
        messages=messages
    )
    
    return response.choices[0].message.content

In [None]:
from IPython.display import Image

image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/2/2c/Falla_normal_Morro_Solar_Peru.jpg/640px-Falla_normal_Morro_Solar_Peru.jpg"
Image(image_url)

In [None]:
ask_about_image("What kind of fault is this?", image_url)

<span style="color:lightgray">&copy; 2024 Matt Hall, Equinor &nbsp; | &nbsp; licensed CC BY, please share this work</span>