In [1]:
# | output: false
import os

from dotenv import load_dotenv

load_dotenv()

True

# ENV Setup {.smaller}

- create a virtual env
```
python3 -m venv env
source env/bin/activate
```

- install packages
```
pip install tiktoken
pip install openai
pip install instructor
pip install transformers
pip install torch
pip install python-dotenv
pip install notebook
```

- add env vars in .env file

```
OPENAI_API_KEY=<key>
TOGETHER_API_KEY=<key>
```


# Background


## NLP Through The Years  {.smaller}

- [ELIZA (MIT),1964-1967](https://en.wikipedia.org/wiki/ELIZA#Response_and_legacy), [CS25 V4: Lecture 1 (Spring 2024)](https://docs.google.com/presentation/d/1oXPs3LXtIVIsVbwTyGjAWj_aWvak9c1uNC4uhkS6glk/edit#slide=id.gea1aecfd7b_0_0)

:::: {.columns}

::: {.column width="50%"}
![](static_blog_imgs/eliza.png){height=20%, width=80%}

:::

::: {.column width="50%"}
![](static_blog_imgs/timeline.png){height=60%, width=80%}
:::

::::

## Word Embeddings {.smaller}
- represent each word as an embedding (vector of numbers)
- useful computations such as distance (cosine/euclidean)
- mapping of words onto a semantic space
- example: Word2Vec (2013), GloVe, BERT, ELMo

![](static_blog_imgs/word-vectors.png)


## Attention and Transformers

- [Image Source: nlp-with-transformers book](https://github.com/nlp-with-transformers/notebooks/blob/main/03_transformer-anatomy.ipynb)

![](static_blog_imgs/self-attention.png){height=60%,width=60%}


## Transformer & Multi-Head Attention


- [Attention Is All you Need: Paper](https://arxiv.org/pdf/1706.03762.pdf)

:::: {.columns}

::: {.column width="50%"}
![](static_blog_imgs/attention-transformer-paper1.png){height=70%, width=70%}
:::

::: {.column width="50%"}
![](static_blog_imgs/attention-transformer-paper2.png)
:::

::::


## What is a LLM (large language model)?

- LLMs are scaled up versions of the Transformer architecture (millions/billions of parameters)
- Most modern LLMs are **decoder only transformers**
- Trained on massive amounts of “general” textual data
- Training objective is typically “next token prediction”: P(Wt+1|Wt,Wt-1,...,W1)


## Next Token Prediction

 - LLMs are next token predictors
- "It is raining today, so I will take my _______." 

![](static_blog_imgs/next_token1.png)

![](static_blog_imgs/next_token2.png)


## Tokenization with tiktoken library {.smaller}

- The first step is to convert the input text into **tokens**
- Each **token** has an id in the vocabulary

In [2]:
# | echo: true

import tiktoken

enc = tiktoken.encoding_for_model("gpt-4-0125")
encoded_text = enc.encode("tiktoken is great!")
encoded_text

[83, 1609, 5963, 374, 2294, 0]

In [3]:
# | echo: true

[enc.decode([token]) for token in encoded_text]

['t', 'ik', 'token', ' is', ' great', '!']

In [4]:
# | echo: true

enc.decode([83, 1609, 5963, 374, 2294, 0])

'tiktoken is great!'



## Tokenization with transformers library {.smaller}

In [5]:
# | echo: true
# | warning: false

from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

texts = [
    "I love summer",
    "I love tacos",
]
inputs = tokenizer(
    texts,
    return_tensors="pt",
    padding="max_length",
    max_length=16,
    truncation=True,
).input_ids
print(inputs)

print(inputs.shape)  # (B, T)
print(tokenizer.vocab_size)
for row in inputs:
    print(tokenizer.convert_ids_to_tokens(row))

  from .autonotebook import tqdm as notebook_tqdm


tensor([[  101,  1045,  2293,  2621,   102,     0,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0],
        [  101,  1045,  2293, 11937, 13186,   102,     0,     0,     0,     0,
             0,     0,     0,     0,     0,     0]])
torch.Size([2, 16])
30522
['[CLS]', 'i', 'love', 'summer', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']
['[CLS]', 'i', 'love', 'ta', '##cos', '[SEP]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]', '[PAD]']


## Tokenization is the First Step {.smaller}

![](static_blog_imgs/first_step.png)

## LLMS are not great at math. Why? {.smaller}

- because of tokenization and next token prediction

What is the average of:  2009 1746 4824 8439

![](static_blog_imgs/llms_suck_at_math.png)





In [6]:
# | echo: true

encoded_text = enc.encode("What is the average of:  2009 1746 4824 8439")
print(encoded_text)

[3923, 374, 279, 5578, 315, 25, 220, 220, 1049, 24, 220, 11771, 21, 220, 21984, 19, 220, 23996, 24]


In [7]:
# | echo: true

print([enc.decode([token]) for token in encoded_text])

['What', ' is', ' the', ' average', ' of', ':', ' ', ' ', '200', '9', ' ', '174', '6', ' ', '482', '4', ' ', '843', '9']


## Basic Transformer Architecture - Futher Reading {.smaller}
- Lots of resources online
- Some of the ones I enjoyed while learning:
    - Chapter 3 of the book [Natural Language Processing With Transformers](https://www.oreilly.com/library/view/natural-language-processing/9781098136789/) 
    - Andrej Karpathy's video [Let's build GPT: from scratch, in code, spelled out](https://www.youtube.com/watch?v=kCc8FmEb1nY) 
    - Sebastian Raschka's Blog Post [Understanding and Coding Self-Attention, Multi-Head Attention, Cross-Attention, and Causal-Attention in LLMs](https://magazine.sebastianraschka.com/p/understanding-and-coding-self-attention) 
    - Omar Sanseviero's Blog Post [The Random Transformer](https://osanseviero.github.io/hackerllama/blog/posts/random_transformer/) 
    - [The Illustrated Transformer](https://jalammar.github.io/illustrated-transformer/) 
    - The original paper: [Attention Is All You Need](https://arxiv.org/abs/1706.03762)

# Instruction Models

## Base Models VS Instruct Models {.smaller}

- `meta-llama/Meta-Llama-3-8B` (base model)

![](static_blog_imgs/base_model_example.png)

## Base Models VS Instruct Models {.smaller}

- `meta-llama/Meta-Llama-3-8B-Instruct`

![](static_blog_imgs/instruct_model_example.png)

## Popular Instruction Fine-Tuned LLMs {.smaller}

- closed
    - [Open AI](https://platform.openai.com/docs/models): `gpt-4-turbo-2024-04-09`, `gpt-3.5-turbo-0125`, etc.
    - [Anthropic](https://docs.anthropic.com/claude/docs/models-overview): `opus`, `sonnet`, `haiku`
    - [Google](https://blog.google/technology/ai/google-gemini-next-generation-model-february-2024/), `Gemini 1.5`
- open
    - [Meta](https://llama.meta.com/llama3/): `Llama-3-8B-Instruct`, `Llama-3-70B-Instruct`
    - [Mistral](https://mistral.ai/technology/#models): `Mistral 7B`, `Mixtral 8x7B`, `Mixtral 8x22B`
    - [Qwen](https://github.com/QwenLM/Qwen): `Qwen-1.8B`, `Qwen-7B`, `Qwen-14B`, `Qwen-72B`
    - [HuggingFace](https://huggingface.co/HuggingFaceH4): `Zephyr-ORPO-141b-A35b-v0.1`
    - [Databricks](DBRX-Instruct-Preview): `DBRX-Instruct-Preview`
    - [NousResearch](https://nousresearch.com/releases/): `Hermes-2-Pro-Mistral-7B`, 
    - [Cohere](https://docs.cohere.com/docs/command-r-plus): `Command R+`

 
## The Gap is closing {.smaller}

- [image source - Maxime Labonne](https://x.com/maximelabonne/status/1779801605702836454), [🏆 LMSYS Chatbot Arena](https://chat.lmsys.org/?leaderboard)
- [another fun animation](https://x.com/jannchie/status/1784621770018058651)
![](static_blog_imgs/closed_vs_open.png)

## Aligning language models 

- There is so much theory/research behind creating instruction models
- Not going to cover that here
- [Checkout this recent talk, Aligning open language models, from Nathan Lambert](https://docs.google.com/presentation/d/1quMyI4BAx4rvcDfk8jjv063bmHg4RxZd9mhQloXpMn0/edit#slide=id.g2ca00c5c0f9_0_0)
- [State of GPT Keynote By Andrej Karpathy](https://www.youtube.com/watch?v=bZQun8Y4L2A)
- [Large Language Model Course by Maxime Labonne](https://github.com/mlabonne/llm-course)


# OpenAI Compatible LLM Inference

## OpenAI Compatible LLM Inference


In [8]:
# | echo: true

import openai

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


The main characters from Lord of the Rings are Frodo Baggins, Samwise Gamgee, Aragorn, Legolas, Gimli, Gandalf, Boromir, Merry and Pippin, and Gollum.


## OpenAI Compatible LLM Inference

- [together.ai](https://www.together.ai/)

```{.python code-line-numbers="3,5"}
import openai

client = openai.OpenAI(api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://api.together.xyz/v1")
chat_completion = client.chat.completions.create(
    model="META-LLAMA/LLAMA-3-70B-CHAT-HF",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)
```

## OpenAI Compatible LLM Inference {.smaller}

- [together.ai](https://www.together.ai/)

In [9]:
# | echo: true

import openai

client = openai.OpenAI(api_key=os.environ.get("TOGETHER_API_KEY"), base_url="https://api.together.xyz/v1")
chat_completion = client.chat.completions.create(
    model="META-LLAMA/LLAMA-3-70B-CHAT-HF",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)

The main characters from J.R.R. Tolkien's epic fantasy novel "The Lord of the Rings" are:

1. **Frodo Baggins**: The hobbit who inherits the One Ring from Bilbo Baggins and undertakes the perilous journey to destroy it in the fires of Mount Doom.
2. **Samwise Gamgee** (Sam): Frodo's loyal hobbit servant and friend, who accompanies him on his quest.
3. **Aragorn (Strider)**: A human warrior who becomes the leader of the Fellowship of the Ring and helps guide Frodo on his journey. He is the rightful King of Gondor.
4. **Legolas**: An elf archer who joins the Fellowship and provides skilled marksmanship and agility.
5. **Gimli**: A dwarf warrior who joins the Fellowship and provides strength and combat skills.
6. **Gandalf the Grey**: A powerful wizard who helps guide Frodo on his quest and provides wisdom and magical assistance.
7. **Boromir**: A human warrior from the land of Gondor, who joins the Fellowship but ultimately tries to take the Ring from Frodo.
8. **Merry Brandybuck** and *

## OpenAI Compatible LLM Inference {.smaller}

- local inference with [ollama](https://ollama.com/)

In [10]:
# | echo: true
import openai

client = openai.OpenAI(api_key="ollama", base_url="http://localhost:11434/v1")
chat_completion = client.chat.completions.create(
    model="llama3",
    messages=[
        {"role": "user", "content": "Who are the main characters from Lord of the Rings?."},
    ],
)
response = chat_completion.choices[0].message.content
print(response)

The main characters in J.R.R. Tolkien's "Lord of the Rings" trilogy, which includes "The Fellowship of the Ring", "The Two Towers", and "The Return of the King", are:

1. Frodo Baggins: The hobbit who inherits the One Ring from Bilbo and sets out on a quest to destroy it in the fires of Mount Doom.
2. Samwise Gamgee (Sam): Frodo's loyal hobbit servant and friend, who accompanies him on his journey to Mordor.
3. Aragorn (Strider): A human warrior who leads the Fellowship and helps them navigate the perilous lands of Middle-earth.
4. Legolas: An elf archer who joins the Fellowship and fights alongside them against Sauron's armies.
5. Gimli: A dwarf warrior who also joins the Fellowship, seeking to avenge his father's death at the hands of orcs.
6. Boromir: The human son of the Steward of Gondor, who tries to take the One Ring from Frodo for the benefit of his own people.
7. Meriadoc Brandybuck (Merry) and Peregrin Took (Pippin): Two hobbit friends of Frodo's who accompany him on his jour

## Chat Templates {.smaller}




In [11]:
# | echo: true
# | warning: false

from transformers import AutoTokenizer

checkpoint = "meta-llama/Meta-Llama-3-8B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


- Each model has its own expected input format. For Llama3 it's this:

```python
"""
<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a friendly chatbot who always responds in the style of a pirate<|eot_id|><|start_header_id|>user<|end_header_id|>

How many helicopters can a human eat in one sitting?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
"""
```

- With chat templates we can use this familiar standard:

In [12]:
# | echo: true
# | warning: false

messages = [
    {
        "role": "system",
        "content": "You are a friendly chatbot who always responds in the style of a pirate",
    },
    {"role": "user", "content": "How many helicopters can a human eat in one sitting?"},
]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
print(tokenizer.decode(tokenized_chat[0]))

<|begin_of_text|><|start_header_id|>system<|end_header_id|>

You are a friendly chatbot who always responds in the style of a pirate<|eot_id|><|start_header_id|>user<|end_header_id|>

How many helicopters can a human eat in one sitting?<|eot_id|><|start_header_id|>assistant<|end_header_id|>



# Structured Output

## Structured Output {.smaller}


In [13]:
# | echo: true

import openai

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {
            "role": "user",
            "content": "Who are the main characters from Lord of the Rings?. "
            "For each character give the name, race, "
            "favorite food, skills, weapons, and a fun fact.",
        },
    ],
)
response = chat_completion.choices[0].message.content
print(response)

1. Frodo Baggins
- Race: Hobbit
- Favorite food: Mushrooms
- Skills: Determination, stealth, resilience
- Weapons: Sting (his sword)
- Fun fact: Frodo is the only character to have directly interacted with the One Ring and survived its corrupting influence.

2. Aragorn (also known as Strider)
- Race: Human (Dunedain)
- Favorite food: Lembas bread
- Skills: Swordsmanship, tracking, leadership
- Weapons: Anduril (his sword), bow and arrows
- Fun fact: Aragorn is the heir to the throne of Gondor and the rightful King of Arnor.

3. Gandalf
- Race: Maia (wizard)
- Favorite food: Pipe-weed
- Skills: Magic, wisdom, leadership
- Weapons: Glamdring (his sword), staff
- Fun fact: Gandalf is actually one of the Maiar, a group of powerful beings who serve the Valar (gods) in the world of Middle-earth.

4. Legolas
- Race: Elf
- Favorite food: Waybread (Lembas)
- Skills: Archery, agility, keen eyesight
- Weapons: Bow and arrows, knives
- Fun fact: Legolas is the son of Thranduil, the Elven King of t

## Structured Output {.smaller}

- [JSON mode](https://platform.openai.com/docs/guides/text-generation/json-mode) and [Function Calling](https://platform.openai.com/docs/guides/function-calling) give us structured output
- [instructor - library](https://github.com/jxnl/instructor) - ["Pydantic is all you need"](https://www.youtube.com/watch?v=yj-wSRJwrrc)
 

In [14]:
# | echo: true

import instructor
import openai
from pydantic import BaseModel

client = instructor.from_openai(openai.OpenAI())


# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int


# Extract structured data from natural language
user_info = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "Chris is 38 years old."}],
)
print(user_info.model_dump())
print(user_info.name)
print(user_info.age)

{'name': 'Chris', 'age': 38}
Chris
38


## Structured Output {.smaller}

In [15]:
# | echo: true

from typing import List

import instructor
import openai
from pydantic import BaseModel, field_validator

client = instructor.from_openai(openai.OpenAI())


class Character(BaseModel):
    name: str
    race: str
    fun_fact: str
    favorite_food: str
    skills: List[str]
    weapons: List[str]


class Characters(BaseModel):
    characters: List[Character]

    @field_validator("characters")
    @classmethod
    def validate_characters(cls, v):
        if len(v) < 10:
            raise ValueError(f"The number of characters must be at least 10, but it is {len(v)}")
        return v

In [16]:
# | echo: true

response = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    messages=[
        {
            "role": "user",
            "content": "Who are the main characters from Lord of the Rings?. "
            "For each character give the name, race, "
            "favorite food, skills, weapons, and a fun fact. Give me at least 10 different characters.",
        },
    ],
    response_model=Characters,
    max_retries=4,
)

from pprint import pprint

pprint(response.model_dump())

{'characters': [{'favorite_food': 'Mushrooms',
                 'fun_fact': 'Frodo is the nephew of Bilbo Baggins.',
                 'name': 'Frodo Baggins',
                 'race': 'Hobbit',
                 'skills': ['Ringbearer', 'Stealth', 'Courage'],
                 'weapons': ['Sting', 'Phial of Galadriel']},
                {'favorite_food': 'Lembas bread',
                 'fun_fact': 'Aragorn is the rightful heir to the throne of '
                             'Gondor.',
                 'name': 'Aragorn',
                 'race': 'Man',
                 'skills': ['Swordsmanship', 'Leadership', 'Tracking'],
                 'weapons': ['Anduril', 'Bow and Arrow']},
                {'favorite_food': 'Roast Pork',
                 'fun_fact': 'Gimli is the son of Gloin, one of the Dwarves in '
                             "'The Hobbit'.",
                 'name': 'Gimli',
                 'race': 'Dwarf',
                 'skills': ['Axe throwing', 'Smithing', 'Courage'],
 

## Function Calling {.smaller}


In [17]:
# | echo: true

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather_forecast",
            "description": "Provides a weather forecast for a given location and date.",
            "parameters": {
                "type": "object",
                "properties": {"location": {"type": "string"}, "date": {"type": "string"}},
                "required": ["location", "date"],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "book_flight",
            "description": "Book a flight.",
            "parameters": {
                "type": "object",
                "properties": {
                    "departure_city": {"type": "string"},
                    "arrival_city": {"type": "string"},
                    "departure_date": {"type": "string"},
                    "return_date": {"type": "string"},
                    "num_passengers": {"type": "integer"},
                    "cabin_class": {"type": "string"},
                },
                "required": [
                    "departure_city",
                    "arrival_city",
                    "departure_date",
                    "return_date",
                    "num_passengers",
                    "cabin_class",
                ],
            },
        },
    },
    {
        "type": "function",
        "function": {
            "name": "send_slack_message",
            "description": "Send a slack message to specific channel.",
            "parameters": {
                "type": "object",
                "properties": {"channel_name": {"type": "string"}, "message": {"type": "string"}},
                "required": ["channel_name", "message"],
            },
        },
    },
]

import json
from datetime import date

import openai

client = openai.OpenAI()
chat_completion = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "system", "content": f"Today's date is {date.today()}"},
        {
            "role": "user",
            "content": """This coming Friday I need to book a flight from Halifax, NS to Austin, Texas. 
                                    It will be me and my friend and we need first class seats. 
                                    We will come back on Sunday. Let me know what I should pack for clothes 
                                    according to the weather there each day. Also please remind my team on 
                                    the DEV slack channel that I will be out of office on Friday. 
                                    1. Book the flight. 
                                    2. Let me know the weather. 
                                    3. Send the slack message.""",
        },
    ],
    tools=tools,
)

for tool in chat_completion.choices[0].message.tool_calls:
    print(f"function name: {tool.function.name}")
    print(f"function arguments: {json.loads(tool.function.arguments)}")
    print()

function name: book_flight
function arguments: {'departure_city': 'Halifax', 'arrival_city': 'Austin', 'departure_date': '2024-05-03', 'return_date': '2024-05-05', 'num_passengers': 2, 'cabin_class': 'First'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-03'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-04'}

function name: get_weather_forecast
function arguments: {'location': 'Austin, Texas', 'date': '2024-05-05'}

function name: send_slack_message
function arguments: {'channel_name': 'DEV', 'message': 'I will be out of office this Friday, May 3, 2024. Please reach out via email if urgent.'}


# RAG: Retrieval Augmented Generation {.smaller}

## RAG: Step 1 - Index your Documents {.smaller}

- RAG is a technique for augmenting LLM knowledge with additional data.
- image source: [langchain docs](https://python.langchain.com/docs/use_cases/question_answering/)

![](static_blog_imgs/rag_part1.png)

## RAG: Step 2 - Query and Prompt LLM

![](static_blog_imgs/rag_part2.png){height=60%, width=60%}

## RAG  Resources

- Vector DBs
    - [weaviate](https://weaviate.io/developers/weaviate/starter-guides/generative) 
    - [pinecone](https://www.pinecone.io/learn/retrieval-augmented-generation/)
    - [vespa](https://blog.vespa.ai/scaling-personal-ai-assistants-with-streaming-mode/)
    - [qdrant](https://qdrant.tech/articles/what-is-rag-in-ai/)

- LLM Frameworks: (not necessary for building on prod but good for learning and POC)
    - [LlamaIndex](https://www.llamaindex.ai/)
    - [langchain](https://python.langchain.com/docs/use_cases/question_answering/)

# MultiModal

## MultiModal {.smaller}

![](https://i.pinimg.com/736x/6e/71/0d/6e710de5084379ba6a57b77e6579084f.jpg)

## MultiModal {.smaller}


In [18]:
# | echo: true

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is unusual about this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://i.pinimg.com/736x/6e/71/0d/6e710de5084379ba6a57b77e6579084f.jpg",
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)

The unusual aspect of this image is a man ironing clothes on an ironing board placed on top of a taxi in the middle of a busy street. This is an uncommon sight, as ironing typically takes place in domestic or commercial indoor settings. The juxtaposition of such a mundane, home-based activity with the fast-paced, outdoor environment of a city street is quite remarkable and humorous. Additionally, both the ironing board and the taxi are branded with the same logo, suggesting that this scene might be part of a promotional event or public stunt to attract attention.


## MultiModal {.smaller}

![](https://media.makeameme.org/created/it-worked-fine.jpg)

## MultiModal {.smaller}

In [23]:
# | echo: true

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What is in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://media.makeameme.org/created/it-worked-fine.jpg",
                    },
                },
            ],
        }
    ],
    max_tokens=300,
)

print(response.choices[0].message.content)

The image is a meme featuring two juxtaposed elements. In the background, there is a scene of a house on fire with firefighters and emergency responders at the site, attempting to manage the situation. In the foreground, there is a young girl smirking at the camera with a knowing expression. Overlaid text reads, "IT WORKED FINE IN DEV, IT'S A DEVOPS PROBLEM NOW," humorously suggesting that a problem developed during the software development stage is now a problem for the DevOps team to handle. The meme uses the incongruity between the calm and mischievous expression of the girl and the chaotic scene behind her to underline its comedic message about shifting blame in a development context.


## MultiModal

![](https://storage.googleapis.com/pai-images/a6d0952a331d40489b216e7f3f1ff6ed.jpeg)

## MultiModal {.smaller}


In [21]:
# | echo: true

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "Give me a long list of visual search tags/keywords so I can "
                    "index this image in my visual search index. Respond in JSON format {'labels': ['label1', ...]}",
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://storage.googleapis.com/pai-images/a6d0952a331d40489b216e7f3f1ff6ed.jpeg",
                    },
                },
            ],
        }
    ],
    response_format={"type": "json_object"},
)

print(response.choices[0].message.content)

{
  "labels": [
    "animated character",
    "wizard",
    "Minion",
    "fantasy",
    "3D illustration",
    "cute",
    "magic",
    "staff",
    "long beard",
    "blue hat",
    "glasses",
    "overalls",
    "adventure",
    "comical character",
    "grey beard",
    "wooden staff",
    "round glasses",
    "yellow",
    "character design",
    "creative",
    "digital art",
    "sorcerer",
    "cartoon",
    "funny",
    "elderly character",
    "mystical",
    "storybook",
    "cloak",
    "leather belt",
    "buckle"
  ]
}


## Code Interpreter (Data Analysis)

- give the LLM access to Python
- your own little data analyst to give tasks to

[example](https://chat.openai.com/c/2a470564-6868-408a-ae61-b7c1c24c40a5)

# Fine Tuning

## Fine Tuning
- todo
- [axolotl](https://github.com/OpenAccess-AI-Collective/axolotl?tab=readme-ov-file)
- [torchtune](https://github.com/pytorch/torchtune)

# Agents

## Agents
- todo

# Resources


# Resources {.smaller}

:::: {.columns}

::: {.column width="33%"}
- [Jeremy Howard](https://x.com/jeremyphoward)
    - [A Hackers' Guide to Language Models](https://www.youtube.com/watch?v=jkrNMKz9pWU&t=186s)
- [Hamel Husain](https://twitter.com/HamelHusain)
- [Maxime Labonne](https://twitter.com/maximelabonne)
- [Sebastian Raschka](https://twitter.com/rasbt)
- [anton](https://twitter.com/abacaj)
- [Teknium](https://twitter.com/Teknium1)
- [Simon Willison](https://twitter.com/simonw)
- [ThursdAI podcast](https://open.spotify.com/show/2J3lqMPD0BUI0bF9KJYKc1?si=9eed369fd01a4ade) 
- [Latent Space](https://www.latent.space/)

:::

::: {.column width="33%"}
- [Jay Alammar](https://twitter.com/JayAlammar)
- [Omar Sanseviero](https://twitter.com/osanseviero)
- [Jason Liu](https://twitter.com/jxnlco)
- [Omar Khattab](https://twitter.com/lateinteraction)
- [Wing Liang](https://twitter.com/winglian)
- [Nous Research](https://twitter.com/NousResearch)
- [Alex Albert](https://twitter.com/alexalbert__)
- [Matt Shumer](https://twitter.com/mattshumer_)
- [lmsysorg](https://twitter.com/lmsysorg)
- [Axolotl](https://twitter.com/axolotl_ai)
- [Nathan Lambert](https://twitter.com/natolambert)
:::

::: {.column width="33%"}
- [Tanishq Abraham](https://twitter.com/iScienceLuvr)
- [Philipp Schmid](https://twitter.com/_philschmid)
- [Tim Dettmers](https://twitter.com/Tim_Dettmers)
- [Eugene Yan](https://twitter.com/eugeneyan)
- [Georgi Gerganov](https://twitter.com/ggerganov)
- [Jim Fan](https://twitter.com/DrJimFan)
- [swyx](https://twitter.com/swyx)
- [Charles  Frye](https://twitter.com/charles_irl)
- [Jonathan Frankle](https://twitter.com/jefrankle)
- [Nils Reimers](https://twitter.com/Nils_Reimers)
- [Alignment Lab AI](https://twitter.com/alignment_lab)
- [people I follow](https://twitter.com/cleavey1985/following)
:::

::::