# Section 1 Exercises

## Setup

### Using this notebook with ollama: 

- Install ollama
- Pull the `llama3.1` model by running `ollama pull llama3.1`
- Create `.env` and write

```ini
MODE=ollama
```

- Restart the kernel

### Using this notebook with GitHub Models: 

- Create `.env` and write

```ini
MODE=github
```

- Go to [Access Tokens](https://github.com/settings/personal-access-tokens)
- Create new **Fine-tuned token**
- Set timeout to be some time in the future (how long you plan on using it)
- Expand Account Permissions
- Find **Models** and change to read-only
- Save
- Copy the key
- Add it to `.env` as `GITHUB_TOKEN=gha....`
- Restart the kernel


In [1]:
%pip install -r requirements.txt

Note: you may need to restart the kernel to use updated packages.


# OpenAI Byte-Pair-Encodings

Many AI models use OpenAI's Byte-Pair-Encodings

- `cl100k_base` (GPT3.5 to GPT4)
- `o200k_base` (GPT-4o, o1, o3, GPT-4.1)

The easiest and fastest way to use these encodings is with the `tiktoken` library.

In [2]:
# Tokenization for OpenAI models

from tiktoken import encoding_for_model, get_encoding

# Load the encoding for a known OpenAI model
encoding = encoding_for_model("gpt-3.5-turbo")
# or use a specific encoding
# encoding = get_encoding("cl100k_base")

# Encode the text
text = "Hello, world!"
tokens = encoding.encode(text)

tokens

[9906, 11, 1917, 0]

In [3]:
# Convert tokens back to text
decoded_text = encoding.decode(tokens)

decoded_text

'Hello, world!'

# Hugging Face Tokenizers

Many open models use custom tokenizers and byte-pair-encodings.

HuggingFace Models have an API for storing and downloading tokenizers to process inputs locally.

The `tokenizers` package on PyPi facilitates this, but the `transformers` package makes it even easier to download and use them.

Use the `AutoTokenizer` class to download and install the tokenizer for a given model.

You will need to create an account on hugging face and login using:

```bash
huggingface_hub login
```


In [4]:
# Using huggingface tokenizers
from transformers import AutoTokenizer

# Load the tokenizer for a specific model
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-8B")

# Encode the text
tokens = tokenizer(text, 
                   add_special_tokens=False, # End of stream/end of text tokens
                   return_offsets_mapping=False) # Return offsets for each token

tokens

  from .autonotebook import tqdm as notebook_tqdm
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.


{'input_ids': [9707, 11, 1879, 0], 'attention_mask': [1, 1, 1, 1]}

In [5]:
# Alternatively, you can use the tokenizer directly
tokens = tokenizer.encode(text)
tokens

[9707, 11, 1879, 0]

In [6]:
# Decode the tokens back to text
decoded_text = tokenizer.decode(tokens)

decoded_text

'Hello, world!'

In [7]:
# Input text doesn't need to be English or ASCII
text = "你好，世界！"
tokens = encoding.encode(text)
decoded_text = encoding.decode(tokens)

tokens, decoded_text

([57668, 53901, 3922, 3574, 244, 98220, 6447], '你好，世界！')

In [8]:
# Although, the number of tokens may be different
# depending on the encoding used
text = "你好，世界！"

encoding_gpt2 = get_encoding("gpt2") # Old, but used in some open models
encoding_cl100 = get_encoding("cl100k_base") # GPT3.5 - 4
encoding_o200 = get_encoding("o200k_base") # New encoding. Double the size of cl100k_base and optimized for languages like Chinese and Japanese

print("GPT-2 is ", len(encoding_gpt2.encode(text)), "tokens")
print("CL100k_base is ", len(encoding_cl100.encode(text)), "tokens")
print("and o200 is", len(encoding_o200.encode(text)), "tokens")

GPT-2 is  14 tokens
CL100k_base is  7 tokens
and o200 is 4 tokens


# Using OpenAI client for chat completions

The OpenAI Python client has become the unofficial standard and most models are work with OpenAI's API spec.

So, even if you're not using OpenAI you can still use the Python client to talk to GitHub models, Ollama local models and many 3rd party LLMs.

Normally, you install it with `pip install openai` but it's already a requirement for this notebook.

In [1]:
# Your first conversation with a model
from openai import OpenAI
import utils

# If you change the environment variables, you need to restart the kernel
base_url = utils.get_base_url()
api_key = utils.get_api_key()

if utils.MODE == "github":
    model = "openai/gpt-4.1-nano"  # A fast, small model
elif utils.MODE == "ollama":
    model = "llama3.1"  # llama and ollama are not related. It's a coincidence

# OpenAI client is a class. The old API used to use globals. Sometimes you might see code snippets for the old API. 

client = OpenAI(
    base_url=base_url,
    api_key=api_key,
)

response = client.chat.completions.create(
    model=model,
    messages=[
        {
            "role": "system",
            "content": "You are a helpful assistant.",
        },
        {
            "role": "user",
            "content": "What is the capital of France?",
        }
    ],
    temperature=0.5,  # or top_p=0.9
    n=1, # Number of results to return. If you want multiple options, increase this

    # Here are some extra parameters you might need in future

    # presence_penalty=0.0, # Increase the likelihood of new topics, default is 0. Range is -1 to 1
    # frequency_penalty=0.0, # Increase the likelihood of new words, default is 0. Range is -2 to 2
    # max_tokens=100, # Maximum number of tokens to return
    # stop=None, # Stop when the model generates this token
)

response.choices[0].message.content

'The capital of France is Paris.'

# Exercise 1: Adjust the temperature and see how it affects the output

Copy the code above and ask it to create a **poem about the moon**
Try different values for temperature and see how it affects the output
Try varying frequency_penalty and presence_penalty too

In [None]:
# Create your moon poem

Below we have written a grading system using another LLM. You will get a grade from A-F along with the reasoning. See if you can adjust your prompt to get a better grade.

In [3]:
# Put a better poem here
poem = """
Oh moonie moonie, shining bright,
I love to see you in the night.
You light the way for all to see,
And fill my heart with joy and glee.
"""

grade = utils.grade_poem(client, model, poem)

from IPython.display import display, Markdown

display(Markdown(grade))

Grade: D

Explanation:

Classic Poetic Structure: The poem employs a simple rhyme scheme and rhythmic pattern, but it lacks formal structure or complexity that would elevate its poetic quality.

Non-Repetitiveness: The poem is brief and does not contain redundant ideas or words, so it performs well in this regard.

Interesting Content: The content is straightforward and somewhat juvenile. It expresses affection for the moon but doesn't offer deeper insight, imagery, or a captivating narrative element.

Creativity: The language and imagery are very basic and cliché. Terms like "shining bright," "light the way," and "fill my heart with joy and glee" are common and lack originality, reducing the poem's imaginative appeal.

Overall, while the poem is lighthearted and sweet, it lacks the depth, originality, and poetic sophistication deserving of higher grades. It earns a grade of D.

In [13]:
# Zero, one, and few-shot prompting