In [10]:
import os
from openai import OpenAI
from dotenv import load_dotenv
load_dotenv(override=True)

api_key = os.getenv('OPENROUTER_API_KEY')

ollama_client = OpenAI(api_key=api_key, base_url="http://localhost:11434/v1")
openai_client = OpenAI(api_key=api_key, base_url="https://openrouter.ai/api/v1")

MODEL_GPT = 'gpt-4o-mini'
MODEL_LLAMA = 'llama3.2'

In [11]:
system_prompt = """ 
You are a technical assistant and you can give answers and explanation to techinical questions such as code, algorithms, etc.
You can also help with general questions.
"""

def create_prompt(user_prompt):
    return [
        {"role": "system", "content": system_prompt},
        {"role": "user",   "content": user_prompt}
    ]

In [None]:

# GPT with streaming

In [12]:
# GPT with streaming
response = openai_client.chat.completions.create(
    model=MODEL_GPT,
    messages=create_prompt("What is tokenization in large language models?"),
    stream=True
)
for chunk in response:
    print(chunk.choices[0].delta.content, end='', flush=True)

Tokenization is a critical process in natural language processing (NLP) and is especially important in the context of large language models (LLMs) like GPT-3 or BERT. It involves breaking down text into smaller units, called tokens, which can then be processed by these models. Here's a breakdown of what tokenization entails:

### 1. **Purpose of Tokenization**
   - **Text Preprocessing**: Tokenization simplifies the handling of text by converting it from a continuous string into discrete units.
   - **Standardization**: It allows for consistent processing of input text, regardless of its original format.

### 2. **Types of Tokenization**
   - **Word Tokenization**: Splits text into individual words or terms. For example, "Hello, world!" becomes ["Hello", "world"].
   - **Subword Tokenization**: Breaks down words into smaller components (subwords) that capture the morphological structure of languages. For example, "unhappiness" may be tokenized into ["un", "happiness"]. This approach he

In [13]:
# `llama3.2`
response = ollama_client.chat.completions.create(
    model=MODEL_LLAMA,
    messages=create_prompt("What is tokenization in large language models?")
)
print(response.choices[0].message.content)

Tokenization is a crucial preprocessing step for large language models (LLMs). In essence, it's the process of breaking down text into individual units, called tokens. These tokens can be words, subwords, characters, or even symbols.

Here's why tokenization is important in LLMs:

**Goals of Tokenization:**

1.  **Input representation**: Tokenization allows the model to process input text as a sequence of tokens, which can be fed into the neural network.
2.  **Output generation**: When generating output from a language model, tokens represent individual words or phrases.

**Types of Tokens:**

1.  **Word tokens**: Individual words (e.g., "hello").
2.  **Subword tokens**: Smaller units within words that capture sub-linguistic aspects, such as prefixes and suffixes (e.g., ".in" for "into", "word", etc.). Subwords help with out-of-vocabulary (OOV) word handling.
3.  **Character-level tokenization**: Breaking down text into individual characters.

**Tokenization Challenges:**

1.  **Word b