<div align="center">
<img src="https://poorit.in/image.png" alt="Poorit" width="40" style="vertical-align: middle;"> <b>AI SYSTEMS ENGINEERING 1</b>

## Unit 1 Exercises: Tokenization and Conversation Memory

**CV Raman Global University, Bhubaneswar**  
*AI Center of Excellence*

</div>

---

Complete the exercises below using the helper functions and setup provided. Each question has one or more empty code cells for your solution.

## Setup

Run the cells below to install packages, import libraries, and define helper functions.

In [None]:
# Install required packages
!pip install -q openai tiktoken

In [None]:
# Import required libraries
import os
import tiktoken
from openai import OpenAI

In [None]:
# Configure Gemini API Key
from google.colab import userdata
from getpass import getpass

# GEMINI_API_KEY = userdata.get("GEMINI_API_KEY")
GEMINI_API_KEY = getpass("Enter your Gemini API key: ")
GEMINI_BASE_URL = "https://generativelanguage.googleapis.com/v1beta/openai/"

client = OpenAI(
    base_url=GEMINI_BASE_URL,
    api_key=GEMINI_API_KEY
)

MODEL = "gemini-2.0-flash-lite"
print(f"Gemini configured with model: {MODEL}")

In [None]:
# Tokenizer and helper functions

encoding = tiktoken.encoding_for_model("gpt-4o-mini")

# GPT-4o-mini pricing (as of 2024)
INPUT_PRICE_PER_1M = 0.15   # $0.15 per 1M input tokens
OUTPUT_PRICE_PER_1M = 0.60  # $0.60 per 1M output tokens


def count_tokens(text):
    """Count the number of tokens in a text."""
    return len(encoding.encode(text))


def estimate_cost(input_tokens, output_tokens):
    """Estimate cost in USD."""
    input_cost = (input_tokens / 1_000_000) * INPUT_PRICE_PER_1M
    output_cost = (output_tokens / 1_000_000) * OUTPUT_PRICE_PER_1M
    return input_cost + output_cost

In [None]:
# Conversation class for managing multi-turn conversations

class Conversation:
    """Manage a conversation with memory."""

    def __init__(self, system_prompt="You are a helpful assistant"):
        self.messages = [{"role": "system", "content": system_prompt}]

    def chat(self, user_message):
        self.messages.append({"role": "user", "content": user_message})

        response = client.chat.completions.create(
            model=MODEL,
            messages=self.messages
        )

        assistant_message = response.choices[0].message.content
        self.messages.append({"role": "assistant", "content": assistant_message})

        return assistant_message

    def get_token_count(self):
        total = sum(count_tokens(m["content"]) for m in self.messages)
        return total

---

## Q1: Token Cost Estimator

Using `tiktoken` and the `estimate_cost()` function, calculate the estimated API cost for the following scenario:

- **System prompt:** `"You are a helpful tutor who explains concepts clearly to college students."`
- **User message:** The full text of the Indian national anthem (Jana Gana Mana) in English transliteration
- **Assume 150 output tokens**

Print the input token count, output token count, and total estimated cost in USD.

In [None]:
# System prompt
system_prompt = "You are a helpful tutor who explains concepts clearly to college students."

# User message: Indian national anthem in English transliteration
user_message = """
Jana Gana Mana Adhinayaka Jaya He
Bharata Bhagya Vidhata
Punjab Sindh Gujarat Maratha
Dravida Utkala Banga
Vindhya Himachala Yamuna Ganga
Uchchala Jaladhi Taranga
Tava Shubha Name Jage
Tava Shubha Ashisha Mange
Gahe Tava Jaya Gatha
Jana Gana Mangala Dayaka Jaya He
Bharata Bhagya Vidhata
Jaya He Jaya He Jaya He
Jaya Jaya Jaya Jaya He
"""

# Count input tokens (system + user combined)
input_tokens = count_tokens(___) + count_tokens(___)
output_tokens = ___  # Given in the question

# Estimate cost
cost = estimate_cost(___, ___)

print(f"Input tokens:  {input_tokens}")
print(f"Output tokens: {output_tokens}")
print(f"Estimated cost: ${cost:.6f}")

---

## Q2: Context Window Usage

Write a function `check_context_fit(text, model_name, context_window)` that:

1. Counts the tokens in the given text
2. Calculates what percentage of the context window it uses
3. Prints whether the text fits within the context window (`True`/`False`)

Test it with a string that repeats `"CV Raman Global University is a great place to study. "` **500 times**, using GPT-4o-mini's context window of 128,000 tokens.

Then increase the repetition count until it **exceeds** the context window. What repetition count is needed?

In [None]:
def check_context_fit(text, model_name, context_window):
    """Check if text fits within a model's context window."""
    # Step 1: Count tokens
    tokens = count_tokens(___)

    # Step 2: Calculate percentage
    percentage = (___ / ___) * 100

    # Step 3: Check fit
    fits = tokens ___ context_window  # which comparison operator?

    print(f"Model: {model_name}")
    print(f"Tokens: {tokens:,}")
    print(f"Context window: {context_window:,}")
    print(f"Usage: {percentage:.2f}%")
    print(f"Fits: {fits}")
    return fits

In [None]:
# Test with 500 repetitions
text_500 = "CV Raman Global University is a great place to study. " * ___
check_context_fit(text_500, "gpt-4o-mini", 128_000)

# Now find the repetition count that exceeds the context window
# Hint: try increasing the multiplier until fits becomes False
print("\n--- Finding the limit ---")
text_big = "CV Raman Global University is a great place to study. " * ___
check_context_fit(text_big, "gpt-4o-mini", 128_000)

---

## Q3: Conversation Memory in Action

Using the `Conversation` class, have a conversation with the model where you:

1. Tell it: `"I am studying Computer Science at CV Raman University."`
2. Tell it: `"My favorite programming language is Python."`
3. Ask it: `"Based on what you know about me, suggest a project idea."`
4. Print `get_token_count()` after **each** message.

In [None]:
# Create a conversation
conv = Conversation()

# Message 1
print("User: I am studying Computer Science at CV Raman University.")
print("Assistant:", conv.chat("I am studying Computer Science at CV Raman University."))
print(f"Token count: {conv.get_token_count()}\n")

# Message 2
print("User: My favorite programming language is Python.")
print("Assistant:", conv.chat("___"))
print(f"Token count: {conv.get_token_count()}\n")

# Message 3
print("User: Based on what you know about me, suggest a project idea.")
print("Assistant:", conv.chat("___"))
print(f"Token count: {conv.get_token_count()}")

**Written Response:** Does the token count grow? Explain in 2–3 sentences why this happens and what it means for the cost of long conversations.

*Your answer here:*



---

## Q4: Stateless Proof

**Without** using the `Conversation` class, make two **separate** API calls:

1. First call: `"My name is [your name] and I study [your branch]."`
2. Second call: `"What is my name and what do I study?"`

Show that the second call **cannot** answer the question.

Then make a **third** call where you manually include both messages in the `messages` list, and show that now it **can** answer.

In [None]:
# Call 1: Introduce yourself (replace the placeholders with your info)
messages_1 = [
    {"role": "user", "content": "My name is ___ and I study ___."}
]

response_1 = client.chat.completions.create(model=MODEL, messages=___)
print("Call 1 response:", response_1.choices[0].message.content)

In [None]:
# Call 2: Ask without any context — a brand new messages list
messages_2 = [
    {"role": "user", "content": "What is my name and what do I study?"}
]

response_2 = client.chat.completions.create(model=___, messages=___)
print("Call 2 response:", response_2.choices[0].message.content)

In [None]:
# Call 3: Include the full conversation history so the model has context
messages_3 = [
    {"role": "user", "content": "My name is ___ and I study ___."},
    {"role": "assistant", "content": "___"},  # Copy response_1 text here
    {"role": "user", "content": "What is my name and what do I study?"}
]

response_3 = client.chat.completions.create(model=___, messages=___)
print("Call 3 response:", response_3.choices[0].message.content)

**Written Response:** Explain in 2–3 sentences why the second call cannot answer the question but the third call can.

*Your answer here:*



---

**Course Information:**
- **Institution:** CV Raman Global University, Bhubaneswar
- **Program:** AI Center of Excellence
- **Course:** AI Systems Engineering 1
- **Developed by:** [Poorit Technologies](https://poorit.in) - *Transform Graduates into Industry-Ready Professionals*

---