# AI Integrations for Developers — Exam

# AI Integrations for Developers — Exam

## Project Objective

The main objective is to create an **AI chatbot** that can answer questions by retrieving information from the **provided PDF file**.  

The chatbot should be able to:  
- Parse and understand the content of the PDF.  
- Use the extracted information to provide relevant and accurate answers.  
- Respond specifically to the **questions in the final cell of the notebook**.  


## General Instructions

- This notebook is a **template** where you must put your code.  
- You should **fill in all empty variables** and complete the code so that when I download your notebook and click **Run all**, all cells execute correctly and provide the answers.  
- ⚠️ **Do NOT hardcode your API key**. Use Colab environment variables (`%env OPENAI_API_KEY=your_key_here`) and access them in your code.  
- You may **create more cells** if needed. It is recommended that your code is well-structured and split logically into separate cells.  
- The function **`ask_ai(query)`** must be implemented by you. All queries will call this function to check your solution.  
- ✅ **Test cases will be created by me (the instructor).** You are **not allowed to modify, remove, or add to the test cases cell**. Your code must work correctly with the provided test cases.  
- You are **ONLY ALLOWED** to use only the following:  
  - **Models:** OpenAI or Anthropic  
  - **Technologies:** LangChain or vanilla Python code  
  - **Vector Store:** Chroma DB

ℹ Before starting, please read the test queries in the final cell to understand the expected outputs.

🚨 **Any student who does not follow the template, does not stick to the required format, or whose code does not execute properly will be disqualified.**


### Important

Fill in **all the variables** in the cell.  
❌ **Do NOT put your API key directly in the code.**  
✅ The cell must be set up to take the API key from the Colab environment variables.


In [1]:
# ================================
# 🔧 RAG Configuration Variables
# ================================

# ⚠️ Do NOT put your API key here directly.
# Make sure you set your API key in Colab like this:
# %env OPENAI_API_KEY=your_key_here

import os
from google.colab import userdata

# API Key (taken from Colab environment variables)
API_KEY = userdata.get("OPENAI_API_KEY")

# Prompt & Model Settings
PROMPT = """
<context>
You are provided with context chunks retrieved from a company handbook.
Only use this context to answer questions.
If the answer is not in the context, reply exactly: "Not in the guide."
</context>

<role>
You are a helpful assistant answering questions from a company handbook.
Only use the provided context to answer.
If the answer is not found in the context, say: 'Not in the guide.'
Keep answers concise and factual.
</role>

<user_info>
The user is a generative AI enthusiast with a strong interest in practical applications of LLMs.
They enjoy designing and refining Retrieval-Augmented Generation (RAG) pipelines and experimenting
with advanced prompt engineering techniques.
</user_info>

<examples>
Q: How many words should effective prompts average?
A: Effective prompts should average around 21 words.

Q: What does 'persona' mean in prompt writing?
A: 'Persona' refers to the role or identity assigned to the AI, which influences its responses.
</examples>

<style>
- Answer in clear, professional business language.
</style>

<format>
Your response must be plain text.
Do not include explanations of your reasoning.
</format>
"""                                          # e.g. "Summarize the document in 3 sentences"
MODEL = "gpt-4o-mini"                        # e.g. "gpt-4"
EMBEDDING_MODEL = "text-embedding-3-small"   # e.g. "text-embedding-ada-002"

# Chunking Parameters
CHUNK_SIZE = 300            # e.g. 500
CHUNK_OVERLAP = 50         # e.g. 50
TOP_N_RESULTS = 8         # e.g. 3

# Generation Parameters
OUTPUT_LENGTH = 420          # e.g. 200
TEMPERATURE = 0.2            # e.g. 0.7

### Code Organization

Create more cells if needed and put your code in them.  
It is **recommended** that your code is well-structured, split logically, and kept in separate cells for clarity.


In [2]:
# ================================
# 🔧 Install packages
# ================================

!pip install chromadb pypdf openai tiktoken



Collecting chromadb
  Downloading chromadb-1.1.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (7.2 kB)
Collecting pypdf
  Downloading pypdf-6.0.0-py3-none-any.whl.metadata (7.1 kB)
Collecting pybase64>=1.4.1 (from chromadb)
  Downloading pybase64-1.4.2-cp312-cp312-manylinux1_x86_64.manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_5_x86_64.whl.metadata (8.7 kB)
Collecting posthog<6.0.0,>=2.4.0 (from chromadb)
  Downloading posthog-5.4.0-py3-none-any.whl.metadata (5.7 kB)
Collecting onnxruntime>=1.14.1 (from chromadb)
  Downloading onnxruntime-1.22.1-cp312-cp312-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.9 kB)
Collecting opentelemetry-exporter-otlp-proto-grpc>=1.2.0 (from chromadb)
  Downloading opentelemetry_exporter_otlp_proto_grpc-1.37.0-py3-none-any.whl.metadata (2.4 kB)
Collecting pypika>=0.48.9 (from chromadb)
  Downloading PyPika-0.48.9.tar.gz (67 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m67.3/67.3 kB[0m [31m3

In [3]:
# ================================
# 📂 File Upload (PDF)
# ================================

from google.colab import files

# Upload a PDF file
uploaded = files.upload()

# Get filename
pdf_path = list(uploaded.keys())[0]
print(f"✅ Uploaded file: {pdf_path}")


Saving Gemini-Prompting-Guide.pdf to Gemini-Prompting-Guide.pdf
✅ Uploaded file: Gemini-Prompting-Guide.pdf


In [4]:
# ================================
# 📖 Extract text from PDF
# ================================

from pypdf import PdfReader

# Read PDF
reader = PdfReader(pdf_path)

# Extract text from all pages
extracted_text = ""
for page in reader.pages:
    extracted_text += page.extract_text() + "\n"

# Save extracted text to a file for verification
text_file = "extracted_text.txt"
with open(text_file, "w", encoding="utf-8") as f:
    f.write(extracted_text)

print(f"✅ Text extracted and saved to {text_file} (length: {len(extracted_text)} chars)")

# Option to download file
# from google.colab import files
# files.download(text_file)


✅ Text extracted and saved to extracted_text.txt (length: 111461 chars)


In [5]:
# ================================
# ✂️ Sentence-aware Chunking
# ================================

import re
import tiktoken

# Load tokenizer for the embedding model
enc = tiktoken.encoding_for_model(EMBEDDING_MODEL)

def num_tokens(text: str) -> int:
    return len(enc.encode(text))

def split_into_sentences(text: str):
    # Simple regex-based sentence splitter
    sentences = re.split(r'(?<=[.!?])\s+', text.strip())
    return [s for s in sentences if s]

def chunk_text(text, chunk_size=CHUNK_SIZE, overlap=CHUNK_OVERLAP):
    sentences = split_into_sentences(text)
    chunks, current_chunk, current_tokens = [], [], 0

    for sent in sentences:
        sent_tokens = num_tokens(sent)

        # If adding this sentence exceeds chunk size, save current chunk
        if current_tokens + sent_tokens > chunk_size:
            chunks.append(" ".join(current_chunk))
            # Start new chunk with overlap from last chunk
            overlap_tokens = []
            while current_chunk and num_tokens(" ".join(overlap_tokens)) < overlap:
                overlap_tokens.insert(0, current_chunk.pop())
            current_chunk = overlap_tokens.copy()
            current_tokens = num_tokens(" ".join(current_chunk))

        # Add sentence to current chunk
        current_chunk.append(sent)
        current_tokens += sent_tokens

    # Add last chunk
    if current_chunk:
        chunks.append(" ".join(current_chunk))

    return chunks

# Create chunks
chunks = chunk_text(extracted_text, CHUNK_SIZE, CHUNK_OVERLAP)

# Save chunks to file
chunks_file = "chunks_preview.txt"
with open(chunks_file, "w", encoding="utf-8") as f:
    for i, chunk in enumerate(chunks):
        f.write(f"--- Chunk {i+1} ---\n{chunk}\n\n")

# Print statistics
print(f"✅ Total Chunks: {len(chunks)}")
chunk_lengths = [num_tokens(c) for c in chunks]
print(f"📊 Avg tokens per chunk: {sum(chunk_lengths)//len(chunk_lengths)}")
print(f"📊 Min tokens: {min(chunk_lengths)}, Max tokens: {max(chunk_lengths)}")

# Preview first chunks
for i, chunk in enumerate(chunks[:3]):
    print(f"\n🔍 Chunk {i+1} ({num_tokens(chunk)} tokens):\n{chunk[:400]}...\n")

# Allow download
# from google.colab import files
# files.download(chunks_file)


✅ Total Chunks: 105
📊 Avg tokens per chunk: 286
📊 Min tokens: 120, Max tokens: 301

🔍 Chunk 1 (261 tokens):
1
October 2024 edition
A quick-start handbook 
for effective prompts

2
Writing effective prompts 
From the very beginning, Google Workspace was built to allow you to collaborate in real time with other people. Now, you can also collaborate with AI using Gemini for Google Workspace to help boost your productivity and 
creativity without sacrificing privacy or security. The embedded generative AI-p...


🔍 Chunk 2 (293 tokens):
This guide provides you with the foundational skills to write effective prompts when using Gemini for Workspace. You can think of a prompt as a conversation starter with your AI-powered assistant. You might write several 
prompts as the conversation progresses. While the possibilities are virtually endless, you can put consistent 
best practices to work today. The four main areas to consider when ...


🔍 Chunk 3 (300 tokens):
Express complete thoughts in  
f

In [6]:
# ================================
# 🔑 Create Embeddings for Chunks
# ================================

import json
from openai import OpenAI

client = OpenAI(api_key=API_KEY)

embeddings = []

print("⏳ Generating embeddings...")

for i, chunk in enumerate(chunks):
    response = client.embeddings.create(
        model=EMBEDDING_MODEL,
        input=chunk
    )
    vector = response.data[0].embedding
    embeddings.append({
        "id": f"chunk_{i+1}",
        "text": chunk,
        "embedding": vector
    })

    if (i+1) % 10 == 0 or i == len(chunks)-1:
        print(f"✅ Processed {i+1}/{len(chunks)} chunks")

# Save to JSONL file
embeddings_file = "chunk_embeddings.jsonl"
with open(embeddings_file, "w", encoding="utf-8") as f:
    for e in embeddings:
        f.write(json.dumps(e) + "\n")

print(f"\n✅ Saved embeddings to {embeddings_file} (total {len(embeddings)})")

# Allow download
# from google.colab import files
# files.download(embeddings_file)


⏳ Generating embeddings...
✅ Processed 10/105 chunks
✅ Processed 20/105 chunks
✅ Processed 30/105 chunks
✅ Processed 40/105 chunks
✅ Processed 50/105 chunks
✅ Processed 60/105 chunks
✅ Processed 70/105 chunks
✅ Processed 80/105 chunks
✅ Processed 90/105 chunks
✅ Processed 100/105 chunks
✅ Processed 105/105 chunks

✅ Saved embeddings to chunk_embeddings.jsonl (total 105)


In [7]:
# ================================
# 🗄️ Insert Embeddings into ChromaDB
# ================================

import chromadb
from chromadb.utils import embedding_functions

# Create Chroma client (in-memory for now, can persist later)
chroma_client = chromadb.Client()

# Delete collection (for reruns)
try:
    collection = chroma_client.delete_collection(name="prompting_guide")
except:
    pass

# Create collection
collection = chroma_client.get_or_create_collection(name="prompting_guide", metadata={"hnsw:space": "cosine"})

# Insert embeddings
ids = [e["id"] for e in embeddings]
documents = [e["text"] for e in embeddings]
vectors = [e["embedding"] for e in embeddings]

collection.add(
    ids=ids,
    documents=documents,
    embeddings=vectors
)

print(f"✅ Inserted {len(ids)} chunks into Chroma collection 'prompting_guide'")

# Quick check: count items
print("📊 Collection count:", collection.count())

# Sample
sample = collection.peek()
print("\n🔍 Sample:", sample)


✅ Inserted 105 chunks into Chroma collection 'prompting_guide'
📊 Collection count: 105

🔍 Sample: {'ids': ['chunk_1', 'chunk_2', 'chunk_3', 'chunk_4', 'chunk_5', 'chunk_6', 'chunk_7', 'chunk_8', 'chunk_9', 'chunk_10'], 'embeddings': array([[ 0.03427482,  0.0243559 ,  0.03334006, ..., -0.0135022 ,
        -0.00385592, -0.00366117],
       [ 0.01721646,  0.03215443,  0.01425211, ..., -0.0180767 ,
        -0.00821298, -0.00122207],
       [ 0.03158743,  0.03273606,  0.0329449 , ..., -0.03111753,
        -0.00375264,  0.02439541],
       ...,
       [-0.00027881,  0.02642701,  0.06068236, ..., -0.0209523 ,
        -0.00974704,  0.01115409],
       [-0.01558568,  0.03539576,  0.0568851 , ..., -0.01010839,
         0.00940651, -0.01378834],
       [-0.00827033,  0.04495667,  0.03185001, ..., -0.00583506,
         0.01812769, -0.01005574]]), 'documents': ['1\nOctober 2024 edition\nA quick-start handbook \nfor effective prompts\n\n2\nWriting effective prompts \nFrom the very beginning, Google 

In [8]:
# ================================
# 🔍 Vector Search Function
# ================================

def search_chunks(query, n_results=TOP_N_RESULTS):
    # Embed the query
    response = client.embeddings.create(
        model=EMBEDDING_MODEL,
        input=query
    )
    query_embedding = response.data[0].embedding

    # Query Chroma
    results = collection.query(
        query_embeddings=[query_embedding],
        n_results=n_results
    )

    # Format results
    retrieved = []
    for i in range(len(results["ids"][0])):
        retrieved.append({
            "id": results["ids"][0][i],
            "text": results["documents"][0][i],
            "distance": results["distances"][0][i]
        })

    return retrieved




In [9]:
# 🔍 Quick test
test_query = "Which Google apps integrate with Gemini?"
results = search_chunks(test_query, n_results=3)

print(f"🔎 Query: {test_query}\n")
for r in results:
    print(f"📌 {r['id']} (distance: {r['distance']:.4f})")
    print(r["text"][:400] + "...\n")

🔎 Query: Which Google apps integrate with Gemini?

📌 chunk_5 (distance: 0.3471)
Page 28
Human resources ........................................................... Page 32
Marketing .................................................................... Page 37
Project management ........................................................ Page 46
Sales ......................................................................... Page 50
Small business owners and entrepreneurs ............

📌 chunk_6 (distance: 0.3484)
Engaging with Gemini in the side panel 
of your Workspace apps allows you to create highly personalized generative AI outputs that are based on your 
own files and documents — even if they aren’t Google Docs. You can generate personalized emails in seconds 
referencing your own Docs to pull in relevant context, generate Slides that are based on information directly  
from your own briefs or report...

📌 chunk_64 (distance: 0.3976)
(Gemini in Docs)
Content Marketing Manager
Use case

In [10]:
# ================================
# 🧪 Debug Search Similarity
# ================================

debug_query = "Which Google apps integrate with Gemini?"

# Embed the query
response = client.embeddings.create(
    model=EMBEDDING_MODEL,
    input=debug_query
)
query_vector = response.data[0].embedding

# Search top 10 matches
results = collection.query(
    query_embeddings=[query_vector],
    n_results=10
)

print(f"🔎 Query: {debug_query}\n")
for i in range(len(results["ids"][0])):
    dist = results["distances"][0][i]
    txt = results["documents"][0][i][:300].replace("\n", " ")
    print(f"📌 {results['ids'][0][i]} | distance={dist:.4f}")
    print(f"   {txt}\n")


🔎 Query: Which Google apps integrate with Gemini?

📌 chunk_5 | distance=0.3471
   Page 28 Human resources ........................................................... Page 32 Marketing .................................................................... Page 37 Project management ........................................................ Page 46 Sales ...............................

📌 chunk_6 | distance=0.3484
   Engaging with Gemini in the side panel  of your Workspace apps allows you to create highly personalized generative AI outputs that are based on your  own files and documents — even if they aren’t Google Docs. You can generate personalized emails in seconds  referencing your own Docs to pull in relev

📌 chunk_64 | distance=0.3977
   (Gemini in Docs) Content Marketing Manager Use case: Deliver personalized content to customers at scale You want to create copy for a five-step email nurture cadence for your new product. You open a new Google  Doc and prompt Gemini in the Docs side p

## Test Cases (Final Cell)

The final cell must contain your **test cases**.  
When executed, the AI should provide correct answers to the given questions **based on the PDF file**.


### AI Query Function

In this cell, you must implement the function **ask_ai(query)**.  
This function will be the final execution point of your pipeline (RAG / LLM).  


In [11]:
# ================================
# ❓ AI Query Function (with Debug Mode)
# ================================

DEBUG = False  # 🔎 Toggle evidence printing

def ask_ai(query: str) -> str:
    """
    Executes the final RAG / LLM pipeline with optional debug mode.
    Input:
        query (str): The question you want to ask the AI.
    Output:
        str: The AI's answer based on the PDF file.
    """
    # Step 1: Retrieve top-N chunks
    retrieved = search_chunks(query, n_results=TOP_N_RESULTS)
    context = "\n\n".join([r["text"] for r in retrieved])

    # Debug mode: show retrieved evidence
    if DEBUG:
        print(f"\n🔎 DEBUG: Retrieved {len(retrieved)} chunks for query → {query}\n")
        for r in retrieved:
            print(f"📌 {r['id']} (distance={r['distance']:.4f})")
            print(r["text"][:300].replace("\n", " ") + "...\n")

    # Step 2: Build QA prompt
    system_prompt = PROMPT

    user_prompt = f"Context:\n{context}\n\nQuestion: {query}\nAnswer:"

    # Step 3: Call GPT model
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        max_tokens=OUTPUT_LENGTH,
        temperature=TEMPERATURE
    )

    # Step 4: Return the model's answer
    return response.choices[0].message.content.strip()


### Test Queries

Use this cell to test your function with different queries.  
The answers must be generated correctly based on the PDF file.  


In [12]:
# ================================
# 🔍 Example Queries for Testing
# ================================

queries = [
    "Which Google apps integrate with Gemini?",
    "What are two benefits of using natural language in prompts?",
    "How should executives use prompts differently than frontline workers?",
    "What is the purpose of giving constraints in prompts?",
    "Как се прави бобена чорба? От кой източник е информацията?"
]

# Call the AI with each query
for q in queries:
    print(f"Q: {q}")
    print(f"A: {ask_ai(q)}\n")


Q: Which Google apps integrate with Gemini?
A: The Google apps that integrate with Gemini are Gmail, Google Docs, Google Sheets, Google Meet, Google Slides, and Gemini Advanced.

Q: What are two benefits of using natural language in prompts?
A: 1. It allows for clearer communication, making it easier for the AI to understand the user's intent.  
2. It creates a more conversational tone, enhancing the interaction between the user and the AI.

Q: How should executives use prompts differently than frontline workers?
A: Executives should use prompts to integrate them into their daily tasks efficiently, focusing on strategic decision-making and urgent tasks while on the go. They may require prompts that are concise and tailored to their specific roles, whereas frontline workers might focus more on direct customer interactions and operational tasks.

Q: What is the purpose of giving constraints in prompts?
A: The purpose of giving constraints in prompts is to generate specific results by inc