# Notebook 2: RAG Examples

Concrete RAG examples using the shared pipeline (see Notebook 1). We reuse `rag_utils` so there's no duplicate setup.

- **Example 1:** Single-doc Q&A (product list)
- **Example 2:** Answer with in-text citations (which chunk)
- **Example 3:** Different question types (underwriting, target market)
- **Chunking:** Sentence vs token-based in one place
- **Evaluation:** When to measure retrieval (Precision@K, hit rate)

## Setup (once)

In [1]:
from dotenv import load_dotenv
from openai import OpenAI
from rag_utils import load_document, chunk_by_sentences, embed, build_index

load_dotenv()
client = OpenAI()

CHAT_MODEL = "gpt-4o-mini"
TOP_K = 8

doc = load_document()
chunks = chunk_by_sentences(doc)
chroma_client, coll = build_index(chunks, collection_name="rag", client=client)
print(f"Loaded {len(chunks)} chunks, index ready.")

Loaded 34 chunks, index ready.


## Example 1: Single-doc Q&A — product list

In [2]:
question1 = "What products does AI Agent Insure offer?"
q_emb = embed([question1], client=client)[0]
res = coll.query(query_embeddings=[q_emb], n_results=TOP_K)
retrieved = res["documents"][0]
ctx = "\n".join(retrieved)

prompt = f"""Use only the following context to answer. If the context doesn't contain the answer, say so.

Context:
{ctx}

Question: {question1}

Answer:"""

ans = client.chat.completions.create(model=CHAT_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0)
print("Q:", question1)
print("A:", ans.choices[0].message.content)

Q: What products does AI Agent Insure offer?
A: AI Agent Insure offers a portfolio of AI-native insurance products structured around distinct AI risk domains, including:

- AI Infrastructure & Operations Protection
- Agentic AI Liability Insurance
- Autonomous Systems & Robotics Coverage
- Compliance & Regulatory Shield
- Agentic Workflow Uptime Insurance
- Intellectual Property & Output Protection
- Model & Data Security Insurance
- AI Incident Response & Crisis Management
- Synthetic Data & Dataset Integrity Coverage


## Example 2: Answer with in-text citations

We number the retrieved chunks and ask the model to cite which chunk (e.g. Chunk 11) it used for each fact.

In [3]:
question2 = "What is AI Agent Insure's underwriting philosophy?"
q_emb = embed([question2], client=client)[0]
res = coll.query(query_embeddings=[q_emb], n_results=TOP_K)
retrieved_docs = res["documents"][0]
retrieved_ids = res["ids"][0]

numbered_ctx = "\n".join([f"[Chunk {i}] {d}" for i, d in zip(retrieved_ids, retrieved_docs)])
prompt_cite = f"""Use only the following context. For each fact you state, cite the source in parentheses, e.g. (Chunk 0). If the context doesn't contain the answer, say so.

Context:
{numbered_ctx}

Question: {question2}

Answer:"""

ans = client.chat.completions.create(model=CHAT_MODEL, messages=[{"role": "user", "content": prompt_cite}], temperature=0)
print("Q:", question2)
print("A:", ans.choices[0].message.content)

Q: What is AI Agent Insure's underwriting philosophy?
A: AI Agent Insure's underwriting philosophy evaluates risk at the intersection of technical architecture, operational controls, and governance (Chunk 12).


## Example 3: Different question — target market

Same index, different query. Shows retrieval finding the right section.

In [4]:
question3 = "Who are AI Agent Insure's target market segments?"
q_emb = embed([question3], client=client)[0]
res = coll.query(query_embeddings=[q_emb], n_results=TOP_K)
retrieved = res["documents"][0]
ctx = "\n".join(retrieved)

prompt = f"""Use only the following context to answer. If the context doesn't contain the answer, say so.

Context:
{ctx}

Question: {question3}

Answer:"""

ans = client.chat.completions.create(model=CHAT_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0)
print("Q:", question3)
print("A:", ans.choices[0].message.content)

Q: Who are AI Agent Insure's target market segments?
A: AI Agent Insure's target market segments include:

- AI startups and LLM application providers
- Enterprises deploying autonomous agents and workflows
- Robotics and autonomous vehicle developers
- Synthetic data and model training organizations
- Regulated industries adopting AI (healthcare, finance, legal)


## Chunking: sentence vs token-based

Different chunking changes which chunks exist and what gets retrieved. Below we **compare** both strategies: same question, two indexes (sentence vs token), then retrieval + answer for each.

In [5]:
from rag_utils import chunk_by_tokens

chunks_sent = chunk_by_sentences(doc)
chunks_tok = chunk_by_tokens(doc, max_tokens=80, overlap_tokens=20)
print(f"Sentence chunks: {len(chunks_sent)}")
print(f"Token chunks (80 tok, 20 overlap): {len(chunks_tok)}")
print("Sample token chunk:", chunks_tok[6][:100] + "..." if len(chunks_tok[6]) > 100 else chunks_tok[6])

Sentence chunks: 34
Token chunks (80 tok, 20 overlap): 15
Sample token chunk: AI Liability Insurance
- Autonomous Systems & Robotics Coverage
- Compliance & Regulatory Shield
- A...


In [6]:
# Build separate indexes; same question for both strategies
_, coll_sent = build_index(chunks_sent, collection_name="rag_sentence", client=client)
_, coll_tok = build_index(chunks_tok, collection_name="rag_token", client=client)
compare_q = "What products does AI Agent Insure offer?"
q_emb = embed([compare_q], client=client)[0]

res_sent = coll_sent.query(query_embeddings=[q_emb], n_results=TOP_K)
res_tok = coll_tok.query(query_embeddings=[q_emb], n_results=TOP_K)
retrieved_sent = res_sent["documents"][0]
retrieved_tok = res_tok["documents"][0]

def answer_from_chunks(chunks, question):
    ctx = "\n".join(chunks)
    prompt = f"""Use only the following context to answer. If the context doesn't contain the answer, say so.

Context:
{ctx}

Question: {question}

Answer:"""
    r = client.chat.completions.create(model=CHAT_MODEL, messages=[{"role": "user", "content": prompt}], temperature=0)
    return r.choices[0].message.content

answer_sent = answer_from_chunks(retrieved_sent, compare_q)
answer_tok = answer_from_chunks(retrieved_tok, compare_q)

print("=" * 60)
print("SENTENCE CHUNKING")
print("=" * 60)
print("Retrieved (preview):", [d[:50] + "..." for d in retrieved_sent[:3]])
print("Answer:", answer_sent)
print()
print("=" * 60)
print("TOKEN CHUNKING (80 tok, 20 overlap)")
print("=" * 60)
print("Retrieved (preview):", [d[:50] + "..." for d in retrieved_tok[:3]])
print("Answer:", answer_tok)

SENTENCE CHUNKING
Retrieved (preview): ['Company Overview  AI Agent Insure is a specialty i...', 'AI Agent Insure provides tailored insurance soluti...', '# AI Agent Insure ## Comprehensive Company Profile...']
Answer: AI Agent Insure offers a portfolio of AI-native insurance products structured around distinct AI risk domains, including:

- AI Infrastructure & Operations Protection
- Agentic AI Liability Insurance
- Autonomous Systems & Robotics Coverage
- Compliance & Regulatory Shield
- Agentic Workflow Uptime Insurance
- Intellectual Property & Output Protection
- Model & Data Security Insurance
- AI Incident Response & Crisis Management
- Synthetic Data & Dataset Integrity Coverage

TOKEN CHUNKING (80 tok, 20 overlap)
Retrieved (preview): ['The company is designed as an AI-native insurer ra...', 'AI Liability Insurance\n- Autonomous Systems & Robo...', 'AI retrieval systems, including vector databases a...']
Answer: AI Agent Insure offers the following products:

- Autonomous Syst

**Takeaway:** Sentence chunking yields more, smaller chunks (full sentences); token chunking yields fewer, fixed-size chunks that can cut across sentences. Retrieval and the final answer can differ. Tune chunk size and overlap for your docs and queries.