# Quick Start: RAG with Chroma + Ollama (Python)

This Jupyter Notebook guides you through building a Retrieval Augmented Generation (RAG) pipeline using **Chroma** for vector storage and **Ollama** for embeddings + LLM generation.

**CSV Structure (expected):**
```csv
TicketId,Project,Question,Answer
1001,CRM Suite,Cannot log into the CRM; getting 'invalid credentials' even though my password is correct.,"We reset the user's password, cleared browser cache, and verified SSO token freshness. Issue resolved."
```

---
## Prerequisites
- Install & run **Ollama**: `ollama serve` and pull models (e.g., `ollama pull llama3.1`, `ollama pull nomic-embed-text`).
- Install Python packages: `pip install chromadb pandas ollama`.
- Place your CSV file (e.g., `tickets.csv`) in the same working directory as this notebook.


## 0. Imports & Configuration
Set paths, model names, and retrieval parameters. Adjust as needed.

In [None]:
import os
import time
import pandas as pd

# Vector DB (Chroma)
import chromadb

# Ollama Python client for embeddings + generation
import ollama

# Configuration (edit as needed)
CSV_PATH = "tickets.csv"  # Path to your CSV file
CHROMA_DIR = "./chroma_store"  # Persistent directory for Chroma
COLLECTION_NAME = "tickets_qa"
EMBED_MODEL = "nomic-embed-text"  # Ollama embedding model
LLM_MODEL = "llama3.1:latest"  # Ollama chat/generation model (e.g., 'mistral')

TOP_K = 3  # Number of closest documents to fetch
EMBED_BATCH = 32  # (Not used directly; simple loop embedding)


## 1. Helper Functions
Embedding, document formatting, CSV loading, Chroma initialization, ingestion, retrieval, prompt building, and generation.

In [51]:
def embed_texts_ollama(texts, model=EMBED_MODEL):
    """
    Embed a list of strings using Ollama's embeddings endpoint.
    Returns list of vectors (list[float]).
    NOTE: Ensure `ollama serve` is running and the embedding model is pulled.
    """
    vectors = []
    for t in texts:
        resp = ollama.embeddings(model=model, prompt=t)
        vectors.append(resp["embedding"])
        time.sleep(0.01)  # gentle pacing
    return vectors

def row_to_document(row):
    """
    Construct a semantically rich text representation to improve similarity search.
    Combines Question + Answer + metadata.
    """
    return (
        f"TicketId: {row['TicketId']}\n"
        f"Project: {row['Project']}\n"
        f"Question: {row['Question']}\n"
        f"Resolution: {row['Answer']}\n"
    )

def load_csv(csv_path):
    """
    Load the CSV with expected columns.
    """
    df = pd.read_csv(csv_path)
    required_cols = {"TicketId", "Project", "Question", "Answer"}
    if not required_cols.issubset(df.columns):
        raise ValueError(f"CSV missing required columns: {required_cols}")
    df["TicketId"] = df["TicketId"].astype(str)
    return df


def get_or_create_collection(persist_dir, collection_name):
    """
    Create a persistent Chroma client + collection using the new client API.
    This stores data on disk at `persist_dir`.
    """
    # Use PersistentClient for local on-disk storage (SQLite-backed in current releases)
    client = chromadb.PersistentClient(path=persist_dir)
    collection = client.get_or_create_collection(name=collection_name)
    return client, collection


def ingest_dataframe(collection, df):
    """
    Create embeddings for each row and upsert into Chroma.
    """
    ids = df["TicketId"].tolist()
    documents = [row_to_document(r) for _, r in df.iterrows()]
    metadatas = [
        {
            "TicketId": r["TicketId"],
            "Project": r["Project"],
            "Question": r["Question"],
            "Answer": r["Answer"]
        }
        for _, r in df.iterrows()
    ]
    embeddings = embed_texts_ollama(documents, model=EMBED_MODEL)
    collection.upsert(
        ids=ids,
        documents=documents,
        metadatas=metadatas,
        embeddings=embeddings,
    )

def retrieve_context(collection, query, top_k=TOP_K):
    """
    Embed the query, then perform similarity search in Chroma.
    Returns list of dicts with metadata + document + distance.
    """
    query_vec = embed_texts_ollama([query], model=EMBED_MODEL)[0]
    results = collection.query(
        query_embeddings=[query_vec],
        n_results=top_k,
        include=["documents", "metadatas", "distances"]
    )
    contexts = []
    for i in range(len(results["documents"][0])):
        contexts.append({
            "document": results["documents"][0][i],
            "metadata": results["metadatas"][0][i],
            "distance": results["distances"][0][i],
        })
    return contexts

def build_augmented_prompt(user_query, contexts):
    """
    Combine retrieved context into a single prompt for the LLM.
    """
    context_texts = []
    for c in contexts:
        meta = c["metadata"]
        ctx = (
            f"- TicketId: {meta.get('TicketId')}\n"
            f"  Project: {meta.get('Project')}\n"
            f"  Question: {meta.get('Question')}\n"
            f"  Resolution: {meta.get('Answer')}\n"
        )
        context_texts.append(ctx)
    context_block = "\n".join(context_texts)
    system_instructions = (
        "You are a helpful support assistant. Use the CONTEXT to answer the USER QUERY.\n"
        "If the context is not sufficient, say what additional info is needed.\n"
        "Cite TicketIds or Projects when relevant."
    )
    prompt = (
        f"{system_instructions}\n\n"
        f"CONTEXT:\n{context_block}\n\n"
        f"USER QUERY:\n{user_query}\n\n"
        f"ASSISTANT:"
    )
    return prompt

def generate_with_ollama(prompt, model=LLM_MODEL):
    """
    Call Ollama to generate a completion.
    """
    resp = ollama.generate(model=model, prompt=prompt)
    return resp["response"]

## 2. Load CSV
Checks columns, casts `TicketId` to string for ID consistency.

In [52]:
# Load the CSV (ensure the file exists in the working directory)
df = load_csv(CSV_PATH)
df.head()

Unnamed: 0,TicketId,Project,Question,Answer
0,1001,CRM Suite,Cannot log into the CRM; getting 'invalid cred...,"We reset the user's password, cleared browser ..."
1,1002,Billing Portal,Invoice PDF fails to download; the button spin...,Enabled CDN path for PDF endpoint and added a ...
2,1003,Mobile App,App crashes when opening notifications on Andr...,Patched NotificationManager usage and added nu...
3,1004,Data Platform,Scheduled ETL job did not run last night; data...,"Restarted Airflow scheduler, re-ran backfill, ..."
4,1005,Helpdesk,Password reset emails are not being delivered ...,DKIM record was misconfigured. Fixed DNS recor...


## 3. Initialize Chroma (Persistent)
Creates/gets a collection and ensures data survives restarts (`duckdb+parquet`).

In [53]:
client, collection = get_or_create_collection(CHROMA_DIR, COLLECTION_NAME)
collection

Collection(name=tickets_qa)

## 4. Ingest: Embed & Upsert
Computes embeddings using **Ollama** and upserts into **Chroma** with metadata.

In [54]:
print("Embedding & upserting documents into Chroma...")
ingest_dataframe(collection, df)

# No manual persist step is required with PersistentClient.
# Data is automatically stored at the path you passed to PersistentClient.
print("Ingestion complete. Data is persisted automatically at:", CHROMA_DIR)

Embedding & upserting documents into Chroma...
Ingestion complete. Data is persisted automatically at: ./chroma_store


## 5. Retrieve Relevant Context
Embeds the user query, runs similarity search, and shows top results.

In [55]:
user_query = (
    "A user reports invalid credentials in CRM but is sure the password is correct. What should we do?"
)
contexts = retrieve_context(collection, user_query, top_k=TOP_K)
print("Top retrieved contexts:")
for i, c in enumerate(contexts, 1):
    print(f"[{i}] TicketId={c['metadata']['TicketId']} | Project={c['metadata']['Project']} | distance={c['distance']:.4f}")
contexts[:2]  # preview first two


Top retrieved contexts:
[1] TicketId=1001 | Project=CRM Suite | distance=131.9055
[2] TicketId=1005 | Project=Helpdesk | distance=303.9214
[3] TicketId=1007 | Project=HR Portal | distance=337.5979


[{'document': "TicketId: 1001\nProject: CRM Suite\nQuestion: Cannot log into the CRM; getting 'invalid credentials' even though my password is correct.\nResolution: We reset the user's password, cleared browser cache, and verified SSO token freshness. Issue resolved.\n",
  'metadata': {'Project': 'CRM Suite',
   'TicketId': '1001',
   'Answer': "We reset the user's password, cleared browser cache, and verified SSO token freshness. Issue resolved.",
   'Question': "Cannot log into the CRM; getting 'invalid credentials' even though my password is correct."},
  'distance': 131.90553283691406},
 {'document': 'TicketId: 1005\nProject: Helpdesk\nQuestion: Password reset emails are not being delivered to users.\nResolution: DKIM record was misconfigured. Fixed DNS records and retried the mail queue. Delivery confirmed.\n',
  'metadata': {'Project': 'Helpdesk',
   'Answer': 'DKIM record was misconfigured. Fixed DNS records and retried the mail queue. Delivery confirmed.',
   'TicketId': '1005'

## 6. Build Augmented Prompt
Combines retrieved context with instructions for grounded answering.

In [56]:
prompt = build_augmented_prompt(user_query, contexts)
print(prompt[:800] + ("\n...[truncated]..." if len(prompt) > 800 else ""))


You are a helpful support assistant. Use the CONTEXT to answer the USER QUERY.
If the context is not sufficient, say what additional info is needed.
Cite TicketIds or Projects when relevant.

CONTEXT:
- TicketId: 1001
  Project: CRM Suite
  Question: Cannot log into the CRM; getting 'invalid credentials' even though my password is correct.
  Resolution: We reset the user's password, cleared browser cache, and verified SSO token freshness. Issue resolved.

- TicketId: 1005
  Project: Helpdesk
  Question: Password reset emails are not being delivered to users.
  Resolution: DKIM record was misconfigured. Fixed DNS records and retried the mail queue. Delivery confirmed.

- TicketId: 1007
  Project: HR Portal
  Question: Employees cannot upload documents; the upload fails with a 413 error.
  R
...[truncated]...


## 7. Generate with Ollama LLM
Calls the chosen chat/generation model to produce an answer grounded in the retrieved context.

In [57]:
answer = generate_with_ollama(prompt, model=LLM_MODEL)
print("--- LLM Answer ---")
print(answer)


--- LLM Answer ---
Based on the provided context, I would recommend following the steps taken in TicketId 1001 to resolve the issue.

To confirm, let's try:

1. Reset the user's password
2. Clear browser cache
3. Verify SSO token freshness

If these steps do not resolve the issue, please provide more information about the error message or any other symptoms you're experiencing, and I'll be happy to help further!


## 8. (Optional) Re-ingest / Reset Collection
You can clear and re-ingest if you iterate on the dataset or embedding strategy.

In [58]:
# Uncomment to clear collection
# collection.delete(where={})  # deletes all documents
# ingest_dataframe(collection, df)
# client.persist()

## Notes & Tips
- Use the **same embedding model** for documents and queries.
- Ensure `ollama serve` is running and models are pulled.
- For longer tickets, consider **chunking** documents (split into smaller parts).
- You can filter by metadata in Chroma (e.g., `where={"Project": "CRM Suite"}`).

## Next Steps
- Add feedback logging, re-ranking, and prompt templates.
- Wrap in an API (FastAPI) or CLI for production use.
- Use PyMuPDF and Tesseract to get data from PDFs.
- Use LlamaIndex for chunking the long PDF-texts into smaller pieces using semantic chunking.
- Use FastAPI and docker for deploying a webserver