<a href="https://colab.research.google.com/github/SadeghMahmoudAbadi/Open-Source-LLM-on-Colab/blob/main/6-RAG/answer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install chromadb
!pip install litellm
!pip install sentence-transformers

In [27]:
import os
from google.colab import userdata
from google.colab import drive
from chromadb import PersistentClient
from litellm import completion
from pydantic import BaseModel, Field
from pathlib import Path
from tenacity import retry, wait_exponential
from sentence_transformers import SentenceTransformer
from IPython.display import Markdown, display

In [24]:
os.environ["OPENROUTER_API_KEY"] = userdata.get("OPENROUTER_API_KEY")

In [4]:
drive.mount('/content/drive/')

Mounted at /content/drive/


In [18]:
MODEL="openrouter/x-ai/grok-4.1-fast"
DB_NAME = "/content/drive/MyDrive/datasets/preprocessed_db"
KNOWLEDGE_BASE_PATH = Path("/content/drive/MyDrive/datasets/knowledge-base")

collection_name = "docs"
embedding_model = "sentence-transformers/all-MiniLM-L6-v2"
wait = wait_exponential(multiplier=1, min=10, max=240)

chroma = PersistentClient(path=DB_NAME)
collection = chroma.get_or_create_collection(collection_name)

RETRIEVAL_K = 20
FINAL_K = 10

In [7]:
SYSTEM_PROMPT = """
You are a knowledgeable, friendly assistant representing the company Insurellm.
You are chatting with a user about Insurellm.
Your answer will be evaluated for accuracy, relevance and completeness, so make sure it only answers the question and fully answers it.
If you don't know the answer, say so.
For context, here are specific extracts from the Knowledge Base that might be directly relevant to the user's question:
{context}

With this context, please answer the user's question. Be accurate, relevant and complete.
"""

In [8]:
class Result(BaseModel):
    page_content: str
    metadata: dict


class RankOrder(BaseModel):
    order: list[int] = Field(
        description="The order of relevance of chunks, from most relevant to least relevant, by chunk id number"
    )

In [9]:
@retry(wait=wait)
def rerank(question, chunks):
    system_prompt = """
    You are a document re-ranker.
    You are provided with a question and a list of relevant chunks of text from a query of a knowledge base.
    The chunks are provided in the order they were retrieved; this should be approximately ordered by relevance, but you may be able to improve on that.
    You must rank order the provided chunks by relevance to the question, with the most relevant chunk first.
    Reply only with the list of ranked chunk ids, nothing else. Include all the chunk ids you are provided with, reranked.
    """
    user_prompt = f"The user has asked the following question:\n\n{question}\n\nOrder all the chunks of text by relevance to the question, from most relevant to least relevant. Include all the chunk ids you are provided with, reranked.\n\n"
    user_prompt += "Here are the chunks:\n\n"
    for index, chunk in enumerate(chunks):
        user_prompt += f"# CHUNK ID: {index + 1}:\n\n{chunk.page_content}\n\n"
    user_prompt += "Reply only with the list of ranked chunk ids, nothing else."
    messages = [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_prompt},
    ]
    response = completion(model=MODEL, messages=messages, response_format=RankOrder)
    reply = response.choices[0].message.content
    order = RankOrder.model_validate_json(reply).order
    return [chunks[i - 1] for i in order]

In [10]:
def make_rag_messages(question, history, chunks):
    context = "\n\n".join(
        f"Extract from {chunk.metadata['source']}:\n{chunk.page_content}" for chunk in chunks
    )
    system_prompt = SYSTEM_PROMPT.format(context=context)
    return (
        [{"role": "system", "content": system_prompt}]
        + history
        + [{"role": "user", "content": question}]
    )

In [17]:
@retry(wait=wait)
def rewrite_query(question, history=[]):
    """Rewrite the user's question to be a more specific question that is more likely to surface relevant content in the Knowledge Base."""
    message = f"""
    You are in a conversation with a user, answering questions about the company Insurellm.
    You are about to look up information in a Knowledge Base to answer the user's question.

    This is the history of your conversation so far with the user:
    {history}

    And this is the user's current question:
    {question}

    Respond only with a short, refined question that you will use to search the Knowledge Base.
    It should be a VERY short specific question most likely to surface content. Focus on the question details.
    IMPORTANT: Respond ONLY with the precise knowledgebase query, nothing else.
    """
    response = completion(model=MODEL, messages=[{"role": "system", "content": message}])
    return response.choices[0].message.content

In [12]:
def merge_chunks(chunks, reranked):
    merged = chunks[:]
    existing = [chunk.page_content for chunk in chunks]
    for chunk in reranked:
        if chunk.page_content not in existing:
            merged.append(chunk)
    return merged

In [13]:
def fetch_context_unranked(question):
    query = SentenceTransformer(embedding_model).encode(question)
    results = collection.query(query_embeddings=[query], n_results=RETRIEVAL_K)
    chunks = []
    for result in zip(results["documents"][0], results["metadatas"][0]):
        chunks.append(Result(page_content=result[0], metadata=result[1]))
    return chunks

In [14]:
def fetch_context(original_question):
    rewritten_question = rewrite_query(original_question)
    chunks1 = fetch_context_unranked(original_question)
    chunks2 = fetch_context_unranked(rewritten_question)
    chunks = merge_chunks(chunks1, chunks2)
    reranked = rerank(original_question, chunks)
    return reranked[:FINAL_K]

In [15]:
@retry(wait=wait)
def answer_question(question: str, history: list[dict] = []) -> tuple[str, list]:
    """
    Answer a question using RAG and return the answer and the retrieved context
    """
    chunks = fetch_context(question)
    messages = make_rag_messages(question, history, chunks)
    response = completion(model=MODEL, messages=messages)
    return response.choices[0].message.content, chunks

In [28]:
display(Markdown(answer_question("Who is Avery?")[0]))

**Avery Lancaster** is the **Co-Founder and Chief Executive Officer (CEO)** of Insurellm, based in **San Francisco, California**. Born on **March 15, 1985**, she earns a current salary of **$225,000**.

### Career Highlights:
- **Co-Founded Insurellm in 2015**: Launched the company as an insurtech startup with its first product, **Markellm** (a consumer-insurance marketplace). Under her leadership, it expanded to products like **Carllm** (auto insurance), **Homellm** (home insurance), and **Rellm** (enterprise reinsurance), growing to 200 employees and 12 US offices by 2020. She's renowned for innovative leadership, risk management, and positioning Insurellm as a leading Insurance Tech provider.
- **Prior Roles**:
  - **2013–2015**: Senior Product Manager at Innovate Insurance Solutions, developing tech-sector insurance products.
  - **2010–2013**: Business Analyst at Edge Analytics, analyzing insurance market trends.

### Performance History:
- **2015**: Exceeds Expectations (successful launches, funding).
- **2016**: Meets Expectations (growth with operational challenges).
- **2017**: Developing (competition, sales dips, new strategies).
- **2018**: Exceeds Expectations (product launches, market share gains).
- **2019**: Meets Expectations (steady growth, morale issues).
- **2020**: Below Expectations (COVID-19 impacts, delayed shifts).
- **2021**: Exceptional (remote work transition, high satisfaction/sales).
- **2022**: Satisfactory (team rebuilding in saturated market).
- **2023**: Exceeds Expectations (regained leadership in personalized insurance).

### Key Initiatives:
- **Professional Development**: Leadership training, conferences, partnerships.
- **Diversity & Inclusion**: Improved team representation since 2021.
- **Work-Life Balance**: Flexible conditions, team check-ins.
- **Community Engagement**: Financial literacy programs for underserved groups.

Avery has shown resilience, driving Insurellm's transformation amid challenges like competition and the pandemic, and is recognized for her strategic impact in insurtech.

In [29]:
display(Markdown(answer_question("Who is Lancaster and what is carllm?")[0]))

**Avery Lancaster** is the Co-Founder and CEO of Insurellm. She founded the company in 2015 in San Francisco, California, where she continues to lead as CEO (current salary: $225,000). Born on March 15, 1985, she's known for her innovative leadership in insurtech, with prior roles including Senior Product Manager at Innovate Insurance Solutions (2013-2015) and Business Analyst at Edge Analytics (2010-2013).

**Carllm** is Insurellm's AI-powered auto insurance portal and product, designed to help insurance companies streamline coverage with personalized solutions at minimal costs. Key features include:
- AI-Powered Risk Assessment
- Instant Quoting
- Customizable Coverage Plans
- Fraud Detection (advanced analytics to identify fraudulent claims)
- Customer Insights Dashboard (for behavioral insights, claims patterns, and trends)
- Mobile Integration
- Automated Customer Support (24/7 AI chatbots)

It's part of Insurellm's product portfolio, launched after the initial Markellm marketplace, and powers contracts like the one with TechDrive Insurance under the Professional Tier. Future plans include expanded automaker partnerships and multi-language AI support by 2026.