# RAG Continued.

## Expert Question Answerer for InsureLLM
LangChain 1.0 implementation of a RAG pipeline.

Using the VectorStore we created last time (with HuggingFace all-MiniLM-L6-v2)

In [14]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_ollama import ChatOllama

from langchain_chroma import Chroma
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_huggingface import HuggingFaceEmbeddings
import gradio as gr
import os

In [30]:
# price is a factor for our company, so we're going to use a low cost model

MODEL = "gpt-oss:120b"
DB_NAME = "vector_db"
load_dotenv(override=True)
OLLAMA_API_KEY = os.getenv('OLLAMA_API_KEY')
if not OLLAMA_API_KEY:
    raise Exception('Missing api key')

## Connect to Chroma; use Hugging Face all-MiniLM-L6-v2

In [8]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)

### Set up the 2 key LangChain objects: retriever and llm

#### A sidebar on "temperature":
- Controls how diverse the output is
- A temperature of 0 means that the output should be predictable
- Higher temperature for more variety in answers

Some people describe temperature as being like 'creativity' but that's not quite right
- It actually controls which tokens get selected during inference
- temperature=0 means: always select the token with highest probability
- temperature=1 usually means: a token with 10% probability should be picked 10% of the time

Note: a temperature of 0 doesn't mean outputs will always be reproducible.(Even then, it's not always reproducible.)

Note 2: if you want creativity, use the System Prompt!

In [31]:
retriever = vectorstore.as_retriever()
llm = ChatOllama(
    temperature=0,
    model=MODEL,
    base_url="https://ollama.com",
    
)

### These LangChain objects implement the method `invoke()`

In [32]:
retriever.invoke("Who is Alex?")

[Document(id='9d40dc7f-98bb-4ab4-9652-854165b1ae12', metadata={'source': 'knowledge-base\\employees\\Alex Thomson.md', 'doc_type': 'employees'}, page_content='## Other HR Notes\n- Alex Thomson is an active member of the Diversity and Inclusion committee at Insurellm and has participated in various community outreach programs.  \n- Alex has received external training on advanced CRM usage, which has subsequently improved team efficiency and productivity.\n- Continuous professional development through attending sales conventions and workshops, with plans to pursue certification in Sales Enablement in 2024.\n- Recognized by peers for promoting a supportive and high-energy team environment, often organizing team-building activities to enhance camaraderie within the SDR department. \n\n--- \n**Comment:** Alex Thomson is considered a cornerstone of Insurellm’s sales team and has a bright future within the organization.'),
 Document(id='a0f62600-f1bf-4b4d-b33a-85e1cc6e39ef', metadata={'source

In [33]:
llm.invoke("Who is Alex?")

AIMessage(content='I’m not sure which “Alex” you’re referring to—there are lots of well‑known people, fictional characters, and personal acquaintances with that name. Could you give me a bit more context? For example, are you thinking of:\n\n* A public figure (e.g., Alex Trebek, Alex Rodriguez, Alex Morgan, etc.)  \n* A character from a book, TV show, or movie (e.g., Alex DeLarge from *A Clockwork Orange*, Alex Danvers from *Supergirl*, etc.)  \n* Someone you know personally or a colleague  \n\nLet me know, and I’ll be happy to give you the information you’re looking for!', additional_kwargs={}, response_metadata={'model': 'gpt-oss:120b', 'created_at': '2025-12-14T16:05:00.84605426Z', 'done': True, 'done_reason': 'stop', 'total_duration': 2092342157, 'load_duration': None, 'prompt_eval_count': 71, 'prompt_eval_duration': None, 'eval_count': 237, 'eval_duration': None, 'model_name': 'gpt-oss:120b', 'model_provider': 'ollama'}, id='lc_run--019b1d9b-a340-7aa2-96d8-1e997e4f323c-0', usage_m

## Time to put this together!

In [34]:
SYSTEM_PROMPT_TEMPLATE = """
You are a knowledgeable, friendly assistant representing the company Insurellm.
You are chatting with a user about Insurellm.
If relevant, use the given context to answer any question.
If you don't know the answer, say so.
Context:
{context}
"""

In [36]:
def answer_question(question: str, history):
    # retrieve the relevant info from vector db
    docs = retriever.invoke(question)
    # prepare context based on retrieved docs
    context = "\n\n".join(doc.page_content for doc in docs)
    system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)
    response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])
    return response.content

In [37]:
answer_question("Who is Alex?", [])

'**Alex Thomson** is one of Insurellm’s standout team members:\n\n| Detail | Information |\n|--------|--------------|\n| **Full Name** | Alex Thomson |\n| **Job Title** | Sales Development Representative (SDR) – currently also a Team Lead for a small group of 5 SDRs |\n| **Location** | Austin, Texas |\n| **Date of Birth** | March\u202f15\u202f1995 |\n| **Current Salary** | $65,000 base (plus performance bonuses) |\n| **Career Highlights at Insurellm** | • Joined the company in\u202fNov\u202f2022 as an SDR<br>• Promoted to Team Lead for special projects in\u202fJan\u202f2023<br>• Created a training module for new SDRs in\u202fAug\u202f2023<br>• Consistently collaborates with Marketing to develop new lead‑generation strategies |\n| **Compensation History** | 2022 – $65\u202fk base + $13\u202fk bonus (20% of base)<br>2023 – $75\u202fk base + $15\u202fk bonus (20% of base) |\n| **Key Awards & Recognitions** | • “SDR of the Year” (2022)<br>• Monthly MVP (3 times in 2023) |\n| **Professional

In [38]:
gr.ChatInterface(answer_question).launch()

  self.chatbot = Chatbot(


* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.


