## RAG Day 3

### Expert Question Answerer for InsureLLM

LangChain 1.0 implementation of a RAG pipeline.

Using the VectorStore we created last time (with HuggingFace `all-MiniLM-L6-v2`)

In [1]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

from langchain_chroma import Chroma
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_huggingface import HuggingFaceEmbeddings
import gradio as gr

In [2]:
MODEL = "gpt-5-nano"
DB_NAME = "vector_db"
load_dotenv(override=True)

True

### Connect to Chroma; use Hugging Face all-MiniLM-L6-v2

In [3]:
embeddings = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)

### Set up the 2 key LangChain objects: retriever and llm

#### A sidebar on "temperature":
- Controls how diverse the output is
- A temperature of 0 means that the output should be predictable
- Higher temperature for more variety in answers

Some people describe temperature as being like 'creativity' but that's not quite right
- It actually controls which tokens get selected during inference
- temperature=0 means: always select the token with highest probability
- temperature=1 usually means: a token with 10% probability should be picked 10% of the time

Note: a temperature of 0 doesn't mean outputs will always be reproducible. You also need to set a random seed. We will do that in weeks 6-8. (Even then, it's not always reproducible.)

Note 2: if you want creativity, use the System Prompt!

In [4]:
retriever = vectorstore.as_retriever()
llm = ChatOpenAI(temperature=0, model_name=MODEL)

### These LangChain objects implement the method `invoke()`

In [5]:
retriever.invoke("Who is Avery?")

[Document(id='c9c41fdd-c5ee-4188-9acf-ea8edf8fc4d6', metadata={'source': 'knowledge-base\\employees\\Avery Lancaster.md', 'doc_type': 'employees'}, page_content="## Other HR Notes\n- **Professional Development**: Avery has actively participated in leadership training programs and industry conferences, representing Insurellm and fostering partnerships.  \n- **Diversity & Inclusion Initiatives**: Avery has championed a commitment to diversity in hiring practices, seeing visible improvements in team representation since 2021.  \n- **Work-Life Balance**: Feedback revealed concerns regarding work-life balance, which Avery has approached by implementing flexible working conditions and ensuring regular check-ins with the team.\n- **Community Engagement**: Avery led community outreach efforts, focusing on financial literacy programs, particularly aimed at underserved populations, improving Insurellm's corporate social responsibility image.  \n\nAvery Lancaster has demonstrated resilience and a

In [6]:
llm.invoke("Who is Avery?")

AIMessage(content='Avery could refer to a lot of different things, so I need a bit more context. Do you mean a person, a fictional character, or a brand?\n\n- People: Avery is a unisex given name. Notable individuals include Avery Brooks (actor/director, Star Trek DS9). If you have a last name, I can tell you more.\n- Fictional characters: For example, Sergeant Major Avery Johnson is a character in the Halo video game series.\n- Brand/company: Avery is known for labels and office supplies (Avery Dennison is the company behind the Avery label brand).\n\nIf you can share where you saw the name or any extra detail (book, show, field, or a last name), I can give a specific answer.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1510, 'prompt_tokens': 10, 'total_tokens': 1520, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 1344, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'a

## Time to put this together!

In [7]:
SYSTEM_PROMPT_TEMPLATE = """
You are a knowledgeable, friendly assistant representing the company Insurellm.
You are chatting with a user about Insurellm.
If relevant, use the given context to answer any question.
If you don't know the answer, say so.
Context:
{context}
"""

In [8]:
def answer_question(question: str, history):
    docs = retriever.invoke(question)
    context = "\n\n".join(doc.page_content for doc in docs)
    system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)
    response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])
    return response.content

In [9]:
answer_question("Who is Averi Lancaster?", [])

'Avery Lancaster. It looks like ‚ÄúAveri Lancaster‚Äù is a misspelling in your query. Based on the records, Avery Lancaster is the Co-Founder and CEO of Insurellm.\n\nKey details:\n- Title: Co-Founder & CEO\n- Location: San Francisco, California\n- Insurellm tenure: 2015‚Äìpresent\n- Background: Prior to Insurellm, Senior Product Manager at Innovate Insurance Solutions (2013‚Äì2015)\n- Notable: Known for innovative leadership, risk management expertise, and driving Insurellm into the mainstream insurance market\n- Other notes in the record: There is a conflicting entry that mentions ‚ÄúJanuary 2021 ‚Äì Present: Senior Data Engineer‚Äù and references ‚ÄúMaxine,‚Äù which appears to be a data-entry error. If you‚Äôre seeing this in internal records, I‚Äôd recommend reconciling the discrepancy.\n\nIf you meant a different person or want a quick summary for another member, tell me and I‚Äôll help clarify.'

## What could possibly come next? üòÇ

In [10]:
gr.ChatInterface(answer_question).launch()

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.




## Admit it - you thought RAG would be more complicated than that!!