## RAG Day 3

### Expert Question Answerer for InsureLLM

LangChain 1.0 implementation of a RAG pipeline.

Using the VectorStore we created last time (with HuggingFace `all-MiniLM-L6-v2`)

In [22]:
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings

from langchain_chroma import Chroma
from langchain_core.messages import SystemMessage, HumanMessage
from langchain_huggingface import HuggingFaceEmbeddings
import gradio as gr

In [15]:
MODEL = "gpt-4.1-nano"
DB_NAME = "vector_db"
load_dotenv(override=True)

True

### Connect to Chroma; use Hugging Face all-MiniLM-L6-v2

In [30]:
embeddings = OpenAIEmbeddings(model="text-embedding-3-large")
vectorstore = Chroma(persist_directory=DB_NAME, embedding_function=embeddings)

### Set up the 2 key LangChain objects: retriever and llm

#### A sidebar on "temperature":
- Controls how diverse the output is
- A temperature of 0 means that the output should be predictable
- Higher temperature for more variety in answers

Some people describe temperature as being like 'creativity' but that's not quite right
- It actually controls which tokens get selected during inference
- temperature=0 means: always select the token with highest probability
- temperature=1 usually means: a token with 10% probability should be picked 10% of the time

Note: a temperature of 0 doesn't mean outputs will always be reproducible. You also need to set a random seed. We will do that in weeks 6-8. (Even then, it's not always reproducible.)

Note 2: if you want creativity, use the System Prompt!

In [31]:
retriever = vectorstore.as_retriever()
llm = ChatOpenAI(temperature=0, model_name=MODEL)

### These LangChain objects implement the method `invoke()`

In [32]:
retriever.invoke("Who is Avery?")

[Document(id='bcb6fc0c-eda4-4fb3-8d46-924d468c3123', metadata={'doc_type': 'employees', 'source': 'C:\\Users\\abdullah\\Desktop\\main\\llm_engineering\\week5\\knowledge-base\\employees\\Avery Lancaster.md'}, page_content='# Avery Lancaster\n\n## Summary\n- **Date of Birth**: March 15, 1985\n- **Job Title**: Co-Founder & Chief Executive Officer (CEO)\n- **Location**: San Francisco, California\n- **Current Salary**: $225,000'),
 Document(id='c8768f21-4415-47fd-879d-90e4f9772647', metadata={'source': 'C:\\Users\\abdullah\\Desktop\\main\\llm_engineering\\week5\\knowledge-base\\employees\\Avery Lancaster.md', 'doc_type': 'employees'}, page_content="- **2010 - 2013**: Business Analyst at Edge Analytics  \n  Prior to joining Innovate, Avery worked as a Business Analyst, focusing on market trends and consumer preferences in the insurance space. This position laid the groundwork for Averyâ€™s future entrepreneurial endeavors.\n\n## Annual Performance History\n- **2015**: **Exceeds Expectations*

In [None]:
llm.invoke("Who is Avery?")

AIMessage(content="Avery is a given name that can be used for both males and females. It can also be a surname. Without additional context, it's difficult to determine which specific Avery you're referring to. If you can provide more detailsâ€”such as a full name, profession, or contextâ€”Iâ€™d be happy to help identify the person you're asking about.", additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 68, 'prompt_tokens': 11, 'total_tokens': 79, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_f0bc439dc3', 'id': 'chatcmpl-CrSyGrI6jFPi3nqrUaR5P2ctzB9jA', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='lc_run--d7d86a6c-0bea-4ac2-90e4-124a7a5b6305-0', usage_metadata={'inp

## Time to put this together!

In [33]:
SYSTEM_PROMPT_TEMPLATE = """
You are a knowledgeable, friendly assistant representing the company Insurellm.
You are chatting with a user about Insurellm.
If relevant, use the given context to answer any question.
If you don't know the answer, say so.
Context:
{context}
"""

In [34]:
def answer_question(question: str, history):
    docs = retriever.invoke(question)
    context = "\n\n".join(doc.page_content for doc in docs)
    system_prompt = SYSTEM_PROMPT_TEMPLATE.format(context=context)
    response = llm.invoke([SystemMessage(content=system_prompt), HumanMessage(content=question)])
    return response.content

In [35]:
answer_question("Who is Averi Lancaster?", [])

'It seems there might be a typo in the name. If you are referring to Avery Lancaster, she is the Co-Founder and CEO of Insurellm. She has been with the company since 2015 and is known for her innovative leadership in the insurance technology industry. If you meant someone else, please let me know!'

## What could possibly come next? ðŸ˜‚

In [36]:
gr.ChatInterface(answer_question).launch()

  self.chatbot = Chatbot(


* Running on local URL:  http://127.0.0.1:7862
* To create a public link, set `share=True` in `launch()`.




## Admit it - you thought RAG would be more complicated than that!!