# RAG Application

![Simple RAG](../../images/simple_rag.png)

In this notebook, we're going to set up a simple RAG application that we'll be using as we learn more about LangSmith.

RAG (Retrieval Augmented Generation) is a popular technique for providing LLMs with relevant documents that will enable them to better answer questions from users. 

In our case, we are going to index some LangSmith documentation!

LangSmith makes it easy to trace any LLM application, no LangChain required!

### Setup

Make sure you set your environment variables, including your OpenAI API key.

In [1]:
# You can set them inline!
import os

In [2]:
# Or you can use a .env file
#from dotenv import load_dotenv
#load_dotenv(dotenv_path=".env", override=True)

### Simple RAG application

In [3]:
from langsmith import traceable
from openai import OpenAI
from typing import List
import nest_asyncio
#from llmhelper import get_llm_from_ollama, get_retriever
from common_genai_utils.llmhelper import get_retriever
from common_genai_utils.geminihelper import get_llm


APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the latest question in the conversation. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
"""

llm_client = get_llm()
nest_asyncio.apply()
retriever = get_retriever(llm_model=llm_client)

"""
retrieve_documents
- Returns documents fetched from a vectorstore based on the user's question
"""
@traceable(run_type="chain", 
           metadata={"vector_db" : "Chroma Vector DB"}
          )
def retrieve_documents(question: str):
    return retriever.invoke(question)

"""
generate_response
- Calls `call_openai` to generate a model response after formatting inputs
"""
@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_model(messages)

"""
call_openai
- Returns the chat completion output from OpenAI
"""
@traceable(run_type="llm", metadata={
    "ls_model_name" : "Llama on Olama", 
    "ls_provider" : "Meta"
})
def call_model(
    messages: List[dict]
) -> str:
    return llm_client.invoke(
        messages
    )

"""
langsmith_rag
- Calls `retrieve_documents` to fetch documents
- Calls `generate_response` to generate a response based on the fetched documents
- Returns the model response
"""
@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question,
                            langsmith_extra={"metadata": {"runtime_metadata": "Testing runtime metadata"}})
    response = generate_response(question, documents)
    return response


Loaded env variables from current directory ->  False
Loaded env variables from parent directory ->  True


  from .autonotebook import tqdm as notebook_tqdm
USER_AGENT environment variable not set, consider setting it to identify your requests.


model name used -  models/gemini-1.5-pro


This should take a little less than a minute. We are indexing and storing LangSmith documentation in a SKLearn vector database.

In [4]:
question = "What is LangSmith? and What is it used for?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

ResponseError: model "models/gemini-1.5-pro" not found, try pulling it first

In [None]:
# another question to test if llm can answer the query.
question = "What is solar energy?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

### Let's take a look in LangSmith!