# RAG Application

![Simple RAG](../../images/simple_rag.png)

In this notebook, we're going to set up a simple RAG application that we'll be using as we learn more about LangSmith.

RAG (Retrieval Augmented Generation) is a popular technique for providing LLMs with relevant documents that will enable them to better answer questions from users. 

In our case, we are going to index some LangSmith documentation!

LangSmith makes it easy to trace any LLM application, no LangChain required!

### Setup

Make sure you set your environment variables, including your Langsmith API Key.

In [1]:
# You can set them inline!
import os
os.environ["LANGCHAIN_API_KEY"] = ""
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "langsmith-academy"

In [2]:
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../../.env", override=True)

True

In [5]:
!ollama --version

ollama version is 0.5.13


### Simple RAG application

In [3]:
from langsmith import traceable
import requests
from typing import List
import nest_asyncio
import os
from utils import get_vector_db_retriever

# Set USER_AGENT to avoid warning
os.environ["USER_AGENT"] = "LangSmithRAG/1.0 (Python; Ollama)"

MODEL_PROVIDER = "ollama"
MODEL_NAME = "llama3.2:latest"
APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks. 
Use the retrieved context to answer the question in one concise sentence, focusing only on the answer itself. 
If the context lacks sufficient information, say "I don't have enough context to answer accurately" and stop there."""

OLLAMA_API_URL = "http://localhost:11434/api/generate"
nest_asyncio.apply()
retriever = get_vector_db_retriever()

"""
retrieve_documents
- Returns documents fetched from a vectorstore based on the user's question
"""
@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

"""
generate_response
- Calls `call_ollama` to generate a model response after formatting inputs
"""
@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs}\n\nQuestion: {question}"
        }
    ]
    prompt = "\n".join([f"{msg['role']}: {msg['content']}" for msg in messages])
    return call_ollama(prompt)

"""
call_ollama
- Returns the text generation output from Ollama (renamed for structure compatibility)
"""
@traceable(run_type="llm")
def call_ollama(
    messages: str,
    model: str = MODEL_NAME,
    temperature: float = 0.0
) -> str:
    payload = {
        "model": model,
        "prompt": messages,
        "temperature": temperature,
        "stream": False
    }
    response = requests.post(OLLAMA_API_URL, json=payload)
    response.raise_for_status()
    return response.json()["response"]

"""
langsmith_rag
- Calls `retrieve_documents` to fetch documents
- Calls `generate_response` to generate a response based on the fetched documents
- Returns the model response
"""
@traceable(run_type="chain")
def langsmith_rag(question: str, **kwargs):
    documents = retrieve_documents(question)
    # print(f"Retrieved documents: {[doc.page_content for doc in documents]}")
    response = generate_response(question, documents)
    # print(f"Generated response: {response}")
    return response

# Example usage
if __name__ == "__main__":
    # First question
    question1 = "What is the capital of France?"
    answer1 = langsmith_rag(question1)
    print(answer1)

    # Second question with metadata
    question2 = "What is LangSmith used for?"
    answer2 = langsmith_rag(question2, langsmith_extra={"metadata": {"website": "www.google.com"}})
    print(answer2)

USER_AGENT environment variable not set, consider setting it to identify your requests.


The answer to "What is the capital of France?" is Paris.
LangSmith appears to be a programming language, although its primary use isn't explicitly stated in the provided context.


This should take a little less than a minute. We are indexing and storing LangSmith documentation in a SKLearn vector database.

In [4]:
question = "What is LangSmith used for?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

LangSmith appears to be a programming language or development tool, but its specific use case and purpose are not clearly defined in the provided context.


### Let's take a look in LangSmith!