# RAG Application

![Simple RAG](../../images/simple_rag.png)

In this notebook, we're going to set up a simple RAG application that we'll be using as we learn more about LangSmith.

RAG (Retrieval Augmented Generation) is a popular technique for providing LLMs with relevant documents that will enable them to better answer questions from users. 

In our case, we are going to index some LangSmith documentation!

LangSmith makes it easy to trace any LLM application, no LangChain required!

### Setup

Make sure you set your environment variables, including your OpenAI API key.

In [1]:
#!pip3 install --upgrade lxml_html_clean

In [2]:
# # You can set them inline!
# import os
# os.environ["OPENAI_API_KEY"] = ""
# os.environ["LANGSMITH_API_KEY"] = ""
# os.environ["LANGSMITH_TRACING"] = "true"
# os.environ["LANGSMITH_PROJECT"] = "langsmith-academy"

In [1]:
import os
# Or you can use a .env file
from dotenv import load_dotenv
load_dotenv(dotenv_path="../../.env", override=True)

True

### Simple RAG application

In [2]:
from langsmith import traceable
from openai import OpenAI
from typing import List
import nest_asyncio
from utils import get_vector_db_retriever

MODEL_PROVIDER = "openai"
MODEL_NAME = "gpt-4o-mini"
APP_VERSION = 1.0
RAG_SYSTEM_PROMPT = """You are an assistant for question-answering tasks. 
Use the following pieces of retrieved context to answer the latest question in the conversation. 
If you don't know the answer, just say that you don't know. 
Use three sentences maximum and keep the answer concise.
"""

openai_client = OpenAI()
nest_asyncio.apply()
retriever = get_vector_db_retriever()

"""
retrieve_documents
- Returns documents fetched from a vectorstore based on the user's question
"""
@traceable(run_type="chain")
def retrieve_documents(question: str):
    return retriever.invoke(question)

"""
generate_response
- Calls `call_openai` to generate a model response after formatting inputs
"""
@traceable(run_type="chain")
def generate_response(question: str, documents):
    formatted_docs = "\n\n".join(doc.page_content for doc in documents)
    messages = [
        {
            "role": "system",
            "content": RAG_SYSTEM_PROMPT
        },
        {
            "role": "user",
            "content": f"Context: {formatted_docs} \n\n Question: {question}"
        }
    ]
    return call_openai(messages)

"""
call_openai
- Returns the chat completion output from OpenAI
"""
@traceable(run_type="llm")
def call_openai(
    messages: List[dict], model: str = MODEL_NAME, temperature: float = 0.0
) -> str:
    return openai_client.chat.completions.create(
        model=model,
        messages=messages,
        temperature=temperature,
    )

"""
langsmith_rag
- Calls `retrieve_documents` to fetch documents
- Calls `generate_response` to generate a response based on the fetched documents
- Returns the model response
"""
@traceable(run_type="chain")
def langsmith_rag(question: str):
    documents = retrieve_documents(question)
    response = generate_response(question, documents)
    return response.choices[0].message.content


USER_AGENT environment variable not set, consider setting it to identify your requests.
Fetching pages: 100%|###############################################################################| 197/197 [00:35<00:00,  5.53it/s]


This should take a little less than a minute. We are indexing and storing LangSmith documentation in a SKLearn vector database.

In [5]:
question = "What is LangSmith used for?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

LangSmith is a platform designed for building production-grade LLM applications, allowing users to monitor and evaluate their applications for improved reliability. It provides features for tracing application requests, evaluating application quality over time, and testing prompts with version control. Additionally, it is framework agnostic, meaning it can be used with or without LangChain's frameworks.


In [6]:
question = "How can I evaluate my app ?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

To evaluate your app, you should create a dataset with test inputs and optionally expected outputs, define a target function that specifies what you're evaluating, and use evaluators to score the outputs of your target function. You can also consider different evaluation techniques such as backtesting, pairwise evaluation, or online evaluation depending on your needs. This structured approach will help you measure performance and identify areas for improvement.


In [7]:
question = "How do I create a dataset ?"
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

You can create a dataset in LangSmith by navigating to the Datasets & Experiments page and clicking on "+ New Dataset." From there, you can either import an existing dataset from a CSV or JSONL file, create an empty dataset, or add examples manually or via an LLM. Additionally, you can define a schema for your dataset to ensure it conforms to a specific JSON structure.


In [8]:
question = "I created the new dataset, now how do I run an evaluation with it ? "
ai_answer = langsmith_rag(question, langsmith_extra={"metadata": {"website": "www.google.com"}})
print(ai_answer)

To run an evaluation with your newly created dataset, navigate to the prompt playground and select your dataset from the "Test over dataset" dropdown. Then, add a prompt by selecting an existing one or creating a new one, ensuring that the input variables match the dataset keys. Finally, you can run the evaluation to see how well it scores across different contexts or scenarios.


### test in scale

In [9]:
prompt_question_generator = "Generate a question about Langsmith, the tool. How to start, how to create new traces, etc."
prompt_question_generator += "Answer only with the question, nothing more. Don't generate questions that already were generated" 

messages = [
    {
        "role": "system",
        "content": prompt_question_generator
    }
]

for i in range(100):
    response = call_openai(messages)
    generated_question = response.choices[0].message.content
    print("Q:", generated_question)

    messages.append({"role": "user", "content": f"previously generated question: {generated_question}"})
    
    ai_answer = langsmith_rag(generated_question, langsmith_extra={"metadata": {"text": "AI-generated"}})
    print("R:", ai_answer)
    print("#" * 30)
    print(messages)
    print("#" * 30)

Q: What are the steps to create a new trace in Langsmith?
R: To create a new trace in LangSmith, you need to set up tracing in your application, which can be done through automatic or manual instrumentation. Once your application is configured, you can log traces by using the LangSmith SDK or API, ensuring that you pass the necessary hostname if using a no-auth configuration. Finally, you can add metadata and tags to your traces for better organization and analysis.
##############################
[{'role': 'system', 'content': "Generate a question about Langsmith, the tool. How to start, how to create new traces, etc.Answer only with the question, nothing more. Don't generate questions that already were generated"}, {'role': 'user', 'content': 'previously generated question: What are the steps to create a new trace in Langsmith?'}]
##############################
Q: How can I integrate Langsmith with my existing development environment?
R: To integrate LangSmith with your existing devel

KeyboardInterrupt: 