# Simple GenAI RAG App with Langchain

### Use case:
- Let's say I have a specific website and that website has some text content
- We are going to extract that information (Web Scraping)
- Convert that data into chunks and embed it
- Using llm along with prompt engineering to specifically get output from that page


### 🔐 Load Environment Variables

Here we load API keys and set environment variables using the `.env` file.  
- Make sure `.env` is in `.gitignore`
- LANGCHAIN_TRACING_V2 enables tracing in LangSmith


In [21]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY")
os.environ["MISTRAL_API_KEY"]=os.getenv("MISTRAL_API_KEY")
os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGCHAIN_API_KEY")
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "SIMPLE_GENAI_APP"

### 🧠 Load the LLM

We're using Groq's `gemma2-9b-it` model via `langchain_groq`.  
This will be our core language model for answering questions.


In [22]:
from langchain_groq import ChatGroq
llm = ChatGroq(model="gemma2-9b-it")

### 🌐 Load Documents from the Web

Using `WebBaseLoader` to pull documentation directly from LangSmith's website.  
This returns a list of document objects.


In [23]:
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com/?_gl=1*1sq7mh8*_gcl_au*MjU3OTg0MzA2LjE3NTE3MDM2NDY.*_ga*ODYzNDc1NDYwLjE3NTE3MDM2NDY.*_ga_47WX3HKKY2*czE3NTE5MDM3MTAkbzIkZzAkdDE3NTE5MDM3MTAkajYwJGwwJGgw")
docs = loader.load()
docs

[Document(metadata={'source': 'https://docs.smith.langchain.com/?_gl=1*1sq7mh8*_gcl_au*MjU3OTg0MzA2LjE3NTE3MDM2NDY.*_ga*ODYzNDc1NDYwLjE3NTE3MDM2NDY.*_ga_47WX3HKKY2*czE3NTE5MDM3MTAkbzIkZzAkdDE3NTE5MDM3MTAkajYwJGwwJGgw', 'title': 'Get started with LangSmith | 🦜️🛠️ LangSmith', 'description': 'LangSmith is a platform for building production-grade LLM applications.', 'language': 'en'}, page_content="\n\n\n\n\nGet started with LangSmith | 🦜️🛠️ LangSmith\n\n\n\n\n\n\n\n\nSkip to main contentOur Building Ambient Agents with LangGraph course is now available on LangChain Academy!API ReferenceRESTPythonJS/TSSearchRegionUSEUGo to AppGet StartedObservabilityEvaluationPrompt EngineeringDeployment (LangGraph Platform)AdministrationSelf-hostingPricingReferenceCloud architecture and scalabilityAuthz and AuthnAuthentication methodsdata_formatsEvaluationDataset transformationsRegions FAQsdk_referenceGet StartedOn this pageGet started with LangSmith\nLangSmith is a platform for building production-grade 

### 📚 Split Documents into Chunks

We split the documents into 500-character chunks with 100-character overlap.  
This helps in handling large documents efficiently for vector embedding.


In [24]:
from langchain_text_splitters import RecursiveCharacterTextSplitter as RCTS 

splitter = RCTS(chunk_size = 500, chunk_overlap = 100)
final_docs = splitter.split_documents(docs)

### 🧠 Generate Embeddings and Store in VectorDB

- Use Mistral's `mistral-embed` model to embed document chunks.
- Store vectors in FAISS, an efficient in-memory vector store.


In [25]:
from langchain_mistralai import MistralAIEmbeddings
embedding = MistralAIEmbeddings(model = "mistral-embed")

from langchain_community.vectorstores import FAISS
db = FAISS.from_documents(final_docs, embedding)



### 🔍 Run a Basic Similarity Search

Before creating a full RAG pipeline, test if your query finds relevant context chunks from the FAISS DB.


In [26]:
# querying from a vectrostore DB
query = "The quality and development speed of AI applications depends on"
result = db.similarity_search(query)
result[0].page_content

'Get started by adding tracing to your application.\nCreate dashboards to view key metrics like RPS, error rates and costs.\n\nEvals\u200b\nThe quality and development speed of AI applications depends on high-quality evaluation datasets and metrics to test and optimize your applications on. The LangSmith SDK and UI make building and running high-quality evaluations easy.'

### 🧩 Create Document Chain with Contextual Prompt

`create_stuff_documents_chain` merges retrieved docs with a prompt so the LLM gets useful context.  
This is the “RAG” part (Retrieval-Augmented Generation).


In [27]:
# Retrieval Chain, Dcoument Chain
from langchain.chains.combine_documents import create_stuff_documents_chain
# create_stuff_document_chain makes our llm able to have context when answering questions
from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(
    """
Answer the following question based only on the provided context:
<context>
{context}
</context>"""
)

document_chain = create_stuff_documents_chain(llm, prompt)


### 🧪 Manually Test the Document Chain

Provide context manually to see how the chain performs before connecting to retriever.


In [28]:
from langchain_core.documents import Document
document_chain.invoke(
    {
        "input":"The quality and development speed of AI applications depends on what?",
        "context":[Document(page_content="""The quality and development speed of AI applications depends on high-quality evaluation datasets and metrics to test and optimize your applications on. The LangSmith SDK and UI make building and running high-quality evaluations easy.Get started by creating your first evaluation.Quickly assess the performance of your application using our off-the-shelf evaluators as a starting point.Analyze results of evaluations in the LangSmith UI and compare results over time.Easily collect human feedback on your data to improve your application.""")]
    }
)

'According to the provided text, the LangSmith SDK and UI make it easy to build and run high-quality evaluations for AI applications. \n'

### 🔗 Connect Retriever to Document Chain

This turns our setup into a full RAG chain.  
Retriever fetches relevant chunks → chain injects them into the prompt.


In [29]:
retriever = db.as_retriever()
from langchain.chains import create_retrieval_chain
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# document chain is responsible to give us context information

### 💡 Get Final Response from RAG Pipeline

The query is passed through the retrieval chain which:
1. Fetches context via FAISS
2. Injects it into prompt
3. Sends to LLM (Groq)

You get the final answer here!


In [30]:
# get response from the LLM
response = retrieval_chain.invoke({"input":"The quality and development speed of AI applications depends on what?"})
response["answer"]

"LangSmith offers LLM-native observability, which helps you gain insights into your application's performance throughout its development lifecycle.  \n"