# 🧠 Retrieval-Augmented Generation (RAG) with LangChain, Groq & OpenAI
This notebook demonstrates a simple yet effective RAG pipeline using the LangChain framework.
It combines retrieval from a vector store with the power of LLMs for enhanced question answering.

Technologies used:
- `LangChain`
- `Groq` (DeepSeek LLM)
- `OpenAI` Embeddings
- `FAISS` Vector Store
- `.env` configuration

## 1. 🔧 Environment Setup
Load environment variables and necessary packages.

In [1]:
from langchain_groq import ChatGroq
from dotenv import load_dotenv
import os

load_dotenv()

True

## 2. 🧠 Load the LLM
We use Groq’s `deepseek-r1-distill-llama-3` model.

In [3]:
# Initialize the Groq LLM
llm = ChatGroq(model='deepseek-r1-distill-llama-70b')

# Test basic generation
response = llm.invoke('What is the capital of France?')
print(response)

content='<think>\n\n</think>\n\nThe capital of France is Paris.' additional_kwargs={} response_metadata={'token_usage': {'completion_tokens': 12, 'prompt_tokens': 10, 'total_tokens': 22, 'completion_time': 0.046966364, 'prompt_time': 0.000311747, 'queue_time': 0.20253216, 'total_time': 0.047278111}, 'model_name': 'deepseek-r1-distill-llama-70b', 'system_fingerprint': 'fp_1bbe7845ec', 'service_tier': 'on_demand', 'finish_reason': 'stop', 'logprobs': None} id='run--59c2190c-9453-4acd-a045-c2625fda5ead-0' usage_metadata={'input_tokens': 10, 'output_tokens': 12, 'total_tokens': 22}


## 3. 📦 Embedding & Vector Store
We use OpenAI embeddings and FAISS as the retriever backend.

In [4]:
from langchain.embeddings import OpenAIEmbeddings
from langchain.vectorstores import FAISS
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader

api_key = os.getenv('OPENAI_API_KEY')
embedding_model = OpenAIEmbeddings(model='text-embedding-3-small')

  embedding_model = OpenAIEmbeddings(model='text-embedding-3-small')


## 4. 📚 Document Loading & Chunking
We split documents into manageable chunks before embedding.

In [6]:
# Load sample text
loader = TextLoader('sample.txt')  # Replace with your own file
documents = loader.load()

# Split documents into chunks
splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(documents)

RuntimeError: Error loading sample.txt

## 5. 🧠 Create FAISS Vector Store
Embed the chunks and build the FAISS index.

In [None]:
vectorstore = FAISS.from_documents(chunks, embedding_model)

## 6. 🔍 
Use the retriever to answer a user question based on retrieved context.

In [None]:
# Convert the vector store to retriever with top-k=5
retriever = vectorstore.as_retriever(search_kwargs={'k': 5})

# Retrieve relevant documents for a query
query = 'What are llama2 fine-tuning benchmarks?'
docs = retriever.get_relevant_documents(query)

# Combine context with LLM prompt
context = '\n'.join([doc.page_content for doc in docs])
final_prompt = f"Answer the question based on the context below:\n\n{context}\n\nQuestion: {query}"

print(llm.invoke(final_prompt))