<a href="https://colab.research.google.com/github/Aparnamol-KS/CodeCompanion-GroqAI/blob/main/Part_C_RAG_Bootcamp_GROQ.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Day 3: Retrieval-Augmented Generation (RAG) with GROQ LLaMA3


This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system using **LangChain** and the **GROQ API** (LLaMA3 models).  
We'll walk through creating embeddings, storing in a vector DB, and querying using a custom LangChain-compatible `GroqLLM`.

## RAG = Retriever + Reader (LLM)


In [None]:

!pip install -U langchain langchain-community openai chromadb faiss-cpu sentence-transformers tiktoken groq



## Architecture Overview
```text
[User Query] → [Retriever → VectorDB] → [Relevant Chunks] → [GROQ LLM] → [Answer]
```


In [None]:

from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import TextLoader
from langchain.chains import RetrievalQA
from langchain.llms.base import LLM

from typing import List, Optional
from groq import Groq
import os


## Step 1: Load Sample README File

In [None]:

sample_text = '''# Sample Project

This project demonstrates an example of a LangChain-powered RAG pipeline. It uses FAISS for vector search and a GROQ-hosted LLaMA3 model for response generation.

## Features

- Document embedding
- Vector similarity search
- LLM-based QA over documents
'''

with open("sample_readme.txt", "w") as f:
    f.write(sample_text)

loader = TextLoader("sample_readme.txt")
documents = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = text_splitter.split_documents(documents)


## Step 2: Create Embeddings & Store in Chroma

In [None]:

embedding = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
vectorstore = Chroma.from_documents(docs, embedding, persist_directory="rag_chroma_groq")


## Step 3: Define GROQ LLM Wrapper

In [None]:

class GroqLLM(LLM):
    model: str = "llama3-8b-8192"
    api_key: str = ""  # Replace with your actual API key
    temperature: float = 0.0

    def _call(self, prompt: str, stop: Optional[List[str]] = None) -> str:
        client = Groq(api_key=self.api_key)

        messages = [
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": prompt}
        ]

        response = client.chat.completions.create(
            model=self.model,
            messages=messages,
            temperature=self.temperature,
        )

        return response.choices[0].message.content

    @property
    def _llm_type(self) -> str:
        return "groq-llm"


## Step 4: Build RAG Pipeline with GROQ

In [None]:

retriever = vectorstore.as_retriever()
groq_llm = GroqLLM(api_key="")

qa_chain = RetrievalQA.from_chain_type(
    llm=groq_llm,
    retriever=retriever,
    return_source_documents=True
)


## Step 5: Ask a Question

In [None]:

query = "What does this project demonstrate?"
result = qa_chain({"query": query})
print("Answer:", result["result"])
