# RAG Workflow with LangChain, Groq API, and FAISS

In this assignment, you will:

1. Load web content using LangChain.
2. Store embeddings of the content in FAISS, a vector database.
3. Use Groq API as your LLM to query the stored embeddings via Retrieval-Augmented Generation (RAG).

## Objectives:

- Learn how to use LangChain for data loading from the web.
- Understand embeddings and how to store/retrieve them with FAISS.
- Implement RAG workflow using Groq LLM API.

## Setup and Installation


In [3]:
# Install required libraries
!pip install langchain groq faiss-cpu sentence-transformers beautifulsoup4 langchain-community langchain-groq

Collecting groq
  Downloading groq-0.26.0-py3-none-any.whl.metadata (15 kB)
Collecting faiss-cpu
  Downloading faiss_cpu-1.11.0-cp311-cp311-manylinux_2_28_x86_64.whl.metadata (4.8 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.24-py3-none-any.whl.metadata (2.5 kB)
Collecting langchain-groq
  Downloading langchain_groq-0.3.2-py3-none-any.whl.metadata (2.6 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.9.1-py3-none-any.whl.metadata (3.8 kB)
Collecting httpx-sse<1.0.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from dataclasses-json<0.7,>=0.5.7->langchain-community)
  Downloading marshmallow-3.26.1-py3-none-any.whl.metadata (7.3 kB)
Collecting typing-inspect<1,>=0.4.0 (from 

## Import Libraries

In [4]:
from langchain.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain_groq import ChatGroq
import os



## Step 1: Load and Prepare Data

https://python.langchain.com/docs/integrations/document_loaders/web_base/

https://python.langchain.com/api_reference/text_splitters/character/langchain_text_splitters.character.RecursiveCharacterTextSplitter.html

https://huggingface.co/blog/getting-started-with-embeddings

https://python.langchain.com/docs/integrations/vectorstores/faiss/

# Documentation

In [7]:
# Define the URL to load
url = "https://huggingface.co/blog/getting-started-with-embeddings"
# Load the web content
loader = WebBaseLoader(url)
docs = loader.load()

In [8]:
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
documents = text_splitter.split_documents(docs)

:## Step 2: Generate Embeddings and Store in FAISS

In [9]:
embeddings_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

  embeddings_model = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [15]:
vectorstore = FAISS.from_documents(documents, embeddings_model)
vectorstore.save_local("faiss_index")

print("Embeddings stored in FAISS index.")

Embeddings stored in FAISS index.


## Step 3: Setup Groq LLM

In [25]:
os.environ['GROQ_API_KEY'] = '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'
llm = ChatGroq(model_name="llama3-8b-8192")
print("Groq LLM setup complete.")

Groq LLM setup complete.


## Step 4: Retrieval-Augmented Generation (RAG)

In [26]:
retriever = vectorstore.as_retriever()
qa_chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)

## Step 5: Querying the System

In [27]:
query = "What are embeddings in NLP?"
answer = qa_chain.run(query)

In [28]:
print(f"Question: {query}\n")
print(f"Answer: {answer}\n")

Question: What are embeddings in NLP?

Answer: In NLP, an embedding is a numerical representation of a piece of information, such as text, documents, or images. The representation captures the semantic meaning of what is being embedded, making it robust for many industry applications.

