RAG Application using Hugging Face repo

Simple RAG in Google Colab (No API Key, 100% Free)
1. Install Required Libraries

        !pip install langchain faiss-cpu sentence-transformers transformers

        !pip install -U langchain-community

        !pip install faiss-cpu

        !pip install pypdf

In [1]:
from google.colab import files

uploaded = files.upload()


Saving apache-kafka-beginner-guide.pdf to apache-kafka-beginner-guide (1).pdf


2. Import Libraries

In [2]:
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.llms import HuggingFacePipeline
from langchain.chains import RetrievalQA
from langchain.document_loaders import PyPDFLoader

from transformers import pipeline


In [3]:
pdf_path = "/content/apache-kafka-beginner-guide.pdf"   # after upload, file will be here
loader = PyPDFLoader(pdf_path)
documents = loader.load()

3. Prepare Knowledge Base

In [4]:
# Example knowledge base
docs = [
    "Java Spring Boot is a framework for building REST APIs and microservices.",
    "Kafka is used for real-time data streaming and messaging.",
    "AWS provides cloud-native services like S3, EC2, and Lambda."
]

# Split into chunks
#splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
#documents = splitter.create_documents(doc)

splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
docs = splitter.split_documents(documents)


4. Create Vector Database (FAISS)

In [5]:
# Use HuggingFace embeddings (free)
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Build FAISS vectorstore
db = FAISS.from_documents(documents, embeddings)
retriever = db.as_retriever()


  embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


5. Load a Free HuggingFace Model for Generation

In [6]:
# Use a small open-source model (free to run in Colab CPU)
llm_pipeline = pipeline("text2text-generation", model="google/flan-t5-small")

llm = HuggingFacePipeline(pipeline=llm_pipeline)


Device set to use cpu
  llm = HuggingFacePipeline(pipeline=llm_pipeline)


6. Build RAG Chain

In [7]:
qa = RetrievalQA.from_chain_type(
    llm=llm,
    retriever=retriever,
    return_source_documents=True
)


7. Ask Questions

In [9]:
query = "What is Kafka used for?"
result = qa(query)

print("Question:", query)
print("Answer:", result["result"])
print("Source:", result["source_documents"][0].page_content)


Question: What is Kafka used for?
Answer: Building applications and systems in need of real-time streaming
Source: What is Apache Kafka?
 
 
 
Apache Kafka is a publish-subscribe (pub-sub) message system that allows messages (also
 
called records) to be sent between processes, applications, and servers. Simply said - Kafka
 
stores streams of records.
 
 
A record can include any kind of information. It could, for example, have information about an
 
event that has happened on a website or could be a simple text message that triggers an event
 
so another application may connect to the system and process or reprocess it.
 
 
Unlike most messaging systems, the message queue in Kafka (also called a log) is persistent.
 
The data sent is stored until a specified retention period has passed by. Noticeable for Apache
 
Kafka is that records are not deleted when consumed.
 
 
An Apache Kafka cluster consists of a chosen number of brokers/servers (also called nodes).
 
Apache Kafka itself is