<a href="https://colab.research.google.com/github/MumbuaFaithK/ai-and-data-projects/blob/main/Retrieval_Augmented_Generation_(RAG)_Pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. Install & Import Libraries

In [13]:
# Install necessary libraries
!pip install -U langchain langchain-community transformers sentence-transformers faiss-cpu pypdf

# Imports
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import FAISS
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline



# 2. Load and Preview the PDF

In [15]:
from google.colab import files

# Upload the document manually
uploaded = files.upload()
import os

# Rename uploaded file to 'document.pdf' for convenience
os.rename("The four pillars of effective communication design [Slides].pdf", "document.pdf")


## Load the Document
loader = PyPDFLoader("document.pdf")
docs = loader.load()
print(f"Loaded {len(docs)} pages.")
print(docs[0].page_content[:500])


Saving The four pillars of effective communication design [Slides].pdf to The four pillars of effective communication design [Slides].pdf
Loaded 15 pages.
Please do not copy without permission. Â© ExploreAI 2023.
The four pillars of effective communication design
Design for impactful communication


# 3. Split the Document into Chunks

In [16]:
splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = splitter.split_documents(docs)
print(f"Total Chunks Created: {len(chunks)}")
print(chunks[0].page_content[:300])

Total Chunks Created: 22
Please do not copy without permission. Â© ExploreAI 2023.
The four pillars of effective communication design
Design for impactful communication


# 4. Create Embeddings & Vector Store

In [17]:
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
vectorstore = FAISS.from_documents(chunks, embeddings)
retriever = vectorstore.as_retriever()

# 5. Load the LLM and Define RAG Query Function

In [18]:
model_name = "google/flan-t5-large"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
flan_pipeline = pipeline("text2text-generation", model=model, tokenizer=tokenizer)

def query_rag(question):
    relevant_docs = retriever.get_relevant_documents(question)
    context = "\n".join([doc.page_content for doc in relevant_docs])
    prompt = f"Answer the question using only the context:\n\nContext:\n{context}\n\nQuestion: {question}\n\nAnswer:"

    response = flan_pipeline(
        prompt,
        max_new_tokens=200,
        temperature=0.9,
        top_k=50,
        top_p=0.9,
        do_sample=True
    )
    return response[0]['generated_text']

Device set to use cpu


# 6. Sample Queries and Comparison

In [20]:
# Document-based answer
print("ðŸ”¹ Answer from RAG:")
print(query_rag("Summarize the key points of this document in a paragraph of 200 words."))

# Generic answer without context (optional comparison)
no_context_prompt = "Summarize the key points of a document in a paragraph of 200 words."
print("\nðŸ”¸ Answer without document context:")
print(flan_pipeline(no_context_prompt, max_new_tokens=200)[0]['generated_text'])

ðŸ”¹ Answer from RAG:


  return forward_call(*args, **kwargs)


The four pillars checklist is a checklist for visualisations.

ðŸ”¸ Answer without document context:
The following is a summary of the key points of the document.
