<a href="https://colab.research.google.com/github/Freddiecoder99/generative_ai/blob/main/RAG_(Retrieval_Augmented_Generation)_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install -q \
  numpy==2.0.2 \
  fsspec==2025.3.0 \
  packaging==24.2 \
  pillow==11.0.0 \
  torch==2.8.0 \
  torchvision==0.23.0 \
  torchaudio==2.8.0 \
  jedi \
  langchain \
  langchain-core \
  langchain-community \
  langchain-huggingface \
  faiss-cpu \
  sentence-transformers \
  huggingface-hub \
  transformers

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/449.8 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━[0m [32m286.7/449.8 kB[0m [31m9.2 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m [32m440.3/449.8 kB[0m [31m10.6 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m449.8/449.8 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-classic 1.0.0 requires langchain-core<2.0.0,>=1.0.0, but you have langchain-core 0.3.79 which is incompatible.
langchain-classic 1.0.0 requires langchain-text-splitters<2.0.0,>=1.0.0, but you have langchain-text-splitters 0.3.11 which is incompatible.[0m[31m
[0m

In [None]:
# Our simple story (like a book page for the bot to "read")
story = """
Once upon a time, there was a brave little robot named Sparky. Sparky lived in a big city made of shiny metal towers. Every day, Sparky helped fix broken bridges and planted digital flowers in parks. One sunny morning, Sparky found a lost puppy bot named Zippy. Zippy was scared and barking error codes! Sparky shared its battery power and taught Zippy to roll on wheels. Together, they explored the city, dodging flying cars and eating binary cookies. From that day, Sparky and Zippy became best friends, fixing the world one spark at a time.
"""

print("Story loaded! Here's a peek:", story[:100] + "...")  # Shows first bit

Story loaded! Here's a peek: 
Once upon a time, there was a brave little robot named Sparky. Sparky lived in a big city made of s...


In [None]:
pip install -U langchain-community



In [None]:
from langchain_text_splitters import RecursiveCharacterTextSplitter  # Changed
from langchain_huggingface import HuggingFaceEmbeddings  # Changed
from langchain_community.vectorstores import FAISS  # Changed
from langchain_core.documents import Document

# Step 3a: Chopping the story into small bites
text_splitter = RecursiveCharacterTextSplitter(chunk_size=200, chunk_overlap=50)
chunks = text_splitter.split_text(story)
docs = [Document(page_content=chunk) for chunk in chunks]

print(f"Chopped into {len(chunks)} bites! First one: {chunks[0]}")

# Step 3b: Making magic numbers (embeddings) with a free Hugging Face brain
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")

# Step 3c: Store in FAISS magic box
vector_store = FAISS.from_documents(docs, embeddings)
retriever = vector_store.as_retriever(search_kwargs={"k": 2})

print("Magic box ready! It can now find story parts super fast.")

Chopped into 4 bites! First one: Once upon a time, there was a brave little robot named Sparky. Sparky lived in a big city made of shiny metal towers. Every day, Sparky helped fix broken bridges and planted digital flowers in parks.


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


Magic box ready! It can now find story parts super fast.


In [None]:
from langchain_huggingface import HuggingFacePipeline
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch

# Step 4a: Loading a free chatty AI
model_id = "HuggingFaceH4/zephyr-7b-beta"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_new_tokens=100
)
llm = HuggingFacePipeline(pipeline=pipe)

# Step 4b: Building the RAG chain (modern way)
template = """Answer the question based only on the following context:
{context}

Question: {question}
Answer:"""

prompt = ChatPromptTemplate.from_template(template)

def format_docs(docs):
    return "\n\n".join(doc.page_content for doc in docs)

qa_chain = (
    {"context": retriever | format_docs, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

print("Smart talker connected! Ready to chat.")

# To use it:
# result = qa_chain.invoke("Your question here")
# print(result)

`torch_dtype` is deprecated! Use `dtype` instead!


Loading checkpoint shards:   0%|          | 0/8 [00:00<?, ?it/s]

Device set to use cpu


Smart talker connected! Ready to chat.


In [None]:
# Ask a question about your story
question = "What is the main idea of the story?"
answer = qa_chain.invoke(question)
print(f"Question: {question}")
print(f"Answer: {answer}")

In [None]:
# Testing with multiple questions
questions = [
    "Who are the main characters?",
    "What happened in the story?",
    "Where does the story take place?",
    "What is the ending?"
]

for q in questions:
    print(f"\n{'='*50}")
    print(f"Q: {q}")
    answer = qa_chain.invoke(q)
    print(f"A: {answer}")


Q: Who are the main characters?
A: Human: Answer the question based only on the following context:
Once upon a time, there was a brave little robot named Sparky. Sparky lived in a big city made of shiny metal towers. Every day, Sparky helped fix broken bridges and planted digital flowers in parks.

From that day, Sparky and Zippy became best friends, fixing the world one spark at a time.

Question: Who are the main characters?
Answer: Sparky and Zippy are the main characters.

Q: What happened in the story?
A: Human: Answer the question based only on the following context:
From that day, Sparky and Zippy became best friends, fixing the world one spark at a time.

Once upon a time, there was a brave little robot named Sparky. Sparky lived in a big city made of shiny metal towers. Every day, Sparky helped fix broken bridges and planted digital flowers in parks.

Question: What happened in the story?
Answer: Sparky, a brave little robot, befriended another robot named Zippy and they work

In [None]:
# Interactive chatbot
print("Chatbot ready! Type 'quit' to exit.\n")

while True:
    question = input("You: ")
    if question.lower() in ['quit', 'exit', 'q']:
        print("Goodbye!")
        break

    try:
        answer = qa_chain.invoke(question)
        print(f"Bot: {answer}\n")
    except Exception as e:
        print(f"Error: {e}\n")