<a href="https://colab.research.google.com/github/rudhra029-source/PRD_RAG/blob/main/PRD_RAG_Chatbot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Install Required Libraries

This cell installs all the libraries that is required to load web pages, create embeddings , build a vector store and create an UI.

In [32]:
%%capture
%pip install -qqqq \
    faiss-cpu \
    "gradio>=4.0" \
    "langchain>=0.3.16" \
    "langchain-community>=0.3.16" \
    "langchain-openai>=0.2.14" \
    "langchain-text-splitters>=0.3.4" \
    sentence-transformers


API Keys

This cell saves OpenAI API key in an environment variable so LangChain can use OpenAI models without hardcoding the key everywhere

In [33]:
import os

os.environ["OPENAI_API_KEY"] = "sk"

PRD Webpage

This cell loads the text content of the chosen web page (the Formlabs PRD article) into a LangChain Document, so code can read and process it


In [26]:
from langchain_community.document_loaders import WebBaseLoader

url = "https://formlabs.com/blog/product-requirements-document-prd-with-template/"
loader = WebBaseLoader(url)
docs = loader.load()

print("Number of documents:", len(docs))
print(docs[0].page_content[:1000])


Number of documents: 1
How to Write a Product Requirements Document (PRD) - With Free Template | FormlabsSkip to Main ContentSelect SiteFormlabsformlabs.comCurrentDentaldental.formlabs.com3D PrintersMaterialsSoftwareApplicationsLearnSupportContactStoreAll PostsGuidesHow to Write a Product Requirements Document (PRD) - With Free TemplateEngineeringGuidesThe Product Requirements Document or PRD describes all aspects of a new idea required or desired to make its realization a success. The PRD is the bridge between the often vague project briefing and the highly detailed engineering implementation plan. It is also known as the Program Of Requirements (POR), Design Specification, or Product Nucleus in some circles. 
The PRD provides designers and developers with a realistic sense of what is required in the product in terms of what it should look and feel like, how it should function, how it should carry out the brand and help the business, and in which ways it should be used.White PaperGuid

Chunking

This cell splits the long PRD document into smaller overlapping chunks so they fit within the model’s context window and still have the same context or meaning.


In [34]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=800,
    chunk_overlap=200,
)

split_docs = text_splitter.split_documents(docs)

print("Number of chunks:", len(split_docs))
print(split_docs[0].page_content[:500])


Number of chunks: 35
How to Write a Product Requirements Document (PRD) - With Free Template | FormlabsSkip to Main ContentSelect SiteFormlabsformlabs.comCurrentDentaldental.formlabs.com3D PrintersMaterialsSoftwareApplicationsLearnSupportContactStoreAll PostsGuidesHow to Write a Product Requirements Document (PRD) - With Free TemplateEngineeringGuidesThe Product Requirements Document or PRD describes all aspects of a new idea required or desired to make its realization a success. The PRD is the bridge between the of


Creating Embeddings and Vector Store

This cell converts each text chunk into a numeric vector (embedding) that captures its meaning, using a sentence-transformers model through HuggingFaceEmbeddings and then build the vector ( FAISS) store so that system can quickly find the most relevant chunks as per the user prompts.

In [35]:
from sentence_transformers import SentenceTransformer
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

model_name = "sentence-transformers/all-MiniLM-L6-v2"
hf_embeddings = HuggingFaceEmbeddings(model_name=model_name)

vectorstore = FAISS.from_documents(split_docs, hf_embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})


Prompt Template + LCEL (LangChain Expression Language) + LLM RAG QA

This cell defines a RAG chain - retriever pulls relevant chunks, inserts them into the user prompt, and calls the LLM to answer based only on the context.

In [38]:
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

# Prompt: system + human with {context} and {question}
prompt = ChatPromptTemplate.from_messages([
    (
        "system",
        "You are a helpful assistant. Use ONLY the provided context to answer. "
        "If the information is clearly not in the context, say you don't know.\n\nCONTEXT:\n{context}"
    ),
    ("human", "{question}")
])

def format_docs(docs):
    return "\n\n".join(d.page_content for d in docs)

# LCEL-style RAG chain (like in the class notebook) [file:4]
rag_chain = (
    {
        "context": retriever | format_docs,
        "question": RunnablePassthrough()
    }
    | prompt
    | llm
    | StrOutputParser()
)


Testing the Prompt

In [39]:
rag_chain.invoke("What is a product requirements document?")


'A Product Requirements Document (PRD) describes all aspects of a new idea required or desired to make its realization a success. It serves as a bridge between the often vague project briefing and the highly detailed engineering implementation plan. The PRD outlines various components such as hardware, software, user experience, and more, and it makes development goals specific by using clear language and the SMART-principle (specific, measurable, achievable, relevant, and time-bound).'

Chat History

This cell adds a simple conversation memory and a chat() function that combines recent queries and answers together with the new queries before sending it through the RAG chain.

In [30]:
# Simple chat history list: each item is (user_message, assistant_answer)
chat_history = []

def chat(question: str):
    """
    Conversational wrapper around rag_chain.
    Keeps recent history so the model can understand follow-up questions.
    """
    global chat_history

    # 1) Build a text version of recent conversation (last 5 exchanges)
    history_text = ""
    for user_msg, assistant_msg in chat_history[-5:]:
        history_text += f"User: {user_msg}\nAssistant: {assistant_msg}\n"

    # 2) Combine history + new question into one 'composed' question
    composed_question = f"""
Here is the recent conversation:

{history_text}

The user now asks: {question}

If needed, use the previous messages to understand what the user is referring to.
"""

    # 3) Call your existing RAG chain with this composed question
    answer = rag_chain.invoke(composed_question)

    # 4) Save this turn in history and return the answer
    chat_history.append((question, answer))
    return answer


Chat History Testing

In [31]:
# First question
print("Q1:", "What is a product requirements document?")
print("A1:", chat("What is a product requirements document?"))

# Follow‑up question referring to the first answer
print("\nQ2:", "What does it say about the objectives section?")
print("A2:", chat("What does it say about the objectives section?"))


Q1: What is a product requirements document?
A1: A Product Requirements Document (PRD) describes all aspects of a new idea required or desired to make its realization a success. It serves as a bridge between the often vague project briefing and the highly detailed engineering implementation plan. The PRD outlines the project's objectives, user needs, goals, and various requirements that must be met for the product to be successful. It is an essential tool for the design and product development process.

Q2: What does it say about the objectives section?
A2: The objectives section of the Product Requirements Document (PRD) includes a description of the project’s objectives, which encompasses the overall vision for the product and high-level goals for the company. It should also include success metrics, Key Performance Indicators (KPIs), and a timeframe wherever possible.


Gradio Chat UI

In [40]:
import gradio as gr

def gradio_chat(user_message, history):
    """
    Gradio wrapper around your chat() function.
    history is a list of [user, assistant] pairs from the UI.
    """
    global chat_history
    # sync internal history with what Gradio shows
    chat_history = [(u, a) for (u, a) in history]

    # use your conversational RAG bot
    answer = chat(user_message)

    # update UI history
    history.append((user_message, answer))
    return "", history

with gr.Blocks() as demo:
    gr.Markdown("# PRD Q&A Chatbot")
    gr.Markdown("Ask questions about the Product Requirements Document article.")

    chatbot = gr.Chatbot()
    msg = gr.Textbox(label="Your question")
    clear = gr.Button("Clear chat")

    msg.submit(gradio_chat, [msg, chatbot], [msg, chatbot])
    clear.click(lambda: None, None, chatbot, queue=False)

demo.launch()


  chatbot = gr.Chatbot()
  chatbot = gr.Chatbot()


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://5f3e6b942cde1a3fcb.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)




Testing the retrieval step


In [20]:
rag_chain.invoke("What is a product requirements document?")


'A Product Requirements Document (PRD) describes all aspects of a new idea required or desired to make its realization a success. It serves as a bridge between the often vague project briefing and the highly detailed engineering implementation plan. The PRD outlines various components such as hardware, software, user experience, and more, and makes development goals specific using the SMART-principle (specific, measurable, achievable, relevant, and time-bound).'

In [21]:
chat("What is a product requirements document?")


'A Product Requirements Document (PRD) describes all aspects of a new idea required or desired to make its realization a success. It serves as a bridge between the often vague project briefing and the highly detailed engineering implementation plan. The PRD is also known as the Program Of Requirements (POR), Design Specification, or Product Nucleus in some circles. It is an essential tool for the design and product development process, helping to evaluate future obstacles, establish evaluation criteria for success, convey product priorities, and improve multidisciplinary teamwork.'