# **Generative AI Project: RAG-based Conversational Assistant for Standards and Design Guidelines**

## Project Overview
This project implements a Retrieval-Augmented Generation (RAG) conversational assistant for querying standards and design guidelines. It leverages the Groq Llama-3.3-70b-Specdec language model for generating detailed responses and uses HuggingFace's sentence-transformers/all-MiniLM-L6-v2 for creating document embeddings. The system processes PDF documents, stores embeddings in FAISS, and supports chat history-aware contextual queries, providing precise answers based on the uploaded documents.

In [45]:
import os
from dotenv import load_dotenv
from langchain_community.document_loaders import PyPDFDirectoryLoader
from langchain_groq import ChatGroq
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import FAISS
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholde
from langchain.chains import create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

In [46]:
import os
from dotenv import load_dotenv
load_dotenv()

os.environ['HF_TOKEN'] = os.getenv("HF_TOKEN")
groq_api_key = os.getenv("GROQ_API_KEY")


### **1 : Load PDF documents**

* Purpose: Load PDF documents from the directory for further processing.
* Explanation: The PyPDFDirectoryLoader extracts text from all PDFs in the specified directory.

In [47]:
# Step 1: Load PDF documents
from langchain_community.document_loaders import PyPDFDirectoryLoader
loader = PyPDFDirectoryLoader("standards")
docs = loader.load()
docs

[Document(metadata={'source': 'standards\\Ernst_Neufert_ARCHITECTS_DATA.pdf', 'page': 0}, page_content="Ernst Neufert \nARCHITECTS' DATA \nSecond (International) English Edition \nGeneral editor Vincent Jones \nEditorial consultant George Atkinson OBE BAArch) RIBA \nUSA editor Wm Dudley Hunt Jr BSc BArch FAIA \nEditor John Thackara \nDeputy editor Richard Miles \nb \nBlackwell \nScience \nThis document \nL \ncontains 447 pages \n"),
 Document(metadata={'source': 'standards\\Ernst_Neufert_ARCHITECTS_DATA.pdf', 'page': 1}, page_content="© 1980 by \nBlackwell Science Ltd \nEditorial Offices: \nOsney Mead, Oxford 0X2 OEL \n25 John Street, London WC1 N 2BL \n23 Ainslie Place, Edinburgh EH3 6AJ \n350 Main Street, Maiden \nMA 02148 5018, USA \n54 University Street, Canton \nVictoria 3053, Australia \n10, rue Casimir Delavigne \n75006 Paris, France \nOther Editorial Offices: \nBlackwell Wissenschafts-Venlag GmbH \nKurfurstendamm 57 \n10707 Berlin, Germany \nBlackwell Science KK \nMG Koderimach

### **2: Initialize Groq LLM**
* Purpose: Use the Groq LLM for generating responses based on retrieved information.
* Explanation: The Groq LLM is configured with the API key to enable communication.

In [67]:
# Step 2: Initialize Groq LLM
from langchain_groq import ChatGroq
llm = ChatGroq(model = "Llama-3.3-70b-Specdec", groq_api_key = groq_api_key)

### **3: Initialize embeddings**
* Purpose: Generate vector embeddings for the documents.
* Explanation: sentence-transformers/all-MiniLM-L6-v2 converts textual data into numerical vectors for similarity-based retrieval

In [49]:
### Step 3: Initialize embeddings
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
embeddings = HuggingFaceEmbeddings(model_name ="sentence-transformers/all-MiniLM-L6-v2")
embeddings


HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2', cache_folder=None, model_kwargs={}, encode_kwargs={}, multi_process=False, show_progress=False)

### **4: Splitting the documents**
* Purpose: Split large documents into smaller chunks for better retrieval accuracy.
* Explanation: The splitter creates chunks with overlap, ensuring context continuity.

In [50]:
## Step 4: Splitting the documents
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 10000, chunk_overlap = 1000)
splits = text_splitter.split_documents(docs)

### **5: Use FAISS for vector storage**
* Purpose: Store document embeddings in a vector database for efficient retrieval.
* Explanation: FAISS is a fast similarity search library that indexes and retrieves vectors.

In [51]:
# Step 5: Use Chroma for vector storage
from langchain_community.vectorstores import FAISS
vector_store = FAISS.from_documents(documents = splits, embedding = embeddings)
retriever = vector_store.as_retriever()
retriever


VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000019D9336BDF0>, search_kwargs={})

### **6: Create prompt templates**
* Purpose: Define reusable prompt templates for the system and contextualized question reformulation.
* Explanation: Prompts guide the LLM in generating specific responses based on provided context and history.

In [68]:
# Step 6: Create prompt templates
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

## contextualize_question_system_prompt
contextualize_question_system_prompt = (
    "Based on the chat history and the most recent user question, "
    "which may refer to previous context in the conversation, "
    "reformulate the question so it can be understood independently of the chat history. "
    "Do not provide an answer; simply rephrase the question if necessary, otherwise return it unchanged."
)

contextualize_question_prompt = ChatPromptTemplate.from_messages(
   [ 
        ("system", contextualize_question_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)


## System prompt
system_prompt = (
    "You are a highly knowledgeable assistant tasked with providing accurate answers to questions."
    "Utilize the given pieces of retrieved information to craft your response."
    "If the answer is not present in the provided context, indicate that you do not know."
    "Your responses should be detailed and comprehensive."
    "\n\n"
    "{context}"
)

question_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}")
    ]
)



### **7: Create retriever and history-aware retriever**
* Purpose: Build a retriever that considers chat history when retrieving documents.
* Explanation: The retriever ensures relevance by reformulating queries based on chat history.

In [69]:
# Step 7: Create retriever and history-aware retriever
from langchain.chains import create_history_aware_retriever, create_retrieval_chain
history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_question_prompt)
history_aware_retriever


RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
| VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000019D9336BDF0>, search_kwargs={}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], typing.Annotated[langchain_core.messages.ai.AIMessageChunk, Tag(tag='A

### **8: combine documents**
* Purpose: Combine retrieved documents for processing by the LLM.
* Explanation: The chain organizes retrieved chunks into a coherent structure for answering questions.

In [70]:
# Step 8:combine documents
from langchain.chains.combine_documents import create_stuff_documents_chain
question_answer_chain = create_stuff_documents_chain(llm, question_prompt)

### **9: Create retriever chain**
* Purpose: Link the retriever and document combiner into a RAG pipeline.
* Explanation: The retrieval chain integrates retrieval and answer generation.

In [71]:
# Step 9:Create retriever chain
from langchain.chains import create_retrieval_chain
rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)
rag_chain

RunnableBinding(bound=RunnableAssign(mapper={
  context: RunnableBinding(bound=RunnableBranch(branches=[(RunnableLambda(lambda x: not x.get('chat_history', False)), RunnableLambda(lambda x: x['input'])
           | VectorStoreRetriever(tags=['FAISS', 'HuggingFaceEmbeddings'], vectorstore=<langchain_community.vectorstores.faiss.FAISS object at 0x0000019D9336BDF0>, search_kwargs={}))], default=ChatPromptTemplate(input_variables=['chat_history', 'input'], input_types={'chat_history': list[typing.Annotated[typing.Union[typing.Annotated[langchain_core.messages.ai.AIMessage, Tag(tag='ai')], typing.Annotated[langchain_core.messages.human.HumanMessage, Tag(tag='human')], typing.Annotated[langchain_core.messages.chat.ChatMessage, Tag(tag='chat')], typing.Annotated[langchain_core.messages.system.SystemMessage, Tag(tag='system')], typing.Annotated[langchain_core.messages.function.FunctionMessage, Tag(tag='function')], typing.Annotated[langchain_core.messages.tool.ToolMessage, Tag(tag='tool')], ty

### **10: Manage chat history**
* Purpose: Maintain session-based chat history for context retention.
* Explanation: ChatMessageHistory stores interactions by session ID

In [72]:
# Step 10: Manage chat history
from langchain_community.chat_message_histories import ChatMessageHistory
from langchain_core.chat_history import BaseChatMessageHistory

store = {}

def get_session_history(session_id: str) -> BaseChatMessageHistory:
    if session_id not in store:
        store[session_id] = ChatMessageHistory()
    return store[session_id]


### **11: Add Runnable with message history**
* Purpose: Integrate the RAG chain with message history for conversational capabilities.
* Explanation: RunnableWithMessageHistory enables history-aware responses.

In [73]:
# Step 11: Add Runnable with message history
from langchain_core.runnables.history import RunnableWithMessageHistory
conversational_rag_chain = RunnableWithMessageHistory(
    rag_chain,
    get_session_history,
    input_messages_key = "input",
    history_messages_key = "chat_history",
    output_messages_key = "answer"
)

### Sample Invocations
* Purpose: Query the system with a hospital design-related question.
* Explanation: Responses are generated based on retrieved and history-aware context

In [74]:
response_1 = conversational_rag_chain.invoke(
    {"input": "What are the components and design standards with dimesions I should follow when I am designing Hospitals: specialunits?, give me output in detail"},
    config  = {
        "configurable": {"session_id": "user_1"}
    },
)

print("User_1")
print("\n")
print(f'''response_1: {response_1["answer"]}''')

User_1


response_1: Designing hospital special units requires careful consideration of various components and design standards to ensure efficient and effective patient care. Here's a detailed overview of the key components and design standards with dimensions to follow:

**1. Intensive Care Units (ICUs):**

* Bed unit basic space module: 11-15 m²/unit
* Number of beds: 6-7 maximum per unit
* Distance from control station or viewpoint to patient: minimal, so that equipment can be read and patient can be seen
* Nurse/physician station: 4-10 m²
* Support area for medication station, utility, and treatment: 11-15 m²
* Amenities such as rest room, locker, and WC: 0.5-1.0 m² per bed
* Design considerations:
	+ Will patient be conscious, require privacy, toilet, constant nursing attention?
	+ Will location or configuration of unit help or hinder patient's recovery?
	+ Can staff see all patients easily? Is ratio patients/staff station appropriate?

**2. Neurosurgical Units:**

* Ratio of pop

* Purpose: Query the system about furniture design standards.
* Explanation: The system retrieves relevant information and generates a detailed response.

In [75]:
response_2 = conversational_rag_chain.invoke(
    {"input": "What is the Body measurements for Dining chair under the section DIMENSIONS & SPACEREQUIREMENTS"},
    config  = {
        "configurable": {"session_id": "user_1"}
    },
)

print("User_1")
print("\n")
print(f'''response_1: {response_2["answer"]}''')

User_1


response_1: According to the provided text under the section "DIMENSIONS & SPACE REQUIREMENTS", the body measurements for a dining chair are as follows:

**Man* Seat width: 400-450 mm (15.7-17.7 in)
* Seat depth: 400-450 mm (15.7-17.7 in)
* Seat height: 450-460 mm (17.7-18.1 in)
* Backrest height: 800-850 mm (31.5-33.5 in)
* Armrest height: 650-700 mm (25.6-27.6 in)
* Armrest width: 50-70 mm (2-2.8 in)

**Clearances**

* Clearance between top of chair seat and underside of table top: 250-300 mm (9.8-11.8 in)
* Clearance between back of legs and front edge of seat: 50-70 mm (2-2.8 in)
* Clearance between thigh and underside of table top: 100-150 mm (3.9-5.9 in)

**Other measurements**

* Elbow rest height: 900-1000 mm (35.4-39.4 in)
* Eye level: 1050-1150 mm (41.3-45.3 in)
* Reach distance: 600-700 mm (23.6-27.6 in)

These measurements are based on European data and are intended to provide a general guideline for designing dining chairs. However, it's essential to note that bod

### Key Features
* LLM Model: Groq Llama-3.3-70b-Specdec.
* Embedding Model: HuggingFace sentence-transformers/all-MiniLM-L6-v2.
* Vector Database: FAISS.
* Chat History: Session-based chat for personalized interactions.
* RAG Pipeline: Combines retrieval and generation for accurate, contextual responses.