![Alt text](1.png)


# Q&A System with PDF

This project demonstrates a question-answering (Q&A) system built using LangChain and Groq. It allows you to upload a PDF file, query the contents of the PDF, and receive relevant answers from a powerful large language model (LLM).

## How it Works

1. **PDF Processing:**
   - The project begins by taking a PDF file as input.
   - The PDF is loaded and split into smaller chunks of text.
   - Each chunk is embedded using Azure OpenAI embeddings.
   - The embeddings are then stored in a vector database (FAISS) for efficient similarity search.

2. **Question Answering:**
   - When a user asks a question, the system performs a similarity search in the vector database using the question.
   - The most relevant text chunks from the PDF are retrieved.
   - These chunks are passed as context to the Groq LLM (Mixtral-8x7b-32768).
   - The LLM generates an answer to the user's question based on the provided context.

## Requirements

- Python 3.12
- `requirements.txt`: Install necessary packages using `pip install -r requirements.txt`.

## Environment Variables

- **GROQ_API_KEY:** Your Groq API key.
- **AZURE_OPENAI_ENDPOINT:** Your Azure OpenAI endpoint.
- **MAJNU:** Your Azure OpenAI deployment name.

## Usage

1. **Upload PDF:** Provide a PDF file as input.
2. **Ask a Question:**  Enter your question related to the PDF's content.
3. **Receive Answer:** The system will generate an answer based on the PDF's content.
### Example Interaction

- **Document:** SampleSet Assignment (A PDF file uploaded)
- **Question:** What are the UI frameworks provided?
- **Response:**

    For your interactive question-answering (QA) bot interface, you can use the following UI frameworks:

    1. **Streamlit** ([streamlit.io](https://streamlit.io/)): Streamlit is an open-source app framework for machine learning and data science projects. It allows you to create interactive web apps with Python in just a few minutes. Streamlit is a great choice for building your QA bot interface due to its simplicity and ease of use. You can use Streamlit to create forms for uploading PDF documents and inputting user queries.

    2. **Gradio** ([gradio.app](https://gradio.app/)): Gradio is an open-source library for creating machine learning user interfaces. With Gradio, you can create interactive web interfaces for your models with just a few lines of code. Gradio is an excellent choice for your QA bot interface as it offers features like file uploads, text input, and real-time output rendering.

   You can choose either Streamlit or Gradio based on your preferences and familiarity with the framework. Both options should be suitable for your needs.

- **Source:** Part 2: Interactive QA Bot Interface 
Problem Statement: Develop an interactive interface for the QA bot from Part 1, allowing users to input queries and retrieve answers in real time. The interface should...



# General RAG Pipeline

![Alt text](1.png)

The Retrieval-Augmented Generation (RAG) pipeline combines retrieval-based methods with generative models to enhance the process of generating responses based on external documents. It consists of three main stages: Ingestion, Retrieval, and Generation.

## 1. Ingestion

In this phase, documents are processed to create a searchable format for effective retrieval:

- **Documents**: Raw text documents that contain the information to be used.
- **Chunks**: The documents are split into smaller, manageable pieces called chunks. This helps in better indexing and retrieval, making the process more efficient.
- **Embedding**: Each chunk is transformed into a vector representation using an embedding model. These embeddings capture the semantic meaning of the text and allow for efficient similarity searches.
- **Index (Database)**: The embeddings are stored in an index (often a vector database) that supports fast retrieval based on similarity searches.

## 2. Retrieval

This phase focuses on fetching relevant chunks based on a user query:

- **Query**: The user submits a query or question seeking information.
- **Index**: The query is processed to generate an embedding, which is then used to search the previously created index.
- **Top K Results**: The retrieval system fetches the top K most relevant chunks from the index based on their similarity to the query embedding. This step ensures that only the most pertinent information is considered for response generation.

## 3. Generation

In this final phase, a response is generated using the retrieved information:

- **Top K Results**: The selected top K chunks from the retrieval stage are passed to a generative model (like GPT).
- **Response to User**: The generative model uses the context provided by the retrieved chunks to formulate a coherent and relevant response to the user's query. This response can be a summary, answer, or any relevant information extracted and synthesized from the retrieved data.

## Summary

The RAG pipeline effectively combines retrieval and generation, allowing for more informed and contextually relevant responses by leveraging external knowledge sources.


## Highlights of Our Document  based Conversational Bot using RAG-📌
![Alt text](2.png)


- **Doc-loader**: Loads documents for processing efficiently. 📄
- **Text-splitter**: Divides text into manageable sections for better analysis. ✂️
- **Embedding**: Converts text into vector representations for similarity searches. 🔍
- **Chroma/FAISS DB**: Provides database solutions for storing and retrieving embeddings. 🗄️
- **Streamlit Client**: Facilitates user interaction with the data retrieval system. 💻
- **LLM (Large Language Model)**: Powers natural language understanding and generation. 🤖
- **Memory Management**: Enhances system performance by storing and recalling previous interactions. 🧠

## Key Insights -🔑

- **Efficient Document Processing**: The use of doc-loaders and text-splitters optimizes the initial handling of large datasets, allowing for quicker access to information. 📂
- **Vector Representation**: Embeddings transform textual data into vectors, enabling nuanced similarity searches and improving retrieval accuracy. 🔗
- **Robust Databases**: Chroma and FAISS serve as essential tools for organizing and managing embeddings, ensuring fast retrieval times and scalability. 🏢
- **User-Centric Design**: The integration of a Streamlit client allows for a more interactive and user-friendly experience, making complex data accessible. 🎨
- **Advanced Language Models**: LLMs are crucial for generating coherent responses, highlighting the importance of language understanding in AI applications. 📚
- **Dynamic Memory Utilization**: Effective memory management can enhance the system’s ability to provide contextually relevant information, improving user engagement. 🔄
- **Vector Store Loading**: Loading vector stores efficiently is vital for maintaining performance, particularly in applications requiring real-time data access. ⏱️

In [33]:
import os
import tempfile
from langchain_core.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.vectorstores import FAISS
from langchain_openai import AzureOpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.document_loaders import TextLoader, PyPDFLoader, CSVLoader
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain.text_splitter import RecursiveCharacterTextSplitter


# 1. Load environment variables
load_dotenv()

# 2. Initialize LLM with API key
groq_api_key = os.environ['GROQ_API_KEY']
llm = ChatGroq(model="mixtral-8x7b-32768", groq_api_key=groq_api_key)

# 3. Define the prompt template
prompt = PromptTemplate(template="Answer the question.\nQuestion: {question}\nHelpful Answers:",
                        input_variables=['question'])







In [9]:
def process_pdf(file_name):
    # Load the PDF
    pdf_loader = PyPDFLoader(file_name)
    pdtext = pdf_loader.load()

    # Split the loaded text into chunks
    text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
    text_chunks = text_splitter.split_documents(pdtext)

    # Initialize embeddings and vector store
    embeddings = AzureOpenAIEmbeddings(deployment='MAJNU', azure_endpoint=os.environ['AZURE_OPENAI_ENDPOINT'])
    vectordb = FAISS.from_documents(text_chunks, embeddings)

    return vectordb

In [27]:
def query_file(vectordb, question):
    # Perform similarity search on the vector database
    docs = vectordb.similarity_search(question, k=3)
    context = "\n".join([doc.page_content for doc in docs])
    
    # Prepare the input for the LLM
    input_text = f"{question}\n\nContext: {context}"
    
    # Invoke the LLM with the formatted input
    response = llm.invoke(input_text)  # Pass the input_text as a string
    
    # Access the response text correctly
    answer = response.content if hasattr(response, 'content') else str(response)  # Adjust based on the response structure
    
    # Only keep the first source
    source = f"Source: {docs[0].page_content[:200]}..." if docs else "No sources available."
    return answer, source


In [31]:
uploaded_files = 'm.pdf'
vectordb = process_pdf(uploaded_files)

In [35]:


# Sample query
question = "What are the ui framework provided"
answer, source = query_file(vectordb, question)
print("Answer:", answer)
print('--------------------------------------------------------------------------------')
print("Source:", source)

Answer: For your interactive QA bot interface, you can use the following UI frameworks:

1. Streamlit (<https://streamlit.io/>):
Streamlit is an open-source app framework built specifically for machine learning and data science projects. It allows you to create interactive web apps with Python in just a few minutes. Streamlit is a great choice for creating data-intensive and ML-based web applications, and it has built-in support for uploading files, displaying text, and real-time data processing.
2. Gradio (<https://gradio.app/>):
Gradio is a Python library for creating user interfaces for machine learning models. It enables you to share your models with others by creating user interfaces that are both simple and powerful. Gradio is designed to work with any ML library and supports file uploads, real-time input/output, and multiple input types.

For this task, you can choose either Streamlit or Gradio to build the frontend interface. Here's a brief comparison of the two:

* Streamlit i