<a href="https://colab.research.google.com/github/Shaz-gif/tiny-jarvis/blob/main/AI_Bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#AI Chatbot Assignment
### Apport Software Solutions Private Limited
### Role :  AI Engineer
Name : Shashwat Raj \\
Entry Number: 2021MT10259 \\
College: IIT Delhi \\
**Please open this note book in colab using the following sharing like, because this contains images and links for documentation that will be better displayed on Colab** \\
Link to this colab notebook: - [Shashwat's Colab](https://colab.research.google.com/drive/13W-X7MPBSBww1eNnhCHS1x5U6yqptFPi?usp=sharing)

## Important Note

Due to the numerous dependencies, the trial-and-error nature of the development process, and the need to document the code alongside implementation, I decided to use Google Colab for this project. This setup allows for an efficient and organized workflow.

I want to clarify that the code has not been uploaded anywhere on the internet and is only stored privately in my Colab account. It is accessible solely through a private link, ensuring adherence to proper conduct codes and data privacy practices.



# **Chatbot for PDF-Based Q&A**

## **Objective**

Create a chatbot that can answer questions based on the content of a provided PDF document. If the chatbot cannot find the answer in the PDF, it should respond:

"Sorry, I didn’t understand your question. Do you want to connect with a live agent?"

---

## **Assignment Requirements**

### 1. **PDF Understanding**
- The chatbot should load the provided PDF and use its content as the knowledge base.
- It must accurately retrieve and display answers from the PDF content.

### 2. **Fallback Response**
- If the chatbot cannot find the answer in the PDF, it must provide a fallback response:
  "Sorry, I didn’t understand your question. Do you want to connect with a live agent?"

### 3. **User Interaction**
- Develop an interface (text-based or graphical) where users can ask questions and receive responses.
- Ensure the interaction is clear and intuitive.

### 4. **Problem-Solving**
- You are responsible for deciding the tools, libraries, and frameworks to use.
- Research and implement methods for handling PDF data, querying information, and building chatbot functionality.

---

## **Guidelines**

- Focus on applying AI concepts and practical problem-solving skills.
- Ensure your solution is modular and maintainable.
- Document your approach, including the decisions made and challenges encountered.

---

## **Deliverables**

### 1. **Chatbot Application**
- A functional chatbot that meets the requirements.

### 2. **Documentation**
- Describe your approach to solving the problem.
- Include details on tools and techniques used, and how the fallback logic is implemented.


![Piping and Instrumentation Diagram](https://drive.google.com/uc?id=15AosHDCUudfYp21mDChKeEELV_zUgKvW)

*Image made by Shashwat: [Canva Link](https://www.canva.com/design/DAGX6P98jv0/_hkeMRWL92X_AQCRPkP76Q/edit?utm_content=DAGX6P98jv0&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)*


# **Piping and Instrumentation Diagram (P&ID): End-to-End Design of AI Bot**

Above diagram illustrates the architecture and workflow for an AI chatbot designed to answer questions based on documents like PDFs, DOCs, and TXT files. Here's an explanation of the components:

---

## **1. Data Source**
- **Input Formats:** The user uploads files in supported formats:
  - **PDF** : Implemented
  - **DOC** : Can be extended to
  - **TXT** : Can be extended to
- **Purpose:** These files act as the knowledge base for the chatbot.

---

## **2. Data Parsing**
- **Chunking:**
  - The uploaded documents are broken down into smaller, manageable chunks (e.g., `chunk01`, `chunk02`, etc.).
  - **Reason:** This helps in processing and embedding the text more efficiently for querying.

---

## **3. Embedding Models**
- **Purpose:** Convert the text chunks into vector embeddings, which are mathematical representations of the text.
- **Process:**
  - Each chunk is processed using a pre-trained **embedding model**. In my small project, I used SentenceTransformer
\*[Sentence Transformer Documentation](https://www.canva.com/design/DAGX6P98jv0/_hkeMRWL92X_AQCRPkP76Q/edit?utm_content=DAGX6P98jv0&utm_campaign=designshare&utm_medium=link2&utm_source=sharebutton)*
  - Outputs are **vector embeddings**, which capture the semantic meaning of the text.

---

## **4. Data Warehouse (Storage)**
- **Embedded Text Corpus:**
  - The embeddings are stored in a database for efficient retrieval.
  - Example: For PDF1, embeddings might include sentences like "Approval of class...".

---

## **5. Vector Database**
- **Examples:** Tools like **Pinecone** or **SingleStore** are used. I have used **Pinecone**.
- **Functionality:**
  - The vector embeddings from the embedding model are stored in this database.
  - This allows for fast similarity searches when a user query is processed.

---

## **6. Query Processing**
- **User Query:**
  - The user inputs a query (e.g., "What is date of exam ...?").
  - The query is embedded into a vector format using the same embedding model.
- **Cosine Similarity:**
  - The vectorized query is compared with the stored embeddings using cosine similarity to determine relevance.

---

## **7. Results**
- **Similarity Scores:**
  - Result tokens are ranked based on how closely they match the query (e.g., token1: 0.98, token4: 0.79, etc.).
  - These results are presented to the user in descending order of similarity.

---

## **8. User Interface**
- **Purpose:** Provides an intuitive platform for user interaction.
- **Features:**
  - Users can input queries and view responses.
  - Can be extended to more sophasticated user interactions and UI
---

## **Backend (Highlighted in the Diagram)**
- The backend handles:
  - Parsing and embedding the uploaded documents.
  - Storing embeddings in a vector database.
  - Processing user queries and retrieving results based on similarity.

---

This architecture ensures efficient and accurate question-answering while maintaining user-friendly interaction through fallback mechanisms and feedback loops.


### PDF Text Extraction Functionality


This code enables PDF file upload and text extraction using `ipywidgets` and `PyPDF2`.

1. **`extract_text_from_pdf(pdf_path)`**:
   - This function accepts a file path (`pdf_path`), opens the PDF file, and extracts text from all of its pages.
   - It uses `PyPDF2.PdfReader` to read the PDF and the `extract_text()` method to retrieve text content.
   - Returns the extracted text or `None` if an error occurs.

2. **`on_upload_change(change)`**:
   - This is a callback function that gets triggered when a file is uploaded using the `widgets.FileUpload`.
   - It extracts the uploaded file, saves it as `uploaded_pdf.pdf` in the local path `/content/`, and prints a success message.
   - Then, it calls `extract_text_from_pdf()` to extract text from the saved PDF and displays the result or an error message.

3. **File Upload Widget (`uploader`)**:
   - `widgets.FileUpload` is used to create a file upload interface that accepts only `.pdf` files and allows a single file to be uploaded.
   - The `observe()` method listens for file uploads and calls `on_upload_change()` when the file is uploaded.

4. **File Upload Process**:
   - The widget allows the user to upload a PDF file. After the file is uploaded, it triggers the callback to save the file and extract the text.

This code provides an interactive way to upload and extract text from a PDF file in a Jupyter notebook environment.



In [None]:
import PyPDF2
from IPython.display import display
import ipywidgets as widgets

pdf_text = None

In [None]:
def extract_text_from_pdf(pdf_path):
    """
    Extracts text from a PDF file.
    Args:
        pdf_path (str): Path to the PDF file.
    Returns:
        str: Combined text from all pages of the PDF.
    """
    text = ""
    try:
        with open(pdf_path, 'rb') as file:
            pdf_reader = PyPDF2.PdfReader(file)
            for page in pdf_reader.pages:
                text += page.extract_text()
        return text
    except Exception as e:
        print(f"Error while reading PDF: {e}")
        return None


In [None]:
def on_upload_change(change):
    """
    Callback function that is triggered when a file is uploaded.
    """
    global pdf_text
    uploaded_file = next(iter(uploader.value.values()))

    file_path = "/content/uploaded_pdf.pdf"
    with open(file_path, 'wb') as f:
        f.write(uploaded_file['content'])

    print(f"File uploaded successfully: {file_path}")

    pdf_text = extract_text_from_pdf(file_path)
    if pdf_text:
        print(f"Extracted {len(pdf_text)} characters from the PDF.")
    else:
        print("Failed to extract text.")



In [None]:
uploader = widgets.FileUpload(
    accept='.pdf',  # Only accept PDF files
    multiple=False  # Only allow a single file to be uploaded
)


uploader.observe(on_upload_change, names='value')

display(uploader)


FileUpload(value={}, accept='.pdf', description='Upload')

File uploaded successfully: /content/uploaded_pdf.pdf
Extracted 94788 characters from the PDF.


#Preprocessing the Extracted Text
### Function 1: `split_text_into_chunks`

This function splits a given text into smaller chunks to optimize it for processing by embedding models.

- **Arguments:**
  - `text` (str): The input text to be split.
  - `chunk_size` (int, default=500): The maximum size of each chunk in characters.

- **Returns:**
  - A list of text chunks, where each chunk is a substring of the input text with a size no greater than `chunk_size`.

- **Explanation:**
  - The function iterates over the words in the input text and builds chunks. When the current chunk exceeds the `chunk_size`, the chunk is added to the list and a new chunk begins.
  - After processing all words, any remaining text in `current_chunk` is added as the final chunk.




In [None]:
def split_text_into_chunks(text, chunk_size=500):
    """
    Splits the text into smaller chunks.
    Args:
        text (str): The input text.
        chunk_size (int): Maximum size of each chunk in characters.
    Returns:
        list: List of text chunks.
    """
    chunks = []
    current_chunk = ""
    for word in text.split():
        if len(current_chunk) + len(word) + 1 > chunk_size:
            chunks.append(current_chunk.strip())
            current_chunk = word
        else:
            current_chunk += " " + word
    if current_chunk:
        chunks.append(current_chunk.strip())
    return chunks


In [None]:
text_chunks = split_text_into_chunks(pdf_text, chunk_size=500)

print(f"Created {len(text_chunks)} text chunks.")


Created 191 text chunks.


## Errors and Issues

### Embedding Generation Error

While generating embeddings using OpenAI's `text-embedding-ada-002` model, an error occurred with the message:


This error indicates that the usage quota for the current OpenAI plan was exceeded. As a result, to proceed with generating embeddings, a different, free model will be used as an alternative.

For more details on quota limits and billing, refer to the [OpenAI API error codes documentation](https://platform.openai.com/docs/guides/error-codes/api-errors).


### Alternative: Using Sentence Transformers for Embeddings

Due to the error encountered with OpenAI's `text-embedding-ada-002` model (quota exceeded), I opted to use the **Sentence Transformer** model as an alternative. While Sentence Transformers may be slightly less accurate or efficient in some cases compared to OpenAI's embeddings, it is a great free alternative for generating embeddings from text.

The Sentence Transformer model can generate high-quality embeddings and is widely used for tasks like semantic search, clustering, and text similarity analysis.

By switching to this model, we can continue with the embedding generation process without needing a paid OpenAI plan.



In [None]:
# !pip install -qU \
#     pinecone-client==3.0.2 \
#     openai==1.10.0 \
#     datasets==2.16.1

# from openai import OpenAI

# client = OpenAI(
#     api_key="******"
# )


# !pip install openai --upgrade  # Ensure you have the latest OpenAI package installed

# import openai

# openai.api_key = "******"


In [None]:
# import openai

# def generate_embeddings_v2(text_chunks):
#     """
#     Generates embeddings for a list of text chunks using OpenAI's text-embedding-ada-002 model.
#     Args:
#         text_chunks (list): List of text chunks to embed.
#     Returns:
#         list: List of embeddings (one for each text chunk).
#     """
#     embeddings = []
#     try:
#         # Use the new interface to batch process embeddings
#         response = openai.Embedding.create(
#             model="text-embedding-ada-002",
#             input=text_chunks  # Pass the entire list of text chunks at once
#         )
#         embeddings = [item['embedding'] for item in response['data']]  # Extract embeddings
#     except Exception as e:
#         print(f"Error generating embeddings: {e}")
#     return embeddings


In [None]:

# # Generate embeddings for the text chunks
# if pdf_text and 'text_chunks' in locals():
#     embeddings = generate_embeddings_v2(text_chunks)
#     if embeddings:
#         print(f"Generated embeddings for {len(embeddings)} chunks.")
# else:
#     print("No text chunks to process!")


Error generating embeddings: You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.


##Using Sentence-transformer

### Alternative: Using Sentence Transformers for Embeddings

Due to the error encountered with OpenAI's `text-embedding-ada-002` model (quota exceeded), I opted to use the **Sentence Transformer** model as an alternative. While Sentence Transformers may be slightly less accurate or efficient in some cases compared to OpenAI's embeddings, it is a great free alternative for generating embeddings from text.

#### Sentence Transformer Architecture

Sentence Transformers are built on top of transformer-based models (such as BERT, RoBERTa, or DistilBERT) and are specifically designed for tasks that require sentence or text embeddings. The model generates dense vector representations of sentences that capture their semantic meaning, making them ideal for tasks like semantic search, clustering, and text similarity analysis. The architecture uses techniques like **Siamese networks** and **triplet loss** to train the model on sentence pairs, optimizing it for producing meaningful sentence-level embeddings.

![Sentence Transformer Architecture](https://www.researchgate.net/profile/Mohamed-Gaber-2/publication/353487642/figure/fig2/AS:1050256590503937@1627412092943/Siamese-Sentence-Transformer-STransformer-Architecture.ppm)

 [Sentence Transformers](https://sbert.net/).

By switching to this model, we can continue with the embedding generation process without needing a paid OpenAI plan.



In [None]:
!pip install sentence-transformers




In [None]:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')


### Function: `generate_embeddings`

This function generates embeddings for a list of text chunks using the **SentenceTransformers** model.

- **Arguments:**
  - `text_chunks` (list): A list of text chunks that need to be embedded.

- **Returns:**
  - A list of embeddings corresponding to each text chunk, where each embedding is a dense vector representation of the chunk's semantic meaning.

- **Explanation:**
  - The function uses the `encode` method from the SentenceTransformers model to generate embeddings for all the text chunks at once.
  - The `show_progress_bar=True` argument displays a progress bar during the embedding generation process, especially useful when processing large datasets.


In [None]:

def generate_embeddings(text_chunks):
    """
    Generate embeddings for a list of text chunks using SentenceTransformers.
    Args:
        text_chunks (list): List of text chunks to embed.
    Returns:
        list: List of embeddings for the text chunks.
    """
    embeddings = model.encode(text_chunks, show_progress_bar=True)
    return embeddings


In [None]:
# Generate embeddings for the text chunks
if pdf_text and 'text_chunks' in locals():
    embeddings = generate_embeddings(text_chunks)
    print(f"Generated embeddings for {len(embeddings)} chunks.")
else:
    print("No text chunks to process!")


Batches:   0%|          | 0/6 [00:00<?, ?it/s]

Generated embeddings for 191 chunks.


### Setting Up Pinecone for Storing Embeddings

Pinecone is a fully managed vector database designed for similarity search and machine learning applications. It allows you to store, index, and search vector embeddings at scale. Pinecone makes it easy to manage and query large collections of high-dimensional vectors, such as embeddings generated from text or images, without needing to manage the underlying infrastructure.

#### What is Pinecone?

Pinecone is a cloud-native vector database that provides a high-performance platform to store, search, and retrieve vector embeddings. It is optimized for applications such as:

- **Semantic Search**: Quickly retrieving relevant documents based on the similarity of their embeddings.
- **Recommendation Systems**: Finding similar items (e.g., products, movies) based on user preferences or item characteristics.
- **Anomaly Detection**: Identifying outliers by comparing vectors in a database.



![Sentence Transformer Architecture](https://miro.medium.com/v2/resize:fit:1200/1*4Z90ZgAq7nDdkuWe1bWnlg.jpeg)

 [Vector Databases](https://www.pinecone.io/learn/vector-database/).


#### How Pinecone Works

1. **Embedding Generation**: First, you generate vector embeddings from your text or data using machine learning models like Sentence Transformers or OpenAI’s embeddings.
   
2. **Storing Vectors**: Once you have the embeddings, you store them in Pinecone's vector database. Each vector is typically stored with metadata, such as IDs or other related information.

3. **Similarity Search**: When you query the database, Pinecone computes the similarity between your input vector and the stored vectors using efficient nearest-neighbor search algorithms (like cosine similarity or Euclidean distance).

4. **Real-Time Updates**: Pinecone supports real-time updates, allowing you to add, delete, or modify vectors dynamically as new data comes in.

#### Why Use Pinecone?

- **Scalability**: Pinecone can handle billions of vectors and ensures fast and reliable similarity searches even as your data grows.
- **Managed Service**: It abstracts away the complexity of managing vector databases, letting you focus on building applications rather than maintaining infrastructure.
- **Efficiency**: Pinecone offers highly optimized and low-latency search, making it suitable for real-time applications.

By using Pinecone to store and index your embeddings, you can easily scale up your search and retrieval tasks, making it an essential tool for any application that involves similarity search or machine learning.


In [None]:
!pip install pinecone-client




### Setting Up Pinecone Index

This code sets up a **Pinecone** vector database to store embeddings.

1. **Initialize Pinecone**: The `Pinecone` client is initialized using an API key to authenticate the connection.

2. **Check and Create Index**:
   - It checks if the index `pdf-chatbot-index` exists.
   - If not, it creates a new index with a dimension of `384` (suitable for **all-MiniLM-L6-v2** embeddings), using **cosine similarity** as the metric.

3. **Connect to Index**: The code then connects to the specified index for future operations.

This setup enables efficient similarity search using Pinecone’s managed vector database.


In [None]:
from pinecone import Pinecone, ServerlessSpec
import os

# Initialize Pinecone with your API key
pc = Pinecone(api_key="*********")

# Replace with your API  , I have removed mine API Keys. This notebook will give error here if
#run without providing API Keys.


In [None]:
# pc.delete_index(index_name)

# print(f"Deleted the index: {index_name}")

Deleted the index: pdf-chatbot-index


In [None]:
index_name = "pdf-chatbot-index"

if index_name not in pc.list_indexes().names():
    """
    Create a new index (adjust dimension to match the embedding dimension)
    """
    pc.create_index(
        name=index_name,
        dimension=384,                                                            # for all-MiniLM-L6-v2 embeddings
        metric="cosine",                                                          # Use cosine similarity for text embeddings
        spec=ServerlessSpec(cloud='aws', region='us-east-1')
    )


In [None]:

# Connect to the Pinecone index
index = pc.Index(index_name)

print(f"Connected to Pinecone index: {index_name}")


Connected to Pinecone index: pdf-chatbot-index


### Upserting Embeddings into Pinecone

1. **Generate Unique IDs & Prepare Data**: Unique IDs are created for each text chunk, and embeddings are paired with these IDs and metadata (the original text) in preparation for insertion into Pinecone.

2. **Upsert Data**: The `upsert()` method inserts the embeddings, IDs, and metadata into the Pinecone index, enabling efficient similarity search.


In [None]:
# Generate unique IDs for the text chunks
ids = [f"chunk-{i}" for i in range(len(embeddings))]

vectors = [(ids[i], embeddings[i], {"text": text_chunks[i]}) for i in range(len(embeddings))]
index.upsert(vectors)

print(f"Upserted {len(vectors)} embeddings into the Pinecone index.")

Upserted 191 embeddings into the Pinecone index.


### Querying Pinecone for Similar Chunks

1. **Generate Query Embedding**:
   - The user's question (`query_text`) is converted into an embedding using the Sentence Transformers model.

2. **Query Pinecone**:
   - The query embedding is used to search the Pinecone index, retrieving the top `k` most similar results. The results include metadata (text chunks) along with their similarity scores.

3. **Filter Results by Similarity**:
   - Results are filtered based on a **similarity threshold**. Only matches with scores above the threshold are considered valid and returned.



In [None]:
def query_pinecone(query_text, top_k=5, similarity_threshold=0.4):
    """
    Query the Pinecone index with a user question and retrieve the most similar chunks.
    Args:
        query_text (str): The user's question.
        top_k (int): Number of top results to retrieve.
        similarity_threshold (float): Minimum similarity score to consider a valid match.
    Returns:
        list: Matching results with scores above the threshold.
    """
    # Generate embedding for the query
    query_embedding = model.encode([query_text])[0]
    query_embedding_list = query_embedding.tolist()

    # Query Pinecone for the top-k most similar results
    results = index.query(vector=query_embedding_list, top_k=top_k, include_metadata=True)

    # Filter results based on the similarity threshold
    matches = [
        match for match in results["matches"] if match["score"] >= similarity_threshold
    ]

    return matches


In [None]:
def user_query(user_question):
    results = query_pinecone(user_question)

    if results:
        print(f"Top matches: {results}")
    else:
        print("Sorry, I didn’t understand your question. Do you want to connect with a live agent?")

    for match in results:
      print(f"{match['score']:.2f}: {match['metadata']['text']}")

##Some Example Usuage

In [None]:
"""
These are likely to be in the uploaded pdf
"""


user_question = "What are the consequences of Ragging & Sexual Harassment?"
user_query(user_question)


Top matches: [{'id': 'chunk-188',
 'metadata': {'text': 'Ragging & Sexual Harassment •Ragging & Sexual '
                      'Harassment of fellow students is strictly prohibited. '
                      'Any student/s found guilty of ragging and/or abe\x7fng '
                      'ragging, whether ac9vely or passively, or being a part '
                      'of a conspiracy to promote ragging, is liable to be '
                      'punished as per the rules. Ragging ofen ends up in '
                      'sexual or physical harassment for the vic9m. Ragging '
                      'mostly leads to sexual abuse or harassment. •Ragging of '
                      'students in any form is strictly prohibited inside and '
                      'outside the campus. The'},
 'score': 0.673096478,
 'values': []}, {'id': 'chunk-189',
 'metadata': {'text': 'ins9tute maintains a zero tolerance policy towards '
                      'ragging. All issues in this regards will be dealt with '

In [None]:
user_question = "Student Support Services Guidelines"

user_query(user_question)


Top matches: [{'id': 'chunk-185',
 'metadata': {'text': 'today has been an integral part of educa9on and is '
                      'currently evolving to meet and exceed student '
                      'expecta9ons. To ensure all your Queries/Concerns/Issues '
                      'are dealt within acceptable 9meframe and to utmost '
                      'sa9sfac9on, kindly follow the student support services '
                      'guidelines. Policies and Procedures •Students who have '
                      'received creden9als for Student Portal, can raise their '
                      'queries online and will receive a request number for '
                      'tracking purposes. •Students who are wai9ng for '
                      '“Student Portal” access can'},
 'score': 0.59887141,
 'values': []}, {'id': 'chunk-184',
 'metadata': {'text': 'cases will be reviewed by NGASCE management. A student '
                      'shall be provided upto a maximum of 3 interviews. In '


In [None]:
user_question = "Placement Guidelines"

user_query(user_question)

Top matches: [{'id': 'chunk-179',
 'metadata': {'text': 'applicable) •Aiested copies of Grade Sheets/Mark sheets '
                      '/ Final Cer9ﬁcate •Copy/ies of Prospectus or '
                      'communica9on received from Professional Body/ '
                      'Management / Educa9onal Ins9tu9on/s as applicable, '
                      'requiring you to submit transcripts. Placement '
                      'Guidelines: Placement assistance is oﬀered to students '
                      'however it is the preroga9ve of the Schools & Campuses '
                      'to decide, which of the programs this service should be '
                      'oﬀered. Students are expected to maintain decorum and '
                      'abide by the guidelines during'},
 'score': 0.610876501,
 'values': []}, {'id': 'chunk-180',
 'metadata': {'text': 'placement processes. In the event of non-conformance to '
                      'the placement guidelines, the School reserves the right 

In [None]:
"""
These are unlikely to be in the uploaded pdf
"""
user_question = "Are we alone in this Universe?"
user_query(user_question)



Sorry, I didn’t understand your question. Do you want to connect with a live agent?


In [None]:
user_question = "Apport Software Solutions Private Limited"
user_query(user_question)

Sorry, I didn’t understand your question. Do you want to connect with a live agent?


In [None]:
user_question = "Seating Plan"
user_query(user_question)

Sorry, I didn’t understand your question. Do you want to connect with a live agent?


# Using "facebook/bart-large-cnn" for better summarization

![Piping and Instrumentation Diagram](https://drive.google.com/uc?id=11qtF7oPjv5YzeMCgUH9QnhzlannWglFD)



### Facebook BART Architecture

BART (Bidirectional and Auto-Regressive Transformers) is a sequence-to-sequence model that combines the strengths of BERT's bidirectional encoder and GPT's autoregressive decoder. It is trained using a denoising autoencoder approach, where the model learns to reconstruct corrupted text. BART excels at tasks like text summarization, generation, and translation due to its ability to both understand and generate text. Its hybrid architecture allows it to perform well on a variety of natural language processing tasks, making it a versatile and powerful tool in NLP.

![Sentence Transformer Architecture](https://production-media.paperswithcode.com/methods/Screen_Shot_2020-06-01_at_9.49.47_PM.png)

 [BART Docs](https://huggingface.co/docs/transformers/en/model_doc/bart).

In [None]:
!pip install transformers torch




### Fallback Response and Concise Answer Generation

#### Purpose:
This approach ensures that the chatbot always provides a response, even when no relevant matches are found for the user's query.

#### How It Works:
1. **Fallback Response**: If the chatbot doesn't find any valid matches with a high enough similarity score, it returns a predefined fallback response:  
   *“Sorry, I didn’t understand your question. Do you want to connect with a live agent?”*

2. **Concise Response**: If valid matches are found (i.e., matches with a score above the specified threshold), the chatbot combines the matching text, uses a summarization model (e.g., BART or T5) to generate a concise, user-friendly response, and presents it to the user.

#### Why It Is Used:
- **Ensures User Satisfaction**: The fallback response ensures the chatbot remains functional even when it cannot find a relevant answer, preventing an empty or irrelevant response.
- **Natural and Concise**: The summarization step converts lengthy or complex matches into concise, easy-to-read answers, improving the user experience.
- **Smooth Transition**: If the chatbot can't answer the query, the fallback message offers a smooth transition to a live agent, ensuring continuous support.


![Piping and Instrumentation Diagram](https://drive.google.com/uc?id=1A-Y2fCpNg9gwFDn4RnXdqQLw2_AzakPC)

Overall Design

**Correction** : Result--> INTO --> Summarizer (BART) will also be in backend

In [None]:
from transformers import pipeline

# Load pre-trained model for summarization (BART or T5)
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")

In [None]:
def generate_concise_response(query_text, matches, similarity_threshold=0.4):
    """
    Generate a concise and user-friendly response by rephrasing the matched text.
    If no valid matches are found, returns a fallback response.
    Args:
        query_text (str): The user's query.
        matches (list): List of top matches with metadata.
        similarity_threshold (float): Minimum similarity score to consider a match valid.
    Returns:
        str: Concise response based on matched text, or fallback response.
    """

    valid_matches = [match for match in matches if match["score"] >= similarity_threshold]

    if not valid_matches:
        # fallback response
        return "Sorry, I didn’t understand your question. Do you want to connect with a live agent?"


    combined_text = " ".join([match['metadata']['text'] for match in valid_matches])


    summarized_response = summarizer(combined_text, max_length=200, min_length=50, do_sample=False)

    return summarized_response[0]['summary_text']


##Use Cases

In [None]:
user_query = "what are the consequence of ragging"
results = query_pinecone(user_query)


concise_response = generate_concise_response(user_query, results)

print(concise_response)


Ragging of students in any form is strictly prohibited inside and outside the campus. Ragging ofen ends up in sexual or physical harassment for the vic9m. The ins9tute maintains a zero tolerance policy towards ragging. All issues in this regards will be dealt with utmost urgency.


In [None]:
user_query = "what are Term End Examinaon Eligibility & Policies"
results = query_pinecone(user_query)


concise_response = generate_concise_response(user_query, results)

print(concise_response)


Term End Examina9on Credence is 70%. Students are expected to complete the academic cycle of the Semester enrolled for. Students cannot directly appear for Re-Sit Term End Exams (April/Sept) Students can register directly for the term end examina9on based on the eligibility.


##Fallback Responses

In [None]:
user_query = "Do Aliens exists ?"
results = query_pinecone(user_query)


concise_response = generate_concise_response(user_query, results)

print(concise_response)

Sorry, I didn’t understand your question. Do you want to connect with a live agent?


In [None]:
user_query = "What are ASR Models"
results = query_pinecone(user_query)


concise_response = generate_concise_response(user_query, results)

print(concise_response)

Sorry, I didn’t understand your question. Do you want to connect with a live agent?


#Fallback Logic

---

### Handling Fallback Responses and Threshold Optimization

#### 1. **Threshold for Similarity Score**

The threshold for similarity scores plays a crucial role in determining when the bot should respond with relevant information and when it should trigger a fallback response.

- **Precision** ensures that only highly relevant matches are considered.
- **Recall** ensures that potential relevant information isn’t overlooked.

**Threshold Optimization:**

- **High Threshold (e.g., 0.9)**: May result in many fallback responses, leading to user frustration.
- **Low Threshold (e.g., 0.3)**: Can cause irrelevant or weak responses, impacting user satisfaction.

**Optimal Threshold Selection**:
- Start with a **reasonable threshold** (e.g., 0.6).
- **Iterate and experiment** based on feedback or test cases.
- Adjust based on **domain knowledge** of the PDF content.

#### 2. **Graceful Fallback Handling**

Fallback responses should be **empathetic** and **context-aware** to maintain user engagement.

- **Tone**: Ensure responses are friendly and supportive, e.g.,  
  *“I couldn’t find an exact match. Would you like to ask something else or speak to a live agent?”*
  
- **Clarification**: If unable to provide an exact match, clarify the limitation, e.g.,  
  *“Your query seems a bit specific. Would you like to rephrase or try another question?”*

- **Suggestions**: Propose related topics or areas of interest to keep the conversation flowing, e.g.,  
  *“Do you mean placement guidelines or eligibility criteria?”*

#### 3. **Confidence Level Adjustment**

- **Dynamic Threshold Adjustment**: Adjust the threshold based on the confidence of the match. For **lower-confidence matches** (e.g., scores between 0.6-0.7), consider a more flexible fallback response:  
  *“I found some information but it may not be what you’re looking for. Would you like me to try again?”*
  
- **Higher Confidence** (e.g., scores above 0.8) should result in more confident, concise responses.

#### 4. **Best Practices**

- **Monitor and Fine-Tune**: Regularly track fallback response frequency and adjust the threshold as needed.
  
- **Context-Awareness**: Maintain context for follow-up queries, especially when exact matches are not found.

- **Aggregated Responses**: For queries with multiple matches, aggregate top results and rephrase them concisely for a better response.


#### Conclusion:
- Fine-tuning the similarity threshold is a balance between **precision** and **recall**, with **user feedback** guiding adjustments.
- **Fallback responses** should be empathetic and provide options for re-engagement, ensuring a smooth and professional user experience.


#Sources
- [Learn about Vector Databases (Pinecone)](https://www.pinecone.io/learn/vector-database/)
- [Sentence Transformers (SBERT)](https://sbert.net/)
- [New and Improved Embedding Model (OpenAI)](https://openai.com/index/new-and-improved-embedding-model/)
- [Colab Notebook Example](https://colab.research.google.com/drive/13W-X7MPBSBww1eNnhCHS1x5U6yqptFPi?usp=sharing)
