<a href="https://colab.research.google.com/github/SureshkumarRadadiya/SOP-CHATBOT/blob/main/SOP_CHATBOT.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


#Business Problem:
A leading Al consulting firm handles large volumes of technical documents such as regulatory guidelines, compliance manuals, research papers, and operational workflows. These documents are often lengthy and complex, making manual summarization and the creation of Standard Operating Procedures (SOPs) inefficient and Inconsistent. The current manual approach delays the production of structured SOPs and introduces the risk of errors. To enhance operational efficiency and maintain consistency, an automated solution is required to quickly and accurately summarize these documents and generate concise, well-organized SOPs.

###Business objective:
Maximize efficiency and consistency

###Business constraint:
Minimize Cost, Maximize Scalability

###Business Success Criteria:
A reduction in document processing time by at least 30%.

###Economic Success Criteria:
A reduction in operational costs by at least 30%.

###ML Success Criteria:
Achieve an accuracy of at least 90%.


#Step 1: Setting up the Environment



In [1]:
!pip install transformers sentence-transformers faiss-cpu langchain

Collecting faiss-cpu
  Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (4.4 kB)
Downloading faiss_cpu-1.9.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (27.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m27.5/27.5 MB[0m [31m50.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: faiss-cpu
Successfully installed faiss-cpu-1.9.0.post1


#Step 2: Import required libraries

In [2]:
!pip install langchain-community
from transformers import pipeline
from sentence_transformers import SentenceTransformer
import faiss
from langchain.vectorstores import FAISS
from langchain.chains import RetrievalQA
!pip install transformers sentence-transformers faiss-cpu langchain
!pip install langchain-community
!pip install pymupdf

import fitz

Collecting langchain-community
  Downloading langchain_community-0.3.8-py3-none-any.whl.metadata (2.9 kB)
Collecting SQLAlchemy<2.0.36,>=1.4 (from langchain-community)
  Downloading SQLAlchemy-2.0.35-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (9.6 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain<0.4.0,>=0.3.8 (from langchain-community)
  Downloading langchain-0.3.8-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<0.4.0,>=0.3.21 (from langchain-community)
  Downloading langchain_core-0.3.21-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting marshmallow<4.0.0,>=3.18.0 (from datac

#Step 3: Load the summarization model and embedding model


In [3]:
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.58k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/1.63G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.36M [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.7k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

#Step 4: Load and Process the PDF Document


In [4]:
def load_pdf(file_path):
    # Read PDF and extract text
    doc = fitz.open(file_path)
    text = ""
    for page in doc:
        text += page.get_text()
    doc.close()
    return text

# Load and preprocess the PDF content


In [5]:
document_text = load_pdf('/content/RESEARCH_ARTICLE_1.pdf')

#Step 5: Split Document into Chunks and Create Embeddings

In [6]:
chunk_size = 500  # Adjust the chunk size based on model's input limit
chunks = [document_text[i:i+chunk_size] for i in range(0, len(document_text), chunk_size)]
chunk_embeddings = embedding_model.encode(chunks)

#Step 6: Set up FAISS Index for efficient retrieval


In [7]:
embedding_dim = chunk_embeddings.shape[1]
index = faiss.IndexFlatL2(embedding_dim)
index.add(chunk_embeddings)

#Step 7: Define retrieval function


In [8]:
def retrieve_relevant_chunks(query, k=3):
    query_embedding = embedding_model.encode([query])
    distances, indices = index.search(query_embedding, k)
    return [chunks[i] for i in indices[0]]

#Step 8: Generate SOP response based on retrieved chunks




In [9]:
def generate_sop(query):
    relevant_chunks = retrieve_relevant_chunks(query)
    combined_text = " ".join(relevant_chunks)
    summary = summarizer(combined_text, max_length=150, min_length=50, do_sample=False)
    return summary[0]['summary_text']

#Step 9: Implement the Chatbot Interface


In [10]:
def chatbot():
    print("Welcome to the SOP Chatbot! Type 'exit' to stop.")
    while True:
        user_input = input("User: ")
        if user_input.lower() == 'exit':
            break
        response = generate_sop(user_input)
        print("SOP Bot:", response)

#Step 10: Start the chatbot



In [11]:
chatbot()

Welcome to the SOP Chatbot! Type 'exit' to stop.
User: RAG
SOP Bot: The integration of Retriever-Augmented Generation RAG, fine-tuning, and prompt engineering significantly enhanced a LLM chatbot's performance. RAG offers a middle ground, especially for real-time data retrieval needs.
User: exit


**PDF Loading:** The load_pdf function uses PyMuPDF to extract text from each page in the PDF file.


**Document Chunks and Embeddings:** The PDF content is split into chunks, each embedded using all-MiniLM-L6-v2 for vector-based retrieval.


**FAISS Index Setup:** We use FAISS to store and retrieve chunks based on query similarity, ensuring that only the most relevant sections of the document are used for SOP generation.


**RAG Pipeline:** The generate_sop function retrieves relevant chunks, concatenates them, and then summarizes the combined text.


**Interactive Chatbot:** The chatbot function handles user input, retrieves relevant document parts, and generates a structured SOP response.




#SECOND PART

In [12]:
# Step 1: Install necessary libraries if not already installed
# !pip install PyMuPDF sentence-transformers faiss-cpu transformers time

import fitz  # PyMuPDF for PDF processing
from sentence_transformers import SentenceTransformer
import faiss
from transformers import pipeline
import time
import numpy as np

In [13]:
# Step 2: Load Models
summarizer = pipeline("summarization", model="facebook/bart-large-cnn")
embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

In [14]:
# Step 3: Load and Process PDF
def load_pdf(file_path):
    doc = fitz.open(file_path)
    text = ""
    for page in doc:
        text += page.get_text()
    doc.close()
    return text

In [15]:
# Load and preprocess the document
document_text = load_pdf('/content/RESEARCH_ARTICLE_1.pdf')

In [16]:
# Step 4: Split Document and Create Embeddings
chunk_size = 500  # Chunk size for the embedding model
chunks = [document_text[i:i+chunk_size] for i in range(0, len(document_text), chunk_size)]
chunk_embeddings = embedding_model.encode(chunks)

In [17]:
# Step 5: Set up FAISS Index
embedding_dim = chunk_embeddings.shape[1]
index = faiss.IndexFlatL2(embedding_dim)
index.add(chunk_embeddings)

In [18]:
# Step 6: Define retrieval and SOP generation functions with timing
def retrieve_relevant_chunks(query, k=3):
    query_embedding = embedding_model.encode([query])
    distances, indices = index.search(query_embedding, k)
    return [chunks[i] for i in indices[0]]

In [19]:
def generate_sop(query):
    start_time = time.time()  # Track processing time
    relevant_chunks = retrieve_relevant_chunks(query)
    combined_text = " ".join(relevant_chunks)
    summary = summarizer(combined_text, max_length=150, min_length=50, do_sample=False)
    processing_time = time.time() - start_time  # Calculate time taken
    return summary[0]['summary_text'], processing_time

In [20]:
# Step 7: Performance Metrics Logging
processing_times = []
cost_per_doc = 0.10  # Hypothetical cost per document
total_costs = 0
accuracy_threshold = 0.90

In [21]:
# Define a function to simulate accuracy and track metrics
def evaluate_performance(query, expected_summary):
    generated_summary, processing_time = generate_sop(query)
    processing_times.append(processing_time)
    global total_costs
    total_costs += cost_per_doc

    # Simulated accuracy based on overlap with expected summary (for demonstration purposes)
    common_words = set(generated_summary.split()).intersection(set(expected_summary.split()))
    accuracy = len(common_words) / len(expected_summary.split())

    return generated_summary, processing_time, accuracy

In [4]:
def chatbot():
    print("Welcome to the SOP Chatbot! Type 'exit' to stop.")
    expected_summary = "Expected SOP summary text here."  # Placeholder for evaluation
    processing_times = []  # Initialize processing_times list here
    while True:
        user_input = input("User: ")
        if user_input.lower() == 'exit':
            break

        generated_summary, processing_time = generate_sop(user_input)

        # Simulated accuracy (replace with your actual accuracy calculation)
        common_words = set(generated_summary.split()).intersection(set(expected_summary.split()))
        accuracy = len(common_words) / len(expected_summary.split()) if len(expected_summary.split()) >0 else 0  # Handle empty expected summary

        processing_times.append((processing_time, generated_summary, accuracy)) # Store processing time and accuracy

        print("SOP Bot:", generated_summary)
        print(f"Processing Time: {processing_time:.2f} seconds")
        print(f"Estimated Accuracy: {accuracy:.2%}")

    # Final Metrics Calculation (moved outside the loop)
    if processing_times: #check if processing_times is not empty
        avg_processing_time = np.mean([pt for pt, _, _ in processing_times])
        time_reduction = (avg_processing_time / 10) * 100

        # Calculate total cost based on the actual number of queries
        total_costs = len(processing_times) * 0.10
        if len(processing_times) > 0: #check if processing times is not 0
            cost_reduction = ((total_costs / len(processing_times)) / 0.15) * 100
        else:
            cost_reduction = 0 # assign 0 if processing time is 0

        print(f"\n--- Performance Summary ---")
        print(f"Average Processing Time per Document: {avg_processing_time:.2f} seconds")
        print(f"Estimated Time Reduction: {time_reduction:.2f}%")
        print(f"Total Operational Cost: ${total_costs:.2f}")
        print(f"Estimated Cost Reduction: {cost_reduction:.2f}%")

        # Access accuracy values correctly
        accuracy_values = [accuracy for _, _, accuracy in processing_times]
        print(f"Accuracy Threshold Achieved: {'Yes' if np.mean(accuracy_values) >= accuracy_threshold else 'No'}")
    else:
        print("No queries processed.")
chatbot()

Welcome to the SOP Chatbot! Type 'exit' to stop.
User: exit
No queries processed.


**Processing Time Calculation:** Each SOP generation records the time taken, which is logged to calculate average processing time per document. A reduction of 30% compared to a hypothetical baseline (e.g., 10 seconds for manual processing) is calculated.


**Cost Calculation:** A cost per document is simulated and accumulated for each processed document. The reduction percentage compares the automated process to a baseline cost.


**Accuracy Calculation:** The accuracy of generated SOPs is estimated by comparing the overlap with an "expected" summary for each query. This comparison simulates accuracy assessment and aims for an average of 90%.


**Performance Summary:** After running the chatbot, metrics are displayed to summarize the reductions in time and cost, and confirm if the accuracy threshold has been met.