# Text Summarizer Plugin

## Introduction

Chrome extension seamlessly integrates with Flask and leverages an OpenVINO backend for fast and efficient summarization of webpages (via URL) and PDFs (via upload). Powered by LangChain tools, it handles advanced tasks like text splitting and vectorstore management to deliver accurate and meaningful summaries.

## How it Works

<img width="1000" alt="image" src="./assets/Text-Summarizer-Overview.png">


## Pre-requisites

### Install the below necessary tools/packages:
   - [Git on Windows](https://git-scm.com/downloads)
   - [Miniforge](https://conda-forge.org/download/)
   - [Google Chrome for Windows](https://www.google.com/chrome/?brand=OZZY&ds_kid=43700080794581137&gad_source=1&gclid=Cj0KCQiAoae5BhCNARIsADVLzZdwNNB5nIyjZ8OyCzg6h_cCig1eoaYquUSEd7BAigJhTzps1Kxuop8aArE6EALw_wcB&gclsrc=aw.ds)


### Follow the below steps to prepare the environment

Before converting the models & running the plugin, make sure you have followed all the below listed [steps to prepare the environment](./README.md/#prerequisites)
- Cloning the Text-Summarizer Plugin Repository
- Creating conda environment & Installing necessary packages

### Download and Convert the Huggingface Model to OpenVINO IR Format:

#### Login to Huggingface:
Generate a token from Huggingface for private/gated models like Meta Llama, etc. To access such private/gated models, refer to [Huggingface documentation](https://huggingface.co/docs/hub/en/models-gated).

In [None]:
from huggingface_hub import login
login()

#### Converting a huggingface model to OpenVINO
Convert the models using `optimum-cli`

In [None]:
! mkdir models && cd models     

In [None]:
! optimum-cli export openvino --model Qwen/Qwen2-7B-Instruct --weight-format int4 ov_qwen7b

In [None]:
! optimum-cli export openvino --model meta-llama/Llama-2-7b-chat-hf --weight-format int4 ov_llama_2

>**Note**: [Raise access request](https://www.llama.com/llama-downloads) for Llama models as it is a gated repository.


### Load the extension

To load an unpacked extension in developer mode:
- Go to the Extensions page by entering **chrome://extensions** in a new tab. (By design chrome:// URLs are not linkable.)
    - Alternatively, **click the Extensions menu puzzle button and select Manage Extensions** at the bottom of the menu.
    - Or, click the Chrome menu, hover over More Tools, then select Extensions.
- Enable **Developer Mode** by clicking the toggle switch next to Developer mode.
- Click the **Load unpacked** button and select the extension directory.
- Refer to [Chrome’s development documentation](https://developer.chrome.com/docs/extensions/get-started/tutorial/hello-world#load-unpacked) for further details.

<img src="./assets/load_extension.png" width=250 height=250 >







### Pin the extension
Pin your extension to the toolbar to quickly access your extension.

<img src="./assets/pin_extension.png" height=250 width=250>


### Code Sample Structure
Browser plugin code has two parts, one is backend folder & the other is extension folder.
- **Backend** - In the backend folder, we have two python files `code.py` and `server.py`
  - `code.py` manages data pre-processing tasks
  - `server.py` manages flask server-side operations
- **Extension** - In the extension we have the front end code required for the browser plugin (popup.html, popup.js, style.css, manifest.json)


## Backend code for Text Summarization 

### Importing the necessary libraries

In [None]:
from transformers import AutoTokenizer, pipeline
from optimum.intel import OVModelForCausalLM
from langchain_community.llms import HuggingFacePipeline
from langchain.prompts import PromptTemplate
from langchain.chains import RetrievalQA
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings
from langchain_community.document_loaders import WebBaseLoader, PyPDFLoader

### Prompt Templates for Summarization & Question Answering Bot
Here we have declared two variables for prompt templates so that it can be called later on , one template for summarization and one for query asked in the bot

In [None]:
#prompt template for summarization
summary_template= """Write a concise summary of the following: "{context}" CONCISE SUMMARY: """
#prompt template for query
query_template="""Use the following pieces of context to answer the question at the end.
    If you don't know the answer, just say that you don't know, don't try to make up an answer.
    Use 10 words maximum and keep the answer as concise as possible in one sentence.
    Always say "thanks for asking!" at the end of the answer.
 
    {context}
 
    Question: {question}
 
    Helpful Answer:"""

### Pre-process the Input File

* Through browser plugin, users post an input file for summarization. 
* Document loaders, i.e. WebBaseLoader & PyPDFLoader, in RAG are used to load page content from any Webpage/PDF and preprocess the documents that will be further used for retrieval during the summarization & question answering process.
* The loaded page data would be split using Recursive Character Text Splitter & embeddings are created using HuggingFace Embeddings. Here, RecursiveCharacterTextSplitter is used to split text into smaller pieces recursively at the character level.
* In RAG, embeddings plays a crucial role in retrieval of relevant documents for a given query and Sentence Transformers helps to generate embeddings for each document in your knowledge base.
* These embeddings are further stored into ChromaDB for further retrieval usage. Chroma is a vector store and embeddings database designed from the ground-up to make it easy to build AI applications with embeddings.

In [None]:
def pre_processing(loader):
    """
        This function does the below steps in a sequential order:
        1. Loads page content from the webpage/PDF 
        2. Splits the page data using Recursive Character Text Splitter & creates embeddings using HuggingFace Embeddings
        3. This is further stored into ChromaDB for futher retrieval usage
        input: loader contains page data from a Webpage/PDF
        output: returns a vectorstore
    """
    try:
        page_data = loader.load()
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=20)
        all_splits = text_splitter.split_documents(page_data)
        embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2")
        global vectorstore
        vectorstore = Chroma.from_documents(documents=all_splits, embedding=embeddings)  
        return vectorstore
    except Exception as e:
        print("Error while processing Webpage/PDF page content\n")
        raise e

### Load LLM models
Below module:
1. Fetches the OpenVINO converted model & compiles it on GPU
2. Generate a HuggingFace Pipeline for Text-Generation.
3. Returns the model

In [None]:
def load_llm(model_id):
    """
        Meta Llama2 & Qwen 7B models are converted to OpenVINO IR Format. This function compiles those converted models on GPU.
        input: user selected model_id from plugin
        output: compiled model with openvino
    """
    if model_id:
        try:
            if model_id=="Meta LLama 2":
                model_path=r"models\ov_llama_2"
            elif model_id=="Qwen 7B Instruct":
                model_path=r"models\ov_qwen7b"
            model = OVModelForCausalLM.from_pretrained(model_path , device='GPU')
            tokenizer = AutoTokenizer.from_pretrained(model_path)
            pipe=pipeline(
                "text-generation",
                model=model,
                tokenizer=tokenizer,
                max_new_tokens=4096,  
                device=model.device
            )
            global llm_model 
            llm_model = HuggingFacePipeline(pipeline=pipe)
            return llm_model
        except Exception as e:
            print("Failed to load the model. Please check whether the model_path is correct\n")
            raise e

###  URL Summarization
For a URL Summarization, we load the web page content when end-user enters a URL into the plugin using **WebBaseLoader** which in return loads the page data and passes into the RetrievalQA chain. When a question is being asked in the retreival QA chain , we try to get a concise summary and return it . Here we are using WebBaseLoader to load the documents from the web.
* The **WebBaseLoader** in Retrieval Augmented Generation (RAG) is a type of document loader that is designed to load documents from the web.The WebBaseLoader is used when the documents for retrieval are not stored locally or in a Hugging Face dataset, but are instead located on the web.
* **RetrievalQA** is a type of question answering system that uses a retriever to fetch relevant documents given a question, and then uses a reader to extract the answer from the retrieved documents.

Below module:
1. Loads the page data from a webpage using WebBaseLoader.
2. Pre-processed the data & stores into a vector store.
3. Passes the prompt, vectorstore & LLM model into the chain & returns the summary to the plugin

In [None]:
def pre_process_url_data(urls):
    """
        When an end user pastes a URL into the plugin, The RAW data is passed onto the RetrievalQA chain,
        and the output is returned back to the plugin.
        input: Webpage URL(str).
        output: Glance Summary of the fetched URL.
    """
    try:
        loader = WebBaseLoader(urls)
        global summ_vectorstore 
        summ_vectorstore = pre_processing(loader) # Common Helper function for processing data.
        prompt = PromptTemplate(
            template=summary_template,
            input_variables=["context", "question"]
        )
    
        qa_chain = RetrievalQA.from_chain_type(
            llm=llm_model,
            retriever=summ_vectorstore.as_retriever(),
            chain_type="stuff",
            chain_type_kwargs={"prompt": prompt},
            return_source_documents=False,
        )
        
        question = "Please summarize the entire book in one paragraph of 100 words"
        summary = qa_chain(question)
        response = summary['result']
        summary_start = response.find("CONCISE SUMMARY:")
        concise_summary = response[summary_start + len("CONCISE SUMMARY:"):].strip()
        return concise_summary
    except Exception as e:
        print("Failed to summarize webpage\n")
        raise e

### URL Question Answering BOT
The below module:
1. Fetches any follow up questions related to the summary post URL summarization.
2. Passes the **query template** which is declared as global into the prompt, vectorstore, compiled model to the chain.
3. When a question is being passed to the retrieval QA chain, the chain loads & posts a precise answer in a single sentence.


In [None]:
def qa_on_url_summarized_text(query):
    """
        This function fetches the query asked by the users post summarization from the URL, searches an answer from the vectorstore & returns answer in less than 10 words.
        input: user's follow-up question(str)
        output: Answer to the conversations.
    """
    try:
        prompt = PromptTemplate(
            template=query_template,
            input_variables=["context", "question"]
            )
        reduce_chain = RetrievalQA.from_chain_type(
                llm=llm_model,
                retriever=summ_vectorstore.as_retriever(),
                chain_type="stuff",
                chain_type_kwargs={"prompt": prompt},
                return_source_documents=False
            )
        summary = reduce_chain({'query': query})
        summ_vectorstore.delete
        response = summary['result']
        summary_start = response.find("Helpful Answer:")
        concise_summary = response[summary_start + len("Helpful Answer:"):].strip()
        return concise_summary
    except Exception as e:
        print("Error in Webpage Summarizer QA BoT\n")
        raise e

### PDF Summarization

When end-users upload any PDF file to the plugin, page data is loaded, using **PyPDFLoader** and passed into the RetrievalQA chain for generating a concise summary. When a question is being asked post summarization, a precise answer is returned. 

* **PyPDFLoader** is a document loader within the LangChain framework specifically designed to handle PDF files. It allows you to extract text from PDF documents and load them into a format suitable for language models and other text-based applications.

Below module: 
1. Loads the page data from a webpage using PyPDFLoader.
2. Pre-processed the data & stores into a vector store.
3. Passes the prompt, vectorstore & LLM model into the chain & returns the summary to the plugin

In [None]:
def pre_process_pdf_data(pdf):
    """
        When an end user uploads a PDF into the plugin, The RAW data is passed onto the RetrievalQA chain,
        and the output is returned back to the plugin.
        input: PDF path(str).
        output: Glance Summary of the uploaded PDF.
    """
    try:
        loader = PyPDFLoader(pdf, extract_images=False)
        global pdf_vectorstore
        pdf_vectorstore=pre_processing(loader)
    
        prompt = PromptTemplate(
            template=summary_template,
            input_variables=["context", "question"]
        )
        reduce_chain = RetrievalQA.from_chain_type(
            llm=llm_model,
            retriever=pdf_vectorstore.as_retriever(),
            chain_type="stuff",
            chain_type_kwargs={"prompt": prompt},
            return_source_documents=False,
        )
        question = "Please summarize the entire book in 100 words."
        summary = reduce_chain({'query': question})

        response = summary['result']
        summary_start = response.find("CONCISE SUMMARY:")
        concise_summary = response[summary_start + len("CONCISE SUMMARY:"):].strip()
        return concise_summary
    except Exception as e:
        print("Failed to summarize PDF \n")
        raise e


### PDF Question Answering BOT
The below module:
1. Posts follow up questions, asked by end-users, related to the summary PDF post PDF summarization
2. Passes the **query template** which is declared as global into the prompt, vectorstore, compiled model to the chain.
3. When a question is being passed to the retrieval QA chain, the chain loads & posts a precise answer in a single sentence.

In [None]:
def qa_on_pdf_summarized_text(query):
    """
        This function fetches the query asked by the users post summarization from the PDF, then after it searches an answer from the vectorstore & returns answer in less than 10 words.
        input: user's follow-up question(str)
        output: Answer to the conversations.
    """
    try:
        prompt = PromptTemplate(
            template=query_template,
            input_variables=["context", "question"]
            )
        reduce_chain = RetrievalQA.from_chain_type(
                llm=llm_model,
                retriever=pdf_vectorstore.as_retriever(),
                chain_type="stuff",
                chain_type_kwargs={"prompt": prompt},
                return_source_documents=False
            )
        summary = reduce_chain({'query': query})
        response = summary['result']
        summary_start = response.find("Helpful Answer:")
        concise_summary = response[summary_start + len("Helpful Answer:"):].strip()
        return concise_summary
    except Exception as e:
        print("Error in PDF Summarizer QA BoT")
        raise e


## Server-side code

### Importing necessary packages

In [None]:
import time
from flask import Flask, Response, request, jsonify
from flask_cors import CORS
import tempfile
import chromadb

### Initializing the flask app and enabling CORS
Here we are initializing a flask and enabling CORS which allows the flask app tobe accessed and interacted with from other domains and we are restricting the types of files that can be uploaded to the application.


In [None]:
app = Flask(__name__)
CORS(app)  # This will enable CORS for all routes
ALLOWED_EXTENSIONS = {'txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'}

### Model Selection
The below module:
1. Fetches the model selected by the end-user through model_id.
2. Loads the model which would further trigger the model compilation function present in the main code for summarization.


In [None]:
@app.route('/select-model', methods=['POST'])
def select_model():
    """
        Model selection function which would further trigger Model compilation function.
    """
    try:
        global current_model
        data = request.get_json()
        model_id = data.get('model_id')
        current_model = load_llm(model_id)
        return jsonify({'message': f'Model {model_id} loaded successfully.'}), 200
    
    except Exception:
        return jsonify({'message': f'Failed to load model'}), 500

### Yielding Summary onto the plugin
This is a generator function which helps to Stream and yield the response content chunk by chunk

In [None]:
def stream_output(process_function, *args):
    """
        Generator function to stream output from a process function.
    """
    try:
        for chunk in process_function(*args):
            if chunk is not None:
                yield f"{chunk}"
    except Exception:
        yield f"Error while streaming output"


### URL processing code
This will fetch the URL from the user's input from the plugin and trigger the URL summarization function present in the main code

In [None]:
@app.route('/process-url', methods=['POST'])
def process_url():
    """
        Fetches URL from the plugin & triggers the URL summarization function.
    """
    try:
        data = request.get_json()
        url = data.get('url')  
        if not url:
            return jsonify({'message': 'No URL provided'}), 400
        chromadb.api.client.SharedSystemClient.clear_system_cache()
        return Response(stream_output(pre_process_url_data, [url]), content_type='text/event-stream')
    
    except Exception: 
        return jsonify({'message': f'Error while processing URL'}), 400


### PDF processing code
This function takes the PDF uploaded by the user and trigger the PDF summarization function present in the main code

In [None]:
@app.route('/upload-pdf', methods=['POST'])
def upload_pdf():
    """
        Once the PDF's uploaded, the PDF Summarization function's triggered.
    """
   
    pdf_file = request.files['pdf'] 
    if pdf_file and pdf_file.content_type == 'application/pdf':
        try:
            with tempfile.NamedTemporaryFile(delete=False, suffix=".pdf") as temp_pdf:
                pdf_file.save(temp_pdf.name)
                temp_pdf_path = temp_pdf.name
                print(temp_pdf_path)
           
            chromadb.api.client.SharedSystemClient.clear_system_cache()
            return Response(stream_output(pre_process_pdf_data, temp_pdf_path), content_type='text/event-stream')
 
        except Exception:
            return jsonify({"message": f"Error processing PDF:"}), 500
 
    else:
        return jsonify({"message": "Invalid file type. Please upload a PDF."}), 400
 

### PDF Query code for Question Answering Bot
Once the PDF content summarization is done user asks query to the Question Answering bot which gets triggered to the Query for the PDF function present in the main code

In [None]:
@app.route('/your_query_pdf', methods=['POST'])
def pdf_process_query():
    """
        This function triggers the PDF Question Answering Bot
    """
    try:
        data = request.get_json()
        query=data.get('query')
        if not data:
            return jsonify({'message':'no query provided'}),400
        response_message=str(qa_on_pdf_summarized_text(query))
        return jsonify({'message': response_message}), 200
    except Exception:
        return jsonify({'message': f'Error while PDF QA Bot'}), 500


### URL Query code for Question Answering Bot
Once the URL content summarization is done user asks query to the Question Answering bot which gets triggered to the Query for the URL function present in the main code

In [None]:
@app.route('/your_query_url', methods=['POST'])
def url_process_query():
    """
        This function triggers the URL question answering Bot
    """
    try:
        data = request.get_json()
        query=data.get('query')
        if not data:
            return jsonify({'message':'no query provided'}),400
        response_message=str(qa_on_url_summarized_text(query))
        return jsonify({'message': response_message}), 200
    except Exception:
        return jsonify({'message': f'Error while URL QA Bot'}), 500


### Calling the main function
This code snippet ensures that the Flask development server starts only when the application is run directly

In [None]:
app.run(port=5000)