**Making the UI using Gradio**

Gradio - Gradio is an open-source Python package that allows you to quickly build a demo or web application for your machine learning model, API, or any arbitary Python function.

1. Loading the saved model for abstractive summarization
2. Defining function for abstractive summarization
3. Defining function for extractive summarization (rule based approach)
4. Defining the function to choose between the type of summarization
5. Defining the Gradio interface
6. Launching the interface using public URL

**Installing required libraries**

In [18]:
! pip install gradio transformers[sentencepiece]



**Loading the Abstractive Model**

In [19]:
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer, pipeline

device = "cuda" if torch.cuda.is_available() else "cpu"

model_ckpt = "/content/drive/MyDrive/model"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
model_pegasus = AutoModelForSeq2SeqLM.from_pretrained(model_ckpt).to(device)

**Abstractive summarization function**

In [20]:
def abstractive_summarization(text):
    pipe = pipeline("summarization", model=model_pegasus, tokenizer=tokenizer, device=0 if device == "cuda" else -1)
    summary = pipe(text)[0]["summary_text"]
    return summary

**Extractive Summarization function**

In [21]:
import nltk
import numpy as np
import pandas as pd
from sentence_transformers import SentenceTransformer
from sklearn.feature_extraction.text import TfidfVectorizer
from nltk.cluster import KMeansClusterer
from scipy.spatial import distance_matrix

nltk.download('punkt')
model = SentenceTransformer('stsb-roberta-base')

def extractive_summarization(text):
    sentences = nltk.sent_tokenize(text)
    sentences = [sentence.strip() for sentence in sentences]
    df = pd.DataFrame(sentences, columns=['sentences'])

    vectorizer = TfidfVectorizer()
    tfidf_matrix = vectorizer.fit_transform(df['sentences'])
    tfidf_scores = np.sum(tfidf_matrix, axis=1)
    normalized_tfidf_scores = tfidf_scores / np.sum(tfidf_scores)
    df['tfidf_score'] = normalized_tfidf_scores

    def get_sent_embeddings(sent):
        embeddings = model.encode([sent])
        return embeddings[0]

    df['embeddings'] = df['sentences'].apply(get_sent_embeddings)

    n_clusters = int(len(df)/3)
    iterations = 25
    X = np.array(df['embeddings'].tolist())

    kcluster = KMeansClusterer(n_clusters, distance=nltk.cluster.util.cosine_distance, repeats=iterations, avoid_empty_clusters=True)
    assigned_clusters = kcluster.cluster(X, assign_clusters=True)
    df['Cluster'] = assigned_clusters
    df['Centroid'] = df['Cluster'].apply(lambda x: kcluster.means()[x])

    def distance_from_centroid(row):
        dist_matrix = distance_matrix([row['embeddings']], [row['Centroid'].tolist()])[0][0]
        return dist_matrix

    df['distance_from_centroid'] = df.apply(distance_from_centroid, axis=1)
    df['combined_score'] = df['distance_from_centroid'] * df['tfidf_score']

    sents = df.sort_values(by='combined_score', ascending=True).groupby('Cluster').head(1)['sentences'].tolist()
    summary = ' '.join(sents)
    return summary

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


**Interface using Gradio**

In [23]:
import gradio as gr

# Function to choose between abstractive and extractive
def summarize(text, summarization_type):
    if summarization_type == "Abstractive":
        summary = abstractive_summarization(text)
    else:
        summary = extractive_summarization(text)
    return summary

# Define the Gradio Interface
iface = gr.Interface(
    fn=summarize,
    inputs=[
        gr.Textbox(lines=5, label="Input Text"),
        gr.Radio(["Abstractive", "Extractive"], label="Summarization Type")
    ],
    outputs=[
        gr.Textbox(label="Summary", placeholder="Generated summary will appear here.")
    ],
    title="Text Summarizer",
    description="Choose the type of summarization and input the text.",
)

# Launch the interface
iface.launch(share=True, inline=True)

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Running on public URL: https://66b8dafd039b11fad5.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)


