# **Text Summarization using NLP**


**What is text summarization?**

Text summarization is the process of extracting the most important information from a source text.

**Why automatic text summarization?**



1.   Summaries reduce reading time.
2.   When researching documents,summaries make the  selection process easier.
3.   Automatic summarization improves the effectiveness of indexing.
4.   Automatice summarization algorithms are less biased than human summarization.
5.   Personalized summaries are useful in question-answering systems as they provied personalized information.
6.   Using automatic or semi-automatic summarization systems enables commercial abstract services to increase the number of text documents they are able to process.

**How to do text summarization**


*   Text cleaning
*   Sentence tokenization
*   Word tokenzation
*   Word-frequency table
*   Summarization 

**Importing Libraries**

In [11]:
import tkinter as tk
from tkinter import scrolledtext
from tkinter import messagebox
from heapq import nlargest
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

**Load Preprocessing corpus**

In [12]:
stopwords = list(STOP_WORDS)
nlp = spacy.load('en_core_web_sm')

**Function To Take Input and Display it**

In [13]:
def summarize_text():
    text = input_textbox.get("1.0", tk.END)
    doc = nlp(text)
    
    word_frequencies = {}
    for word in doc:
        if word.text.lower() not in stopwords and word.text.lower() not in punctuation:
            if word.text not in word_frequencies:
                word_frequencies[word.text] = 1
            else:
                word_frequencies[word.text] += 1
    
    max_frequency = max(word_frequencies.values())
    for word in word_frequencies:
        word_frequencies[word] = word_frequencies[word] / max_frequency
    
    sentence_tokens = [sent for sent in doc.sents]
    sentence_scores = {}
    for sent in sentence_tokens:
        for word in sent:
            if word.text.lower() in word_frequencies:
                if sent not in sentence_scores:
                    sentence_scores[sent] = word_frequencies[word.text.lower()]
                else:
                    sentence_scores[sent] += word_frequencies[word.text.lower()]
    
    select_length = int(len(sentence_tokens) * 0.3)
    summary = nlargest(select_length, sentence_scores, key=sentence_scores.get)
    final_summary = ' '.join([word.text for word in summary])
    
    output_textbox.delete(1.0, tk.END)
    output_textbox.insert(tk.END, final_summary)

**Fucntion to Copy Output Text**

In [14]:
import pyperclip
def copy_to_clipboard():
    summary = output_textbox.get("1.0", tk.END)
    pyperclip.copy(summary)
    messagebox.showinfo("Info", "Summary copied to clipboard!")

**Main Window**

In [15]:
root = tk.Tk()
root.title("Text Summarization using NLP")
root.geometry("800x600")

''

**Input Box**

In [16]:
input_label = tk.Label(root, text="Enter Text to Summarize:", font=("Arial", 14))
input_label.pack(pady=10)
input_textbox = scrolledtext.ScrolledText(root, height=10, width=50, font=("Arial", 12))
input_textbox.pack(pady=10)

**Summarize Button**

In [17]:
summarize_button = tk.Button(root, text="Summarize", command=summarize_text, font=("Arial", 12), bg="green", fg="white")
summarize_button.pack(pady=10)

**Output Box**

In [18]:
output_label = tk.Label(root, text="Summarized Text:", font=("Arial", 14))
output_label.pack(pady=10)
output_textbox = scrolledtext.ScrolledText(root, height=10, width=50, font=("Arial", 12))
output_textbox.pack(pady=10)

In [19]:
copy_button = tk.Button(root, text="Copy to Clipboard", command=copy_to_clipboard, font=("Arial", 12), bg="green", fg="white")
copy_button.pack(pady=10)

**Main Loop**

In [20]:
root.mainloop()