<a href="https://colab.research.google.com/github/ARJUN108-verma/Elite_Tech_internship/blob/main/TEXT_SUMMARIZATION_TOOL.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

TEXT SUMMARIZATION TOOL

In [1]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.probability import FreqDist
from collections import defaultdict
import string

In [2]:
# Download required NLTK data
nltk.download('punkt')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

In [3]:
class TextSummarizer:
    def __init__(self):
        self.stop_words = set(stopwords.words('english') + list(string.punctuation))

    def preprocess_text(self, text):
        """Clean and prepare the text for processing"""
        # Tokenize into sentences
        sentences = sent_tokenize(text)

        # Tokenize words for each sentence
        word_sentences = [word_tokenize(sentence.lower()) for sentence in sentences]

        # Remove stopwords and punctuation
        filtered_sentences = []
        for words in word_sentences:
            filtered_words = [word for word in words if word not in self.stop_words]
            filtered_sentences.append(filtered_words)

        return sentences, filtered_sentences

    def calculate_sentence_scores(self, filtered_sentences):
        """Calculate importance scores for each sentence"""
        # Flatten all words to calculate word frequencies
        words = [word for sentence in filtered_sentences for word in sentence]
        word_frequencies = FreqDist(words)

        # Calculate sentence scores based on word frequencies
        sentence_scores = defaultdict(int)
        for i, sentence in enumerate(filtered_sentences):
            for word in sentence:
                if word in word_frequencies:
                    sentence_scores[i] += word_frequencies[word]

        return sentence_scores

    def generate_summary(self, text, summary_length=3):
        """Generate a summary of the input text"""
        # Preprocess the text
        original_sentences, filtered_sentences = self.preprocess_text(text)

        # Calculate sentence importance scores
        sentence_scores = self.calculate_sentence_scores(filtered_sentences)

        # Select top N sentences for the summary
        ranked_sentences = sorted(
            sentence_scores.items(),
            key=lambda x: x[1],
            reverse=True
        )[:summary_length]

        # Sort selected sentences by their original order
        summary_sentences = [original_sentences[i] for i, _ in sorted(ranked_sentences)]

        # Combine sentences into summary
        summary = ' '.join(summary_sentences)
        return summary


In [4]:
def main():
    print("Text Summarization Tool")
    print("Enter/Paste your content (press Enter then Ctrl+D to finish):")

    # Read multiline input
    contents = []
    while True:
        try:
            line = input()
        except EOFError:
            break
        contents.append(line)

    text = '\n'.join(contents)

    if not text.strip():
        print("No input provided. Exiting.")
        return

    summarizer = TextSummarizer()

    while True:
        try:
            summary_length = int(input("\nEnter number of sentences for summary (default 3): ") or 3)
            break
        except ValueError:
            print("Please enter a valid number.")

    print("\nGenerating summary...\n")
    summary = summarizer.generate_summary(text, summary_length)

    print("=== Summary ===")
    print(summary)

    # Calculate compression ratio
    original_length = len(sent_tokenize(text))
    if original_length > 0:
        compression = (1 - (summary_length / original_length)) * 100
        print(f"\nSummary reduced text by {compression:.1f}% "
              f"({original_length} sentences → {summary_length} sentences)")

In [None]:
if __name__ == "__main__":
    main()

Text Summarization Tool
Enter/Paste your content (press Enter then Ctrl+D to finish):
Climate Change: Causes and Consequences  The Earth's climate has changed throughout history. Just in the last 650,000 years, there have been seven cycles of glacial advance and retreat, with the abrupt end of the last ice age about 11,700 years ago marking the beginning of the modern climate era — and of human civilization. Most of these climate changes are attributed to very small variations in Earth's orbit that change the amount of solar energy our planet receives.  However, the current warming trend is of particular significance because it is unequivocally the result of human activity since the mid-20th century and proceeding at a rate that is unprecedented over millennia. The planet's average surface temperature has risen about 1.18 degrees Celsius since the late 19th century, a change driven largely by increased carbon dioxide emissions into the atmosphere and other human activities.  The indust