In [None]:
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize

In [None]:
nltk.download('punkt')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to /home/sandeep/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     /home/sandeep/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [None]:
text = """
    In today’s rapidly evolving business environment, organizations are constantly seeking ways to gain a competitive advantage by improving operational efficiency and making data-driven decisions. One of the major challenges faced by business analysts, like Janani, is dealing with the overwhelming amount of textual information available from various sources such as market reports, industry publications, customer reviews, financial statements, and competitor updates. Analyzing such massive volumes of text manually can be time-consuming and prone to human error. As a result, there is a growing need for automated tools that can quickly process and summarize large datasets into actionable insights.

    Industry trends change at a fast pace, and businesses must stay informed to adapt their strategies accordingly. Analysts need to identify emerging technologies, shifting consumer preferences, regulatory changes, and global market fluctuations. However, manually reading through dozens of lengthy articles every day reduces productivity and delays decision-making. Automated summarization provides a fast and efficient solution by condensing these large documents into short, meaningful summaries that highlight only the most important information.

    Competitor analysis is another crucial area where automated text summarization proves beneficial. Companies frequently release press announcements, product updates, financial results, and marketing campaigns. Tracking these activities manually is nearly impossible on a daily basis. With summarization tools, analysts can instantly extract key competitor insights, enabling businesses to respond quickly with strategic planning, product improvements, or targeted marketing efforts.

    Additionally, the widespread use of social media and online platforms has led to an explosion of user-generated content. Customer feedback, reviews, and public sentiment often contain valuable insights that can shape business decisions. Yet, the sheer volume of this data makes manual review impractical. Automated summarization allows analysts to distill customer opinions, identify recurring issues, and uncover positive trends that can guide product development and enhance customer satisfaction.

    Artificial intelligence and natural language processing technologies play a central role in building effective summarization tools. By leveraging algorithms that analyze word frequency, sentence importance, and contextual meaning, these systems can produce accurate summaries that retain the essence of the original text. This not only saves time for analysts but also ensures consistency and reduces the risk of missing critical information.

    In conclusion, automated text summarization is becoming an essential component of modern business intelligence. It empowers analysts like Janani to navigate the vast amount of textual data efficiently, extract meaningful insights, and make informed decisions quickly. As organizations continue to embrace digital transformation, tools like SummAI will play a vital role in enhancing productivity, improving strategic planning, and maintaining a competitive edge in the marketplace.

"""

In [None]:
words = word_tokenize(text)

In [None]:
stop_words = set(stopwords.words("english"))

In [None]:
filtered_words = [w.lower() for w in words if w.lower() not in stop_words]


In [None]:
print("Filtered Words:")
print(filtered_words)
print("\n")

Filtered Words:
['today', '’', 'rapidly', 'evolving', 'business', 'environment', ',', 'organizations', 'constantly', 'seeking', 'ways', 'gain', 'competitive', 'advantage', 'improving', 'operational', 'efficiency', 'making', 'data-driven', 'decisions', '.', 'one', 'major', 'challenges', 'faced', 'business', 'analysts', ',', 'like', 'janani', ',', 'dealing', 'overwhelming', 'amount', 'textual', 'information', 'available', 'various', 'sources', 'market', 'reports', ',', 'industry', 'publications', ',', 'customer', 'reviews', ',', 'financial', 'statements', ',', 'competitor', 'updates', '.', 'analyzing', 'massive', 'volumes', 'text', 'manually', 'time-consuming', 'prone', 'human', 'error', '.', 'result', ',', 'growing', 'need', 'automated', 'tools', 'quickly', 'process', 'summarize', 'large', 'datasets', 'actionable', 'insights', '.', 'industry', 'trends', 'change', 'fast', 'pace', ',', 'businesses', 'must', 'stay', 'informed', 'adapt', 'strategies', 'accordingly', '.', 'analysts', 'need',

In [None]:
freq_table = {}

In [None]:
for word in filtered_words:
    if word in freq_table:
        freq_table[word] += 1
    else:
        freq_table[word] = 1

In [None]:
print("Frequency Table:")
print(freq_table)
print("\n")

Frequency Table:
{'today': 1, '’': 1, 'rapidly': 1, 'evolving': 1, 'business': 4, 'environment': 1, ',': 36, 'organizations': 2, 'constantly': 1, 'seeking': 1, 'ways': 1, 'gain': 1, 'competitive': 2, 'advantage': 1, 'improving': 2, 'operational': 1, 'efficiency': 1, 'making': 1, 'data-driven': 1, 'decisions': 3, '.': 22, 'one': 1, 'major': 1, 'challenges': 1, 'faced': 1, 'analysts': 6, 'like': 3, 'janani': 2, 'dealing': 1, 'overwhelming': 1, 'amount': 2, 'textual': 2, 'information': 3, 'available': 1, 'various': 1, 'sources': 1, 'market': 2, 'reports': 1, 'industry': 2, 'publications': 1, 'customer': 4, 'reviews': 2, 'financial': 2, 'statements': 1, 'competitor': 3, 'updates': 2, 'analyzing': 1, 'massive': 1, 'volumes': 1, 'text': 4, 'manually': 3, 'time-consuming': 1, 'prone': 1, 'human': 1, 'error': 1, 'result': 1, 'growing': 1, 'need': 2, 'automated': 5, 'tools': 4, 'quickly': 3, 'process': 1, 'summarize': 1, 'large': 2, 'datasets': 1, 'actionable': 1, 'insights': 4, 'trends': 2, 'c

In [None]:
sentences = sent_tokenize(text)


In [None]:
sentence_value = {}

In [None]:
for sentence in sentences:
    for word, freq in freq_table.items():
        if word in sentence.lower():
            if sentence in sentence_value:
                sentence_value[sentence] += freq
            else:
                sentence_value[sentence] = freq

In [None]:
print("Sentence Scores:")
print(sentence_value)
print("\n")

Sentence Scores:
{'\nIn today’s rapidly evolving business environment, organizations are constantly seeking ways to gain a competitive advantage by improving operational efficiency and making data-driven decisions.': 88, 'One of the major challenges faced by business analysts, like Janani, is dealing with the overwhelming amount of textual information available from various sources such as market reports, industry publications, customer reviews, financial statements, and competitor updates.': 115, 'Analyzing such massive volumes of text manually can be time-consuming and prone to human error.': 40, 'As a result, there is a growing need for automated tools that can quickly process and summarize large datasets into actionable insights.': 86, 'Industry trends change at a fast pace, and businesses must stay informed to adapt their strategies accordingly.': 79, 'Analysts need to identify emerging technologies, shifting consumer preferences, regulatory changes, and global market fluctuations

In [None]:
if len(sentence_value) == 0:
    print("No sentences found in the input text.")
else:
    avg_score = sum(sentence_value.values()) / len(sentence_value)

    summary = ""
    for sentence in sentences:
        if sentence in sentence_value and sentence_value[sentence] > (1.0 * avg_score):
            summary += sentence + " "

    print("------ SUMMARY ------")
    print(summary)

------ SUMMARY ------
One of the major challenges faced by business analysts, like Janani, is dealing with the overwhelming amount of textual information available from various sources such as market reports, industry publications, customer reviews, financial statements, and competitor updates. With summarization tools, analysts can instantly extract key competitor insights, enabling businesses to respond quickly with strategic planning, product improvements, or targeted marketing efforts. Automated summarization allows analysts to distill customer opinions, identify recurring issues, and uncover positive trends that can guide product development and enhance customer satisfaction. It empowers analysts like Janani to navigate the vast amount of textual data efficiently, extract meaningful insights, and make informed decisions quickly. As organizations continue to embrace digital transformation, tools like SummAI will play a vital role in enhancing productivity, improving strategic plann