# Automated Text Summarization

In [1]:
text = """Mahendra Singh Dhoni, known for his calm demeanor and strategic acumen, is celebrated as one of the most impactful leaders in cricket history. His remarkable ability to stay composed under pressure and make game-changing decisions has earned him admiration from players and fans alike. Dhoni's leadership style is characterized by a blend of patience and intuition, guiding his team with a quiet confidence that often defies the chaos of high-stakes matches. Beyond his tactical brilliance, he is also recognized for his sportsmanship and dedication to the game, leaving an enduring legacy in the world of cricket. Dhoni, widely known as MS Dhoni, is renowned not only for his leadership but also for his impressive skills as a wicketkeeper-batsman. His career is marked by several significant achievements, including leading the Indian cricket team to victory in major tournaments like the ICC T-twenty World Cup and the ICC Cricket World Cup. Dhoni's journey from a small-town cricketer to an iconic figure in the sport is a testament to his talent and perseverance. Off the field, he is admired for his philanthropic efforts and his role as a mentor to younger players. Dhoni's legacy extends beyond his tactical prowess; it encompasses his ability to inspire and his contributions to both cricket and society."""

In [2]:
len(text)

1313

### `1.` Importing the required libraries

In [3]:
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

In [4]:
stop_words = list(STOP_WORDS)

In [5]:
summarize = spacy.load('en_core_web_sm')

In [6]:
doc = summarize(text)

In [7]:
tokens = [token.text for token in doc]
print(tokens)

['Mahendra', 'Singh', 'Dhoni', ',', 'known', 'for', 'his', 'calm', 'demeanor', 'and', 'strategic', 'acumen', ',', 'is', 'celebrated', 'as', 'one', 'of', 'the', 'most', 'impactful', 'leaders', 'in', 'cricket', 'history', '.', 'His', 'remarkable', 'ability', 'to', 'stay', 'composed', 'under', 'pressure', 'and', 'make', 'game', '-', 'changing', 'decisions', 'has', 'earned', 'him', 'admiration', 'from', 'players', 'and', 'fans', 'alike', '.', 'Dhoni', "'s", 'leadership', 'style', 'is', 'characterized', 'by', 'a', 'blend', 'of', 'patience', 'and', 'intuition', ',', 'guiding', 'his', 'team', 'with', 'a', 'quiet', 'confidence', 'that', 'often', 'defies', 'the', 'chaos', 'of', 'high', '-', 'stakes', 'matches', '.', 'Beyond', 'his', 'tactical', 'brilliance', ',', 'he', 'is', 'also', 'recognized', 'for', 'his', 'sportsmanship', 'and', 'dedication', 'to', 'the', 'game', ',', 'leaving', 'an', 'enduring', 'legacy', 'in', 'the', 'world', 'of', 'cricket', '.', 'Dhoni', ',', 'widely', 'known', 'as', '

In [8]:
punctuation = punctuation + '\n' + '.' + '('+')' +','
punctuation

'!"#$%&\'()*+,-./:;<=>?@[\\]^_`{|}~\n.(),'

### `2.` Text Cleaning

In [9]:
freq_counter = {}
for word in tokens:
    if word.lower() not in stop_words:
        if word.lower() not in punctuation:
            if word.lower() not in freq_counter:
                freq_counter[word.lower()] = 1
            else:
                freq_counter[word.lower()] += 1

In [10]:
print(freq_counter)

{'mahendra': 1, 'singh': 1, 'dhoni': 6, 'known': 2, 'calm': 1, 'demeanor': 1, 'strategic': 1, 'acumen': 1, 'celebrated': 1, 'impactful': 1, 'leaders': 1, 'cricket': 5, 'history': 1, 'remarkable': 1, 'ability': 2, 'stay': 1, 'composed': 1, 'pressure': 1, 'game': 2, 'changing': 1, 'decisions': 1, 'earned': 1, 'admiration': 1, 'players': 2, 'fans': 1, 'alike': 1, 'leadership': 2, 'style': 1, 'characterized': 1, 'blend': 1, 'patience': 1, 'intuition': 1, 'guiding': 1, 'team': 2, 'quiet': 1, 'confidence': 1, 'defies': 1, 'chaos': 1, 'high': 1, 'stakes': 1, 'matches': 1, 'tactical': 2, 'brilliance': 1, 'recognized': 1, 'sportsmanship': 1, 'dedication': 1, 'leaving': 1, 'enduring': 1, 'legacy': 2, 'world': 3, 'widely': 1, 'ms': 1, 'renowned': 1, 'impressive': 1, 'skills': 1, 'wicketkeeper': 1, 'batsman': 1, 'career': 1, 'marked': 1, 'significant': 1, 'achievements': 1, 'including': 1, 'leading': 1, 'indian': 1, 'victory': 1, 'major': 1, 'tournaments': 1, 'like': 1, 'icc': 2, 't': 1, 'cup': 2,

In [11]:
max(freq_counter.values())

6

In [12]:
for word in freq_counter.keys():
    freq_counter[word] = freq_counter[word] / max(freq_counter.values())

In [13]:
print(freq_counter)  #setting the scores for every word based on the frequency of occurence of the word in the sentence 

{'mahendra': 0.16666666666666666, 'singh': 0.16666666666666666, 'dhoni': 1.0, 'known': 0.4, 'calm': 0.2, 'demeanor': 0.2, 'strategic': 0.2, 'acumen': 0.2, 'celebrated': 0.2, 'impactful': 0.2, 'leaders': 0.2, 'cricket': 1.0, 'history': 0.3333333333333333, 'remarkable': 0.3333333333333333, 'ability': 0.6666666666666666, 'stay': 0.3333333333333333, 'composed': 0.3333333333333333, 'pressure': 0.3333333333333333, 'game': 0.6666666666666666, 'changing': 0.3333333333333333, 'decisions': 0.3333333333333333, 'earned': 0.3333333333333333, 'admiration': 0.3333333333333333, 'players': 0.6666666666666666, 'fans': 0.3333333333333333, 'alike': 0.3333333333333333, 'leadership': 0.6666666666666666, 'style': 0.3333333333333333, 'characterized': 0.3333333333333333, 'blend': 0.3333333333333333, 'patience': 0.3333333333333333, 'intuition': 0.3333333333333333, 'guiding': 0.3333333333333333, 'team': 0.6666666666666666, 'quiet': 0.3333333333333333, 'confidence': 0.3333333333333333, 'defies': 0.333333333333333

### `3.` Sentence Tokenization

In [14]:
sentence_tokens = [sent for sent in doc.sents]
print(sentence_tokens)

[Mahendra Singh Dhoni, known for his calm demeanor and strategic acumen, is celebrated as one of the most impactful leaders in cricket history., His remarkable ability to stay composed under pressure and make game-changing decisions has earned him admiration from players and fans alike., Dhoni's leadership style is characterized by a blend of patience and intuition, guiding his team with a quiet confidence that often defies the chaos of high-stakes matches., Beyond his tactical brilliance, he is also recognized for his sportsmanship and dedication to the game, leaving an enduring legacy in the world of cricket., Dhoni, widely known as MS Dhoni, is renowned not only for his leadership but also for his impressive skills as a wicketkeeper-batsman., His career is marked by several significant achievements, including leading the Indian cricket team to victory in major tournaments like the ICC T-twenty World Cup and the ICC Cricket World Cup., Dhoni's journey from a small-town cricketer to a

In [15]:
sentence_score = {}

for sentence in sentence_tokens:
    for word in sentence:
        if word.text.lower() in freq_counter.keys():
            if sentence not in sentence_score:
                sentence_score[sentence] = freq_counter[word.text.lower()]
            else:
                sentence_score[sentence] += freq_counter[word.text.lower()]

In [16]:
sentence_score

{Mahendra Singh Dhoni, known for his calm demeanor and strategic acumen, is celebrated as one of the most impactful leaders in cricket history.: 4.466666666666668,
 His remarkable ability to stay composed under pressure and make game-changing decisions has earned him admiration from players and fans alike.: 5.333333333333333,
 Dhoni's leadership style is characterized by a blend of patience and intuition, guiding his team with a quiet confidence that often defies the chaos of high-stakes matches.: 6.666666666666665,
 Beyond his tactical brilliance, he is also recognized for his sportsmanship and dedication to the game, leaving an enduring legacy in the world of cricket.: 6.0,
 Dhoni, widely known as MS Dhoni, is renowned not only for his leadership but also for his impressive skills as a wicketkeeper-batsman.: 6.566666666666666,
 His career is marked by several significant achievements, including leading the Indian cricket team to victory in major tournaments like the ICC T-twenty Worl

### `4.` Selecting 40% of sentences with max scores

In [17]:
from heapq import nlargest

In [18]:
len(sentence_score) * 0.4  # 40 % of the sentences with large score

3.6

In [19]:
summary = nlargest(n = 4, iterable= sentence_score, key = sentence_score.get)

### `5.` Get Summary

In [20]:
print(summary)

[His career is marked by several significant achievements, including leading the Indian cricket team to victory in major tournaments like the ICC T-twenty World Cup and the ICC Cricket World Cup., Dhoni's journey from a small-town cricketer to an iconic figure in the sport is a testament to his talent and perseverance., Dhoni's legacy extends beyond his tactical prowess; it encompasses his ability to inspire and his contributions to both cricket and society., Off the field, he is admired for his philanthropic efforts and his role as a mentor to younger players.]


In [21]:
final_summary = [word.text for word in summary]
print(" ".join(final_summary)) 

His career is marked by several significant achievements, including leading the Indian cricket team to victory in major tournaments like the ICC T-twenty World Cup and the ICC Cricket World Cup. Dhoni's journey from a small-town cricketer to an iconic figure in the sport is a testament to his talent and perseverance. Dhoni's legacy extends beyond his tactical prowess; it encompasses his ability to inspire and his contributions to both cricket and society. Off the field, he is admired for his philanthropic efforts and his role as a mentor to younger players.
