#### Extractive Text Summary Generation Model

In [1]:
text = """"Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum and first released in 1991, Python emphasizes code clarity with its clean syntax and indentation-based structure. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python’s extensive standard library and vibrant ecosystem of third-party packages make it a versatile tool for a wide range of applications, from web development and automation to data science, machine learning, and artificial intelligence. Popular frameworks like Django and Flask enable rapid web development, while libraries such as NumPy, Pandas, and Scikit-learn support scientific computing and data analysis. Python is widely used in scripting, automation, cybersecurity, and DevOps, making it a crucial skill in modern software development. With a strong community and comprehensive documentation, Python remains one of the most beginner-friendly languages, attracting both newcomers and experienced developers alike. Its cross-platform compatibility allows applications to run seamlessly across different operating systems. Python’s simple syntax, combined with its powerful capabilities, ensures that it remains a dominant language in programming, continually evolving to meet the demands of cutting-edge technology, making it an indispensable tool for developers worldwide."""

In [2]:
len(text)

1437

In [3]:
#!pip install spacy

In [4]:
#!python -m spacy download en_core_web_sm

In [5]:
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation

In [6]:
nlp = spacy.load('en_core_web_sm')

In [7]:
summary = nlp(text)

In [8]:
summary

"Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum and first released in 1991, Python emphasizes code clarity with its clean syntax and indentation-based structure. It supports multiple programming paradigms, including procedural, object-oriented, and functional programming. Python’s extensive standard library and vibrant ecosystem of third-party packages make it a versatile tool for a wide range of applications, from web development and automation to data science, machine learning, and artificial intelligence. Popular frameworks like Django and Flask enable rapid web development, while libraries such as NumPy, Pandas, and Scikit-learn support scientific computing and data analysis. Python is widely used in scripting, automation, cybersecurity, and DevOps, making it a crucial skill in modern software development. With a strong community and comprehensive documentation, Python remains one of the most beginner-f

In [9]:
len(summary)

239

In [10]:
tokens = [token.text.lower() for token in summary if not token.is_stop and not token.is_punct and token.text !='\n']

In [11]:
tokens

['python',
 'high',
 'level',
 'interpreted',
 'programming',
 'language',
 'known',
 'simplicity',
 'readability',
 'created',
 'guido',
 'van',
 'rossum',
 'released',
 '1991',
 'python',
 'emphasizes',
 'code',
 'clarity',
 'clean',
 'syntax',
 'indentation',
 'based',
 'structure',
 'supports',
 'multiple',
 'programming',
 'paradigms',
 'including',
 'procedural',
 'object',
 'oriented',
 'functional',
 'programming',
 'python',
 'extensive',
 'standard',
 'library',
 'vibrant',
 'ecosystem',
 'party',
 'packages',
 'versatile',
 'tool',
 'wide',
 'range',
 'applications',
 'web',
 'development',
 'automation',
 'data',
 'science',
 'machine',
 'learning',
 'artificial',
 'intelligence',
 'popular',
 'frameworks',
 'like',
 'django',
 'flask',
 'enable',
 'rapid',
 'web',
 'development',
 'libraries',
 'numpy',
 'pandas',
 'scikit',
 'learn',
 'support',
 'scientific',
 'computing',
 'data',
 'analysis',
 'python',
 'widely',
 'scripting',
 'automation',
 'cybersecurity',
 'devops

In [12]:
len(tokens)

134

In [13]:
tokens1 = []
stopwords  = list(STOP_WORDS)
allowed = ['ADJ','PROPN','VERB','NOUN']
for token in summary:
    if token.text in stopwords or token.text in punctuation:
        continue
    if token.pos_ in allowed :
        tokens1.append(token.text)

In [14]:
tokens1

['Python',
 'high',
 'level',
 'interpreted',
 'programming',
 'language',
 'known',
 'simplicity',
 'readability',
 'Created',
 'Guido',
 'van',
 'Rossum',
 'released',
 'Python',
 'emphasizes',
 'code',
 'clarity',
 'clean',
 'syntax',
 'indentation',
 'based',
 'structure',
 'supports',
 'multiple',
 'programming',
 'paradigms',
 'including',
 'procedural',
 'object',
 'oriented',
 'functional',
 'programming',
 'Python',
 'extensive',
 'standard',
 'library',
 'vibrant',
 'ecosystem',
 'party',
 'packages',
 'versatile',
 'tool',
 'wide',
 'range',
 'applications',
 'web',
 'development',
 'automation',
 'data',
 'science',
 'machine',
 'learning',
 'artificial',
 'intelligence',
 'Popular',
 'frameworks',
 'Django',
 'Flask',
 'enable',
 'rapid',
 'web',
 'development',
 'libraries',
 'NumPy',
 'Pandas',
 'Scikit',
 'learn',
 'support',
 'scientific',
 'computing',
 'data',
 'analysis',
 'Python',
 'scripting',
 'automation',
 'cybersecurity',
 'DevOps',
 'making',
 'crucial',
 's

In [15]:
len(tokens1)

127

In [16]:
from collections import Counter

In [17]:
word_cnt = Counter(tokens)

In [18]:
word_cnt

Counter({'python': 6,
         'programming': 4,
         'development': 3,
         'language': 2,
         'syntax': 2,
         'tool': 2,
         'applications': 2,
         'web': 2,
         'automation': 2,
         'data': 2,
         'making': 2,
         'remains': 2,
         'developers': 2,
         'high': 1,
         'level': 1,
         'interpreted': 1,
         'known': 1,
         'simplicity': 1,
         'readability': 1,
         'created': 1,
         'guido': 1,
         'van': 1,
         'rossum': 1,
         'released': 1,
         '1991': 1,
         'emphasizes': 1,
         'code': 1,
         'clarity': 1,
         'clean': 1,
         'indentation': 1,
         'based': 1,
         'structure': 1,
         'supports': 1,
         'multiple': 1,
         'paradigms': 1,
         'including': 1,
         'procedural': 1,
         'object': 1,
         'oriented': 1,
         'functional': 1,
         'extensive': 1,
         'standard': 1,
         'libra

In [19]:
freq_max = max(word_cnt.values())

In [20]:
freq_max

6

In [21]:
for word in word_cnt.keys():
    word_cnt[word] = word_cnt[word]/freq_max

In [22]:
word_cnt                #normalization betn 0-1

Counter({'python': 1.0,
         'programming': 0.6666666666666666,
         'development': 0.5,
         'language': 0.3333333333333333,
         'syntax': 0.3333333333333333,
         'tool': 0.3333333333333333,
         'applications': 0.3333333333333333,
         'web': 0.3333333333333333,
         'automation': 0.3333333333333333,
         'data': 0.3333333333333333,
         'making': 0.3333333333333333,
         'remains': 0.3333333333333333,
         'developers': 0.3333333333333333,
         'high': 0.16666666666666666,
         'level': 0.16666666666666666,
         'interpreted': 0.16666666666666666,
         'known': 0.16666666666666666,
         'simplicity': 0.16666666666666666,
         'readability': 0.16666666666666666,
         'created': 0.16666666666666666,
         'guido': 0.16666666666666666,
         'van': 0.16666666666666666,
         'rossum': 0.16666666666666666,
         'released': 0.16666666666666666,
         '1991': 0.16666666666666666,
         'emphas

In [23]:
sent_tokens = [sent.text for sent in summary.sents]

In [24]:
sent_tokens

['"Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum and first released in 1991, Python emphasizes code clarity with its clean syntax and indentation-based structure.',
 'It supports multiple programming paradigms, including procedural, object-oriented, and functional programming.',
 'Python’s extensive standard library and vibrant ecosystem of third-party packages make it a versatile tool for a wide range of applications, from web development and automation to data science, machine learning, and artificial intelligence.',
 'Popular frameworks like Django and Flask enable rapid web development, while libraries such as NumPy, Pandas, and Scikit-learn support scientific computing and data analysis.',
 'Python is widely used in scripting, automation, cybersecurity, and DevOps, making it a crucial skill in modern software development.',
 'With a strong community and comprehensive documentation, Python remains one 

In [25]:
sent_score = {}
for sent in sent_tokens:
    for word in sent.split():
        if word.lower() in word_cnt.keys():
            if sent not in sent_score.keys():
                sent_score[sent] = word_cnt[word]
            else:
                sent_score[sent] += word_cnt[word]
        print(word)

"Python
is
a
high-level,
interpreted
programming
language
known
for
its
simplicity
and
readability.
Created
by
Guido
van
Rossum
and
first
released
in
1991,
Python
emphasizes
code
clarity
with
its
clean
syntax
and
indentation-based
structure.
It
supports
multiple
programming
paradigms,
including
procedural,
object-oriented,
and
functional
programming.
Python’s
extensive
standard
library
and
vibrant
ecosystem
of
third-party
packages
make
it
a
versatile
tool
for
a
wide
range
of
applications,
from
web
development
and
automation
to
data
science,
machine
learning,
and
artificial
intelligence.
Popular
frameworks
like
Django
and
Flask
enable
rapid
web
development,
while
libraries
such
as
NumPy,
Pandas,
and
Scikit-learn
support
scientific
computing
and
data
analysis.
Python
is
widely
used
in
scripting,
automation,
cybersecurity,
and
DevOps,
making
it
a
crucial
skill
in
modern
software
development.
With
a
strong
community
and
comprehensive
documentation,
Python
remains
one
of
the
most
beginner-f

In [26]:
sent_score

{'"Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum and first released in 1991, Python emphasizes code clarity with its clean syntax and indentation-based structure.': 2.833333333333333,
 'It supports multiple programming paradigms, including procedural, object-oriented, and functional programming.': 1.3333333333333335,
 'Python’s extensive standard library and vibrant ecosystem of third-party packages make it a versatile tool for a wide range of applications, from web development and automation to data science, machine learning, and artificial intelligence.': 3.6666666666666665,
 'Popular frameworks like Django and Flask enable rapid web development, while libraries such as NumPy, Pandas, and Scikit-learn support scientific computing and data analysis.': 2.0000000000000004,
 'Python is widely used in scripting, automation, cybersecurity, and DevOps, making it a crucial skill in modern software development.':

In [27]:
import pandas as pd

In [28]:
pd.DataFrame(list(sent_score.items()),columns=['Sentence','Score'])

Unnamed: 0,Sentence,Score
0,"""Python is a high-level, interpreted programmi...",2.833333
1,"It supports multiple programming paradigms, in...",1.333333
2,Python’s extensive standard library and vibran...,3.666667
3,Popular frameworks like Django and Flask enabl...,2.0
4,"Python is widely used in scripting, automation...",1.166667
5,With a strong community and comprehensive docu...,1.666667
6,Its cross-platform compatibility allows applic...,1.333333
7,"Python’s simple syntax, combined with its powe...",3.333333


In [29]:
from heapq import nlargest

In [30]:
sents =3
n=nlargest(sents,sent_score,key=sent_score.get)

In [31]:
" ".join(n)

'Python’s extensive standard library and vibrant ecosystem of third-party packages make it a versatile tool for a wide range of applications, from web development and automation to data science, machine learning, and artificial intelligence. Python’s simple syntax, combined with its powerful capabilities, ensures that it remains a dominant language in programming, continually evolving to meet the demands of cutting-edge technology, making it an indispensable tool for developers worldwide. "Python is a high-level, interpreted programming language known for its simplicity and readability. Created by Guido van Rossum and first released in 1991, Python emphasizes code clarity with its clean syntax and indentation-based structure.'

##### Extractive summary done