In [1]:
# !pip install gensim sumy transformers

## TextRank Summarizer

In [2]:
# Import the textrank sumarizer
from sumy.summarizers.text_rank import TextRankSummarizer

# Import the parser and tokenizer
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer

In [3]:
text = '''
The Bengal tiger is a population of the Panthera tigris tigris subspecies and the nominate tiger
subspecies. It ranks among the biggest wild cats alive today.[2][3] It is considered to belong 
to the world's charismatic megafauna. The tiger is estimated to have been present in the Indian
subcontinent since the Late Pleistocene, for about 12,000 to 16,500 years.[5][6][7] Today, it is
threatened by poaching, loss and fragmentation of habitat, and was estimated at comprising fewer
than 2,500 wild individuals by 2011. None of the Tiger Conservation Landscapes within its range 
is considered large enough to support an effective population of more than 250 adult individuals.
The Bengal tiger's historical range covered the Indus River valley until the early 19th century,
almost all of India, Pakistan, southern Nepal, Bangladesh, Bhutan, and southwestern China. 
Today, it inhabits India, Bangladesh, Nepal, Bhutan, and southwestern China.[6] India's tiger 
population was estimated at 2,603–3,346 individuals by 2018.[9] Around 300–500 individuals are 
estimated in Bangladesh,[8] 355 in Nepal by 2022,[10] and 90 individuals in Bhutan by 2015.
Felis tigris was the scientific name used by Carl Linnaeus in 1758 for the tiger.[12] 
It was subordinated to the genus Panthera by Reginald Innes Pocock in 1929. Bengal is the 
traditional type locality of the species and the nominate subspecies Panthera tigris tigris.
The validity of several tiger subspecies in continental Asia was questioned in 1999. 
Morphologically, tigers from different regions vary little, and gene flow between populations 
in those regions is considered to have been possible during the Pleistocene. Therefore, it was 
proposed to recognise only two subspecies as valid, namely P. t. tigris in mainland Asia, and 
P. t. sondaica in the Greater Sunda Islands and possibly in Sundaland.[14] The nominate 
subspecies P. t. tigris constitutes two clades: the northern clade comprises the Siberian and 
Caspian tiger populations, and the southern clade all remaining continental tiger populations.
[15] The extinct and living tiger populations in continental Asia have been subsumed to P. t. 
tigris since the revision of felid taxonomy in 2017.[1] Results of a genetic analysis of 32 
tiger samples indicate that the Bengal tiger samples grouped into a different clade than the 
Siberian tiger samples'''

In [4]:
from nltk.tokenize import sent_tokenize
len(sent_tokenize(text))

19

In [5]:
# Initializing my parser
my_parser = PlaintextParser.from_string(text,Tokenizer('english'))

In [6]:
# Creating a summary of 3 sentences
text_rank_summarizer = TextRankSummarizer()
summary = text_rank_summarizer(my_parser.document,sentences_count=3)

In [7]:
for sent in summary:
    print(sent)

The Bengal tiger is a population of the Panthera tigris tigris subspecies and the nominate tiger subspecies.
The tiger is estimated to have been present in the Indian subcontinent since the Late Pleistocene, for about 12,000 to 16,500 years.
[14] The nominate subspecies P. t. tigris constitutes two clades: the northern clade comprises the Siberian and Caspian tiger populations, and the southern clade all remaining continental tiger populations.


## Using URL(Web Scrapping)

In [8]:
import urllib
from bs4 import BeautifulSoup

In [9]:
url = urllib.request.urlopen("https://en.wikipedia.org/wiki/Rajgad_Fort")

In [10]:
data = url.read()

In [11]:
soup=BeautifulSoup(data,'html.parser')
text=soup.get_text()

In [12]:
len(sent_tokenize(text))

62

In [13]:
# Initializing my parser
my_parser = PlaintextParser.from_string(text,Tokenizer('english'))

In [14]:
# Creating a summary of 3 sentences
text_rank_summarizer = TextRankSummarizer()
summary = text_rank_summarizer(my_parser.document,sentences_count=3)

In [15]:
for sent in summary:
    print(sent)

History[edit] The fort has stood witness to many significant historic events including the birth of Chhatrapati Shivaji's son Rajaram I, the death of Shivaji's wife Saibai, the return of Shivaji from Agra, the burial of Afzal Khan's head in the Mahadarwaja walls of Balle Killa, the strict words of Sonopant Dabir to Shivaji.
Amazing Maharashtra Rajgad Fort Information in Marathi Pune Trekkers vteForts in MaharashtraAhmednagar district Ahmednagar Fort Bahadurgad Bhairavgad Bitangad Harishchandragad Kaladgad Kharda Kunjargad Madan Fort Manjarsumbha fort Pabargad Patta Fort Ratangad Akola district Akola Fort Balapur Narnala Amravati district Amner Fort Gawilghur Aurangabad district Antur Fort Daulatabad Fort Chandrapur district Chandrapur Fort Ballarpur Fort Bhadravati Fort Manikgad Dhule district Bhamer Laling Thalner Kolhapur district Bhudargad Gandharvgad Panhala Pargadh Pavangad Samangad Vishalgad Latur district Udgir Mumbai City district Bombay Castle Dongri Fort Fort George Mahim For

## LexRank Summarizer

In [16]:
# Import the LexRank summarizer
from sumy.summarizers.lex_rank import LexRankSummarizer

In [17]:
text = '''
The Bengal tiger is a population of the Panthera tigris tigris subspecies and the nominate tiger
subspecies. It ranks among the biggest wild cats alive today.[2][3] It is considered to belong 
to the world's charismatic megafauna. The tiger is estimated to have been present in the Indian
subcontinent since the Late Pleistocene, for about 12,000 to 16,500 years.[5][6][7] Today, it is
threatened by poaching, loss and fragmentation of habitat, and was estimated at comprising fewer
than 2,500 wild individuals by 2011. None of the Tiger Conservation Landscapes within its range 
is considered large enough to support an effective population of more than 250 adult individuals.
The Bengal tiger's historical range covered the Indus River valley until the early 19th century,
almost all of India, Pakistan, southern Nepal, Bangladesh, Bhutan, and southwestern China. 
Today, it inhabits India, Bangladesh, Nepal, Bhutan, and southwestern China.[6] India's tiger 
population was estimated at 2,603–3,346 individuals by 2018.[9] Around 300–500 individuals are 
estimated in Bangladesh,[8] 355 in Nepal by 2022,[10] and 90 individuals in Bhutan by 2015.
Felis tigris was the scientific name used by Carl Linnaeus in 1758 for the tiger.[12] 
It was subordinated to the genus Panthera by Reginald Innes Pocock in 1929. Bengal is the 
traditional type locality of the species and the nominate subspecies Panthera tigris tigris.
The validity of several tiger subspecies in continental Asia was questioned in 1999. 
Morphologically, tigers from different regions vary little, and gene flow between populations 
in those regions is considered to have been possible during the Pleistocene. Therefore, it was 
proposed to recognise only two subspecies as valid, namely P. t. tigris in mainland Asia, and 
P. t. sondaica in the Greater Sunda Islands and possibly in Sundaland.[14] The nominate 
subspecies P. t. tigris constitutes two clades: the northern clade comprises the Siberian and 
Caspian tiger populations, and the southern clade all remaining continental tiger populations.
[15] The extinct and living tiger populations in continental Asia have been subsumed to P. t. 
tigris since the revision of felid taxonomy in 2017.[1] Results of a genetic analysis of 32 
tiger samples indicate that the Bengal tiger samples grouped into a different clade than the 
Siberian tiger samples'''

In [18]:
# Initializing my parser
my_parser = PlaintextParser.from_string(text,Tokenizer('english'))

In [19]:
# Create a summary of 3 sentences
lex_rank_summarizer = LexRankSummarizer()
summary = lex_rank_summarizer(my_parser.document,sentences_count=3)

In [20]:
for sent in summary:
    print(sent)

The Bengal tiger is a population of the Panthera tigris tigris subspecies and the nominate tiger subspecies.
[6] India's tiger population was estimated at 2,603–3,346 individuals by 2018.
[9] Around 300–500 individuals are estimated in Bangladesh,[8] 355 in Nepal by 2022,[10] and 90 individuals in Bhutan by 2015.


## LSA Summarizer

In [21]:
# Import the LSA summarizer
from sumy.summarizers.lsa import LsaSummarizer

In [22]:
# Initializing my parser
my_parser = PlaintextParser.from_string(text,Tokenizer('english'))

In [23]:
# Create a summary of 3 sentences
lsa_summarizer = LsaSummarizer()
summary = lsa_summarizer(my_parser.document,sentences_count=3)

In [24]:
for sent in summary:
    print(sent)

Today, it inhabits India, Bangladesh, Nepal, Bhutan, and southwestern China.
[12] It was subordinated to the genus Panthera by Reginald Innes Pocock in 1929.
Morphologically, tigers from different regions vary little, and gene flow between populations in those regions is considered to have been possible during the Pleistocene.


## Using transformers(GPT) for text summarization

In [25]:
import transformers

In [26]:
pipeline = transformers.pipeline("summarization")

2023-12-12 15:18:11.270697: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-12-12 15:18:11.331446: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-12-12 15:18:11.331491: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-12-12 15:18:11.332858: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-12-12 15:18:11.342661: I tensorflow/core/platform/cpu_feature_guar

config.json:   0%|          | 0.00/1.80k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/1.22G [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/26.0 [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/899k [00:00<?, ?B/s]

merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

In [28]:
summary_text = pipeline(text,max_length=200,
                       min_length=5,do_sample=False)[0]['summary_text']

print(summary_text)

 The Bengal tiger ranks among the biggest wild cats alive today . It is estimated to have been present in the Indian subcontinent since the Late Pleistocene, for about 12,000 to 16,500 years . The tiger's historical range covered the Indus River valley until the early 19th century . Today, it is threatened by poaching, loss and fragmentation of habitat .
