# Introduction

In [1]:
# Warnings
import warnings
warnings.filterwarnings('ignore')

# BEGIN: fix Python or Notebook SSL CERTIFICATE_VERIFY_FAILED
import os, ssl
if (not os.environ.get('PYTHONHTTPSVERIFY', '') and getattr(ssl, '_create_unverified_context', None)):
    ssl._create_default_https_context = ssl._create_unverified_context
# END: fix Python or Notebook SSL CERTIFICATE_VERIFY_FAILED

## Installing pre-requsite libraries
* https://pypi.org/project/bert-extractive-summarizer/

In [2]:
!pip install sumy transformers sentencepiece



### Import libraries

In [3]:
from sumy.nlp.tokenizers import Tokenizer
from sumy.parsers.plaintext import PlaintextParser
from sumy.summarizers.lex_rank import LexRankSummarizer

In [4]:
content = "Text_Summarize_Text/content.txt"

output_sentences_count = 10

with open(content, "r", encoding="utf-8") as f: # open(r'C:\Users\...site_1.html', "r") as f:
    article = f.read()  
    
# article

In [5]:
my_parser = PlaintextParser.from_string(article, Tokenizer('english'))

# Creating a summary of 3 sentences.
lex_rank_summarizer = LexRankSummarizer()
lexrank_summary = lex_rank_summarizer(my_parser.document, sentences_count = output_sentences_count)

# Printing the summary
for sentence in lexrank_summary:
  print(sentence)

Onion and garlic, both are considered superfoods due to their numerous proven health benefits.
They add distinct flavours to the food and are inseparable parts of Indian cuisine.
However, Ayurveda does not support the usage of these two ingredients in your diet.
But what is the reason behind this?
The real reason
It is not that Ayurveda does not acknowledge the health benefits of onion and garlic.
Ayurveda recognizes onions and garlic as blood purifiers.
As per Ayurveda, both these ingredients produce excessive heat in the body.
These two ingredients are even avoided by people practicing meditation or following a spiritual path, as consumptions of onion and garlic are known to increase anger, aggression, ignorance, anxiety, and increase in sexual desire.
Health benefits of onion and garlic


## LSA (Latent semantic analysis)

In [6]:
from sumy.summarizers.lsa import LsaSummarizer

# creating the summarizer
lsa_summarizer = LsaSummarizer()
lsa_summary = lsa_summarizer(my_parser.document, sentences_count = output_sentences_count)

# Printing the summary
for sentence in lsa_summary:
    print(sentence)

Onion and garlic, both are considered superfoods due to their numerous proven health benefits.
They add distinct flavours to the food and are inseparable parts of Indian cuisine.
However, Ayurveda does not support the usage of these two ingredients in your diet.
Moreover, garlic is used to prepare various ayurvedic medicines.
But Ayurveda does not support their excessive usage as it considers onion as tamasic in nature (makes people irritable) and garlic to be rajsic (disturbed sleep and drained energy) in nature.
As per Ayurveda, both these ingredients produce excessive heat in the body.
Ayurveda recommends having onion and garlic in low quantity.
Ayurveda principles are mostly confused with spirituality and yoga that recommends avoiding both the ingredient as both are believed to distract a person’s focus and attention.
These two ingredients are even avoided by people practicing meditation or following a spiritual path, as consumptions of onion and garlic are known to increase anger,

## Luhn Summarization algorithm’s approach is based on TF-IDF (Term Frequency-Inverse Document Frequency). 

In [7]:
from sumy.summarizers.luhn import LuhnSummarizer

#  Creating the summarizer
luhn_summarizer = LuhnSummarizer()
luhn_summary = luhn_summarizer(my_parser.document, sentences_count = output_sentences_count)

# Printing the summary
for sentence in luhn_summary:
  print(sentence)

Onion and garlic, both are considered superfoods due to their numerous proven health benefits.
Whether you are preparing curry, stew or soup, onion and garlic are important ingredients in it.
However, Ayurveda does not support the usage of these two ingredients in your diet.
It is not that Ayurveda does not acknowledge the health benefits of onion and garlic.
But Ayurveda does not support their excessive usage as it considers onion as tamasic in nature (makes people irritable) and garlic to be rajsic (disturbed sleep and drained energy) in nature.
As per Ayurveda, both these ingredients produce excessive heat in the body.
It is true that our body needs some heat, but excessive heat may increase the risk of other health problems.
Ayurveda principles are mostly confused with spirituality and yoga that recommends avoiding both the ingredient as both are believed to distract a person’s focus and attention.
These two ingredients are even avoided by people practicing meditation or following 

## extractive method is the KL-Sum algorithm

In [8]:
from sumy.summarizers.kl import KLSummarizer
kl_summarizer = KLSummarizer()
kl_summary = kl_summarizer(my_parser.document, sentences_count = output_sentences_count)

# Printing the summary
for sentence in kl_summary:
    print(sentence)

Onion and garlic, both are considered superfoods due to their numerous proven health benefits.
They add distinct flavours to the food and are inseparable parts of Indian cuisine.
But what is the reason behind this?
The real reason
It is not that Ayurveda does not acknowledge the health benefits of onion and garlic.
Ayurveda recognizes onions and garlic as blood purifiers.
Moreover, garlic is used to prepare various ayurvedic medicines.
Health benefits of onion and garlic
Due to its anti-bacterial, anti-fungal and anti-viral properties, garlic is known to reduce inflammation and lower high blood pressure.
It is even recommended for people trying to lose weight.


## Summarization with T5 Transformers

In [9]:
from transformers import T5Tokenizer, T5Config, T5ForConditionalGeneration

my_model = T5ForConditionalGeneration.from_pretrained('t5-small')
tokenizer = T5Tokenizer.from_pretrained('t5-small')

input_ids = tokenizer.encode(article, return_tensors='pt', max_length=750, truncation=False)
summary_ids = my_model.generate(input_ids)

t5_summary = tokenizer.decode(summary_ids[0])
print(t5_summary)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


<pad> <extra_id_0> and garlic are considered superfoods due to their proven health benefits. They add distinct


# GPT-2 Transformers

In [10]:
# Importing model and tokenizer
from transformers import GPT2Tokenizer,GPT2LMHeadModel

# Instantiating the model and tokenizer with gpt-2
tokenizer=GPT2Tokenizer.from_pretrained('gpt2')
model=GPT2LMHeadModel.from_pretrained('gpt2')

# Encoding text to get input ids & pass them to model.generate()
inputs=tokenizer.batch_encode_plus([article], return_tensors='pt', max_length=750, truncation=False)
summary_ids=model.generate(inputs['input_ids'], early_stopping=True)

GPT_summary=tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(GPT_summary)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.
Input length of input_ids is 384, but ``max_length`` is set to 20.This can lead to unexpected behavior. You should consider increasing ``config.max_length`` or ``max_length``.


Onion and garlic, both are considered superfoods due to their numerous proven health benefits. They add distinct flavours to the food and are inseparable parts of Indian cuisine. Whether you are preparing curry, stew or soup, onion and garlic are important ingredients in it. However, Ayurveda does not support the usage of these two ingredients in your diet. But what is the reason behind this?
The real reason

It is not that Ayurveda does not acknowledge the health benefits of onion and garlic. Ayurveda recognizes onions and garlic as blood purifiers. Moreover, garlic is used to prepare various ayurvedic medicines. But Ayurveda does not support their excessive usage as it considers onion as tamasic in nature (makes people irritable) and garlic to be rajsic (disturbed sleep and drained energy) in nature. As per Ayurveda, both these ingredients produce excessive heat in the body.

It is true that our body needs some heat, but excessive heat may increase the risk of other health problems. 

# XLM Transformers

In [11]:
# Importing model and tokenizer
from transformers import XLMWithLMHeadModel, XLMTokenizer

# Instantiating the model and tokenizer 
tokenizer = XLMTokenizer.from_pretrained('xlm-mlm-en-2048')
model = XLMWithLMHeadModel.from_pretrained('xlm-mlm-en-2048')

# Encoding text to get input ids & pass them to model.generate()
inputs = tokenizer.batch_encode_plus([article], return_tensors='pt')
summary_ids = model.generate(inputs['input_ids'], early_stopping=True)

# Decode and print the summary
XLM_summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
print(XLM_summary)

Some weights of XLMWithLMHeadModel were not initialized from the model checkpoint at xlm-mlm-en-2048 and are newly initialized: ['transformer.position_ids']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Input length of input_ids is 376, but ``max_length`` is set to 20.This can lead to unexpected behavior. You should consider increasing ``config.max_length`` or ``max_length``.


onion and garlic, both are considered superfoods due to their numerous proven health benefits. they add distinct flavours to the food and are inseparable parts of indian cuisine. whether you are preparing curry, stew or soup, onion and garlic are important ingredients in it. however, ayurveda does not support the usage of these two ingredients in your diet. but what is the reason behind this? the real reasonit is not that ayurveda does not acknowledge the health benefits of onion and garlic. ayurveda recognizes onions and garlic as blood purifiers. moreover, garlic is used to prepare various ayurvedic medicines. but ayurveda does not support their excessive usage as it considers onion as tamasic in nature ( makes people irritable ) and garlic to be rajsic ( disturbed sleep and drained energy ) in nature. as per ayurveda, both these ingredients produce excessive heat in the body.it is true that our body needs some heat, but excessive heat may increase the risk of other health problems. 