<a href="https://colab.research.google.com/github/sagarrokad1/Text-Summarization/blob/main/Text_Summarization_Using_Bert_and_Spacy.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##**Used two methods with two different libraries Bert and Spacy for text summarization**

### **In this notebook for text summarization I used a very good book for overcoming fear and anxiety named : Free Yourself from Fears_ Overcoming Anxiety and Living Without Worry**

### **Installing some necessary libraries**

In [None]:
! pip install bert-extractive-summarizer
! pip install spacy
! pip install transformers # > 2.2.0
! pip install neuralcoref
! pip install folium==0.2.1

###**Importing Libraries**

In [None]:
from summarizer import Summarizer
import spacy
from spacy.lang.en.stop_words import STOP_WORDS
from string import punctuation
from pprint import pprint
from heapq import nlargest
import string

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


###**Using Bert for text summarization**

###**Load Data**

In [None]:
with open("/content/drive/MyDrive/AlmaBetter/Free Yourself from Fears_ Overcoming Anxiety and Living Without Worry - PDF Room.txt", 'r') as file:
    data = file.read().replace('\n', '')

In [None]:
data = data.replace("\ufeff", "")

In [None]:
data=data.replace('\x0c','')

In [None]:
translation_table = str.maketrans('', '', string.digits)
data = data.translate(translation_table)

In [None]:
data[0:200]

'Praise forFree Yourself From Fears“It has been said that the two great emotions are love and fear, andthat they form the foundation for all others. Fear tends to be thesource of most of the difficult '

In [None]:
model = Summarizer()

Downloading:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/1.25G [00:00<?, ?B/s]

Some weights of the model checkpoint at bert-large-uncased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


Downloading:   0%|          | 0.00/226k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/455k [00:00<?, ?B/s]

####**Taking only 10 sentence for summary**

In [None]:
result = model(data, num_sentences=10, min_length=60)

In [None]:
full = ''.join(result)

#### **Final Summary**

In [None]:
pprint(full)

('Praise forFree Yourself From Fears“It has been said that the two great '
 'emotions are love and fear, andthat they form the foundation for all others. '
 'Sometimes it comes completely from our imaginationabout what might happen. '
 'Which would you rather live in?If you feel afraid when you are in the air:o '
 'Become aware of your feelings, and be curious about what you arefeeling.o '
 'Relax your body (see page ).o Use a safety anchor (page ).Fear of '
 'authorityMany people are frightened of authority figures—a fear that '
 'oftenstarts in childhood, where authority is power and has both the ability '
 'and the permission to hurt. What changes when you do this?How do you feel '
 'now?This skill works because the fear comes not from the authority '
 'itself,but from how you are thinking about it. The area ofinfluence is much '
 'smaller than the area of concern. Googlewhacking is a joke, but behind '
 'thehumor it is clear that the only way to get through the informationglut is '

## **Applying second method for text summarization using Spacy Library**

In [None]:
nlp = spacy.load('en_core_web_sm')

In [None]:
document = nlp(data)

####**Converting each word of sentence into tokens**

In [None]:
tokens = [token.text for token in document]

In [None]:
stopwords = list(STOP_WORDS)

####**Frequency of each word in the text**

In [None]:
word_freq = {}
for word in document:
    if word.text.lower() not in stopwords:
        if word.text.lower() not in punctuation:
            if word.text not in word_freq.keys():
                word_freq[word.text] = 1
            else:
                word_freq[word.text] += 1
                
#print(word_freq)

####**Finding maximum frequency**

In [None]:
max_freq = max(word_freq.values())
max_freq

516

####**Percentage of each word in the text**

In [None]:
for word in word_freq.keys():
    word_freq[word] = word_freq[word]/max_freq

#print(word_freq)

####**Dividing sentences into tokens with fullstops**

In [None]:
sen_tokens = [sent for sent in document.sents]

####**Alloting score to each sentence**

In [None]:
sen_scores = {}
for sent in sen_tokens:
    for word in sent:
        if word.text.lower() in word_freq.keys():
            if sent not in sen_scores.keys():
                sen_scores[sent] = word_freq[word.text.lower()]
            else:
                sen_scores[sent] += word_freq[word.text.lower()]

In [None]:
length = int(len(sen_tokens)*0.003)
length

16

#### **Finding the sentences with highest scores**

In [None]:
summary = nlargest(length, sen_scores, key = sen_scores.get)
summary

[Primary fear will always be there to protect us, but many people livetheir lives within fear: fear of risk, fear of failure, fear of authority, fearof loss.,
 THE FEARThe value under the fearSelf-esteemThe value of respectFear of looking stupidFear of commitmentDEALING WITH FEAR IN THE BODYControlling your feeling of fearControlling fear through breathingControlling fear through feelingControlling fear through relaxationDEALING WITH FEAR IN THE MINDHow,
 PART IIUnreal Fear—Fear as FoeCHAPTER Fear in TimeThe present has three dimensions… the present of past things, the presentof present things and the present of future things.,
 Part III deals with authentic fear: fear as a signal to keep you outof danger and take action, and how to distinguish authentic fear fromunreal fear.,
 People who do brave acts are not fearless—the very reason we admirethem and see them as brave is because they feel the fear and do itanyway, unlike the usual response, which is to feel the fear and holdback.,
 W

####**Joining sentences with highest score**

In [None]:
final_summary = [word.text for word in summary]
summary = ' '.join(final_summary)

####**Final Summary Using Spacy**

In [None]:
summary

'Primary fear will always be there to protect us, but many people livetheir lives within fear: fear of risk, fear of failure, fear of authority, fearof loss. THE FEARThe value under the fearSelf-esteemThe value of respectFear of looking stupidFear of commitmentDEALING WITH FEAR IN THE BODYControlling your feeling of fearControlling fear through breathingControlling fear through feelingControlling fear through relaxationDEALING WITH FEAR IN THE MINDHow PART IIUnreal Fear—Fear as FoeCHAPTER Fear in TimeThe present has three dimensions… the present of past things, the presentof present things and the present of future things. Part III deals with authentic fear: fear as a signal to keep you outof danger and take action, and how to distinguish authentic fear fromunreal fear. People who do brave acts are not fearless—the very reason we admirethem and see them as brave is because they feel the fear and do itanyway, unlike the usual response, which is to feel the fear and holdback. We talk ofF

In [None]:
len(data)

381672

In [None]:
len(summary)

2391