# Sentence-Ranking
**Sentence ranking** is a popular approach for text summarization, where sentences are scored based on their importance and the top-ranked sentences are selected to form the summary. Here are some pros and cons of using sentence ranking for text summarization:

## Pros:

* It is a simple and intuitive approach that can be easily implemented.
* It can handle different types of text, such as news articles, scientific papers, and social media posts.
* It can preserve the original structure of the text and provide a coherent summary.
* It can be combined with other techniques, such as sentence clustering and sentence compression, to improve the quality of summaries.
* It can be evaluated using standard metrics, such as ROUGE and BLEU, which allow for objective comparison with other summarization models.

### Cons:

* It can be sensitive to the choice of ranking algorithm and feature set, which can affect the quality of the summary.
* It may not capture the overall meaning of the text and may miss important information.
* It may generate redundant or repetitive information, especially when multiple sentences convey similar information.
* It may not handle text with complex syntax or domain-specific terminology well, which can lead to inaccuracies in the summary.
* It may not be able to generate summaries that are novel or creative, as it relies on the input text for content.

Overall, sentence ranking is a widely used and effective approach for text summarization, but its limitations should be considered when evaluating its performance and potential applications.

These are the scores we achieved:

      ROUGE Score:
      Precision: 0.833
      Recall: 0.331
      F1-Score: 0.474

      BLEU Score: 0.556


Here are some research papers that use sentence ranking for text summarization:

1. "TextRank: Bringing Order into Texts" by R. Mihalcea and P. Tarau. This paper introduces the TextRank algorithm, which is a graph-based approach for sentence ranking and has been widely used for text summarization.

2. "Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization" by J. A. Pérez-Carballo and A. García-Serrano. This paper compares the performance of different graph-based algorithms, including TextRank, for extractive text summarization.

3. "Enhancing Sentence Extraction-Based Single-Document Summarization with Supervised Methods" by D. Das and A. Sarkar. This paper proposes a supervised learning approach for sentence ranking based on features such as sentence length, position, and similarity to the document title.

4. "A Neural Attention Model for Abstractive Sentence Summarization" by A. Rush et al. This paper uses a neural attention model for abstractive text summarization, where sentences are ranked based on their relevance to the summary and the overall coherence of the text.

These papers demonstrate the versatility and effectiveness of sentence ranking for text summarization, and highlight the potential for combining this approach with other techniques to improve the quality of summaries.

In [None]:
!pip install rouge
!pip install nltk
from rouge import Rouge 
import nltk
import nltk.translate.bleu_score as bleu
nltk.download('stopwords')
nltk.download('punkt')
import numpy as np
import pandas as pd
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from nltk.stem import PorterStemmer
from sklearn.feature_extraction.text import TfidfVectorizer

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting rouge
  Downloading rouge-1.0.1-py3-none-any.whl (13 kB)
Installing collected packages: rouge
Successfully installed rouge-1.0.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


In [None]:
text ="""
 India's Health Ministry has announced that the country's COVID-19 vaccination drive will now be expanded to include people over the age of 60 and those over 45 with co-morbidities. The move is expected to cover an additional 270 million people, making it one of the largest vaccination drives in the world.The decision was taken after a meeting of the National Expert Group on Vaccine Administration for COVID-19 (NEGVAC), which recommended the expansion of the vaccination program. The NEGVAC also suggested that private hospitals may be allowed to administer the vaccine, although the details of this are yet to be finalized.India began its vaccination drive in mid-January, starting with healthcare and frontline workers. Since then, over 13 million doses have been administered across the country. However, the pace of the vaccination drive has been slower than expected, with concerns raised over vaccine hesitancy and logistical challenges.The expansion of the vaccination drive to include the elderly and those with co-morbidities is a major step towards achieving herd immunity and controlling the spread of the virus in India. The Health Ministry has also urged eligible individuals to come forward and get vaccinated at the earliest.India has reported over 11 million cases of COVID-19, making it the second-worst affected country in the world after the United States. The country's daily case count has been declining in recent weeks, but experts have warned that the pandemic is far from over and that precautions need to be maintained.
In summary, India's Health Ministry has announced that the country's COVID-19 vaccination drive will be expanded to include people over 60 and those over 45 with co-morbidities, covering an additional 270 million people. The decision was taken after a meeting of the National Expert Group on Vaccine Administration for COVID-19, and is a major step towards achieving herd immunity and controlling the spread of the virus in India."""

In [None]:
nltk.download('stopwords')
nltk.download('punkt')
  

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [None]:
#Preprocess the text
stop_words = set(stopwords.words('english'))
stemmer = PorterStemmer()
sentences = sent_tokenize(text.lower())
words = word_tokenize(text.lower())

filtered_words = []
for word in words:
    if word not in stop_words:
        stemmed_word = stemmer.stem(word)
        filtered_words.append(stemmed_word)

# Calculate the sentence scores
vectorizer = TfidfVectorizer()
X = vectorizer.fit_transform(sentences)

In [None]:
sentence_scores = []
for i in range(len(sentences)):
    sentence_score = 0
    for word in filtered_words:
        if word in vectorizer.get_feature_names():
            sentence_score += X[i, vectorizer.vocabulary_[word]]
    sentence_scores.append(sentence_score)

# Sort the sentences
ranked_sentences = sorted(((sentence_scores[i], s) for i, s in enumerate(sentences)), reverse=True)

# Select the top N sentences
top_n = 3
selected_sentences = []
for i in range(top_n):
    selected_sentences.append(ranked_sentences[i][1])




In [None]:
# Generate the summary
summary = " ".join(selected_sentences)
print(summary)

in summary, india's health ministry has announced that the country's covid-19 vaccination drive will be expanded to include people over 60 and those over 45 with co-morbidities, covering an additional 270 million people. the decision was taken after a meeting of the national expert group on vaccine administration for covid-19, and is a major step towards achieving herd immunity and controlling the spread of the virus in india. 
 india's health ministry has announced that the country's covid-19 vaccination drive will now be expanded to include people over the age of 60 and those over 45 with co-morbidities.


In [None]:
rouge = Rouge()
scores = rouge.get_scores(summary, text)
print("ROUGE Score:")
print("Precision: {:.3f}".format(scores[0]['rouge-1']['p']))
print("Recall: {:.3f}".format(scores[0]['rouge-1']['r']))
print("F1-Score: {:.3f}".format(scores[0]['rouge-1']['f']))

ROUGE Score:
Precision: 0.833
Recall: 0.331
F1-Score: 0.474


In [None]:
from nltk.translate.bleu_score import sentence_bleu

def summary_to_sentences(summary):
    # Split the summary into sentences using the '.' character as a separator
    sentences = summary.split('.')
    
    # Convert each sentence into a list of words
    sentence_lists = [sentence.split() for sentence in sentences]
    
    return sentence_lists

def paragraph_to_wordlist(paragraph):
    # Split the paragraph into words using whitespace as a separator
    words = paragraph.split()
    return words

reference_paragraph = text
reference_summary = summary_to_sentences(reference_paragraph)
predicted_paragraph = summary
predicted_summary = paragraph_to_wordlist(predicted_paragraph)

score = sentence_bleu(reference_summary, predicted_summary)
print(score)

0.5559999307354189


In [None]:
print("BLEU Score: {:.3f}".format(score))

BLEU Score: 0.556
