## Mandate-4 Contributions
### Notebook-2

In this notebook, we will train the model on random PAWS data in which I have shuffled the columns and rows. Downloading PAWS data:--

In [None]:
!mkdir data
!wget https://storage.googleapis.com/paws/english/paws_wiki_labeled_final.tar.gz -P data
!tar -xvf data/paws_wiki_labeled_final.tar.gz -C data
!mv data/final/* data
!rm -r data/final

--2023-05-04 02:39:08--  https://storage.googleapis.com/paws/english/paws_wiki_labeled_final.tar.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 173.194.193.128, 173.194.194.128, 173.194.195.128, ...
Connecting to storage.googleapis.com (storage.googleapis.com)|173.194.193.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4687157 (4.5M) [application/gzip]
Saving to: ‘data/paws_wiki_labeled_final.tar.gz’


2023-05-04 02:39:08 (110 MB/s) - ‘data/paws_wiki_labeled_final.tar.gz’ saved [4687157/4687157]

final/test.tsv
final/
final/train.tsv
final/dev.tsv
--2023-05-04 02:39:09--  http://qim.fs.quoracdn.net/quora_duplicate_questions.tsv
Resolving qim.fs.quoracdn.net (qim.fs.quoracdn.net)... 151.101.1.2, 151.101.65.2, 151.101.129.2, ...
Connecting to qim.fs.quoracdn.net (qim.fs.quoracdn.net)|151.101.1.2|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 58176133 (55M) [text/tab-separated-values]
Saving to: ‘data/quora_duplicate

Here we will install evaluate and rouge-score libraries for checking accuracy of our model.

In [None]:
!pip install simpletransformers
!pip install utils
!pip install transformers==4.23.1

!pip install evaluate
!pip install rouge-score

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting simpletransformers
  Downloading simpletransformers-0.63.11-py3-none-any.whl (250 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m250.7/250.7 kB[0m [31m6.6 MB/s[0m eta [36m0:00:00[0m
Collecting transformers>=4.6.0
  Downloading transformers-4.28.1-py3-none-any.whl (7.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m7.0/7.0 MB[0m [31m82.3 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting seqeval
  Downloading seqeval-1.2.2.tar.gz (43 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.6/43.6 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Collecting sentencepiece
  Downloading sentencepiece-0.1.99-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.3/1.3 MB[0m [31m77.0 MB/s[0m 

In [None]:
import os
from datetime import datetime
import logging

import pandas as pd
from sklearn.model_selection import train_test_split
from simpletransformers.seq2seq import Seq2SeqModel, Seq2SeqArgs


Functions for loading the data and pre-processing the datasets.

In [None]:
import warnings

import pandas as pd


def load_data(file_path, input_text_column, target_text_column, label_column, keep_label=1):
    df = pd.read_csv(file_path, sep="\t", error_bad_lines=False)
    df = df.loc[df[label_column] == keep_label]
    df = df.rename(
        columns={input_text_column: "input_text", target_text_column: "target_text"}
    )
    df = df[["input_text", "target_text"]]
    df["prefix"] = "paraphrase"

    return df


def clean_unnecessary_spaces(out_string):
    if not isinstance(out_string, str):
        warnings.warn(f">>> {out_string} <<< is not a string.")
        out_string = str(out_string)
    out_string = (
        out_string.replace(" .", ".")
        .replace(" ?", "?")
        .replace(" !", "!")
        .replace(" ,", ",")
        .replace(" ' ", "'")
        .replace(" n't", "n't")
        .replace(" 'm", "'m")
        .replace(" 's", "'s")
        .replace(" 've", "'ve")
        .replace(" 're", "'re")
        .replace("@", "")
        .replace("#", "")
        .replace("*", "")
    )
    return out_string


Loading Google PAWS data

In [None]:
# Google Data
train_df = pd.read_csv("data/train.tsv", sep="\t").astype(str)
eval_df = pd.read_csv("data/dev.tsv", sep="\t").astype(str)

train_df = train_df.loc[train_df["label"] == "1"]
eval_df = eval_df.loc[eval_df["label"] == "1"]

train_df = train_df.rename(
    columns={"sentence1": "input_text", "sentence2": "target_text"}
)
eval_df = eval_df.rename(
    columns={"sentence1": "input_text", "sentence2": "target_text"}
)

train_df = train_df[["input_text", "target_text"]]
eval_df = eval_df[["input_text", "target_text"]]

train_df["prefix"] = "paraphrase"
eval_df["prefix"] = "paraphrase"


In [None]:
train_df

Unnamed: 0,input_text,target_text,prefix
1,The NBA season of 1975 -- 76 was the 30th seas...,The 1975 -- 76 season of the National Basketba...,paraphrase
3,When comparable rates of flow can be maintaine...,The results are high when comparable flow rate...,paraphrase
4,It is the seat of Zerendi District in Akmola R...,It is the seat of the district of Zerendi in A...,paraphrase
5,William Henry Henry Harman was born on 17 Febr...,"William Henry Harman was born in Waynesboro , ...",paraphrase
7,With a discrete amount of probabilities Formul...,Given a discrete set of probabilities formula ...,paraphrase
...,...,...,...
49384,"The Romanesque language , Galician ( Galego ) ...",The Romance language currently spoken in Galic...,paraphrase
49390,Note that k is a vector consisting of three in...,It is necessary to note that k is a vector con...,paraphrase
49393,"Tim Henman won in the final 6 -- 2 , 7 -- 6 , ...","Tim Tim Henman won 6 -- 2 , 7 -- 6 against Yev...",paraphrase
49395,He was considered an active member of the coun...,He was considered an active member of the Coun...,paraphrase


In [None]:

train_df = train_df[["prefix", "input_text", "target_text"]]
eval_df = eval_df[["prefix", "input_text", "target_text"]]

train_df = train_df.dropna()
eval_df = eval_df.dropna()

train_df["input_text"] = train_df["input_text"].apply(clean_unnecessary_spaces)
train_df["target_text"] = train_df["target_text"].apply(clean_unnecessary_spaces)

eval_df["input_text"] = eval_df["input_text"].apply(clean_unnecessary_spaces)
eval_df["target_text"] = eval_df["target_text"].apply(clean_unnecessary_spaces)

print(train_df)

train_df

           prefix                                        target_text  \
1      paraphrase  The NBA season of 1975 -- 76 was the 30th seas...   
3      paraphrase  When comparable rates of flow can be maintaine...   
4      paraphrase  It is the seat of Zerendi District in Akmola R...   
5      paraphrase  William Henry Henry Harman was born on 17 Febr...   
7      paraphrase  With a discrete amount of probabilities Formul...   
...           ...                                                ...   
49384  paraphrase  The Romanesque language, Galician ( Galego ), ...   
49390  paraphrase  Note that k is a vector consisting of three in...   
49393  paraphrase  Tim Henman won in the final 6 -- 2, 7 -- 6, ag...   
49395  paraphrase  He was considered an active member of the coun...   
49397  paraphrase  She was in Cork on June 24 and arrived on 8 Ju...   

                                              input_text  
1      The 1975 -- 76 season of the National Basketba...  
3      The result

Unnamed: 0,prefix,target_text,input_text
1,paraphrase,The NBA season of 1975 -- 76 was the 30th seas...,The 1975 -- 76 season of the National Basketba...
3,paraphrase,When comparable rates of flow can be maintaine...,The results are high when comparable flow rate...
4,paraphrase,It is the seat of Zerendi District in Akmola R...,It is the seat of the district of Zerendi in A...
5,paraphrase,William Henry Henry Harman was born on 17 Febr...,"William Henry Harman was born in Waynesboro, V..."
7,paraphrase,With a discrete amount of probabilities Formul...,Given a discrete set of probabilities formula ...
...,...,...,...
49384,paraphrase,"The Romanesque language, Galician ( Galego ), ...",The Romance language currently spoken in Galic...
49390,paraphrase,Note that k is a vector consisting of three in...,It is necessary to note that k is a vector con...
49393,paraphrase,"Tim Henman won in the final 6 -- 2, 7 -- 6, ag...","Tim Tim Henman won 6 -- 2, 7 -- 6 against Yevg..."
49395,paraphrase,He was considered an active member of the coun...,He was considered an active member of the Coun...


Shuffling the columns and rows. We will take 2000 rows of data.

In [None]:
train_df.rename(columns = {'input_text':'target_text1'}, inplace = True)
train_df.rename(columns = {'target_text':'input_text'}, inplace = True)
train_df.rename(columns = {'target_text1':'target_text'}, inplace = True)

In [None]:
train_df

Unnamed: 0,prefix,target_text,input_text
1,paraphrase,The NBA season of 1975 -- 76 was the 30th seas...,The 1975 -- 76 season of the National Basketba...
3,paraphrase,When comparable rates of flow can be maintaine...,The results are high when comparable flow rate...
4,paraphrase,It is the seat of Zerendi District in Akmola R...,It is the seat of the district of Zerendi in A...
5,paraphrase,William Henry Henry Harman was born on 17 Febr...,"William Henry Harman was born in Waynesboro, V..."
7,paraphrase,With a discrete amount of probabilities Formul...,Given a discrete set of probabilities formula ...
...,...,...,...
49384,paraphrase,"The Romanesque language, Galician ( Galego ), ...",The Romance language currently spoken in Galic...
49390,paraphrase,Note that k is a vector consisting of three in...,It is necessary to note that k is a vector con...
49393,paraphrase,"Tim Henman won in the final 6 -- 2, 7 -- 6, ag...","Tim Tim Henman won 6 -- 2, 7 -- 6 against Yevg..."
49395,paraphrase,He was considered an active member of the coun...,He was considered an active member of the Coun...


In [None]:
print(type(train_df))
train_df_shuffled = train_df.sample(frac = 1)
train_df_shuffled = train_df_shuffled.head(2000)
eval_df = eval_df.head(250)

In [None]:
eval_df

Unnamed: 0,prefix,input_text,target_text
1,paraphrase,They were there to enjoy us and they were ther...,They were there for us to enjoy and they were ...
2,paraphrase,"After the end of the war in June 1902, Higgins...","In August, after the end of the war in June 19..."
3,paraphrase,From the merger of the Four Rivers Council and...,Shawnee Trails Council was formed from the mer...
4,paraphrase,The group toured extensively and became famous...,The group toured extensively and was famous in...
5,paraphrase,Kathy and her husband Pete Beale ( Peter Dean ...,Kathy and her husband Peter Dean ( Pete Beale ...
...,...,...,...
548,paraphrase,"Like the Pacific Maritime Ecozone, Coquitlam l...","Like Pacific Maritime Ecozone, Coquitlam is in..."
549,paraphrase,"The nations of Macedonia, Kenya, Azerbaijan, U...","The nations Macedonia, Kenya, Azerbaijan, Urug..."
552,paraphrase,"On 29 November 1171, Gonzalo signed a document...","On 29 November 1171, Gonzalo signed a charter ..."
554,paraphrase,The remains of two Armenian churches still pre...,The remains of two still preserved Armenian ch...


In [None]:
train_df_shuffled

Unnamed: 0,prefix,target_text,input_text
35288,paraphrase,He graduated from the Military School in Sofia...,He graduated from the Military School in Sofia...
31457,paraphrase,"He developed players like Stefan Bogomilov , B...","He developed players like Stefan Bogomilov , B..."
40845,paraphrase,"When the police made the discovery , Latimer d...","When the police made the discovery , Latimer d..."
18572,paraphrase,The school is headed by the Neelaveni Thayaram...,The school is managed by the Neelaveni Thayara...
28385,paraphrase,Later he joined Calcutta Football League and p...,He later joined the Calcutta Football League a...
...,...,...,...
9882,paraphrase,Flashpoint : Cold War Crisis is a tactical sho...,Operation Flashpoint : Cold War Crisis is a ta...
37936,paraphrase,kinetic compensation : an increase in the pree...,Kinetic Compensation : An increase in the pree...
45912,paraphrase,The defect in the mechanism could not be detec...,The defect of the mechanism could not be seen ...
2511,paraphrase,"The participants included TJ Trinidad , Joem B...","TJ Trinidad , Biboy Ramirez , Eric Fructuoso ,..."


In [None]:

model_args = Seq2SeqArgs()
model_args.eval_batch_size = 64
model_args.evaluate_during_training = True
model_args.evaluate_during_training_steps = 2500
model_args.evaluate_during_training_verbose = True
model_args.fp16 = False
model_args.learning_rate = 5e-5
model_args.max_seq_length = 128
model_args.num_train_epochs = 4
model_args.overwrite_output_dir = True
model_args.reprocess_input_data = True
model_args.save_eval_checkpoints = False
model_args.save_steps = -1
model_args.train_batch_size = 8
model_args.use_multiprocessing = False
model_args.do_sample = True
model_args.num_beams = None
model_args.num_return_sequences = 3
model_args.max_length = 128
model_args.top_k = 50
model_args.top_p = 0.95


In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Mounted at /content/gdrive


Here, we are again training the model which was previously trained on MSR data in previous notebook.

In [None]:
model = Seq2SeqModel(
    encoder_decoder_type="bart",
    encoder_decoder_name="/content/gdrive/My Drive/2MayNewModel/best_model",
    args=model_args
)

model.train_model(train_df_shuffled, eval_data=eval_df)

  0%|          | 0/2000 [00:00<?, ?it/s]

Epoch:   0%|          | 0/4 [00:00<?, ?it/s]

Running Epoch 0 of 4:   0%|          | 0/250 [00:00<?, ?it/s]

  0%|          | 0/250 [00:00<?, ?it/s]

Running Epoch 1 of 4:   0%|          | 0/250 [00:00<?, ?it/s]

  0%|          | 0/250 [00:00<?, ?it/s]

Running Epoch 2 of 4:   0%|          | 0/250 [00:00<?, ?it/s]

  0%|          | 0/250 [00:00<?, ?it/s]

Running Epoch 3 of 4:   0%|          | 0/250 [00:00<?, ?it/s]

  0%|          | 0/250 [00:00<?, ?it/s]

(1000,
 {'global_step': [250, 500, 750, 1000],
  'eval_loss': [0.17218295112252235,
   0.16605503857135773,
   0.16605303436517715,
   0.16605303436517715],
  'train_loss': [0.14816062152385712,
   0.106998011469841,
   0.09026952087879181,
   0.10656018555164337]})

Saving the model in different folder in google drive :--

In [None]:
# !pwd
# !cd /content/outputs && ls -l
# !ls -l
!cp -r "/content/outputs/best_model" "/content/gdrive/My Drive/4May"

Testing the new Model :--

In [None]:
import torch
from transformers import BartForConditionalGeneration, BartTokenizer

model = BartForConditionalGeneration.from_pretrained("/content/gdrive/My Drive/4May/best_model", forced_bos_token_id=0)
tok = BartTokenizer.from_pretrained("/content/gdrive/My Drive/4May/best_model")

example_english_phrase = "Transformers Transformers are fast plus efficient"
batch = tok(example_english_phrase, return_tensors="pt")
generated_ids = model.generate(batch["input_ids"])
generated_sentence = tok.batch_decode(generated_ids, skip_special_tokens=True)

print(generated_sentence)

In [None]:
import logging

from simpletransformers.seq2seq import Seq2SeqModel


logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.ERROR)

model = Seq2SeqModel(encoder_decoder_type="bart",
                     encoder_decoder_name="/content/gdrive/My Drive/4May/best_model"
                     ,use_cuda=False
                     )

to_predict = ["They were there to enjoy us and they were there to pray for us.",
              "What is the best way to play Cricket?",
              "This is the best way to create food.",
              "What are different ways to play badminton?",
              "We should not talk in rude manner to guests.",
              "I do not know how many times I have prepared this model.",
              "This is how you make cake",
              "We remove special symbols like #, @, *, etc. We cannot remove useful punctuations like question mark because the paraphrase of a question will be a question only. We use  regex or python string functions.",
              "Amrozi accused his brother, whom he called \"the witness\", of deliberately distorting  his evidence.",
              "What are the ways to commit suicide?",
              "How can I get started using Quora?",
              "How to play chess?"
              ]

print(type(to_predict))
predicted = []
j = 0
# while True:
for i in to_predict:
    to_predict1 = [i]

    preds = model.predict(to_predict1)

    predicted.insert(j, preds[0])
    print("---------------------------------------------------------")
    print(i)

    print()
    print("Predictions >>>")
    for pred in preds[0]:
        print(pred)
        # predicted[j].append(pred)
    
    j = j+1
    print("---------------------------------------------------------")
    print()

<class 'list'>


Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
They were there to enjoy us and they were there to pray for us.

Predictions >>>
They were there to enjoy us and they were there for us to pray for us.
They were there to enjoy us and they were there for us to pray for us.
They were there to enjoy us and they were there for us to pray for us.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What is the best way to play Cricket?

Predictions >>>
What is the best way to play cricket?
What is the best way to play cricket?
What is the best way to play cricket?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
This is the best way to create food.

Predictions >>>
This is the best way to create food.
This is the best way to create food.
This is the best way to create food.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What are different ways to play badminton?

Predictions >>>
What are some different ways to play badminton?
What are some different ways to play badminton?
What are some different ways to play badminton?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
We should not talk in rude manner to guests.

Predictions >>>
We should not talk in a rude manner to guests.
We should not talk in a rude manner to guests.
We should not talk in a rude manner to guests.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
I do not know how many times I have prepared this model.

Predictions >>>
I do not know how many times I have prepared this model.
I do not know how many times I have prepared this model.
I do not know how many times I have prepared this model.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
This is how you make cake

Predictions >>>
This is how you make a cake from scratch
This is how you make a cake - how do I bake one?
This is how you make a cake from scratch
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
We remove special symbols like #, @, *, etc. We cannot remove useful punctuations like question mark because the paraphrase of a question will be a question only. We use  regex or python string functions.

Predictions >>>
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use regex or python string functions.
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use regex or python string functions.
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use regex or python string functions.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Amrozi accused his brother, whom he called "the witness", of deliberately distorting  his evidence.

Predictions >>>
Amrozi accused his brother, whom he called "the witness, of deliberately distorting his evidence.
Amrozi accused his brother, whom he called "the witness, of deliberately distorting his evidence.
Amrozi accused his brother, whom he called "the witness, of deliberately distorting his evidence.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What are the ways to commit suicide?

Predictions >>>
How can I commit suicide?
How can I commit suicide?
How can I commit suicide?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
How can I get started using Quora?

Predictions >>>
How can I get started with Quora?
How can I get started with Quora?
How can I get started with Quora?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
How to play chess?

Predictions >>>
How can I play chess?
How can I play chess?
How can I play chess?
---------------------------------------------------------



### Evaluation using Rouge score :-- 
We will now check how our model is performing using Rouge score. 
First we will see accuracy for above sentences.
Below is the accuracy for the model we trained now :--

In [None]:
import evaluate
rouge = evaluate.load('rouge')

results = rouge.compute(predictions=to_predict, references=predicted)
print(results)

{'rouge1': 0.8656432748538013, 'rouge2': 0.7579131652661064, 'rougeL': 0.8654553049289891, 'rougeLsum': 0.8648287385129491}


Accuracy for model trained in Mandate-3 :--

In [None]:
import logging

from simpletransformers.seq2seq import Seq2SeqModel


logging.basicConfig(level=logging.INFO)
transformers_logger = logging.getLogger("transformers")
transformers_logger.setLevel(logging.ERROR)

model = Seq2SeqModel(encoder_decoder_type="bart", 
                     encoder_decoder_name="/content/gdrive/My Drive/best_model"
                     ,use_cuda=True
                     )

print(type(to_predict))
predicted = []
j = 0
# while True:
for i in to_predict:
    to_predict1 = [i]

    preds = model.predict(to_predict1)

    predicted.insert(j, preds[0])
    print("---------------------------------------------------------")
    print(i)

    print()
    print("Predictions >>>")
    for pred in preds[0]:
        print(pred)
        # predicted[j].append(pred)
    
    j = j+1
    print("---------------------------------------------------------")
    print()

rouge = evaluate.load('rouge')

results = rouge.compute(predictions=to_predict, references=predicted)
print("ACCURACY ::::")
print(results)

<class 'list'>


Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
They were there to enjoy us and they were there to pray for us.

Predictions >>>
They were there to enjoy us and they were there also to pray for us.
They were there to enjoy us and they were there also to pray for us.
They were there to enjoy us and they were there also to pray for us.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What is the best way to play Cricket?

Predictions >>>
What is the best way to play cricket?
How do I play cricket?
How do I play cricket?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
This is the best way to create food.

Predictions >>>
How can I create my own food?
This is the best way to create food.
How can I create my own food?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What are different ways to play badminton?

Predictions >>>
How do I play badminton?
How do I play badminton?
How do I play badminton?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
We should not talk in rude manner to guests.

Predictions >>>
We should not talk in a rude manner to guests.
We should not talk in a rude manner to guests.
We should not talk in a rude manner to guests.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
I do not know how many times I have prepared this model.

Predictions >>>
I do not know how many times I have prepared this model.
How many times I have prepared this model?
How many times have I prepared this model?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
This is how you make cake

Predictions >>>
How do I make a cake?
How do I make a cake?
How do I make a cake?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
We remove special symbols like #, @, *, etc. We cannot remove useful punctuations like question mark because the paraphrase of a question will be a question only. We use  regex or python string functions.

Predictions >>>
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use  regex or python string functions.
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use  regex or python string functions. We cannot remove useful punctuations like question mark because the answer is a question.
We remove special symbols like #, @, *, etc. because the paraphrase of a question will be a question only. We use  regex or python string functions. We cannot remove useful punctuations like question mark because the answer is a question.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Amrozi accused his brother, whom he called "the witness", of deliberately distorting  his evidence.

Predictions >>>
Amrozi accused his brother, whom he called "the witness", of deliberately distorting  his evidence.
Amrozi accused his brother, whom he called "the witness", of deliberately distorting  his evidence.
Amrozi accused his brother, whom he called "the witness", of deliberately distorting  his evidence.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
What are the ways to commit suicide?

Predictions >>>
How can I commit suicide?
How can I commit suicide?
How can I commit suicide?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
How can I get started using Quora?

Predictions >>>
How do I get started with Quora?
How do I get started with Quora?
How do I get started with Quora?
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
How to play chess?

Predictions >>>
How do I play chess?
How do I play chess?
How do I play chess?
---------------------------------------------------------

ACCURACY ::::
{'rouge1': 0.7882248725261429, 'rouge2': 0.6397759103641457, 'rougeL': 0.7779830322933771, 'rougeLsum': 0.7827344942816812}


In [None]:
!cd data && ls -l

Code for evaluating test data from PAWS dataset :--

In [None]:
# Google Data
test_df = pd.read_csv("data/test.tsv", sep="\t").astype(str)
test_df = test_df.loc[test_df["label"] == "1"]
test_df = test_df.rename(
    columns={"sentence1": "input_text", "sentence2": "target_text"}
)

test_df = test_df[["input_text", "target_text"]]
test_df["prefix"] = "paraphrase"

test_df = test_df[["prefix", "input_text", "target_text"]]

test_df = test_df.dropna()

test_df["input_text"] = test_df["input_text"].apply(clean_unnecessary_spaces)
test_df["target_text"] = test_df["target_text"].apply(clean_unnecessary_spaces)


test_df

Unnamed: 0,prefix,input_text,target_text
2,paraphrase,"In January 2011, the Deputy Secretary General ...","In January 2011, FIBA Asia deputy secretary ge..."
5,paraphrase,"During her sophomore, junior and senior summer...","During her second, junior and senior summers, ..."
7,paraphrase,"His father emigrated to Missouri in 1868, but ...",His father emigrated to Missouri in 1868 but r...
9,paraphrase,It is situated south of Köroğlu Mountains and ...,It is situated south of Köroğlu - mountains an...
10,paraphrase,The Río Blanco mine is a large copper mine loc...,The Río Blanco - Mine is a large copper mine i...
...,...,...,...
7990,paraphrase,Twice Sparrow sold the island twice to Thomas ...,Sparrow twice sold the island to Thomas Polloc...
7994,paraphrase,The name in Tupi means `` insensitive stone ''...,"The name in Tupi means '' hard stone ``, '' in..."
7995,paraphrase,"The company has branches in Tokyo, based in th...",The company has branches in Tokyo based in Sai...
7997,paraphrase,The modern coat of arms of Bavaria was designe...,The modern coat of arms of Bavaria was designe...


In [None]:
test_df_accuracy = test_df.head(70)

In [None]:
test_df_accuracy

Unnamed: 0,prefix,input_text,target_text
2,paraphrase,"In January 2011, the Deputy Secretary General ...","In January 2011, FIBA Asia deputy secretary ge..."
5,paraphrase,"During her sophomore, junior and senior summer...","During her second, junior and senior summers, ..."
7,paraphrase,"His father emigrated to Missouri in 1868, but ...",His father emigrated to Missouri in 1868 but r...
9,paraphrase,It is situated south of Köroğlu Mountains and ...,It is situated south of Köroğlu - mountains an...
10,paraphrase,The Río Blanco mine is a large copper mine loc...,The Río Blanco - Mine is a large copper mine i...
...,...,...,...
164,paraphrase,The Thrash - Metal - Group Anthrax opened Kiss...,The thrash metal group Anthrax opened for Kiss...
165,paraphrase,"According to the Bureau of Meteorology, the lo...","According to the Bureau of Meteorology, the lo..."
171,paraphrase,He was born in a small kingdom of 1000 `` li '...,He was born in a small kingdom of 1000 `` li '...
173,paraphrase,Holodecks can also be used to encourage studen...,Holodecks can also be used to encourage pupils...


In [None]:
to_predict_small = [
     input_text
    for  input_text in test_df_accuracy["input_text"].tolist()
]

to_predict_small

In [None]:
import logging

from simpletransformers.seq2seq import Seq2SeqModel

model = Seq2SeqModel(encoder_decoder_type="bart", 
                     encoder_decoder_name="/content/gdrive/My Drive/4May/best_model"
                     ,use_cuda=True  
                  # use_cuda=True if you have GPU access else assign it False
                     )

predicted = []
j = 0
# while True:
for i in to_predict_small:
    to_predict1 = [i]

    preds = model.predict(to_predict1)

    predicted.insert(j, preds[0])
    print("---------------------------------------------------------")
    print(i)

    print()
    print("Predictions >>>")
    for pred in preds[0]:
        print(pred)
        # predicted[j].append(pred)
    
    j = j+1
    print("---------------------------------------------------------")
    print()

Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In January 2011, the Deputy Secretary General of FIBA Asia, Hagop Khajirian, inspected the venue together with SBP - President Manuel V. Pangilinan.

Predictions >>>
In January 2011 the deputy secretary general of FIBA Asia, Hagop Khajirian, inspected the venue together with SBP President Manuel V. Pangilinan.
In January 2011 the deputy secretary general of FIBA Asia, Hagop Khajirian, inspected the venue together with SBP President Manuel V. Pangilinan.
In January 2011 the deputy secretary general of FIBA Asia, Hagop Khajirian, inspected the venue together with SBP President Manuel V. Pangilinan.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
During her sophomore, junior and senior summers, she spent half of it with her Alaska team, and half playing, and living in Oregon.

Predictions >>>
During her sophomore, junior and senior summers she spent half with her Alaska team and half playing and living in Oregon.
During her sophomore, junior and senior summers she spent half with her Alaska team and half playing and living in Oregon.
During her sophomore, junior and senior summers she spent half of it playing with her Alaska team and half playing in Oregon and living in Portland.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
His father emigrated to Missouri in 1868, but returned when his wife became ill and before the rest of the family could go to America.

Predictions >>>
His father emigrated to Missouri in 1868 but returned when his wife became ill and before the rest of the family could go to America.
His father emigrated to Missouri in 1868 but returned when his wife became ill and before the rest of the family could go to America.
His father emigrated to Missouri in 1868 but returned when his wife became ill and before the rest of the family could go to America.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
It is situated south of Köroğlu Mountains and to the north of Bolu.

Predictions >>>
It is situated south of Köroğlu Mountains and north of Bolu.
It is situated south of Köroğlu Mountains and north of Bolu.
It is situated south of Köroğlu Mountains and north of Bolu.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The Río Blanco mine is a large copper mine located in the north of Peru in Loreto Region.

Predictions >>>
The Río Blanco mine is a large copper mine located in the north of Peru in the Loreto Region.
The Río Blanco mine is a large copper mine located in the north of Peru in the Loreto Region.
The Río Blanco mine is a large copper mine located in the north of Peru in the Loreto Region.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The spectral levels of light that can be measured by plants for photosynthesis is similar to, but not the same as what's used by lumens.

Predictions >>>
The spectral levels of light that can be measured by plants for photosynthesis are similar to but not the same as those used by lumens.
The spectral levels of light that can be measured by plants for photosynthesis are similar to, but not the same as lumens.
The spectral levels of light that can be measured by plants for photosynthesis are similar to, but not the same as lumens.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The Sunset Sunset Road comes from right and becomes Briscoe Mountain Road.

Predictions >>>
The Sunset Sunset Road comes from the right and becomes Briscoe Mountain Road.
The Sunset Sunset Road comes from the right and becomes Briscoe Mountain Road.
The Sunset Sunset Road comes from the right and becomes Briscoe Mountain Road.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Mr. Thuso Nokwanda Mbedu was born in Pietermaritzburg as Thuso Mbebu.

Predictions >>>
Thuso Nokwanda Mbedu was born in Pietermaritzburg as Thuso Mbebu.
Thuso Nokwanda Mbedu was born in Pietermaritzburg as Thuso Mbebu.
Thuso Nokwanda Mbedu was born in Pietermaritzburg as Thuso Mbebu.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Netanya is located on the Israeli Mediterranean Coastal Plain, the historic land bridge between Europe, Africa, and Asia.

Predictions >>>
Netanya is located on the Mediterranean Coastal Plain, the historic land bridge between Europe, Africa and Asia.
Netanya is located on the Mediterranean Coastal Plain, the historic land bridge between Europe, Africa and Asia.
Netanya is located on the Mediterranean Coastal Plain, the historic land bridge between Europe, Africa and Asia.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Owned by Rick and Sheri Dorritie is Megasaurus and owned by Mike West Transaurus.

Predictions >>>
Owned by Rick and Sheri Dorritie is Megasaurus and is owned by Mike West Transaurus.
Owned by Rick and Sheri Dorritie is Megasaurus and is owned by Mike West Transaurus.
Owned by Rick and Sheri Dorritie is Megasaurus and is owned by Mike West Transaurus.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
All five events started the last day and concluded with the final on the first day.

Predictions >>>
All five events started on the last day and concluded with the final on the first day of competition.
All five events started on the last day and concluded with the final on the first day of competition.
All five events started on the last day and concluded with the final on the first day.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The Nokasippi River, the Mississippi River and the Little Nokasippi River are all in the area.

Predictions >>>
The Nokasippi River, the Mississippi River and the Little Nokassippi River are all in the area.
The Nokasippi River, the Mississippi River and the Little Nokassippi River are all in the area.
The Nokasippi River, the Mississippi River and the Little Nokassippi River are all in the area.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Some reports state `` 30 years or more '', while others claim `` 50 years or more ''.

Predictions >>>
Some reports state `` 30 years or more '', while others claim `` 50 years or longer ''.
Some reports state `` 30 years or more '', while others claim `` 50 years or longer ''.
Some reports state `` 30 years or more '', while others claim `` 50 years or longer ''.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The resignation of Councillor Horace Muspratt ( party?, St. Peter, reported April 15, 1908 ) was elected to the council on 2 June 1909.

Predictions >>>
The resignation of Councillor Horace Muspratt ( party member, St. Peter, reported April 15, 1908 ) was elected to the council on June 2, 1909.
The resignation of Councillor Horace Muspratt ( party member, St. Peter, reported April 15, 1908 ) was elected to the council on June 2, 1909.
The resignation of Councillor Horace Muspratt ( party member, St. Peter, reported April 15, 1908 ) was elected to the council on June 2, 1909.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
He was born in New York City in East Broadway on October 23, 1806.

Predictions >>>
He was born on October 23, 1806 in East Broadway in New York City.
He was born in New York City on October 23, 1806 in East Broadway.
He was born on October 23, 1806 in East Broadway in New York City.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
He always added his own surname of Bhosle and then treated the child like his own son.

Predictions >>>
He always added his own surname of Bhosle to the child and then treated the child like his own son.
He always added his own surname of Bhosle to the child and then treated the child like his own son.
He always added his own surname of Bhosle to the child and then treated the child like his own son.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
After selling his house on 30 Harare drive to the Canadian Embassy he bought Bromley Farm just outside Marondera, Zimbabwe.

Predictions >>>
After selling his house on 30 Harare Drive to the Canadian Embassy he bought Bromley Farm just outside Marondera, Zimbabwe.
After selling his house on 30 Harare Drive to the Canadian Embassy he bought Bromley Farm just outside Marondera, Zimbabwe.
After selling his house on 30 Harare Drive to the Canadian Embassy he bought Bromley Farm just outside Marondera, Zimbabwe.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Axl Rose had wanted the Seattle rock band Nirvana to be the opening act, but frontman Kurt Cobain declined.

Predictions >>>
Axl Rose had wanted the Seattle rock band Nirvana to be the opening act but frontman Kurt Cobain declined the offer.
Axl Rose had wanted the Seattle rock band Nirvana to be the opening act but frontman Kurt Cobain declined the offer.
Axl Rose had wanted the Seattle rock band Nirvana to be the opening act but frontman Kurt Cobain declined the offer.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The U.S. Route 191 leads north from Douglas to Interstate 10 near Willcox, the Arizona State Route 80 leads west to Bisbee and northeast to Interstate 10 in New Mexico.

Predictions >>>
The U.S. Route 191 leads north from Douglas to Interstate 10 near Willcox, the Arizona State Route 80 leads west to Bisbee and northeast to Interstate 20 in New Mexico.
The U.S. Route 191 leads north from Douglas to Interstate 10 near Willcox and northeast to Arizona State Route 80, which leads west to Bisbee and northeast towards Interstate 10 in New Mexico.
The U.S. Route 191 leads north from Douglas to Interstate 10 near Willcox, the Arizona State Route 80 leads west to Bisbee and northeast to Interstate 20 in New Mexico.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In 1986, Ray and sitcom actress Marla Gibbs were among the notables who helped dedicate Billie Holiday's star on the Hollywood Walk of Fame.

Predictions >>>
In 1986, Ray and sitcom actress Marla Gibbs were among the notables who helped dedicate Billie Holiday's star on the Hollywood Walk of Fame.
In 1986, Ray and sitcom actress Marla Gibbs were among the notables who helped dedicate Billie Holiday's star on the Hollywood Walk of Fame.
In 1986, Ray and sitcom actress Marla Gibbs were among the notables who helped dedicate Billie Holiday's star on the Hollywood Walk of Fame.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Torre is a Wall Street Journal, USA Today, New York Times, and Amazon International's Bestselling Author.

Predictions >>>
Torre is a Wall Street Journal, USA Today, New York Times and Amazon International best-selling author.
Torre is a Wall Street Journal, USA Today, New York Times and Amazon International best-selling author.
Torre is a Wall Street Journal, USA Today, New York Times and Amazon International best-selling author.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Vic participated in many different countries and fought for Armenia in the 2000 Olympic Games in Sydney, Australia.

Predictions >>>
Vic participated in many different countries and fought for Armenia in the 2000 Olympic Games in Sydney, Australia.
Vic participated in many different countries and fought for Armenia in the 2000 Olympic Games in Sydney, Australia.
Vic participated in many different countries and fought for Armenia in the 2000 Olympic Games in Sydney, Australia.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The construction of the PRL began with the Chatswood section to Epping, which began in November 2002.

Predictions >>>
The construction of the PRL began with the Chatswood section to Epping in November 2002, which was completed.
The construction of the PRL began with the Chatswood section to Epping in November 2002, which was completed.
The construction of the PRL began with the Chatswood section to Epping in November 2002, which was completed.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Evariste Baizeau ( June 3, 1821 - Nantes, February 6, 1910 ) was a French military physician.

Predictions >>>
Evariste Baizeau ( June 3, 1821 in Nantes - February 6, 1910 ) was a French military physician.
Evariste Baizeau ( June 3, 1821 - February 6, 1910 ) was a French military physician.
Evariste Baizeau ( June 3, 1821 in Nantes - February 6, 1910 ) was a French military physician.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Howarth, during an April 1994 `` Soap Opera Update '' interview, said that the only thing that he admired about Todd was his clothing style.

Predictions >>>
Howarth, during an April 1994 `` Soap Opera Update '' interview, said that the only thing he admired about Todd was his clothing style.
Howarth, during an April 1994 `` Soap Opera Update '' interview, said that the only thing he admired about Todd was his clothing style.
In an April 1994 `` Soap Opera Update '' interview, Howarth said that the only thing he admired about Todd was his clothing style.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
His sister Mary ( born 25 February 1956 ) won the 1500m in San Sebastian in 1977 and then became the Commonwealth Games champion in 1978 in Edmonton.

Predictions >>>
His sister Mary ( born February 25, 1956 ) won the 1500m in San Sebastian in 1977 and then became Commonwealth Games champion in 1978 in Edmonton.
His sister Mary ( born February 25, 1956 ) won the 1500m in San Sebastian in 1977 and then became Commonwealth Games champion in 1978 in Edmonton.
His sister Mary ( born February 25, 1956 ) won the 1500m in San Sebastian in 1977 and then became Commonwealth Games champion in 1978 in Edmonton.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The private Catholic schools are Fort Meigs, Frank, Toth and Woodland Elementary Schools, which make up the primary public school, Saint Rose, the fifth.

Predictions >>>
The private Catholic schools are Fort Meigs, Frank, Toth and Woodland Elementary Schools, which make up the fifth primary public school Saint Rose, the fifth.
The private Catholic schools are Fort Meigs, Frank, Toth and Woodland Elementary Schools, which make up the fifth primary public school Saint Rose, the fifth.
The private Catholic schools are Fort Meigs, Frank, Toth and Woodland Elementary Schools, which make up the primary public school Saint Rose, the fifth.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
A new South stand was built in 2005 and 2006 to make the stadium into a complete bowl.

Predictions >>>
A new South stand was built in 2005 and 2006 to make the stadium a complete bowl.
A new South stand was built in 2005 and 2006 to make the stadium a complete bowl.
A new South stand was built in 2005 and 2006 to make the stadium a complete bowl.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
When he died in 1719, his eldest son, also called Joseph, followed initially Joseph Truman.

Predictions >>>
When he died in 1719, his eldest son, also called Joseph Truman, was initially Joseph Truman.
When he died in 1719, his eldest son, also called Joseph Truman, was initially Joseph Truman.
When he died in 1719, his eldest son, also called Joseph Truman, was named initially Joseph Truman.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
They do not follow the pattern of caldera growth and destruction that other shield volcanoes do ; caldera may form, but they generally do not disappear.

Predictions >>>
They do not follow the pattern of caldera growth and destruction that other shield volcanoes do ; calderas may form, but they generally do not disappear.
They do not follow the pattern of caldera growth and destruction that other shield volcanoes do ; calderas may form, but they generally do not disappear.
They do not follow the pattern of caldera growth and destruction that other shield volcanoes do ; calderas may form, but they generally do not disappear.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The first landing in Lae Airfield was possessed by Ernest Mustar on April 19, 1927 in a De Havilland DH.37 by Guinea Gold Airways from Wau.

Predictions >>>
The first landing in Lae Airfield was made by Ernest Mustar on April 19, 1927 in a De Havilland DH.37 by Guinea Gold Airways from Wau.
The first landing in Lae Airfield was made by Ernest Mustar on April 19, 1927 in a De Havilland DH.37 by Guinea Gold Airways from Wau.
The first landing in Lae Airfield was made by Ernest Mustar on April 19, 1927 in a De Havilland DH.37 by Guinea Gold Airways from Wau.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Between Langensteinbach and the newly created Schießhüttenäcker railway station, a short double track section was also built in 2011.

Predictions >>>
A short double track section was also built between Langensteinbach and the newly created Schießhüttenäcker railway station in 2011.
A short double track section was also built between Langensteinbach and the newly created Schießhüttenäcker railway station in 2011.
In 2011 a short double track section was also built between Langensteinbach and the newly created Schießhüttenäcker railway station.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The other two rivers are the Mangles River, and the Matiri River.

Predictions >>>
The other two rivers are the Mangles River and the Matiri River.
The other two rivers are the Mangles River and the Matiri River.
The other two rivers are the Mangles River and the Matiri River.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
A new system Of CCE is also implemented in the school for a continuous and a comprehensive evaluation of students.

Predictions >>>
A new system of CCE is also implemented in the school for continuous and comprehensive evaluation of students.
A new system of CCE is also implemented in the school for continuous and comprehensive evaluation of students.
A new system of CCE is also implemented in the school for continuous and comprehensive evaluation of students.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
At the bottom of the hydrometer is a weighted bulb and at the top is a pan for small weights.

Predictions >>>
At the bottom of the hydrometer is a weighted bulb and at the top is a pan for small weights.
At the bottom of the hydrometer is a weighted bulb and at the top is a pan for small weights.
At the bottom of the hydrometer is a weighted bulb and at the top is a pan for small weights.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
He attended the Liceo Classico in Germany before studying chemistry at the University of Munich in L 'Aquila.

Predictions >>>
He attended the Liceo Classico in Germany before studying chemistry at the University of Munich in L 'Aquila.
He attended the Liceo Classico in Germany before studying chemistry at the University of Munich in L 'Aquila.
He attended the Liceo Classico in Germany before studying chemistry at the University of Munich in L 'Aquila in Italy.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
David Spinozza plays the acoustic guitar in the song, Liberty DeVitto plays the drums and Robert Freedman the horn and string orchestration.

Predictions >>>
David Spinozza plays the acoustic guitar in the song, Liberty DeVitto plays the drums and Robert Freedman provides the horn and string orchestration.
David Spinozza plays the acoustic guitar in the song, Liberty DeVitto plays the drums and Robert Freedman provides the horn and string orchestration.
David Spinozza plays the acoustic guitar in the song, Liberty DeVitto plays the drums and Robert Freedman provides the horn and string orchestration.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The school belongs to the Jefferson District of VHSL Region II ( Virginia High School League ).

Predictions >>>
The school belongs to the Jefferson District of the VHSL Region II ( Virginia High School League ).
The school belongs to the Jefferson District of the VHSL Region II ( Virginia High School League ).
The school belongs to the Jefferson District of the VHSL Region II ( Virginia High School League ).
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
It is native to Chiapas, Guatemala, Honduras, El Salvador, Panama, Guerrero, Oaxaca, Veracruz.

Predictions >>>
It is native to Chiapas, Guatemala, Honduras, El Salvador, Panama, Guerrero, Oaxaca, Veracruz and Mexico.
It is native to Chiapas, Guatemala, Honduras, El Salvador, Panama, Guerrero, Oaxaca, Veracruz and Mexico.
It is native to Chiapas, Guatemala, Honduras, El Salvador, Panama, Guerrero, Oaxaca, Veracruz and Mexico.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Ruffell was employed as an engineer in Southampton Docks and died on 3 October 1940 in Southampton on his 73rd birthday.

Predictions >>>
Ruffell was employed as an engineer in Southampton Docks and died on October 3, 1940 in Southampton on his 73rd birthday.
Ruffell was employed as an engineer in Southampton Docks and died on October 3, 1940 in Southampton on his 73rd birthday.
Ruffell was employed as an engineer in Southampton Docks and died on October 3, 1940 in Southampton on his 73rd birthday.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
David Ray Griffin advocated a sophisticated form of panpsychism, called `` Panexperientialism `` by A. N. Whitehead.

Predictions >>>
David Ray Griffin advocated a sophisticated form of panpsychism, called `` Panexperientialism '' by A. N. Whitehead.
David Ray Griffin advocated a sophisticated form of panpsychism, called `` Panexperientialism '' by A. N. Whitehead.
David Ray Griffin advocated a sophisticated form of panpsychism, called `` Panexperientialism '' by A. N. Whitehead.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Sukumar Prasad is a southern Indian guitarist who was the first Carnatic musician to play the South Indian musical art form of Carnatic music on the electric guitar.

Predictions >>>
Sukumar Prasad is a southern Indian guitarist who was the first Carnatic musician to play the South Indian musical art form of Carnatic music on electric guitar.
Sukumar Prasad is a southern Indian guitarist who was the first Carnatic musician to play the South Indian musical art form of Carnatic music on electric guitar.
Sukumar Prasad is a southern Indian guitarist who was the first Carnatic musician to play the South Indian musical art form of Carnatic music on electric guitar.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Four times between 1777 and 1860, bells were consecrated, and three new ones were mentioned in 1861.

Predictions >>>
Four times between 1777 and 1860, bells were consecrated and three new ones were mentioned in 1861.
Four times between 1777 and 1860, bells were consecrated and three new ones were mentioned in 1861.
Four times between 1777 and 1860, bells were consecrated and three new ones were mentioned in 1861.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Wu Xiang's son, Wu Sangui, was the commander of the Shanhai Pass, the last major obstacle between Manchu and Beijing.

Predictions >>>
Wu Xiang's son, Wu Sangui, was the commander of the Shanhai Pass, the last major obstacle between Manchu and Beijing.
Wu Xiang's son, Wu Sangui, was the commander of the Shanhai Pass, the last major obstacle between Manchu and Beijing.
Wu Xiang's son, Wu Sangui, was the commander of the Shanhai Pass, the last major obstacle between Manchu and Beijing.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In physical cosmology, the energy of the cosmological vacuum appears as the cosmological constant.

Predictions >>>
In physical cosmology, the energy of the cosmological vacuum is used as the cosmolological constant.
In physical cosmology, the energy of the cosmological vacuum is used as the cosmolological constant.
In physical cosmology, the energy of the cosmological vacuum is represented as the cosmolological constant.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Windows XP - Mode runs Windows XP on a separate computer and displays applications in virtual windows on the Windows 7 desktop.

Predictions >>>
Windows XP - Mode runs Windows XP on a separate computer and displays applications in virtual windows on the Windows 7 desktop.
Windows XP - Mode runs Windows XP on a separate computer and displays applications in virtual windows on the Windows 7 desktop.
Windows XP - Mode runs Windows XP on a separate computer and displays applications in virtual windows on the Windows 7 desktop.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Torre is a New York Times, USA Today, Wall Street Journal and Amazon International bestselling novelist.

Predictions >>>
Torre is a New York Times, USA Today, Wall Street Journal and Amazon International bestselling novelist.
Torre is a New York Times, USA Today, Wall Street Journal and Amazon International bestselling novelist.
Torre is a New York Times, USA Today, Wall Street Journal and Amazon International bestselling novelist.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In the following weeks on `` Raw '' Hardy repeatedly began to attack Lita and seduce Kane.

Predictions >>>
In the following weeks on `` Raw '' Hardy repeatedly attacked Lita and seduced Kane.
In the following weeks on `` Raw '' Hardy repeatedly began to attack Lita and seduce Kane.
In the following weeks on `` Raw '' Hardy repeatedly began to attack Lita and seduce Kane.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Players not released by the Warriors were required to play in the 2006 Bartercard Cup.

Predictions >>>
Players not released by the Warriors were required to play in the 2006 Bartercard Cup.
Players not released by the Warriors were required to play in the 2006 Bartercard Cup in 2006.
Players not released by the Warriors were required to play in the 2006 Bartercard Cup in 2006.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Lorna Saycon Espina, the ninth mayor, is the incumbent mayor of the Municipality of Mahayag.

Predictions >>>
Lorna Saycon Espina, the ninth mayor of Mahayag Municipality, is the incumbent mayor of the municipality.
Lorna Saycon Espina, the ninth mayor of Mahayag Municipality, is the incumbent mayor of the municipality.
Lorna Saycon Espina, the ninth mayor of Mahayag Municipality, is the incumbent mayor of the municipality.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Both trains are hauled by a Guwahati Loco Shed based WDM-2 diesel locomotive from Rangiya to Dibrugarh and vice versa.

Predictions >>>
Both trains are hauled by a Guwahati Loco Shed based WDM-2 diesel locomotive from Rangiya to Dibrugarh and vice versa.
Both trains are hauled by a Guwahati Loco Shed based WDM-2 diesel locomotive from Rangiya to Dibrugarh and vice versa.
Both trains are hauled by a Guwahati Loco Shed based WDM-2 diesel locomotive from Rangiya to Dibrugarh and vice versa.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Hans Bock ( October 5, 1928 - January 21, 2008 ) was a German chemist, born in Königstein, Taunus, and died in Hamburg.

Predictions >>>
Hans Bock ( October 5, 1928 -- January 21, 2008 ) was a German chemist, born in Königstein, Taunus and died in Hamburg.
Hans Bock ( October 5, 1928 -- January 21, 2008 ) was a German chemist, born in Königstein, Taunus and died in Hamburg.
Hans Bock ( October 5, 1928 -- January 21, 2008 ) was a German chemist, born in Königstein, Taunus and died in Hamburg.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Having won the 2001 Super League, the Knights traveled to England to play the 2002 World Club Challenge against NRL Premiership champions, the Bradford Bulls.

Predictions >>>
The Knights traveled to England to play the 2002 World Club Challenge against the NRL Premiership champions, the Bradford Bulls in 2001.
The Knights traveled to England to play the 2002 World Club Challenge against the NRL Premiership champions, the Bradford Bulls in 2001.
The Knights traveled to England to play the 2002 World Club Challenge against the NRL Premiership champions, the Bradford Bulls in 2001.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The interior of the resettled Palais de Danse was then redesigned by the griffins the following year.

Predictions >>>
The interior of the resettled Palais de Danse was then redesigned by the griffins the following year.
The interior of the resettled Palais de Danse was then redesigned by the griffins the following year.
The interior of the resettled Palais de Danse was then redesigned by the griffins the following year.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In 1727, Mary Mary Carr married Caleb Carr, daughter of Lyndon ( Stanton ) Carr, and granddaughter of Governor Edward and Hannah.

Predictions >>>
Mary Mary Carr married Caleb Carr in 1727, daughter of Lyndon ( Stanton ) Carr and granddaughter of Governor Edward and Hannah.
Mary Mary Carr married Caleb Carr in 1727, daughter of Lyndon ( Stanton ) Carr and granddaughter of Governor Edward and Hannah.
Mary Mary Carr married Caleb Carr in 1727, daughter of Lyndon ( Stanton ) Carr and granddaughter of Governor Edward and Hannah.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Kurt Treu ( 15 September 1928 in Vienna, Austria -- 6 June 1991 in Karja, Saare County, Estonia ), was a German classical Philologist.

Predictions >>>
Kurt Treu ( September 15, 1928 in Vienna, Austria -- June 6, 1991 in Karja, Saare County, Estonia ) was a German classical philologist.
Kurt Treu ( September 15, 1928 in Vienna, Austria -- June 6, 1991 in Karja, Saare County, Estonia ) was a German classical philologist.
Kurt Treu ( September 15, 1928 in Vienna, Austria -- June 6, 1991 in Karja, Saare County, Estonia ) was a German classical philologist.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Peterson wore  52 on his uniform during his rookie season, and switched to  19 in 1967 after Bob Friend was traded to the New York Mets.

Predictions >>>
Peterson wore 52 on his uniform during his rookie season and switched to 19 in 1967 after Bob Friend was traded to the New York Mets.
Peterson wore 52 on his uniform during his rookie season and switched to 19 in 1967 after Bob Friend was traded to the New York Mets.
Peterson wore 52 on his uniform during his rookie season and switched to 19 in 1967 after Bob Friend was traded to the New York Mets.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Suzanne Ashworth ( also Dick ) is a fictional figure from the British Channel 4 - soap opera `` Hollyoaks '', played by Suzanne Hall.

Predictions >>>
Suzanne Ashworth ( also Dick ) is a fictional character from the British Channel 4 soap opera `` Hollyoaks '', played by Suzanne Hall.
Suzanne Ashworth ( also Dick ) is a fictional character from the British Channel 4 soap opera `` Hollyoaks '', played by Suzanne Hall.
Suzanne Ashworth ( also Dick ) is a fictional character from the British Channel 4 soap opera `` Hollyoaks '', played by Suzanne Hall.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Sites were selected at Maiduguri, Kaduna, Lagos, Minna, Kano and Oshogbo.

Predictions >>>
Sites were selected at Maiduguri, Kaduna, Lagos, Minna, Kano and Oshogbo.
Sites were selected at Maiduguri, Kaduna, Lagos, Minna, Kano and Oshogbo.
Sites were selected at Maiduguri, Kaduna, Lagos, Minna, Kano and Oshogbo.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
In 2002 Global Exchange was awarded the Domestic Human Rights Award by CRLA, and international NGO based in San Francisco.

Predictions >>>
In 2002 Global Exchange was awarded the Domestic Human Rights Award by CRLA, an international NGO based in San Francisco.
In 2002 Global Exchange was awarded the Domestic Human Rights Award by CRLA, an international NGO based in San Francisco.
In 2002 Global Exchange was awarded the Domestic Human Rights Award by CRLA, an international NGO based in San Francisco.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
White was born in PEI, educated in Toronto, and lived in Halifax. She has a BA in English from Dalhousie University.

Predictions >>>
White was born in PEI, educated in Toronto and lived in Halifax for many years and has a BA in English from Dalhousie University.
White was born in PEI, educated in Toronto and lived in Halifax for most of her life and has a BA in English from Dalhousie University.
White was born in PEI, educated in Toronto and lived in Halifax for many years and has a BA in English from Dalhousie University.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
From a Vedic point of view, these 11 yoga upanishads belong to the Vedantic shaka ( school ).

Predictions >>>
From a Vedic point of view, these 11 yoga upanishads belong to the Vedantic Shaka ( school ).
From a Vedic point of view, these 11 yoga upanishads belong to the Vedantic Shaka ( school ).
From a Vedic point of view, these 11 yoga upanishads belong to the Vedantic Shaka ( school ).
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
He grew up in Greater London and finally lived in Surbiton with his parents, then in Surrey, now in Sussex.

Predictions >>>
He grew up in Greater London and finally lived in Surbiton with his parents, then in Surrey, now in Sussex.
He grew up in Greater London and finally lived in Surbiton with his parents, then in Surrey, now in Sussex.
He grew up in Greater London and finally lived in Surbiton with his parents, then in Surrey, now in Sussex.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Denise Foster is a survivor of an attack by Sanchez when she was 17 years old, near railroad tracks in Buffalo, New York.

Predictions >>>
Denise Foster is a survivor of an attack by Sanchez when she was 17 years old near railroad tracks in Buffalo, New York.
Denise Foster is a survivor of an attack by Sanchez when she was 17 years old near railroad tracks in Buffalo, New York.
Denise Foster is a survivor of an attack by Sanchez when she was 17 years old near railroad tracks in Buffalo, New York.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The layout of Belmont is similar to that of Twin Pines park and other parks in Davey Glen Park.

Predictions >>>
The layout of Belmont is similar to that of Twin Pines Park and other parks in Davey Glen Park.
The layout of Belmont is similar to that of Twin Pines Park and other parks in Davey Glen Park.
The layout of Belmont is similar to that of Twin Pines Park and other parks in Davey Glen Park.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
The Thrash - Metal - Group Anthrax opened Kiss at the Thomas'Mack Show in Costa Mesa and at the Pacific Amphitheatre Show in Las Vegas.

Predictions >>>
The Thrash Metal - Group Anthrax opened Kiss at the Thomas Mack Show in Costa Mesa and at the Pacific Amphitheatre Show in Las Vegas.
The Thrash Metal Group Anthrax opened Kiss at the Thomas Mack Show in Costa Mesa and the Pacific Amphitheatre Show in Las Vegas.
The Thrash Metal - Group Anthrax opened Kiss at the Thomas Mack Show in Costa Mesa and the Pacific Amphitheatre Show in Las Vegas.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
According to the Bureau of Meteorology, the lowest temperature ever recorded in Toowoomba was on 12 February 2017, while the highest was on 12 July 1965.

Predictions >>>
According to the Bureau of Meteorology, the lowest temperature ever recorded in Toowoomba was on February 12, 2017, while the highest was on July 12, 1965.
According to the Bureau of Meteorology, the lowest temperature ever recorded in Toowoomba was on February 12, 2017, while the highest was on July 12, 1965.
According to the Bureau of Meteorology, the lowest temperature ever recorded in Toowoomba was on February 12, 2017, while the highest was on July 12, 1965.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
He was born in a small kingdom of 1000 `` li '' northeast of Wa ( Japan ).

Predictions >>>
He was born in a small kingdom of 1000 `` li '' northeast of Wa ( Japan ).
He was born in a small kingdom of 1000 `` li '' northeast of Wa ( Japan ).
He was born in a small kingdom of 1000 `` li '' northeast of Wa ( Japan ).
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Holodecks can also be used to encourage students to build a scene or to even describe a scene.

Predictions >>>
Holodecks can also be used to encourage students to build a scene or even describe a scene.
Holodecks can also be used to encourage students to build a scene or even describe a scene.
Holodecks can also be used to encourage students to build a scene or even describe a scene.
---------------------------------------------------------



Generating outputs:   0%|          | 0/1 [00:00<?, ?it/s]

---------------------------------------------------------
Chuck Chuck Campbell of `` Go Knoxville '' compared Ariana Grande in `` Rare '' with Stefani's vocals.

Predictions >>>
Chuck Campbell of `` Go Knoxville '' compared Ariana Grande in `` Rare '' with Stefani's vocals.
Chuck Campbell of `` Go Knoxville '' compared Ariana Grande in `` Rare '' with Stefani's vocals.
Chuck Campbell of `` Go Knoxville '' compared Ariana Grande in `` Rare '' with Stefani's vocals.
---------------------------------------------------------



In [None]:
# !pip install evaluate
# !pip install rouge-score
import evaluate
rouge = evaluate.load('rouge')

results = rouge.compute(predictions=to_predict_small, references=predicted)
print(results)

{'rouge1': 0.9714710335873413, 'rouge2': 0.9198488512746381, 'rougeL': 0.9534912982368438, 'rougeLsum': 0.953876186023664}


### Checking Semantic Similarity

For checking semantic similarity, we can calculate cosine similarity between the embeddings of the sentences. We use sentence-transformers library to check semantic similarity.
The score shows that two sentences are similar.

In [None]:
!pip install sentence-transformers

In [6]:
from sentence_transformers import SentenceTransformer, util
sentences = ["What is the best way to play cricket?", "How do I play cricket?"]

model = SentenceTransformer('all-MiniLM-L6-v2')

embedding_1= model.encode(sentences[0], convert_to_tensor=True)
embedding_2 = model.encode(sentences[1], convert_to_tensor=True)

print("Similarity between given sentences is : ", util.pytorch_cos_sim(embedding_1, embedding_2))

Similarity between given sentences is :  tensor([[0.9132]], device='cuda:0')


## Conclusion:

Thus we have implemented Paraphrase model using BART pre-trained model. Below observations were made:
1. After last round of finetuning, model is no more generating questions for an assertive sentence.
2. Rogue score for the model has increased.

Earlier model: rouge1: 0.7882248725261429, rougeL: 0.7779830322933771

Final model: rouge1: 0.8656432748538013, rougeL: 0.8654553049289891

3. Sometimes, the model gives the same sentence with no change in it.

## References:
1. https://huggingface.co/docs/transformers/training
2. https://pub.towardsai.net/fine-tune-bart-for-translation-on-wmt16-dataset-and-train-new-tokenizer-4d0fbdc4aa2e
3. https://simpletransformers.ai/
4. https://www.sbert.net/
5. https://medium.com/@priyankads/rouge-your-nlp-results-b2feba61053a