# Project Examination Guidelines and Instructions



### 1. Aim
**Comparative Study of Encoder-Decoder Architectures with Attention Mechanisms**

### 2. Objectives
To implement and compare different encoder-decoder architectures:
- Without Attention (LSTM/GRU-based)
- With Attention (Bahdanau/Luong)
- With Self-Attention (Transformer)

On a chosen task in the domain of **Text-to-Text** or **Image-to-Text** generation.

---

### 3. Task Selection Guidelines

**Guidelines**
- Identify and study a research paper from reputed journals.
- Implement all 3 architectures (No Attention, With Attention, With Self-Attention).
- Use evaluation metrics based on the selected task.
- Compare and analyze models using graphs, tables, and visualizations.
- Highlight interpretability, especially using attention maps.

**Task Domains (Choose One)**

**Text-to-Text Tasks:**
- Question Answering
- Paraphrase Generation
- Grammar Correction
- Text Simplification
- Dialogue Generation
- Headline Generation

**Image-to-Text Tasks:**
- Image Captioning
- Visual Question Answering (VQA)

---

### 4. Implementation Process

#### 4.1 Task 1: Identify a Research Paper
- Select a recent research paper (ACL, arXiv, IEEE, Springer, etc.)
- Focus: Use of Encoder-Decoder with Attention or Transformer
- Based on your selected task domain

#### 4.2 Task 2: Implement Encoder-Decoder without Attention
- Use LSTM/GRU-based encoder-decoder architecture
- Train on a small dataset (e.g., MS-COCO for image captioning, SQuAD for QA, IWSLT14 for translation)
- Evaluate using:
  - BLEU / ROUGE / METEOR / CIDEr (based on task)
  - Training time
  - Inference speed

#### 4.3 Task 3: Implement Encoder-Decoder with Attention
- Add Attention (embedding layer details)
- Use the same dataset and architecture as in Task 2
- Visualize attention weights (for interpretability)
- Compare performance improvements
- Evaluate using the same metrics

#### 4.4 Task 4: Implement Encoder-Decoder with Self-Attention
- Implement a Transformer-based encoder-decoder
- Use Multi-head Attention, Positional Encoding, and Layer Norms
- Train on the same task and dataset
- Evaluate using the same metrics

#### 4.5 Task 5: Analyze and Compare

| Criteria            | LSTM/GRU (No Attention) | Attention (Bahdanau/Luong) | Transformer (Self-Attention) |
|---------------------|--------------------------|------------------------------|-------------------------------|
| Accuracy / BLEU     |                          |                              |                               |
| ROUGE / METEOR      |                          |                              |                               |
| CIDEr / SPICE       | (if applicable)          |                              |                               |
| Training Time       |                          |                              |                               |
| Inference Speed     |                          |                              |                               |
| Model Complexity    |                          |                              |                               |
| Interpretability    |                          | ✔ (Attention Maps)           | ✔ (Attention Heads)           |

**Evaluation Metrics (per Task Domain)**

| Task Domain   | Suggested Metrics                            |
|---------------|-----------------------------------------------|
| Text-to-Text  | BLEU, ROUGE, METEOR, Accuracy                |
| Image-to-Text | BLEU, CIDEr, METEOR, SPICE                  |
| Any Task      | Loss curve, Training Time, Inference Speed  |

---

### 5. Deliverables

Submit the following as a zip folder or GitHub repo:

#### 5.1 Presentation / Slides (Google Slides and its PDF)
- Paper summary
  - Aim, Objectives, Problem statements, Methodology
- Model diagrams and architecture
- Dataset description
- Metric-wise performance comparison
- Graphs (training curves, attention maps)
- Final analysis table and discussion

#### 5.2 Code Files
- Well-commented code for all 3 models
- Scripts for training, evaluation, and visualization

#### 5.3 Dataset Link or Sample

- **Demo Video** showcasing model training and output

---

### 6. LinkedIn Post for Project Demonstration

Prepare a LinkedIn post for showcasing your project:
- Poster / Graphical Abstract of project
- Brief project overview (task, models, objectives)
- Key results and comparisons (metrics like BLEU, ROUGE, etc.)
- Visualizations (e.g., attention maps, graphs)
- Link to GitHub repo or demo video (if applicable)
- Mention any collaborations or research papers used
- **Note:** Every diagram, figure, or poster shared must be designed by the project group members (no copying from external sources)

**Sample Poster Template:**  
All major highlights should be covered in the poster.

---

### 7. Rubrics

| Category                | Excellent (Full Marks) | Good | Average |
|------------------------|------------------------|------|---------|
| **Understanding & Research** (5M) | Clear problem statement, strong paper summary, deep understanding of attention/Transformer-based architectures (5M) | Adequate problem clarity and basic understanding (3–4M) | Vague problem definition, weak grasp of concepts (0–2M) |
| **Implementation** (10M) | All three models correctly implemented, clean modular code, detailed comments (8–10M) | Core models implemented, some modularity and comments (5–7M) | Poor or missing implementation, minimal clarity (0–4M) |
| **Evaluation & Analysis** (5M) | Proper metrics, clear comparisons, attention maps, and detailed discussion (5M) | Some metrics and visualizations, limited insight (3–4M) | Poor analysis, no visualizations (0–2M) |
| **Presentation & Output** (5M) | Professional report, clear poster, good LinkedIn post, clean GitHub/demo (5M) | Mostly complete work, basic demo/post (3–4M) | Poorly structured or incomplete work (0–2M) |

---



## Guidelines for Project Examination

All students appearing for the project examination must adhere to the following:

- **Assessment Panel:**  
  The project examination will be conducted jointly by an external examiner and the internal examiners.

- **Equal Contribution:**  
  Each project member is required to contribute equally in:
  - Project Implementation
  - Demonstration
  - Presentation

- **Dress Code:**  
  Formal attire is compulsory for all students during the project examination.

- **Pre-Examination Requirements:**  
  Prior to the practical examination, students must:
  - Submit the **Course Exit Feedback**
  - Ensure all project-related submissions are completed

> ⚠️ Non-compliance with the above instructions may lead to disqualification from the examination.




# **MDM Lab Project**

| Group Member | Dept |  PRN  |
|-------------------|------|-------|
|  Kaustubh Wagh    | E&TC | 70021 |
| Jayesh Deshmukh   | COMP | 40203 |
| Alvin Abraham     | E&TC | 70132 |

> **Title**: Comparative Study of Encoder-Decoder Architectures for Paraphrase Generation<br>

> **Problem Statement:** To develop and compare encoder-decoder architectures with varying attention mechanisms for effective paraphrase generation.<br>

> **Research Paper Identified: https://aclanthology.org/2021.emnlp-main.414/** <br> Jianing Zhou and Suma Bhat. 2021. Paraphrase Generation: A Survey of the State of the Art. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 5075–5086, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics. <br> <br>

>**Video Referenced: https://aclanthology.org/2021.emnlp-main.414.mp4** <br>

>**Dataset Identified - Quora Question Pairs Dataset : https://www.kaggle.com/datasets/quora/question-pairs-dataset**

<br>

**Objectives:** <br>
- To implement an encoder-decoder model without attention using LSTM/GRU for paraphrase generation.

- To integrate Bahdanau and Luong attention mechanisms into the encoder-decoder architecture and evaluate their impact.

- To design and train a Transformer-based model utilizing self-attention for paraphrase generation.

- To compare the performance of all three models using appropriate evaluation metrics such as BLEU, ROUGE, and METEOR.

- To analyze model interpretability using attention maps and visualizations.

- To study and document the strengths and limitations of each architecture in the context of paraphrase generation.

---

# **0. Dataset and Text Preprocessing**

### **0.1 Importing Dataset from Kaggle**

In [None]:
from google.colab import files
files.upload()

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"kaustubhwagh17","key":"69b44e72d9300c0d2d3a864ace2b43a1"}'}

In [None]:
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

In [None]:
!kaggle datasets download -d quora/question-pairs-dataset

Dataset URL: https://www.kaggle.com/datasets/quora/question-pairs-dataset
License(s): other


In [None]:
!unzip -q question-pairs-dataset.zip -d quora_data/

In [None]:
!pip install pandas
!pip install matplotlib
!pip install scikit-learn
!pip install nltk
!pip install tensorflow



In [None]:
import pandas as pd
import numpy as np
import string
from string import digits
import matplotlib.pyplot as plt
import tensorflow as tf
%matplotlib inline

import re
from sklearn.utils import shuffle
from sklearn.model_selection import train_test_split
from tensorflow.keras.layers import Input,LSTM, Embedding, Dense, Bidirectional
from tensorflow.keras.models import Model
import nltk

In [None]:
df = pd.read_csv('quora_data/questions.csv')
df.head(3)

Unnamed: 0,id,qid1,qid2,question1,question2,is_duplicate
0,0,1,2,What is the step by step guide to invest in sh...,What is the step by step guide to invest in sh...,0
1,1,3,4,What is the story of Kohinoor (Koh-i-Noor) Dia...,What would happen if the Indian government sto...,0
2,2,5,6,How can I increase the speed of my internet co...,How can Internet speed be increased by hacking...,0


In [None]:
df.shape

(404351, 6)

In [None]:
df = df[df['is_duplicate'] == 1]

# Subsample to a manageable size (e.g., 5000 or 10000 pairs)
df = df.sample(n=60000, random_state=42).reset_index(drop=True)


df.dropna(subset=['question1', 'question2'], inplace=True)
df.head(3)

Unnamed: 0,id,qid1,qid2,question1,question2,is_duplicate
0,279065,548630,548631,What is it like to be undergraduate students?,What is it like to be an undergraduate student?,1
1,87099,172963,172964,Money: What would a world without money be like?,What would the world be like if money didn't e...,1
2,314999,618180,618181,What are some Punjabi jokes?,What are some good Punjabi jokes?,1


In [None]:
df.shape

(60000, 6)

In [None]:
# Define contraction mapping for basic negations
contractions = {
    "isn't": "is not", "aren't": "are not", "wasn't": "was not", "weren't": "were not",
    "haven't": "have not", "hasn't": "has not", "hadn't": "had not",
    "won't": "will not", "wouldn't": "would not", "don't": "do not",
    "doesn't": "does not", "didn't": "did not", "can't": "cannot", "couldn't": "could not",
    "shouldn't": "should not", "mightn't": "might not", "mustn't": "must not",
    "n't": " not"  # catch remaining negations like couldn't => could not
}

# Function to expand contractions and clean text
def clean_text(text):
    text = text.lower()

    # Handle common contractions
    for contraction, expanded in contractions.items():
        text = re.sub(r'\b' + re.escape(contraction) + r'\b', expanded, text)

    # Remove punctuation and digits
    text = re.sub(r"[^a-z\s]", "", text)
    text = re.sub(r'\s+', ' ', text).strip()

    return text

# Apply to both question columns
df['question1'] = df['question1'].apply(clean_text)
df['question2'] = df['question2'].apply(clean_text)


# Preview the cleaned data
df[['question1', 'question2']].head()

Unnamed: 0,question1,question2
0,what is it like to be undergraduate students,what is it like to be an undergraduate student
1,money what would a world without money be like,what would the world be like if money did not ...
2,what are some punjabi jokes,what are some good punjabi jokes
3,how was the international space station built,how was international space station made
4,what are the best skullcandy earbuds,what are the best skullcandy earbuds and headp...


In [None]:
# Define min and max lengths (in words)
MIN_LEN = 3
MAX_LEN = 13 # you can tune this based on dataset

# Apply length filtering to both questions
df = df[df['question1'].str.split().apply(len).between(MIN_LEN, MAX_LEN)]
df = df[df['question2'].str.split().apply(len).between(MIN_LEN, MAX_LEN)]
# remove duplicate pairs
df = df[df['question1'] != df['question2']]

# adding start and end tokens
df['target_input'] = df['question2'].apply(lambda x: '<start> ' + x)
df['target_output'] = df['question2'].apply(lambda x: x + ' <end>')

df[['question1', 'question2']].head()

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['target_input'] = df['question2'].apply(lambda x: '<start> ' + x)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['target_output'] = df['question2'].apply(lambda x: x + ' <end>')


Unnamed: 0,question1,question2
0,what is it like to be undergraduate students,what is it like to be an undergraduate student
1,money what would a world without money be like,what would the world be like if money did not ...
2,what are some punjabi jokes,what are some good punjabi jokes
3,how was the international space station built,how was international space station made
4,what are the best skullcandy earbuds,what are the best skullcandy earbuds and headp...


In [None]:
df.shape

(47012, 8)

In [None]:
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Use a shared tokenizer for both input and output (optional: separate if needed)
tokenizer = Tokenizer(filters='', oov_token='<OOV>')
tokenizer.fit_on_texts(df['question1'].tolist() + df['target_input'].tolist() + df['target_output'].tolist())

# Convert texts to sequences
input_seq = tokenizer.texts_to_sequences(df['question1'])
target_input_seq = tokenizer.texts_to_sequences(df['target_input'])
target_output_seq = tokenizer.texts_to_sequences(df['target_output'])

# Define max length (based on percentiles or max in dataset)
MAX_INPUT_LEN = max([len(seq) for seq in input_seq])
MAX_TARGET_LEN = max([len(seq) for seq in target_input_seq])

# Pad sequences
input_seq = pad_sequences(input_seq, maxlen=MAX_INPUT_LEN, padding='post')
target_input_seq = pad_sequences(target_input_seq, maxlen=MAX_TARGET_LEN, padding='post')
target_output_seq = pad_sequences(target_output_seq, maxlen=MAX_TARGET_LEN, padding='post')

# Vocabulary size
vocab_size = len(tokenizer.word_index) + 1


In [None]:
print(tokenizer.sequences_to_texts([input_seq[0]]))
print(tokenizer.sequences_to_texts([target_input_seq[0]]))


['what is it like to be undergraduate students <OOV> <OOV> <OOV> <OOV> <OOV>']
['<start> what is it like to be an undergraduate student <OOV> <OOV> <OOV> <OOV>']


In [None]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Embedding, LSTM, Dense

embedding_dim = 128
lstm_units = 256

# Encoder
encoder_inputs = Input(shape=(MAX_INPUT_LEN,))
enc_emb = Embedding(input_dim=vocab_size, output_dim=embedding_dim, mask_zero=True)(encoder_inputs)
encoder_lstm, state_h, state_c = LSTM(lstm_units, return_state=True)(enc_emb)
encoder_states = [state_h, state_c]

# Decoder
decoder_inputs = Input(shape=(MAX_TARGET_LEN,))
dec_emb_layer = Embedding(input_dim=vocab_size, output_dim=embedding_dim, mask_zero=True)
dec_emb = dec_emb_layer(decoder_inputs)
decoder_lstm = LSTM(lstm_units, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(dec_emb, initial_state=encoder_states)
decoder_dense = Dense(vocab_size, activation='softmax')
decoder_outputs = decoder_dense(decoder_outputs)

# Define model
model = Model([encoder_inputs, decoder_inputs], decoder_outputs)
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.summary()


In [None]:
import numpy as np

target_output_seq = np.expand_dims(target_output_seq, -1)

from tensorflow.keras.callbacks import EarlyStopping

earlystop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)


In [None]:
history = model.fit(
    [input_seq, target_input_seq], target_output_seq,
    batch_size=64,
    epochs=5,
    validation_split=0.2,
    callbacks=[earlystop]
)

Epoch 1/5
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 78ms/step - accuracy: 0.1291 - loss: 6.0459 - val_accuracy: 0.1968 - val_loss: 4.6119
Epoch 2/5
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 77ms/step - accuracy: 0.2121 - loss: 4.3494 - val_accuracy: 0.2376 - val_loss: 4.0829
Epoch 3/5
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.2494 - loss: 3.7876 - val_accuracy: 0.2677 - val_loss: 3.7465
Epoch 4/5
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.2829 - loss: 3.3502 - val_accuracy: 0.2912 - val_loss: 3.4651
Epoch 5/5
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.3112 - loss: 2.9457 - val_accuracy: 0.3065 - val_loss: 3.2645


In [None]:
history = model.fit(
    [input_seq, target_input_seq], target_output_seq,
    batch_size=64,
    epochs=10,
    validation_split=0.2,
    callbacks=[earlystop]
)

Epoch 1/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m45s[0m 77ms/step - accuracy: 0.3358 - loss: 2.6032 - val_accuracy: 0.3203 - val_loss: 3.1187
Epoch 2/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.3581 - loss: 2.3226 - val_accuracy: 0.3293 - val_loss: 3.0251
Epoch 3/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 78ms/step - accuracy: 0.3774 - loss: 2.0908 - val_accuracy: 0.3365 - val_loss: 2.9631
Epoch 4/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.3972 - loss: 1.8755 - val_accuracy: 0.3404 - val_loss: 2.9271
Epoch 5/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 78ms/step - accuracy: 0.4180 - loss: 1.6842 - val_accuracy: 0.3464 - val_loss: 2.9056
Epoch 6/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m45s[0m 77ms/step - accuracy: 0.4372 - loss: 1.5288 - val_accuracy: 0.3500 - val_loss: 2.8899
Epoch 7/10
[1m5

In [None]:
history = model.fit(
    [input_seq, target_input_seq], target_output_seq,
    batch_size=64,
    epochs=10,
    validation_split=0.2,
    callbacks=[earlystop]
)

Epoch 1/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m45s[0m 77ms/step - accuracy: 0.4731 - loss: 1.2536 - val_accuracy: 0.3560 - val_loss: 2.9146
Epoch 2/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 78ms/step - accuracy: 0.4871 - loss: 1.1348 - val_accuracy: 0.3566 - val_loss: 2.9321
Epoch 3/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 77ms/step - accuracy: 0.4998 - loss: 1.0491 - val_accuracy: 0.3576 - val_loss: 2.9560
Epoch 4/10
[1m588/588[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 77ms/step - accuracy: 0.5141 - loss: 0.9533 - val_accuracy: 0.3587 - val_loss: 2.9874


In [None]:
model.save('model1.keras')

In [None]:
# Encoder model (same as during training)
encoder_model = Model(encoder_inputs, encoder_states)

# Decoder model
decoder_state_input_h = Input(shape=(lstm_units,))
decoder_state_input_c = Input(shape=(lstm_units,))
decoder_states_inputs = [decoder_state_input_h, decoder_state_input_c]

# Decoder LSTM (with previous states for inference)
dec_emb2 = dec_emb_layer(decoder_inputs)
decoder_lstm2 = LSTM(lstm_units, return_sequences=True, return_state=True)
decoder_outputs2, state_h2, state_c2 = decoder_lstm2(dec_emb2, initial_state=decoder_states_inputs)

# Decoder output layer
decoder_outputs2 = decoder_dense(decoder_outputs2)

# Define the full decoder model
decoder_model = Model([decoder_inputs] + decoder_states_inputs, [decoder_outputs2, state_h2, state_c2])


In [None]:
def generate_paraphrase(input_seq, max_length=MAX_TARGET_LEN):
    # Get initial states from the encoder
    states_value = encoder_model.predict(input_seq)

    # Initialize the target sequence (start with <start> token)
    target_seq = np.array([tokenizer.texts_to_sequences(['<start>'])[0]])

    # Collect the output sequence
    generated_sequence = []

    # Iterate to predict one word at a time
    for _ in range(max_length):
        # Get the next word prediction and states
        output_tokens, state_h, state_c = decoder_model.predict([target_seq] + states_value)

        # Get the predicted word index
        predicted_word_idx = np.argmax(output_tokens[0, -1, :])

        # Convert word index to word
        predicted_word = tokenizer.index_word.get(predicted_word_idx, '')

        # Stop if we predict the <end> token
        if predicted_word == '<end>' or predicted_word == '':
            break

        # Add predicted word to sequence
        generated_sequence.append(predicted_word)

        # Update target sequence and states
        target_seq = np.array([tokenizer.texts_to_sequences([predicted_word])[0]])
        states_value = [state_h, state_c]

    return ' '.join(generated_sequence)


In [None]:
text = "How would i know if I am sick"
mlen = len(text.split()) + 2
sample_input_seq = tokenizer.texts_to_sequences([text])
sample_input_seq = pad_sequences(sample_input_seq, maxlen=MAX_INPUT_LEN, padding='post')

# Generate paraphrase
paraphrase = generate_paraphrase(sample_input_seq, max_length=mlen)
print(paraphrase)


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 142ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 129ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 37ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 36ms/step
how how how persons persons persons persons most most proposals


In [None]:
print(tokenizer.word_index)  # Check if '<end>' exists in the word index




In [None]:
model.save('model1.keras')

In [None]:
predicted_sentences = []
actual_target_sentences = []

for i in range(len(input_seq)):
    input_example = input_seq[i:i+1]  # keep batch dimension
    predicted_text = generate_paraphrase(input_example)
    predicted_sentences.append(predicted_text)

    # Remove <start> and <end> from actual target
    true_text = df.iloc[i]['target_output'].replace('<start>', '').replace('<end>', '').strip()
    actual_target_sentences.append(true_text)

    if i == 100:  # limit for speed — increase if needed
        break


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 32ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 32ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 38ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 66ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 129ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 124ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 

In [41]:
!pip install nltk rouge-score




In [42]:
import nltk
nltk.download('punkt')  # for tokenization


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.


True

In [43]:
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction

smoothie = SmoothingFunction().method4
bleu_scores = []

for ref, pred in zip(actual_target_sentences, predicted_sentences):
    reference = [ref.split()]
    candidate = pred.split()
    bleu = sentence_bleu(reference, candidate, smoothing_function=smoothie)
    bleu_scores.append(bleu)

average_bleu = sum(bleu_scores) / len(bleu_scores)
print(f"Average BLEU Score: {average_bleu:.4f}")


Average BLEU Score: 0.0142


In [44]:
!pip install rouge

Collecting rouge
  Downloading rouge-1.0.1-py3-none-any.whl.metadata (4.1 kB)
Downloading rouge-1.0.1-py3-none-any.whl (13 kB)
Installing collected packages: rouge
Successfully installed rouge-1.0.1


In [45]:
from rouge import Rouge

rouge = Rouge()
rouge_scores = rouge.get_scores(predicted_sentences, actual_target_sentences, avg=True)

print("ROUGE Scores:")
for metric, scores in rouge_scores.items():
    print(f"{metric}: {scores}")


ROUGE Scores:
rouge-1: {'r': 0.11359954676786352, 'p': 0.16962517680339453, 'f': 0.13025418445429482}
rouge-2: {'r': 0.0, 'p': 0.0, 'f': 0.0}
rouge-l: {'r': 0.11249943675686241, 'p': 0.16714992927864208, 'f': 0.1287309552082933}
