# Vietnamese-English Translation with RNN

This notebook runs the RNN-based translation model after pulling the code from GitHub.

## Setup

First, clone the repository from GitHub if needed. Otherwise, this notebook can be run directly on the code after uploading the GitHub files to Kaggle.

In [None]:
# Install required packages
!pip install -q sacrebleu datasets sentencepiece bert-score

In [None]:
# Set up directories
import os

# Create directories
os.makedirs('models', exist_ok=True)
os.makedirs('plots', exist_ok=True)
os.makedirs('cache', exist_ok=True)

# Clean cache if needed
# !rm -rf cache/*

## Preprocess the Dataset

In [None]:
# Preprocess datasets
!python preprocess.py --direction en-vi --cache_dir /kaggle/working/cache
!python preprocess.py --direction vi-en --cache_dir /kaggle/working/cache

## Train the Models

Set the parameters for training:

In [None]:
# Set training parameters
EMB_DIM = 256
HIDDEN_DIM = 512
NUM_LAYERS = 2
DROPOUT = 0.3
BATCH_SIZE = 64
EPOCHS = 10
LEARNING_RATE = 0.001

In [None]:
# Train English to Vietnamese
!python train.py \
    --direction en-vi \
    --emb_dim {EMB_DIM} \
    --hidden_dim {HIDDEN_DIM} \
    --num_layers {NUM_LAYERS} \
    --dropout {DROPOUT} \
    --batch_size {BATCH_SIZE} \
    --n_epochs {EPOCHS} \
    --learning_rate {LEARNING_RATE} \
    --model_dir /kaggle/working/models

In [None]:
# Train Vietnamese to English
!python train.py \
    --direction vi-en \
    --emb_dim {EMB_DIM} \
    --hidden_dim {HIDDEN_DIM} \
    --num_layers {NUM_LAYERS} \
    --dropout {DROPOUT} \
    --batch_size {BATCH_SIZE} \
    --n_epochs {EPOCHS} \
    --learning_rate {LEARNING_RATE} \
    --model_dir /kaggle/working/models

## Evaluate the Models

We'll evaluate the models using both BLEU and BERTScore metrics:
- BLEU: A traditional metric that measures n-gram overlap
- BERTScore: A newer metric that uses contextual embeddings to measure semantic similarity

In [None]:
# Evaluate English to Vietnamese
!python evaluate.py \
    --direction en-vi \
    --model_path /kaggle/working/models/en-vi-rnn.pt \
    --examples 3

In [None]:
# Evaluate Vietnamese to English
!python evaluate.py \
    --direction vi-en \
    --model_path /kaggle/working/models/vi-en-rnn.pt \
    --examples 3

## Example Translations

In [None]:
# English to Vietnamese
!python translate.py \
    --direction en-vi \
    --model_path /kaggle/working/models/en-vi-rnn.pt \
    --text "Hello, how are you today?"

In [None]:
# Vietnamese to English
!python translate.py \
    --direction vi-en \
    --model_path /kaggle/working/models/vi-en-rnn.pt \
    --text "Xin chào, bạn khỏe không?"

## Save the Models

The models will be saved to the `/kaggle/working/models` directory. You can download them from there or use Kaggle's output feature to save them for later use.

In [None]:
# Check the models directory
!ls -la /kaggle/working/models

## Interactive Translation (Optional)

Note: This will only work in a notebook environment that supports input() function. If not supported in Kaggle, you can run this locally after downloading the trained models.

In [None]:
# Import the required modules
import torch
from utils import preprocess_text
from evaluate import translate_sentence
from model import create_model

# Function for interactive translation
def interactive_translate(model_path, direction):
    # Setup device
    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    print(f"Using device: {device}")
    
    # Load model
    checkpoint = torch.load(model_path, map_location=device)
    src_vocab = checkpoint['src_vocab']
    tgt_vocab = checkpoint['tgt_vocab']
    model_args = checkpoint['args']
    
    # Create model
    model = create_model(
        src_vocab_size=len(src_vocab),
        tgt_vocab_size=len(tgt_vocab),
        embedding_dim=model_args['emb_dim'],
        hidden_dim=model_args['hidden_dim'],
        num_layers=model_args['num_layers'],
        dropout=model_args['dropout'],
        device=device
    )
    
    # Load weights
    model.load_state_dict(checkpoint['model_state_dict'])
    print(f"Model loaded from {model_path}")
    
    # Determine source and target languages
    if direction == 'en-vi':
        src_lang, tgt_lang = 'en', 'vi'
    else:
        src_lang, tgt_lang = 'vi', 'en'
    
    # Interactive mode
    print(f"\nInteractive translation mode ({src_lang} -> {tgt_lang})")
    print("Enter text to translate, or 'q' to quit.")
    
    while True:
        text = input(f"\n{src_lang} > ")
        
        if text.lower() == 'q':
            break
        
        preprocessed_text = preprocess_text(text, src_lang)
        translation = translate_sentence(preprocessed_text, src_vocab, tgt_vocab, model, device)
        
        print(f"{tgt_lang} > {translation}")

# Uncomment to run interactive translation
# interactive_translate('/kaggle/working/models/en-vi-rnn.pt', 'en-vi')

## Comparing RNN vs LSTM Performance

This notebook implements translation using simple RNN units. Here are some expected differences compared to LSTM models:

1. RNNs are simpler and have fewer parameters than LSTMs
2. RNNs may struggle with long-range dependencies due to the vanishing gradient problem
3. LSTMs typically achieve better performance on translation tasks due to their ability to control information flow through gates

The evaluation metrics (BLEU and BERTScore) allow us to quantitatively compare the performance of RNN vs LSTM models.