[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/translation.ipynb)
[![View on GitHub](https://img.shields.io/badge/View_on-GitHub-blue?logo=github)](https://github.com/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/translation.ipynb)

# Translation with BART and T5

## 🎯 Learning Objectives
By the end of this notebook, you will understand:
- How translation works as a sequence-to-sequence task
- How BART and T5 models handle translation
- Using encoder-decoder architectures for translation
- Basic translation implementation with Hugging Face

## 📋 Prerequisites
- Basic understanding of machine learning concepts
- Familiarity with Python and PyTorch
- Knowledge of NLP fundamentals (refer to [NLP Learning Journey](https://github.com/vuhung16au/nlp-learning-journey))

## 📚 What We'll Cover
1. **Setup**: Environment and device detection
2. **Translation Basics**: Understanding sequence-to-sequence tasks
3. **BART for Translation**: How BART adapts to translation
4. **T5 for Translation**: Using T5's text-to-text framework
5. **Practical Examples**: Simple translation demonstrations
6. **Summary**: Key takeaways and next steps

## What is Translation?

Translation involves converting text from one language to another while preserving its meaning. Translation is a sequence-to-sequence task, which means you can use encoder-decoder models like **BART** or **T5** to do it.

### How BART Works for Translation
BART adapts to translation by adding a separate randomly initialized encoder to map a source language to an input that can be decoded into the target language. This new encoder's embeddings are passed to the pretrained encoder instead of the original word embeddings.

### How T5 Works for Translation
T5 treats translation as a text-to-text task, where the input is prefixed with "translate [source] to [target]:" followed by the source text.

## 1. Setup and Environment

Let's start by importing the necessary libraries and setting up our environment.

In [None]:
# Import essential libraries
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForSeq2SeqLM
import warnings
warnings.filterwarnings('ignore')

print("✅ Libraries imported successfully!")
print(f"PyTorch version: {torch.__version__}")

In [None]:
# Device detection for optimal performance
def get_device():
    """
    Get the best available device for PyTorch operations.
    
    Priority order: CUDA > MPS (Apple Silicon) > CPU
    
    Returns:
        torch.device: The optimal device for current hardware
    """
    if torch.cuda.is_available():
        device = torch.device("cuda")
        print(f"🚀 Using CUDA GPU: {torch.cuda.get_device_name()}")
    elif torch.backends.mps.is_available():
        device = torch.device("mps")
        print("🍎 Using Apple MPS for Apple Silicon optimization")
    else:
        device = torch.device("cpu")
        print("💻 Using CPU (consider GPU for better performance)")
    
    return device

# Get the optimal device
device = get_device()

## 2. Basic Translation with Pipeline

The simplest way to get started with translation is using Hugging Face's pipeline interface.

In [None]:
# Create a simple translation pipeline
print("🔄 Loading translation model...")
print("This may take a few minutes on first run (downloading model)")

try:
    # Use a simple English to German translation model
    translator = pipeline(
        "translation",
        model="Helsinki-NLP/opus-mt-en-de",
        device=0 if device.type == 'cuda' else -1
    )
    
    print("✅ Translation pipeline loaded successfully")
    
except Exception as e:
    print(f"❌ Error loading translation model: {e}")
    print("💡 This model might not be available. Translation features may be limited.")
    translator = None

In [None]:
# Test basic translation
if translator:
    sample_texts = [
        "Hello, how are you today?",
        "I love learning about artificial intelligence.",
        "The weather is beautiful today."
    ]
    
    print("🌐 English to German Translation Examples")
    print("=" * 50)
    
    for i, text in enumerate(sample_texts, 1):
        print(f"\n{i}. 🇺🇸 English: {text}")
        
        try:
            result = translator(text)
            german_text = result[0]['translation_text']
            print(f"   🇩🇪 German: {german_text}")
            
        except Exception as e:
            print(f"   ❌ Translation failed: {e}")
else:
    print("❌ Translation pipeline not available")

## 3. Using BART for Translation

BART (Bidirectional and Auto-Regressive Transformers) is an encoder-decoder model that can be adapted for translation tasks.

In [None]:
# Load a BART-based translation model manually
print("🔄 Loading BART-based translation model...")

try:
    # Use a multilingual BART model for translation
    model_name = "facebook/mbart-large-50-one-to-many-mmt"
    
    # Load tokenizer and model
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
    
    # Move model to optimal device
    model.to(device)
    
    print("✅ BART model loaded successfully")
    
    # Set source and target languages
    # mBART uses language codes like "en_XX" for English, "de_DE" for German
    tokenizer.src_lang = "en_XX"  # English source
    
    def translate_with_bart(text, target_lang="de_DE"):
        """
        Translate text using BART model
        
        Args:
            text: Input text in source language
            target_lang: Target language code
        
        Returns:
            Translated text
        """
        # Tokenize input
        inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
        inputs = {k: v.to(device) for k, v in inputs.items()}
        
        # Generate translation
        with torch.no_grad():
            generated_tokens = model.generate(
                **inputs,
                forced_bos_token_id=tokenizer.lang_code_to_id[target_lang],
                max_length=50,
                num_beams=2,
                early_stopping=True
            )
        
        # Decode output
        translated_text = tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]
        return translated_text
    
    bart_available = True
    
except Exception as e:
    print(f"❌ Error loading BART model: {e}")
    print("💡 Using pipeline fallback instead")
    bart_available = False

In [None]:
# Test BART translation
if bart_available:
    test_text = "Machine learning is revolutionizing technology."
    
    print("🤖 BART Translation Example")
    print("=" * 30)
    print(f"🇺🇸 English: {test_text}")
    
    try:
        translated = translate_with_bart(test_text)
        print(f"🇩🇪 German (BART): {translated}")
    except Exception as e:
        print(f"❌ BART translation failed: {e}")
else:
    print("❌ BART model not available for demonstration")

## 4. Using T5 for Translation

T5 (Text-to-Text Transfer Transformer) treats every NLP task as a text-to-text problem, including translation.

In [None]:
# Create T5 translation pipeline
print("🔄 Loading T5 model for translation...")

try:
    # Use T5 small model for demonstration
    t5_translator = pipeline(
        "text2text-generation",
        model="t5-small",
        device=0 if device.type == 'cuda' else -1
    )
    
    print("✅ T5 model loaded successfully")
    
    def translate_with_t5(text, source_lang="English", target_lang="German"):
        """
        Translate text using T5's text-to-text format
        
        Args:
            text: Input text to translate
            source_lang: Source language name
            target_lang: Target language name
        
        Returns:
            Translated text
        """
        # T5 uses task prefixes - format for translation
        prompt = f"translate {source_lang} to {target_lang}: {text}"
        
        result = t5_translator(prompt, max_length=50, num_return_sequences=1)
        return result[0]['generated_text']
    
    t5_available = True
    
except Exception as e:
    print(f"❌ Error loading T5 model: {e}")
    print("💡 T5 model might not be available")
    t5_available = False

In [None]:
# Test T5 translation
if t5_available:
    test_sentences = [
        "Hello world!",
        "How are you doing today?",
        "I enjoy learning new languages."
    ]
    
    print("📝 T5 Translation Examples")
    print("=" * 30)
    
    for i, sentence in enumerate(test_sentences, 1):
        print(f"\n{i}. 🇺🇸 English: {sentence}")
        
        try:
            translated = translate_with_t5(sentence)
            print(f"   🇩🇪 German (T5): {translated}")
        except Exception as e:
            print(f"   ❌ Translation failed: {e}")
else:
    print("❌ T5 model not available for demonstration")

## 5. Understanding Model Differences

Let's compare how different models approach the same translation task.

In [None]:
# Compare different approaches
test_phrase = "Artificial intelligence is changing the world."

print("🔍 Model Comparison")
print("=" * 40)
print(f"📝 Original: {test_phrase}")
print()

# Pipeline translation (OPUS-MT)
if translator:
    try:
        pipeline_result = translator(test_phrase)
        print(f"🔧 Pipeline (OPUS-MT): {pipeline_result[0]['translation_text']}")
    except Exception as e:
        print(f"🔧 Pipeline (OPUS-MT): ❌ {e}")

# BART translation
if bart_available:
    try:
        bart_result = translate_with_bart(test_phrase)
        print(f"🤖 BART (mBART): {bart_result}")
    except Exception as e:
        print(f"🤖 BART (mBART): ❌ {e}")

# T5 translation
if t5_available:
    try:
        t5_result = translate_with_t5(test_phrase)
        print(f"📝 T5: {t5_result}")
    except Exception as e:
        print(f"📝 T5: ❌ {e}")

print("\n💡 Each model has different strengths:")
print("   • OPUS-MT: Specialized for specific language pairs")
print("   • BART/mBART: Good for multilingual translation")
print("   • T5: Flexible text-to-text approach")

## Summary

Congratulations! You've learned the fundamentals of machine translation with encoder-decoder models.

### 🔑 Key Concepts Mastered
- **Sequence-to-Sequence Tasks**: Understanding how translation works as an encoder-decoder problem
- **BART for Translation**: How BART adapts with additional encoders for cross-language tasks
- **T5 for Translation**: Using text-to-text format for translation tasks
- **Model Comparison**: Different approaches and their trade-offs

### 📈 Best Practices Learned
- Use specialized models (OPUS-MT) for specific language pairs
- Consider multilingual models (mBART) for broader language coverage
- Leverage T5's flexibility for various text-to-text tasks
- Always handle errors gracefully in production systems

### 🚀 Next Steps
- **Advanced Models**: Explore more sophisticated models like NLLB (No Language Left Behind)
- **Fine-tuning**: Learn to fine-tune models on domain-specific data
- **Evaluation**: Study translation quality metrics (BLEU, METEOR, BERTScore)
- **Production**: Build scalable translation APIs and services

### 📚 Additional Resources
- [Hugging Face Translation Guide](https://huggingface.co/docs/transformers/tasks/translation)
- [BART Documentation](https://huggingface.co/docs/transformers/model_doc/bart)
- [T5 Documentation](https://huggingface.co/docs/transformers/model_doc/t5)
- [mBART Documentation](https://huggingface.co/docs/transformers/model_doc/mbart)

As you've seen throughout this guide, many models follow similar patterns despite addressing different tasks. Understanding these common patterns can help you quickly grasp how new models work and how to adapt existing models to your specific needs.

---

## About the Author

**Vu Hung Nguyen** - AI Engineer & Researcher

Connect with me:
- 🌐 **Website**: [vuhung16au.github.io](https://vuhung16au.github.io/)
- 💼 **LinkedIn**: [linkedin.com/in/nguyenvuhung](https://www.linkedin.com/in/nguyenvuhung/)
- 💻 **GitHub**: [github.com/vuhung16au](https://github.com/vuhung16au/)

*This notebook is part of the [HF Transformer Trove](https://github.com/vuhung16au/hf-transformer-trove) educational series.*