[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/masked-language-model.ipynb)
[![View on GitHub](https://img.shields.io/badge/View_on-GitHub-blue?logo=github)](https://github.com/vuhung16au/hf-transformer-trove/blob/main/examples/basic1.4/masked-language-model.ipynb)

# Masked Language Model Demonstration

## üéØ Learning Objectives
By the end of this notebook, you will understand:
- What masked language modeling is and how it works
- How to use Hugging Face transformers for mask filling
- How to interpret model predictions and confidence scores
- Basic applications of masked language models

## üìã Prerequisites
- Basic understanding of machine learning concepts
- Familiarity with Python and PyTorch
- Knowledge of NLP fundamentals (refer to [NLP Learning Journey](https://github.com/vuhung16au/nlp-learning-journey))

## üìö What We'll Cover
1. **Setup**: Import libraries and set up environment
2. **Basic Masked Language Modeling**: Using the fill-mask pipeline
3. **Understanding Results**: Interpreting predictions and confidence scores
4. **Practical Examples**: Testing with different contexts
5. **Summary**: Key takeaways and next steps

## What is Masked Language Modeling?

Masked Language Modeling (MLM) is a self-supervised learning task where the model learns to predict missing words in a sentence. The missing words are represented by a special `<mask>` token. This is how models like BERT were pre-trained to understand language context and semantics.

**Key Applications:**
- Text completion and suggestion
- Grammar and spell checking
- Language understanding evaluation
- Creative writing assistance

## 1. Setup and Environment

Let's start by importing the necessary libraries and setting up our environment.

In [None]:
# Install required packages (uncomment and run if needed)
# !pip install transformers torch

# Import essential libraries
import torch
from transformers import pipeline
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ Libraries imported successfully!")
print(f"PyTorch version: {torch.__version__}")

In [None]:
# Device detection for optimal performance
def get_device():
    """
    Get the best available device for PyTorch operations.
    
    Priority order: CUDA > MPS (Apple Silicon) > CPU
    
    Returns:
        torch.device: The optimal device for current hardware
    """
    if torch.cuda.is_available():
        device = torch.device("cuda")
        print(f"üöÄ Using CUDA GPU: {torch.cuda.get_device_name()}")
    elif torch.backends.mps.is_available():
        device = torch.device("mps")
        print("üçé Using Apple MPS for Apple Silicon optimization")
    else:
        device = torch.device("cpu")
        print("üíª Using CPU (consider GPU for better performance)")
    
    return device

# Get the optimal device
device = get_device()

## 2. Basic Masked Language Modeling

The Hugging Face `transformers` library provides a simple pipeline interface for masked language modeling. Let's start with a basic example.

The fill-mask pipeline automatically loads a pre-trained model (typically BERT-based) that's been trained to predict masked tokens.

In [None]:
# Create a fill-mask pipeline
# This automatically downloads and uses a default masked language model
print("üì• Loading fill-mask pipeline...")
unmasker = pipeline("fill-mask")

print(f"‚úÖ Pipeline loaded successfully!")
print(f"üìä Default model: {unmasker.model.config.name_or_path}")
print(f"üè∑Ô∏è  Model type: {unmasker.model.config.model_type}")

In [None]:
# Basic masked language modeling example
text = "The capital of France is <mask>."

print(f"üéØ Input text: {text}")
print("\nü§ñ Model predictions:")
print("=" * 30)

# Get predictions (top 3 most likely words)
predictions = unmasker(text, top_k=3)

# Display results
for i, pred in enumerate(predictions, 1):
    print(f"{i}. '{pred['token_str']}' (confidence: {pred['score']:.3f})")
    print(f"   Complete sentence: {pred['sequence']}")
    print()

## 3. Understanding Results

Let's examine what the model returns and how to interpret the confidence scores.

In [None]:
def analyze_prediction_results(predictions):
    """
    Analyze and explain masked language model predictions.
    
    Args:
        predictions: List of prediction dictionaries from fill-mask pipeline
    """
    print("üìä PREDICTION ANALYSIS")
    print("=" * 40)
    
    total_confidence = sum(pred['score'] for pred in predictions)
    
    print(f"Number of predictions: {len(predictions)}")
    print(f"Total confidence across top predictions: {total_confidence:.3f}")
    print(f"Average confidence: {total_confidence/len(predictions):.3f}\n")
    
    for i, pred in enumerate(predictions, 1):
        confidence_level = (
            "Very High" if pred['score'] > 0.5 else
            "High" if pred['score'] > 0.2 else
            "Medium" if pred['score'] > 0.1 else
            "Low"
        )
        
        print(f"Prediction #{i}:")
        print(f"  Token: '{pred['token_str']}'")
        print(f"  Score: {pred['score']:.4f} ({pred['score']*100:.2f}%)")
        print(f"  Confidence Level: {confidence_level}")
        print(f"  Token ID: {pred['token']}")
        print()

# Analyze our previous predictions
analyze_prediction_results(predictions)

## 4. Practical Examples

Let's test the masked language model with different types of sentences to see how context affects predictions.

In [None]:
# Various test examples
test_examples = [
    "I love to eat <mask> for breakfast.",
    "The weather today is very <mask>.",
    "She works as a <mask> in the hospital.",
    "The <mask> is shining brightly today.",
    "Machine learning is a subset of <mask> intelligence."
]

print("üß™ TESTING DIFFERENT CONTEXTS")
print("=" * 50)

for i, example in enumerate(test_examples, 1):
    print(f"\n{i}. Input: {example}")
    
    try:
        predictions = unmasker(example, top_k=2)
        print("   Top predictions:")
        
        for j, pred in enumerate(predictions, 1):
            print(f"      {j}. {pred['token_str']} (score: {pred['score']:.3f})")
            
    except Exception as e:
        print(f"   Error: {e}")

## 5. Context Sensitivity Example

Let's demonstrate how the same word can have different meanings based on context.

In [None]:
# Context sensitivity examples
context_examples = [
    ("He went to the <mask> to deposit money.", "Financial context"),
    ("They sat on the <mask> of the river.", "Geographic context"),
    ("The <mask> flew through the dark cave.", "Animal context"),
    ("He swung the <mask> at the baseball.", "Sports context")
]

print("üß† CONTEXT SENSITIVITY DEMONSTRATION")
print("=" * 50)

for example, context_type in context_examples:
    print(f"\nContext: {context_type}")
    print(f"Sentence: {example}")
    
    predictions = unmasker(example, top_k=3)
    print("Predictions:")
    
    for i, pred in enumerate(predictions, 1):
        print(f"  {i}. {pred['token_str']} ({pred['score']:.3f})")
    print()

---

## üìã Summary

### üîë Key Concepts Mastered
- **Masked Language Modeling**: Understanding how models predict missing words using context
- **Fill-Mask Pipeline**: Using Hugging Face transformers for simple mask filling tasks
- **Prediction Interpretation**: Understanding confidence scores and model uncertainty
- **Context Sensitivity**: How surrounding words influence predictions

### üìà Best Practices Learned
- Use the fill-mask pipeline for quick masked language modeling tasks
- Consider confidence scores when evaluating predictions
- Test with various contexts to understand model behavior
- Use device detection for optimal performance across different hardware

### üöÄ Next Steps
- **Notebook 05**: Explore question answering with transformers
- **Advanced Topics**: Custom model fine-tuning for specific domains
- **Documentation**: [Hugging Face Fill-Mask Documentation](https://huggingface.co/docs/transformers/task_summary#masked-language-modeling)

### üí° Key Takeaways
> **Masked Language Modeling** is a powerful technique that helps models understand language context and semantics. The fill-mask pipeline provides an easy way to leverage pre-trained models for text completion and understanding tasks.

---

## About the Author

**Vu Hung Nguyen** - AI Engineer & Researcher

Connect with me:
- üåê **Website**: [vuhung16au.github.io](https://vuhung16au.github.io/)
- üíº **LinkedIn**: [linkedin.com/in/nguyenvuhung](https://www.linkedin.com/in/nguyenvuhung/)
- üíª **GitHub**: [github.com/vuhung16au](https://github.com/vuhung16au/)

*This notebook is part of the [HF Transformer Trove](https://github.com/vuhung16au/hf-transformer-trove) educational series.*