# Advanced AI Models: MOE, SAM, LCM, MLM Examples

This notebook covers specialized AI architectures:
- **MOE**: Mixture of Experts
- **SAM**: Segment Anything Model
- **LCM**: Latent Consistency Model
- **MLM**: Masked Language Model

---

In [None]:
# Install required packages
!pip install transformers torch diffusers accelerate segment-anything pillow matplotlib numpy opencv-python -q

In [None]:
import torch
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import requests
from io import BytesIO

print("✓ Imports complete")

---
# Part 1: MOE - Mixture of Experts

## What is MOE?
- **Architecture**: Multiple expert sub-models with a router
- **Efficiency**: Only activates 2-4 experts per token
- **Size**: 8x7B = 56B total, but uses only ~14B at inference
- **Example**: Mixtral 8x7B

### How it works:
```
Input Token → Router → Selects 2 Experts → Process → Output
               ↓
        [Expert 1] [Expert 2] [Expert 3] [Expert 4]
        [Expert 5] [Expert 6] [Expert 7] [Expert 8]
```

## Example 1: Using Mixtral 8x7B (MOE Model)

In [None]:
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load Mixtral (MOE architecture)
print("Loading Mixtral 8x7B...")
print("Note: This requires significant GPU memory (~90GB unquantized)")
print("Using 4-bit quantization to reduce memory...\n")

model_name = "mistralai/Mixtral-8x7B-Instruct-v0.1"

# For demo purposes - would need GPU
print("MOE Architecture Details:")
print("- Total parameters: 46.7B")
print("- Active per token: ~12.9B")
print("- Number of experts: 8")
print("- Experts used per token: 2")
print("- Result: Similar quality to 70B model with 2x efficiency")

## Example 2: MOE Concept Demonstration

In [None]:
# Simulate MOE routing logic
class SimpleMOE:
    """Simplified MOE demonstration"""
    
    def __init__(self, num_experts=8):
        self.num_experts = num_experts
        self.expert_names = [
            "Math Expert", "Code Expert", "Science Expert", "History Expert",
            "Language Expert", "Creative Expert", "Logic Expert", "General Expert"
        ]
    
    def route(self, query):
        """Route query to appropriate experts"""
        # Simplified routing based on keywords
        keywords = {
            0: ["math", "calculate", "equation", "number"],
            1: ["code", "program", "function", "algorithm"],
            2: ["science", "physics", "chemistry", "biology"],
            3: ["history", "historical", "past", "century"],
            4: ["translate", "language", "grammar"],
            5: ["creative", "story", "poem", "artistic"],
            6: ["logic", "reasoning", "puzzle"],
            7: []  # General catch-all
        }
        
        query_lower = query.lower()
        scores = [0] * self.num_experts
        
        for expert_id, words in keywords.items():
            for word in words:
                if word in query_lower:
                    scores[expert_id] += 1
        
        # Select top 2 experts
        top_experts = sorted(range(len(scores)), key=lambda i: scores[i], reverse=True)[:2]
        
        return [self.expert_names[i] for i in top_experts]

# Test the routing
moe = SimpleMOE()

queries = [
    "Calculate the factorial of 10",
    "Write a Python function to sort a list",
    "Explain photosynthesis",
    "Who was the first president?",
    "Write a haiku about spring"
]

print("MOE Routing Demonstration:\n")
for query in queries:
    experts = moe.route(query)
    print(f"Query: {query}")
    print(f"Selected Experts: {', '.join(experts)}\n")

## MOE Benefits

```python
# Traditional Dense Model
model_size = 70B
active_params = 70B  # All parameters used
cost = HIGH

# MOE Model
model_size = 8 * 7B = 56B total
active_params = 12.9B  # Only 2 experts active
cost = MEDIUM
performance = Similar to 70B!
```

---
# Part 2: SAM - Segment Anything Model

## What is SAM?
- **Purpose**: Universal image segmentation
- **Input**: Image + prompt (point, box, or text)
- **Output**: Precise object masks
- **Provider**: Meta AI

### Use Cases:
- Object isolation
- Background removal
- Image editing
- Medical imaging
- Autonomous vehicles

## Example 3: Using SAM for Image Segmentation

In [None]:
# Note: SAM requires specific installation
# pip install git+https://github.com/facebookresearch/segment-anything.git

print("SAM (Segment Anything Model) Overview:\n")
print("Architecture:")
print("- Model size: ~600M parameters")
print("- Speed: Fast inference (~50ms per image)")
print("- Accuracy: State-of-the-art segmentation")
print("\nPrompting Options:")
print("1. Point prompt: Click on object")
print("2. Box prompt: Draw bounding box")
print("3. Text prompt: Describe what to segment")
print("4. Automatic: Segment everything in image")

In [None]:
# Simulated SAM workflow
def demonstrate_sam_workflow():
    """
    Demonstrate SAM segmentation workflow
    """
    print("SAM Workflow Example:\n")
    
    print("Step 1: Load image")
    print("   image = load_image('photo.jpg')")
    
    print("\nStep 2: Initialize SAM")
    print("   sam = SAM(checkpoint='sam_vit_h')")
    
    print("\nStep 3: Generate embeddings")
    print("   predictor.set_image(image)")
    
    print("\nStep 4: Prompt with point/box")
    print("   input_point = [500, 375]  # x, y coordinates")
    print("   input_label = 1  # 1 = foreground point")
    
    print("\nStep 5: Get segmentation mask")
    print("   masks, scores, logits = predictor.predict(")
    print("       point_coords=input_point,")
    print("       point_labels=input_label")
    print("   )")
    
    print("\nStep 6: Use mask")
    print("   - Isolate object")
    print("   - Remove background")
    print("   - Create composite")
    
demonstrate_sam_workflow()

## Example 4: SAM Use Cases

In [None]:
sam_use_cases = {
    "E-commerce": "Remove background from product photos",
    "Medical": "Segment organs/tumors in medical imaging",
    "Photography": "Select and edit specific objects",
    "Autonomous Vehicles": "Identify pedestrians, vehicles, obstacles",
    "Agriculture": "Identify crops, diseases, pests",
    "Robotics": "Object recognition and manipulation",
    "Video Editing": "Rotoscoping and object isolation",
    "AR/VR": "Real-time object segmentation"
}

print("SAM Real-World Applications:\n")
for industry, use in sam_use_cases.items():
    print(f"• {industry}: {use}")

---
# Part 3: LCM - Latent Consistency Model

## What is LCM?
- **Purpose**: Ultra-fast image generation
- **Speed**: 1-4 steps vs 50+ for traditional diffusion
- **Quality**: Maintains high quality
- **Use**: Real-time AI art generation

## Example 5: LCM for Fast Image Generation

In [None]:
from diffusers import DiffusionPipeline

# Load LCM model
print("Loading LCM (Latent Consistency Model)...\n")

pipe = DiffusionPipeline.from_pretrained(
    "SimianLuo/LCM_Dreamshaper_v7",
    torch_dtype=torch.float16
)

# Move to GPU if available
if torch.cuda.is_available():
    pipe = pipe.to("cuda")
    print("✓ Using GPU acceleration")
else:
    print("⚠ Running on CPU (slower)")

print("\nLCM loaded successfully!")

In [None]:
import time

def generate_image_lcm(prompt, num_steps=4):
    """Generate image using LCM"""
    start = time.time()
    
    image = pipe(
        prompt=prompt,
        num_inference_steps=num_steps,
        guidance_scale=8.0,
        height=512,
        width=512
    ).images[0]
    
    elapsed = time.time() - start
    return image, elapsed

# Generate images
prompts = [
    "A serene landscape with mountains and lake, digital art",
    "A futuristic city at night, cyberpunk style",
    "A cute robot reading a book, cartoon style"
]

print("Generating images with LCM (4 steps):\n")

for i, prompt in enumerate(prompts, 1):
    print(f"Image {i}: {prompt}")
    image, time_taken = generate_image_lcm(prompt)
    print(f"Generation time: {time_taken:.2f}s")
    
    # Display image
    plt.figure(figsize=(6, 6))
    plt.imshow(image)
    plt.axis('off')
    plt.title(f"Prompt: {prompt[:40]}...")
    plt.show()
    print()

## Example 6: LCM vs Traditional Diffusion Speed Comparison

In [None]:
# Speed comparison
comparison = {
    "Method": ["Stable Diffusion", "SDXL", "LCM (ours)"],
    "Steps": [50, 40, 4],
    "Time (seconds)": [5.2, 8.5, 0.6],
    "Quality": ["High", "Very High", "High"]
}

print("Speed Comparison:\n")
print(f"{'Method':<20} {'Steps':<10} {'Time (s)':<12} {'Quality'}")
print("-" * 60)
for i in range(len(comparison["Method"])):
    print(f"{comparison['Method'][i]:<20} {comparison['Steps'][i]:<10} "
          f"{comparison['Time (seconds)'][i]:<12} {comparison['Quality'][i]}")

print("\n✨ LCM is 8-10x faster while maintaining quality!")

---
# Part 4: MLM - Masked Language Model

## What is MLM?
- **Architecture**: BERT-style bidirectional
- **Training**: Predict masked tokens
- **Purpose**: Understanding > Generation
- **Examples**: BERT, RoBERTa, ALBERT

## Example 7: Using BERT (MLM) for Text Classification

In [None]:
from transformers import pipeline

# Load BERT for sentiment analysis
print("Loading BERT (Masked Language Model) for classification...\n")

sentiment_analyzer = pipeline(
    "sentiment-analysis",
    model="distilbert-base-uncased-finetuned-sst-2-english"
)

print("✓ BERT model loaded!")

In [None]:
# Sentiment analysis examples
texts = [
    "This product is absolutely amazing! Best purchase ever!",
    "Terrible quality. Very disappointed with this.",
    "It's okay, nothing special but works fine.",
    "I love the design but the price is too high.",
    "Fast shipping and excellent customer service!"
]

print("Sentiment Analysis with BERT:\n")

for text in texts:
    result = sentiment_analyzer(text)[0]
    sentiment = result['label']
    confidence = result['score']
    
    print(f"Text: {text}")
    print(f"Sentiment: {sentiment} (confidence: {confidence:.2%})\n")

## Example 8: MLM Fill-Mask Task

In [None]:
# Load BERT for fill-mask
fill_mask = pipeline("fill-mask", model="bert-base-uncased")

# Examples with masked tokens
sentences = [
    "The capital of France is [MASK].",
    "Python is a [MASK] programming language.",
    "The [MASK] is shining brightly today.",
    "Artificial [MASK] is transforming technology."
]

print("Fill-Mask Task (Predicting Masked Words):\n")

for sentence in sentences:
    results = fill_mask(sentence, top_k=3)
    print(f"Input: {sentence}\n")
    print("Top predictions:")
    for i, result in enumerate(results, 1):
        word = result['token_str'].strip()
        score = result['score']
        print(f"  {i}. '{word}' (confidence: {score:.2%})")
    print()

## Example 9: Named Entity Recognition (NER) with MLM

In [None]:
# Load NER pipeline
ner = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")

text = """
Apple Inc. was founded by Steve Jobs in Cupertino, California. 
The company released the iPhone in 2007, which revolutionized 
the smartphone industry. Tim Cook became CEO in 2011.
"""

print("Named Entity Recognition:\n")
print(f"Text: {text}\n")

entities = ner(text)

# Group entities by type
entity_groups = {}
for entity in entities:
    entity_type = entity['entity'].replace('B-', '').replace('I-', '')
    if entity_type not in entity_groups:
        entity_groups[entity_type] = []
    entity_groups[entity_type].append(entity['word'])

print("Extracted Entities:\n")
for entity_type, words in entity_groups.items():
    # Clean and deduplicate
    clean_words = ' '.join(words).replace(' ##', '')
    print(f"{entity_type}: {clean_words}")

## Example 10: Question Answering with MLM

In [None]:
# Load QA model
qa_model = pipeline("question-answering", model="distilbert-base-cased-distilled-squad")

context = """
The Transformer architecture was introduced in the paper 'Attention Is All You Need' 
by Vaswani et al. in 2017. It revolutionized natural language processing by using 
self-attention mechanisms instead of recurrent neural networks. The architecture 
consists of an encoder and decoder, each with multiple layers. BERT, GPT, and many 
modern language models are based on this architecture.
"""

questions = [
    "Who introduced the Transformer architecture?",
    "When was the Transformer introduced?",
    "What does the Transformer use instead of RNNs?",
    "What models are based on Transformer architecture?"
]

print("Question Answering with BERT:\n")
print(f"Context: {context}\n")
print("Questions and Answers:\n")

for question in questions:
    result = qa_model(question=question, context=context)
    print(f"Q: {question}")
    print(f"A: {result['answer']} (confidence: {result['score']:.2%})\n")

## MLM vs LLM Comparison

In [None]:
print("MLM vs LLM Comparison:\n")

comparison = [
    ["Aspect", "MLM (BERT)", "LLM (GPT)"],
    ["-" * 20, "-" * 30, "-" * 30],
    ["Architecture", "Bidirectional (Encoder)", "Unidirectional (Decoder)"],
    ["Training", "Masked token prediction", "Next token prediction"],
    ["Best For", "Understanding & Classification", "Generation & Completion"],
    ["Context", "Full sentence context", "Left-to-right context"],
    ["Speed", "Fast", "Medium"],
    ["Use Cases", "NER, Classification, QA", "Text generation, Chat"],
    ["Size", "110M - 340M params", "1B - 405B params"],
    ["Generation", "❌ Weak", "✅ Strong"],
    ["Understanding", "✅ Strong", "✅ Strong"]
]

for row in comparison:
    print(f"{row[0]:<20} {row[1]:<30} {row[2]:<30}")

---
# Summary

## Models Covered:

### 1. MOE (Mixture of Experts)
- ✅ Efficient scaling with expert routing
- ✅ 8x7B model with 12B active parameters
- ✅ Best for: Multi-domain tasks efficiently

### 2. SAM (Segment Anything Model)
- ✅ Universal image segmentation
- ✅ Multiple prompting options
- ✅ Best for: Object isolation, medical imaging, editing

### 3. LCM (Latent Consistency Model)
- ✅ Ultra-fast image generation (4 steps)
- ✅ 10x faster than traditional diffusion
- ✅ Best for: Real-time AI art, rapid prototyping

### 4. MLM (Masked Language Model)
- ✅ Bidirectional text understanding
- ✅ Excellent for classification and NER
- ✅ Best for: Sentiment analysis, QA, entity extraction

## Key Takeaways:

| Model | When to Use |
|-------|-------------|
| **MOE** | Need efficiency with multi-domain expertise |
| **SAM** | Image segmentation, object isolation |
| **LCM** | Fast image generation, real-time applications |
| **MLM** | Text classification, NER, understanding tasks |

## Next Steps:
- Combine models for complex workflows
- Explore LMM/MLLM for multimodal tasks
- Learn about LAM for action-taking AI
- Study communication protocols (MCP, A2A)