[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/ruliana/pytorch-katas/blob/main/dan_1/kata_06_suki_dual_behavior_predictor_unrevised.ipynb)

## 🏮 The Ancient Scroll Unfurls 🏮

# THE TWIN MYSTERIES OF SUKI'S NATURE: CLASSIFICATION MEETS REGRESSION
Dan Level: 1 (Temple Sweeper) | Time: 45 minutes | Sacred Arts: Multi-output learning, Classification + Regression, Dual loss functions

## 📜 THE CHALLENGE

Master Pai-Torch sits in contemplative silence, observing Suki's graceful movements around the temple courtyard. "Grasshopper," the ancient master finally speaks, "you have learned to predict single truths - when Suki will eat, whether doors will stick, which ceremonies are occurring. But observe closely: Suki's behavior contains not one mystery, but two intertwined enigmas that dance together like yin and yang." The master's eyes gleam with ancient wisdom. "She exists in distinct modes - the focused hunter tracking temple mice, and the serene napper basking in sunbeams. Yet within each mode, her energy pulses with varying intensity, from barely stirring to explosive action."

"The ultimate Dan 1 challenge awaits you: create a single neural network that can simultaneously classify Suki's behavioral mode AND predict her activity intensity level. You must learn to weave together the discrete art of classification with the continuous flow of regression, understanding when to use each approach for different aspects of the same phenomenon. This is the sacred dual nature of machine learning - knowing both what category something belongs to, and how much of that category it embodies."

## 🎯 THE SACRED OBJECTIVES

- [ ] **Master Multi-Output Networks**: Build a single model that produces two different types of predictions
- [ ] **Understand Classification vs Regression**: Learn when each approach serves different aspects of the same problem
- [ ] **Combine Different Loss Functions**: Calculate separate losses for classification and regression outputs
- [ ] **Interpret Dual Predictions**: Understand how to read and validate both categorical and continuous outputs
- [ ] **Visualize Complex Relationships**: Create plots that show both discrete categories and continuous values


In [None]:
# 📦 FIRST CELL - ALL IMPORTS AND CONFIGURATION
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import matplotlib.pyplot as plt
import numpy as np
from typing import Tuple

# Set reproducibility
torch.manual_seed(42)

# Global configuration constants
DEFAULT_CHAOS_LEVEL = 0.15
SACRED_SEED = 42
HUNTING_MODE = 0
NAPPING_MODE = 1

## 🐱 THE SACRED DATA GENERATION SCROLL

*Master Pai-Torch gestures toward Suki, who is currently stretched luxuriously in a sunbeam*

"Observe, Grasshopper. Suki's behavior follows ancient patterns that even she may not fully understand. Her activity level depends on the time since her last meal and the temple's ambient energy, but her mode - hunting versus napping - follows deeper rhythms that shift like the temple's spiritual tides."

In [None]:
def generate_suki_dual_behavior_data(n_observations: int = 200, chaos_level: float = 0.15,
                                   sacred_seed: int = 42) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
    """
    Generate observations of Suki's dual nature: behavioral mode and activity intensity.

    Ancient wisdom reveals:
    - Mode depends on: spiritual_energy threshold (hunting if > 0.4, napping otherwise)
    - Intensity follows: base_intensity + hours_factor * hours_since_meal + spiritual_boost
    - Hunting mode: base_intensity = 30, hours_factor = 3.0, spiritual_boost = spiritual_energy * 40
    - Napping mode: base_intensity = 10, hours_factor = 1.5, spiritual_boost = (1-spiritual_energy) * 25

    Args:
        n_observations: Number of Suki behavior observations
        chaos_level: Amount of feline unpredictability (0.0 = perfectly predictable, 1.0 = pure chaos)
        sacred_seed: Ensures consistent mystical randomness

    Returns:
        Tuple of (input_features, behavioral_modes, activity_intensities)
        - input_features: [hours_since_meal, spiritual_energy] (n_observations, 2)
        - behavioral_modes: 0=hunting, 1=napping (n_observations,) as long tensor
        - activity_intensities: 0-100 energy level (n_observations, 1) as float tensor
    """
    torch.manual_seed(sacred_seed)
    
    # Suki's input features
    hours_since_meal = torch.rand(n_observations) * 8  # 0-8 hours range
    spiritual_energy = torch.rand(n_observations)      # 0-1 temple energy level
    
    # Combine features for input
    input_features = torch.stack([hours_since_meal, spiritual_energy], dim=1)
    
    # Determine behavioral mode (classification target)
    # Hunting mode when spiritual energy > 0.4, napping otherwise
    behavioral_modes = (spiritual_energy > 0.4).long()  # 0=hunting, 1=napping
    
    # Calculate activity intensity (regression target)
    activity_intensities = torch.zeros(n_observations)
    
    # Hunting mode calculations
    hunting_mask = behavioral_modes == 0
    activity_intensities[hunting_mask] = (
        30 +  # base hunting intensity
        3.0 * hours_since_meal[hunting_mask] +  # hunger drives activity
        spiritual_energy[hunting_mask] * 40  # spiritual boost for hunters
    )
    
    # Napping mode calculations
    napping_mask = behavioral_modes == 1
    activity_intensities[napping_mask] = (
        10 +  # base napping intensity
        1.5 * hours_since_meal[napping_mask] +  # mild hunger effect
        (1 - spiritual_energy[napping_mask]) * 25  # low energy = deeper naps
    )
    
    # Add feline chaos to intensity (cats are wonderfully unpredictable)
    chaos = torch.randn(n_observations) * chaos_level * activity_intensities.std()
    activity_intensities = activity_intensities + chaos
    
    # Keep intensities within reasonable bounds
    activity_intensities = torch.clamp(activity_intensities, 0, 100)
    
    return input_features, behavioral_modes, activity_intensities.unsqueeze(1)

def visualize_suki_mysteries(features: torch.Tensor, modes: torch.Tensor, 
                           intensities: torch.Tensor, predictions: Tuple[torch.Tensor, torch.Tensor] = None):
    """Display the twin mysteries of Suki's behavioral patterns."""
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    hours = features[:, 0].numpy()
    spiritual = features[:, 1].numpy()
    modes_np = modes.numpy()
    intensities_np = intensities.squeeze().numpy()
    
    # Plot 1: Mode Classification (discrete categories)
    hunting_mask = modes_np == 0
    napping_mask = modes_np == 1
    
    ax1.scatter(hours[hunting_mask], spiritual[hunting_mask], 
               c='red', alpha=0.6, label='Hunting Mode', s=60)
    ax1.scatter(hours[napping_mask], spiritual[napping_mask], 
               c='blue', alpha=0.6, label='Napping Mode', s=60)
    
    ax1.axhline(y=0.4, color='purple', linestyle='--', alpha=0.7,
               label='Mode Threshold (Spiritual Energy = 0.4)')
    ax1.set_xlabel('Hours Since Last Meal')
    ax1.set_ylabel('Temple Spiritual Energy')
    ax1.set_title('Suki\'s Behavioral Mode Classification')
    ax1.legend()
    ax1.grid(True, alpha=0.3)
    
    # Plot 2: Activity Intensity Regression (continuous values)
    scatter = ax2.scatter(hours, spiritual, c=intensities_np, 
                         cmap='viridis', alpha=0.7, s=60)
    plt.colorbar(scatter, ax=ax2, label='Activity Intensity (0-100)')
    
    if predictions is not None:
        pred_modes, pred_intensities = predictions
        # Show prediction accuracy with different markers
        correct_mode = (torch.argmax(pred_modes, dim=1) == modes)
        ax2.scatter(hours[~correct_mode.numpy()], spiritual[~correct_mode.numpy()], 
                   marker='x', c='red', s=100, alpha=0.8, label='Mode Prediction Error')
        
    ax2.set_xlabel('Hours Since Last Meal')
    ax2.set_ylabel('Temple Spiritual Energy')
    ax2.set_title('Suki\'s Activity Intensity Regression')
    if predictions is not None:
        ax2.legend()
    ax2.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

# Generate and visualize the sacred data
features, modes, intensities = generate_suki_dual_behavior_data()
print(f"Generated {len(features)} observations of Suki's dual nature")
print(f"Input features shape: {features.shape} (hours_since_meal, spiritual_energy)")
print(f"Behavioral modes shape: {modes.shape} (0=hunting, 1=napping)")
print(f"Activity intensities shape: {intensities.shape} (0-100 energy level)")
print(f"\nMode distribution: {torch.bincount(modes)} [hunting, napping]")
print(f"Intensity range: {intensities.min():.1f} - {intensities.max():.1f}")

visualize_suki_mysteries(features, modes, intensities)

## 💃 FIRST MOVEMENTS: THE DUAL-OUTPUT NEURAL NETWORK

*Master Pai-Torch raises a weathered hand*

"Now comes the true test, Grasshopper. You must forge a single neural network that speaks two languages - the binary whispers of classification and the flowing songs of regression. Like a master calligrapher who can write both poetry and accounting ledgers with the same brush, your model must produce both categorical truths and continuous wisdom."

In [None]:
class SukiDualBehaviorPredictor(nn.Module):
    """A mystical artifact that understands both the discrete and continuous nature of Suki."""

    def __init__(self, input_features: int = 2, hidden_size: int = 16):
        super(SukiDualBehaviorPredictor, self).__init__()
        
        # TODO: Create the shared hidden layer
        # Hint: This layer learns features useful for both tasks
        self.shared_layer = None
        
        # TODO: Create the classification head (mode prediction)
        # Hint: Output should have 2 neurons for binary classification (hunting vs napping)
        self.mode_classifier = None
        
        # TODO: Create the regression head (intensity prediction)
        # Hint: Output should have 1 neuron for continuous intensity values
        self.intensity_regressor = None

    def forward(self, features: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        """Channel dual wisdom through the mystical network."""
        # TODO: Pass input through shared hidden layer with ReLU activation
        # Hint: F.relu() applies the activation function
        hidden = None
        
        # TODO: Get mode classification logits (raw outputs)
        # Hint: Don't apply softmax here - let the loss function handle it
        mode_logits = None
        
        # TODO: Get intensity regression output
        # Hint: No activation needed for regression output
        intensity_output = None
        
        return mode_logits, intensity_output

def train_dual_model(model: nn.Module, features: torch.Tensor, mode_targets: torch.Tensor, 
                    intensity_targets: torch.Tensor, epochs: int = 1500, learning_rate: float = 0.01) -> dict:
    """
    Train the dual-output model with separate losses for classification and regression.

    Returns:
        Dictionary containing training history for both tasks
    """
    # TODO: Choose classification loss function
    # Hint: CrossEntropyLoss is perfect for multi-class classification
    classification_criterion = None
    
    # TODO: Choose regression loss function
    # Hint: MSELoss works well for continuous value prediction
    regression_criterion = None

    # TODO: Choose optimizer for all model parameters
    # Hint: Adam optimizer often works well for multi-task learning
    optimizer = None

    # Track training progress
    history = {
        'classification_losses': [],
        'regression_losses': [],
        'total_losses': [],
        'mode_accuracies': []
    }

    for epoch in range(epochs):
        # TODO: CRITICAL - Clear gradients from previous iteration
        # Hint: Always banish the old gradient spirits first
        
        # TODO: Forward pass - get both predictions
        mode_logits, intensity_predictions = None

        # TODO: Calculate classification loss
        # Hint: Compare mode_logits with mode_targets
        classification_loss = None

        # TODO: Calculate regression loss
        # Hint: Compare intensity_predictions with intensity_targets
        regression_loss = None

        # TODO: Combine both losses (equal weighting for now)
        # Hint: total_loss = classification_loss + regression_loss
        total_loss = None

        # TODO: Backward pass - compute gradients
        
        # TODO: Update parameters

        # Track progress
        history['classification_losses'].append(classification_loss.item())
        history['regression_losses'].append(regression_loss.item())
        history['total_losses'].append(total_loss.item())
        
        # Calculate mode prediction accuracy
        with torch.no_grad():
            predicted_modes = torch.argmax(mode_logits, dim=1)
            accuracy = (predicted_modes == mode_targets).float().mean().item()
            history['mode_accuracies'].append(accuracy)

        # Report progress to Master Pai-Torch
        if (epoch + 1) % 300 == 0:
            print(f'Epoch [{epoch+1}/{epochs}]:')
            print(f'  Classification Loss: {classification_loss.item():.4f}')
            print(f'  Regression Loss: {regression_loss.item():.4f}')
            print(f'  Total Loss: {total_loss.item():.4f}')
            print(f'  Mode Accuracy: {accuracy:.1%}')
            if accuracy > 0.85 and regression_loss.item() < 100:
                print("  💫 The dual spirits of learning dance in harmony!")
            print()

    return history

def plot_dual_training_progress(history: dict):
    """Visualize the journey of dual learning."""
    fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(15, 10))
    
    epochs = range(1, len(history['total_losses']) + 1)
    
    # Total loss
    ax1.plot(epochs, history['total_losses'], 'purple', linewidth=2)
    ax1.set_title('Total Combined Loss')
    ax1.set_xlabel('Epoch')
    ax1.set_ylabel('Loss')
    ax1.grid(True, alpha=0.3)
    
    # Classification loss
    ax2.plot(epochs, history['classification_losses'], 'red', linewidth=2)
    ax2.set_title('Classification Loss (Mode Prediction)')
    ax2.set_xlabel('Epoch')
    ax2.set_ylabel('Cross-Entropy Loss')
    ax2.grid(True, alpha=0.3)
    
    # Regression loss
    ax3.plot(epochs, history['regression_losses'], 'blue', linewidth=2)
    ax3.set_title('Regression Loss (Intensity Prediction)')
    ax3.set_xlabel('Epoch')
    ax3.set_ylabel('MSE Loss')
    ax3.grid(True, alpha=0.3)
    
    # Mode accuracy
    ax4.plot(epochs, [acc * 100 for acc in history['mode_accuracies']], 'green', linewidth=2)
    ax4.axhline(y=85, color='gold', linestyle='--', alpha=0.7, label='Mastery Threshold (85%)')
    ax4.set_title('Mode Classification Accuracy')
    ax4.set_xlabel('Epoch')
    ax4.set_ylabel('Accuracy (%)')
    ax4.legend()
    ax4.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()

## ⚡ THE TRIALS OF MASTERY

*Master Pai-Torch's eyes gleam with anticipation*

"The moment of truth approaches, Grasshopper. Your dual-natured network must prove its worth through four sacred trials. Only when both the discrete and continuous spirits bow to your will shall you achieve true mastery."

### Trial 1: Basic Dual Mastery
- [ ] Classification loss decreases consistently (the category spirits align)
- [ ] Regression loss decreases consistently (the intensity flows harmoniously)
- [ ] Final mode classification accuracy above 85% (Suki approves of your categorical wisdom)
- [ ] Final intensity regression MSE below 100 (your continuous predictions flow true)
- [ ] Model produces outputs of correct shapes for both tasks

### Trial 2: Understanding Test

In [None]:
def test_your_dual_wisdom(model):
    """Master Pai-Torch's evaluation of your dual understanding."""
    model.eval()  # Set to evaluation mode
    
    # Test with known scenarios
    test_features = torch.tensor([
        [2.0, 0.8],  # Short hunger, high spiritual energy → should be hunting mode, high intensity
        [6.0, 0.2],  # Long hunger, low spiritual energy → should be napping mode, moderate intensity
        [1.0, 0.6],  # Very short hunger, medium-high energy → hunting mode, medium intensity
        [4.0, 0.1]   # Medium hunger, very low energy → napping mode, low intensity
    ])
    
    with torch.no_grad():
        mode_logits, intensity_predictions = model(test_features)
    
    # Check output shapes
    assert mode_logits.shape == (4, 2), f"Mode logits shape should be (4, 2), got {mode_logits.shape}"
    assert intensity_predictions.shape == (4, 1), f"Intensity shape should be (4, 1), got {intensity_predictions.shape}"
    
    # Convert logits to probabilities and predictions
    mode_probabilities = F.softmax(mode_logits, dim=1)
    predicted_modes = torch.argmax(mode_logits, dim=1)
    
    print("🔮 Test Scenario Analysis:")
    scenarios = [
        "Short hunger (2h), High energy (0.8)",
        "Long hunger (6h), Low energy (0.2)", 
        "Very short hunger (1h), Medium-high energy (0.6)",
        "Medium hunger (4h), Very low energy (0.1)"
    ]
    
    for i, scenario in enumerate(scenarios):
        mode_pred = "Hunting" if predicted_modes[i] == 0 else "Napping"
        mode_conf = mode_probabilities[i, predicted_modes[i]].item()
        intensity_pred = intensity_predictions[i, 0].item()
        
        print(f"  {scenario}:")
        print(f"    Mode: {mode_pred} (confidence: {mode_conf:.1%})")
        print(f"    Intensity: {intensity_pred:.1f}")
    
    # Logical consistency checks
    expected_modes = [0, 1, 0, 1]  # Based on spiritual energy thresholds
    mode_accuracy = (predicted_modes == torch.tensor(expected_modes)).float().mean()
    
    assert mode_accuracy >= 0.75, f"Mode prediction accuracy {mode_accuracy:.1%} - the categorical spirits need more training!"
    assert torch.all(intensity_predictions >= 0), "Intensity predictions must be non-negative!"
    assert torch.all(intensity_predictions <= 150), "Intensity predictions seem unreasonably high!"
    
    print(f"\n✨ Logical Consistency: {mode_accuracy:.1%} of modes match expected patterns")
    print("🎉 Master Pai-Torch nods with deep satisfaction - your dual understanding blooms like a lotus!")
    
    model.train()  # Return to training mode

## 🌸 THE FOUR PATHS OF MASTERY: PROGRESSIVE EXTENSIONS

*Master Pai-Torch settles into a meditative posture*

"Your foundation grows strong, but true mastery requires exploration of the deeper mysteries. Four paths stretch before you, each revealing new aspects of the dual nature of learning."

### Extension 1: Cook Oh-Pai-Timizer's Recipe Balance
*"A master chef knows that different dishes require different seasonings!"*

*Cook Oh-Pai-Timizer bustles over, carrying two different-sized ladles*

"Ah, Grasshopper! I see you've learned to cook both savory and sweet dishes simultaneously. But notice - in my kitchen, I don't add equal amounts of salt and sugar to every recipe! Sometimes the main dish needs more attention, sometimes the dessert. Your dual model is like a complex meal - each part may need different amounts of attention."

**NEW CONCEPTS**: Loss weighting, balancing different task importance, hyperparameter tuning
**DIFFICULTY**: +15% (still Dan 1, but more nuanced training)

In [None]:
def train_with_balanced_losses(model: nn.Module, features: torch.Tensor, mode_targets: torch.Tensor,
                              intensity_targets: torch.Tensor, classification_weight: float = 1.0,
                              regression_weight: float = 0.01, epochs: int = 1500) -> dict:
    """
    Train with weighted losses to balance the importance of classification vs regression.
    
    Args:
        classification_weight: How much to emphasize mode prediction accuracy
        regression_weight: How much to emphasize intensity prediction accuracy
        
    Returns:
        Training history with weighted losses
    """
    # TODO: Implement weighted training
    # Hint: total_loss = classification_weight * classification_loss + regression_weight * regression_loss
    # Try different weight combinations: (1.0, 0.01), (2.0, 0.02), (0.5, 0.005)
    pass

def compare_loss_weightings():
    """Compare different loss weighting strategies."""
    # TODO: Train multiple models with different loss weightings
    # Compare final accuracies and regression errors
    # Which weighting gives the best balance?
    pass

# TRIAL: Find the optimal loss weighting for your dual-task model
# SUCCESS: Achieve balanced performance where both tasks perform well
# INSIGHT: Learn that classification and regression losses have different scales

### Extension 2: He-Ao-World's Observation Uncertainty
*"These old eyes sometimes see things that might not be quite right..."*

*He-Ao-World shuffles over, looking slightly embarrassed*

"Oh dear! I've been observing Suki for months now, but I'm starting to worry... Some of my observations might be a bit uncertain. Sometimes I write down 'hunting' when Suki might have been transitioning between modes. And my intensity measurements? Well, let's just say these old eyes aren't as sharp as they used to be for fine distinctions."

**NEW CONCEPTS**: Model confidence, prediction uncertainty, probabilistic outputs
**DIFFICULTY**: +25% (still Dan 1, but with uncertainty quantification)

In [None]:
def analyze_prediction_confidence(model: nn.Module, features: torch.Tensor):
    """
    Analyze how confident your model is in its dual predictions.
    
    Returns:
        Dictionary with confidence metrics for both tasks
    """
    model.eval()
    with torch.no_grad():
        mode_logits, intensity_predictions = model(features)
        
        # TODO: Calculate mode prediction confidence
        # Hint: Use softmax to get probabilities, then look at the max probability
        mode_probabilities = None
        mode_confidence = None  # Maximum probability for each prediction
        
        # TODO: Identify uncertain predictions
        # Hint: Low confidence means the model is unsure
        uncertain_threshold = 0.6  # Below this confidence, we're uncertain
        uncertain_predictions = None
        
        return {
            'mode_confidences': mode_confidence,
            'uncertain_count': uncertain_predictions.sum().item(),
            'average_confidence': mode_confidence.mean().item()
        }

def visualize_uncertainty(features: torch.Tensor, modes: torch.Tensor, 
                         intensities: torch.Tensor, model: nn.Module):
    """Show where your model is most and least confident."""
    # TODO: Create a visualization showing:
    # - Original data points
    # - Color-coded by model confidence
    # - Mark uncertain predictions with different symbols
    pass

# TRIAL: Identify which types of Suki behavior are hardest to predict
# SUCCESS: Understand when your model is confident vs uncertain
# MASTERY: Learn that knowing what you don't know is as important as knowing what you do know

### Extension 3: Master Pai-Torch's Architecture Wisdom
*"The structure of understanding shapes the nature of wisdom."*

*Master Pai-Torch draws patterns in the temple sand*

"Observe, Grasshopper. Your current network shares all learning between both tasks - a single hidden layer serves both classification and regression. But consider: might some knowledge be specific to each task? Perhaps the hunting-detection spirits require different wisdom than the intensity-measuring spirits. Sometimes, partial separation leads to greater harmony."

**NEW CONCEPTS**: Multi-task architecture design, shared vs task-specific layers, architectural choices
**DIFFICULTY**: +35% (still Dan 1, but exploring architecture variations)

In [None]:
class EnhancedSukiPredictor(nn.Module):
    """An advanced dual-output model with both shared and task-specific layers."""
    
    def __init__(self, input_features: int = 2, shared_size: int = 12, task_specific_size: int = 8):
        super(EnhancedSukiPredictor, self).__init__()
        
        # TODO: Create shared representation layer
        # Hint: This captures common patterns useful for both tasks
        self.shared_layer = None
        
        # TODO: Create task-specific hidden layers
        # Hint: Mode classification might benefit from different features than intensity regression
        self.mode_specific_layer = None
        self.intensity_specific_layer = None
        
        # TODO: Create output layers
        self.mode_output = None
        self.intensity_output = None
    
    def forward(self, features: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        # TODO: Implement forward pass with both shared and task-specific processing
        # Architecture: input → shared → [mode_specific, intensity_specific] → outputs
        pass

def compare_architectures():
    """Compare simple shared vs enhanced shared+specific architectures."""
    # TODO: Train both architectures and compare:
    # - Training speed
    # - Final accuracy/error
    # - Model complexity (parameter count)
    # Which architecture works better for Suki's dual nature?
    pass

# TRIAL: Design and compare different multi-task architectures
# SUCCESS: Understand the trade-offs between shared and task-specific processing
# INSIGHT: Learn that architecture design is itself a form of inductive bias

### Extension 4: Suki's Temporal Behavior Mystery
*"Time flows like a river, and Suki's nature flows with it."*

*Suki suddenly appears, sits perfectly still for exactly 3.7 seconds, then gracefully leaps onto Master Pai-Torch's shoulder*

*Master Pai-Torch smiles knowingly* "The sacred cat reminds us that her behavior is not just about the present moment, but about the flow of time itself. What if her current mode influences how her intensity changes? What if hunting behavior creates momentum that affects future intensity, while napping creates a different temporal pattern?"

**NEW CONCEPTS**: Sequence modeling, temporal dependencies, state-dependent dynamics
**DIFFICULTY**: +45% (advanced Dan 1, introducing time-series concepts)

In [None]:
def generate_temporal_suki_data(sequence_length: int = 50, n_sequences: int = 20):
    """
    Generate time-series data showing how Suki's mode affects her intensity patterns.
    
    Returns:
        Sequences where current mode influences future intensity changes
    """
    # TODO: Create sequences where:
    # - Hunting mode: intensity tends to build up over time
    # - Napping mode: intensity tends to decay over time
    # - Mode transitions create intensity jumps
    pass

def analyze_temporal_patterns(sequences):
    """
    Discover how mode affects intensity evolution over time.
    """
    # TODO: Analyze:
    # - How does intensity change during hunting sequences?
    # - How does intensity change during napping sequences?
    # - What happens at mode transitions?
    pass

def visualize_temporal_wisdom(sequences):
    """Show how Suki's modes create different temporal dynamics."""
    # TODO: Create time-series plots showing:
    # - Mode over time (discrete)
    # - Intensity over time (continuous)
    # - How they interact across different sequences
    pass

# TRIAL: Discover and model the temporal relationships in Suki's behavior
# SUCCESS: Understand how classification and regression can be temporally coupled
# MASTERY: Recognize that static models miss the dynamic dance of real behavior

## 🔥 CORRECTING YOUR FORM: A STANCE IMBALANCE

*Master Pai-Torch observes your training ritual with a careful eye*

"Your eager mind races ahead of your disciplined form, Grasshopper. See how your dual-task stance wavers? The spirits of classification and regression must dance in harmony, but I observe discord in your technique."

*A previous disciple left this flawed dual training ritual. The form has lost its balance - can you restore proper technique?*

In [None]:
def unsteady_dual_training(model, features, mode_targets, intensity_targets, epochs=800):
    """This dual training stance has lost its balance - your form needs correction! 🥋"""
    classification_criterion = nn.CrossEntropyLoss()
    regression_criterion = nn.MSELoss()
    optimizer = optim.Adam(model.parameters(), lr=0.01)

    for epoch in range(epochs):
        # Forward pass - get both predictions
        mode_logits, intensity_predictions = model(features)
        
        # Calculate losses
        classification_loss = classification_criterion(mode_logits, mode_targets)
        regression_loss = regression_criterion(intensity_predictions, intensity_targets)
        total_loss = classification_loss + regression_loss
        
        # Backward pass and optimization
        total_loss.backward()
        optimizer.step()

        if epoch % 200 == 0:
            print(f'Epoch {epoch}: Total Loss = {total_loss.item():.4f}')
            print(f'  Classification: {classification_loss.item():.4f}')
            print(f'  Regression: {regression_loss.item():.4f}')

    return model

# DEBUGGING CHALLENGE: Can you spot the critical error in this dual training ritual?
# HINT: The Gradient Spirits from both tasks are accumulating and interfering with each other
# MASTER'S WISDOM: "In dual-task learning, the undisciplined gradient accumulates old wisdom from both classification and regression, creating confusion in the parameter updates. The wise practitioner clears the slate before each new learning cycle."

# What happens when you run this code? What symptoms do you observe?
# How would you fix this to restore proper dual-task training form?

## 🎓 THE GRADUATION CEREMONY

*Master Pai-Torch rises slowly, a satisfied gleam in ancient eyes*

"Exceptional work, Grasshopper. You have achieved something truly remarkable - the ability to see both the forest and the trees, the category and the quantity, the discrete and the continuous. Your neural network now speaks two languages fluently, understanding both 'what type' and 'how much' of Suki's mystical nature."

*Suki purrs approvingly and demonstrates a perfect hunting pounce followed immediately by a serene nap*

"The sacred cat herself approves. You have learned that real-world problems rarely ask just one question - they demand understanding of multiple interrelated aspects simultaneously. Classification and regression are not competitors, but dance partners in the grand ballet of machine learning."

### 🏆 MASTERY ACHIEVED

**Core Skills Mastered:**
- ✅ Multi-output neural network architecture
- ✅ Combining classification and regression in a single model
- ✅ Different loss functions for different output types
- ✅ Dual performance evaluation (accuracy + MSE)
- ✅ Understanding when to use discrete vs continuous prediction

**Advanced Techniques Explored:**
- ✅ Loss weighting and task balancing
- ✅ Prediction confidence and uncertainty
- ✅ Shared vs task-specific architectures
- ✅ Temporal dependencies in multi-task learning

**Sacred Wisdom Gained:**
*"The master understands that reality is neither purely categorical nor purely continuous, but a harmonious blend of both. True intelligence lies not in choosing between classification and regression, but in knowing when and how to use each for different aspects of the same phenomenon."*

### 🚀 THE PATH FORWARD

Your journey as a Temple Sweeper reaches its pinnacle, but the path to mastery continues:

- **Dan 2 (Temple Guardian)**: Learn to protect against overfitting in multi-task scenarios
- **Dan 3 (Weapon Master)**: Explore specialized architectures for complex multi-output problems
- **Dan 4 (Combat Innovator)**: Design custom loss functions for novel multi-task challenges
- **Dan 5 (Mystic Arts Master)**: Create generative models that produce structured multi-modal outputs

*Master Pai-Torch bows respectfully*

"Go forth, newly-promoted Guardian candidate. The temple doors await your protection, and with them, new mysteries of regularization, validation, and robust learning. May the dual spirits of classification and regression guide your continued journey!"

*Suki meows once - a sound that somehow conveys both categorical certainty and regression-worthy intensity*