# Model Agnostic Meta Learning (MAML)

**Model Agnostic Meta-Learning (MAML)** is one of the most influential algorithms in meta-learning. It allows models to learn parameters that can be **fine-tuned to new tasks** with just a few gradient steps.

In simple words: MAML helps models *learn how to learn* quickly.

## 📘 1. Motivation

Instead of training a separate model for every task, we want a model that can **adapt** to a new task using a small amount of data. MAML achieves this by optimizing the model parameters for **fast adaptation**.

💡 Example: A model trained with MAML on various classification tasks can quickly adapt to a **new unseen class** with just a few samples.

## ⚙️ 2. MAML Workflow

The training involves two nested loops:

1. **Inner Loop:** Update model parameters on each task using a few steps of gradient descent.
2. **Outer Loop:** Update the *meta-parameters* using gradients from all tasks after their inner-loop updates.

This way, the outer loop learns parameters that are *sensitive* to small changes enabling fast learning.

## 🧩 3. Mathematical Formulation

Let \( \theta \) be the model parameters and \( T_i \) be a task with loss \( L_{T_i}(\theta) \).

- Inner Loop (Task-specific adaptation):
  \[ \theta'_i = \theta - \alpha \nabla_\theta L_{T_i}(\theta) \]

- Outer Loop (Meta-update):
  \[ \theta \leftarrow \theta - \beta \nabla_\theta \sum_i L_{T_i}(\theta'_i) \]

where \( \alpha \) is the inner learning rate and \( \beta \) is the meta learning rate.

## 🧱 4. Implementation (PyTorch Example)

We'll create a simple model and a minimalistic version of MAML training loop.

In [ ]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.optim import Adam

class MAMLNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc1 = nn.Linear(1, 40)
        self.fc2 = nn.Linear(40, 1)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        return self.fc2(x)

### Inner Loop: Task-Specific Adaptation

In [ ]:
def inner_update(model, loss_fn, x, y, lr=0.01):
    y_pred = model(x)
    loss = loss_fn(y_pred, y)
    grads = torch.autograd.grad(loss, model.parameters(), create_graph=True)
    updated_params = [p - lr * g for p, g in zip(model.parameters(), grads)]
    return updated_params

### Outer Loop: Meta-Update

In [ ]:
def maml_train(model, tasks, loss_fn, lr_inner=0.01, lr_outer=0.001, epochs=3):
    optimizer = Adam(model.parameters(), lr=lr_outer)
    
    for epoch in range(epochs):
        meta_loss = 0.0
        for x_train, y_train, x_val, y_val in tasks:
            adapted_params = inner_update(model, loss_fn, x_train, y_train, lr_inner)
            
            y_pred_val = model(x_val)
            loss_val = loss_fn(y_pred_val, y_val)
            meta_loss += loss_val
        
        optimizer.zero_grad()
        meta_loss.backward()
        optimizer.step()
        
        print(f"Epoch {epoch+1}, Meta Loss: {meta_loss.item():.4f}")

## 🧠 5. Key Insights

- MAML doesn’t depend on a specific model architecture — it’s **model-agnostic**.
- It can be used for **supervised**, **reinforcement**, or **unsupervised** tasks.
- Learns initialization weights that are optimal for *fast adaptation*.
- Requires **second-order derivatives**, which can be computationally expensive.

## ⚡ 6. Variants of MAML

- **First-Order MAML (FOMAML):** Approximates gradients by ignoring second-order terms.
- **Reptile:** Simplified optimization-based meta-learning without second-order derivatives.
- **Meta-SGD:** Learns learning rates along with parameters.

## 💡 7. Real-World Applications

- Few-shot image recognition.
- Robotics: adapting to new environments.
- Healthcare: adapting to new patient data.
- Reinforcement learning agents that generalize across tasks.

## ✅ 8. Summary

- **MAML** is a cornerstone of modern meta-learning.
- It finds an initialization that enables fast learning on new tasks.
- Though computationally heavy, its principles inspire many later algorithms.

➡️ Next: `05-Few_Shot_Learning_with_MAML.ipynb`